All posts

The AI Security Tools Directory: 40+ Tools Compared (2026)

A maintained 2026 directory of 40+ AI and LLM security tools, comparing scanners, runtime guardrails, injection detection, and observability.
June 20, 2026
Best AI Guardrail Tools Review: Lakera, NeMo, Bedrock, and Beyond

A practitioner's comparison of the leading AI guardrail tools in 2026 — Lakera Guard, NVIDIA NeMo, AWS Bedrock Guardrails, and Guardrails AI — covering
June 20, 2026
Best LLM Red Teaming Tools 2026: A Practitioner's Evaluation

A hands-on comparison of the leading LLM red teaming tools in 2026 — PyRIT, Garak, Promptfoo, and manual frameworks — with capability matrices
June 12, 2026
How to Test AI Agent Security: A Practical Evaluation Guide

Testing AI agent security requires a different approach than static LLM red-teaming. This guide covers the attack surface, test methodology, and the OWASP
June 12, 2026
Designing a Reproducible AI-Security Eval Harness

A reproducible AI-security evaluation is an engineering artifact, not a notebook. Here's the harness design — separation of corpus, target, judge, and
May 19, 2026
Measuring Prompt-Injection Robustness in Tool-Using Agents

Prompt-injection robustness for an agent is not a single number — it is utility-under-attack against targeted attack success.
May 18, 2026
Comparing LLM Safety Benchmarks: AdvBench, HarmBench, JailbreakBench

AdvBench, HarmBench, and JailbreakBench are not interchangeable, and treating them as one undermines every comparison built on top.
May 17, 2026
Red-Team Eval Methodology: Pairing Attack Success Rate With Refusal Rate

An LLM red-team evaluation that reports attack success rate without reporting refusal rate is half a measurement.
May 16, 2026
Benchmarking LLM Jailbreak Resistance: Attack Success Rate Done Right

Attack success rate is the headline metric for jailbreak resistance, and almost everyone computes it in a way that isn't comparable across runs.
May 14, 2026
Reproducible LLM Scanner Benchmarks: What Everyone Forgets to Pin

An LLM security scanner benchmark that isn't pinned to a model version, a seed, and a corpus hash isn't reproducible.
May 12, 2026
Benchmarking Jailbreak Classifiers: The Asymmetry Nobody Reports

Jailbreak classifiers are graded on attack recall and almost never on the cost of being wrong. That asymmetry is the whole story. Here's how to measure it.
May 10, 2026
How to Benchmark a Prompt-Injection Detector Honestly

Most prompt-injection detector benchmarks are broken before the first request. Here is a test design that produces a number you can actually trust.
May 8, 2026
LLM Benchmark Fidelity: Why MMLU Won't Predict Production Quality

Models with identical MMLU scores produce wildly different production outcomes. Here's where benchmark fidelity actually breaks down and what to measure
May 6, 2026