Tag

#methodology

7 posts tagged methodology.

methodology

Designing a Reproducible AI-Security Eval Harness

A reproducible AI-security evaluation is an engineering artifact, not a notebook. Here's the harness design — separation of corpus, target, judge, and
May 19, 2026
methodology

Measuring Prompt-Injection Robustness in Tool-Using Agents

Prompt-injection robustness for an agent is not a single number — it is utility-under-attack against targeted attack success.
May 18, 2026
methodology

Comparing LLM Safety Benchmarks: AdvBench, HarmBench, JailbreakBench

AdvBench, HarmBench, and JailbreakBench are not interchangeable, and treating them as one undermines every comparison built on top.
May 17, 2026
methodology

Red-Team Eval Methodology: Pairing Attack Success Rate With Refusal Rate

An LLM red-team evaluation that reports attack success rate without reporting refusal rate is half a measurement.
May 16, 2026
methodology

Benchmarking LLM Jailbreak Resistance: Attack Success Rate Done Right

Attack success rate is the headline metric for jailbreak resistance, and almost everyone computes it in a way that isn't comparable across runs.
May 14, 2026
methodology

Reproducible LLM Scanner Benchmarks: What Everyone Forgets to Pin

An LLM security scanner benchmark that isn't pinned to a model version, a seed, and a corpus hash isn't reproducible.
May 12, 2026
methodology

How to Benchmark a Prompt-Injection Detector Honestly

Most prompt-injection detector benchmarks are broken before the first request. Here is a test design that produces a number you can actually trust.
May 8, 2026