Module 3: Test & Prove

Five steps. From model to provable safety report

A structured workflow that takes any AI model, one of 82 connected in production, through red-team testing, defensive guardrails, and framework-aligned report generation.

Book a demo See the reports

AgenticAssure Test & Prove run and results list showing test outcomes across security, privacy, and safety categories

Connected models

In production

Attack techniques

29 single-turn · 5 multi-turn

Defensive guardrails

Plus custom vulnerability definitions

AI Firewall modes

Block · Redact · Observe

The workflow

Five steps. Structured. Repeatable. Auditable.

Every test run follows the same five steps, so your results are comparable, your reports are reproducible, and your evidence is audit-ready.

Step 1: Connect Model

Add your model via API key or BYO credentials. OpenAI, Anthropic, Google Gemini, Ollama (local). Multi-provider fallback chain with mTLS support.

Step 2: New Test

Configure test scope: framework (EU AI Act tier, OWASP, NIST), attack categories, guardrail coverage, and credit estimate. Pre-flight approval if cost exceeds threshold.

Step 3: Run & Results

Tests execute against your model. Results arrive in real time, categorised by severity, mapped to OWASP and MITRE ATLAS, scored by refusal-aware judges.

Step 4: Reports

Framework-aligned reports generated automatically: OWASP LLM Top 10, EU AI Act, NIST AI RMF, Red-Team, MAS MindForge, AI Verify AIVTF. Blockchain-anchored.

Step 5: Test Modules

Browse and configure the full test strategy: which attack categories, which guardrails, which benchmark cookbooks, and which defensive modes to activate.

Step 1: Connect Model

82 models. Four providers. One interface.

OpenAI, Anthropic, Google Gemini, and Ollama (local/air-gapped). Multi-provider fallback chain with mTLS and custom headers.

AgenticAssure Step 1 showing 82 connected models across OpenAI, Anthropic, Google, and Ollama providers — Step 1: Connect Model. 82 models in production. OpenAI · Anthropic · Google Gemini · Ollama (local/air-gapped).

Step 2: New Test

Framework-first test configuration.

Configure tests by framework: EU AI Act risk tier, OWASP category scope, NIST function coverage. The platform shows a credit estimate before you commit, and requires admin approval for runs above the $100 cost gate.

EU AI Act Article 6 risk tier selection
Credit estimate shown pre-run: no surprise bills
Admin approval gate for runs over $100 threshold

AgenticAssure Step 2 showing EU AI Act risk tier selection and credit pricing estimate

AI Firewall

Block. Redact. Observe.

The AI Firewall sits between your application and the model. Three modes give your security team precise control, from full blocking to silent observation, without touching the model itself.

Block

Hard stop.

Requests matching a threat signature are rejected before reaching the model. Immediate, audited, zero model exposure.

Redact

Sanitise, then allow.

PII, credentials, and sensitive tokens are redacted from the request before forwarding. The model sees clean input; the event is logged.

Observe

Log, don't block.

Traffic passes through unchanged but is fingerprinted against threat patterns. Ideal for baselining new models before enforcing.

AI Firewall: three enforcement modes. Configure per endpoint, per model, or estate-wide.

AgenticAssure AI Firewall configuration showing Block, Redact, and Observe mode settings

Defensive layer

7 guardrails. Plus your own.

Seven shipped defensive guardrails cover the most common LLM safety risks. Custom Vulnerabilities let you define organisation-specific threat signatures, tested alongside the standard suite.

AgenticAssure defensive guardrails configuration showing 7 shipped guardrail types

AgenticAssure custom vulnerabilities interface for defining organisation-specific threat signatures

7 shipped guardrails: prompt injection, PII leakage, toxicity, hallucination, system-prompt extraction, excessive agency, sensitive topics
Custom Vulnerabilities: define your own threat signatures in plain language
All guardrail evaluations scored by refusal-aware judges

Step 4: Reports

Framework-aligned. Blockchain-anchored. Audit-ready.

Every test run automatically populates the Reports catalogue: OWASP LLM Top 10, EU AI Act, NIST AI RMF, Red-Team, MAS MindForge, AI Verify AIVTF. Each report is blockchain-anchored with a verifiable hash.

AgenticAssure reports catalogue showing framework-aligned reports for OWASP, EU AI Act, NIST, and MAS MindForge — Reports catalogue: every framework, every test run. Blockchain-anchored, ready for your auditors.

Test runs become framework-aligned reports.

Analysis turns raw results into OWASP, EU AI Act, NIST, MAS MindForge, and AIVTF reports with PASS/FAIL verdicts and blockchain-anchored evidence.

See Analysis See Govern

AgenticAssure · Trust Layer for Enterprise AI

Trust layer for enterprise AI

Test every AI. Prove every claim.
Before your auditors ask you to.

Book a demo

Five steps. From model to provable safety report

Step 1: Connect Model

Step 2: New Test

Step 3: Run & Results

Step 4: Reports

Step 5: Test Modules

Framework-first test configuration.

Test runs become framework-aligned reports.

Test every AI. Prove every claim.Before your auditors ask you to.

Test every AI. Prove every claim.
Before your auditors ask you to.