Skip to main content
AgenticAssure

Module 3: Test & Prove

Five steps. From model to provable safety report

A structured workflow that takes any AI model, one of 82 connected in production, through red-team testing, defensive guardrails, and framework-aligned report generation.

AgenticAssure Test & Prove run and results list showing test outcomes across security, privacy, and safety categories

0

Connected models

In production

0

Attack techniques

29 single-turn · 5 multi-turn

0

Defensive guardrails

Plus custom vulnerability definitions

3

AI Firewall modes

Block · Redact · Observe

The workflow

Five steps. Structured. Repeatable. Auditable.

Every test run follows the same five steps, so your results are comparable, your reports are reproducible, and your evidence is audit-ready.

Step 1: Connect Model

Add your model via API key or BYO credentials. OpenAI, Anthropic, Google Gemini, Ollama (local). Multi-provider fallback chain with mTLS support.

Step 2: New Test

Configure test scope: framework (EU AI Act tier, OWASP, NIST), attack categories, guardrail coverage, and credit estimate. Pre-flight approval if cost exceeds threshold.

Step 3: Run & Results

Tests execute against your model. Results arrive in real time, categorised by severity, mapped to OWASP and MITRE ATLAS, scored by refusal-aware judges.

Step 4: Reports

Framework-aligned reports generated automatically: OWASP LLM Top 10, EU AI Act, NIST AI RMF, Red-Team, MAS MindForge, AI Verify AIVTF. Blockchain-anchored.

Step 5: Test Modules

Browse and configure the full test strategy: which attack categories, which guardrails, which benchmark cookbooks, and which defensive modes to activate.

Step 1: Connect Model

82 models. Four providers. One interface.

OpenAI, Anthropic, Google Gemini, and Ollama (local/air-gapped). Multi-provider fallback chain with mTLS and custom headers.

AgenticAssure Step 1 showing 82 connected models across OpenAI, Anthropic, Google, and Ollama providers
  1. 1 OpenAI models
  2. 2 Anthropic models
  3. 3 Google Gemini
  4. 4 Ollama (local)
Step 1: Connect Model. 82 models in production. OpenAI · Anthropic · Google Gemini · Ollama (local/air-gapped).

Step 2: New Test

Framework-first test configuration.

Configure tests by framework: EU AI Act risk tier, OWASP category scope, NIST function coverage. The platform shows a credit estimate before you commit, and requires admin approval for runs above the $100 cost gate.

  • EU AI Act Article 6 risk tier selection
  • Credit estimate shown pre-run: no surprise bills
  • Admin approval gate for runs over $100 threshold
EU AI Act tiering
AgenticAssure Step 2 showing EU AI Act risk tier selection and credit pricing estimate
AI Firewall

Block. Redact. Observe.

The AI Firewall sits between your application and the model. Three modes give your security team precise control, from full blocking to silent observation, without touching the model itself.

Block

Hard stop.

Requests matching a threat signature are rejected before reaching the model. Immediate, audited, zero model exposure.

Redact

Sanitise, then allow.

PII, credentials, and sensitive tokens are redacted from the request before forwarding. The model sees clean input; the event is logged.

Observe

Log, don't block.

Traffic passes through unchanged but is fingerprinted against threat patterns. Ideal for baselining new models before enforcing.

AI Firewall: three enforcement modes. Configure per endpoint, per model, or estate-wide.

AI Firewall
AgenticAssure AI Firewall configuration showing Block, Redact, and Observe mode settings
Defensive layer

7 guardrails. Plus your own.

Seven shipped defensive guardrails cover the most common LLM safety risks. Custom Vulnerabilities let you define organisation-specific threat signatures, tested alongside the standard suite.

7 guardrails
AgenticAssure defensive guardrails configuration showing 7 shipped guardrail types
Custom vulnerabilities
AgenticAssure custom vulnerabilities interface for defining organisation-specific threat signatures
  • 7 shipped guardrails: prompt injection, PII leakage, toxicity, hallucination, system-prompt extraction, excessive agency, sensitive topics
  • Custom Vulnerabilities: define your own threat signatures in plain language
  • All guardrail evaluations scored by refusal-aware judges
Step 4: Reports

Framework-aligned. Blockchain-anchored. Audit-ready.

Every test run automatically populates the Reports catalogue: OWASP LLM Top 10, EU AI Act, NIST AI RMF, Red-Team, MAS MindForge, AI Verify AIVTF. Each report is blockchain-anchored with a verifiable hash.

Reports catalogue
AgenticAssure reports catalogue showing framework-aligned reports for OWASP, EU AI Act, NIST, and MAS MindForge
Reports catalogue: every framework, every test run. Blockchain-anchored, ready for your auditors.

Test runs become framework-aligned reports.

Analysis turns raw results into OWASP, EU AI Act, NIST, MAS MindForge, and AIVTF reports with PASS/FAIL verdicts and blockchain-anchored evidence.

AgenticAssure · Trust Layer for Enterprise AI

Trust layer for enterprise AI

Test every AI. Prove every claim.
Before your auditors ask you to.

Book a demo