AgentStatus data partner program · technical overview
Independent behavioral data for AI agent risk & compliance.
Data partners building insurance, GRC, underwriting, or analytics products need more than internal telemetry. AgentStatus adds the independent outside-in data layer: continuous validations from real consumer devices, correctness checks, drift detection, geographic variance, and audit-ready behavioral logs.
Context
The actuarial table for AI agents.
We recently published a report on 9 businesses that should be built on AI agent behavioral data. Idea #1 was AI Agent Insurance Underwriting: the actuarial table for AI agents. That category — plus GRC, compliance, and continuous risk monitoring — is where many data partners sit.
Insurance for AI agents needs more than a one-time certification audit. A certification can tell you an agent passed on Tuesday. It does not tell you whether the agent drifted by Thursday, hallucinated for users in Japan on Friday, or went down in Germany on Saturday.
AgentStatus produces the continuous behavioral record that can make this insurable: uptime history, quality score trends, drift frequency, incident severity, latency distribution, geographic reliability, and recovery time.
Who this is for
Platforms that price, monitor, or attest to agent risk.
This page is for data partners — insurers, reinsurers, MGA platforms, GRC vendors, compliance analytics, and risk-scoring products — that need a continuous behavioral record beyond what customers self-report or what a single vendor's internal logs show.
Your platform may already ingest session-level telemetry, workflow events, or customer-side logs. AgentStatus adds a neutral outside-in test record: how the agent behaves under controlled, repeatable validations from real consumer devices over time.
What AgentStatus provides
A structured behavioral record per agent.
For each monitored agent, AgentStatus can provide a structured behavioral record. Some fields are stored directly, some live inside JSON evidence objects, and some are derived from historical results.
Core agent and run fieldsClick to view 12 fields12
Availability and performance fieldsClick to view 16 fields16
Validation-level evidence fieldsClick to view 15 fields15
Evaluation and quality fieldsClick to view 11 fields11
Drift and behavioral stability fieldsClick to view 13 fields13
Policy and conformance fieldsClick to view 17 fields17
GRC and review evidence fieldsClick to view 12 fields12
Derived insurance risk fieldsClick to view 15 fields15
Over time, these records become an underwriting and GRC evidence layer: behavioral risk, policy conformance, drift history, incident severity, control evidence, reviewer actions, and audit-ready evidence bundles. For an insurer, that means risk can be priced and monitored against observed behavior, not static questionnaires.
Anonymized agent data
Sample records from the live network.
Below are five anonymized records from recent monitoring runs. They show the actual shape of the data partners receive: verdicts, latency, policy conformance, and derived risk bands across residential nodes in multiple regions.
anonymized_sample_records.jsonClick to read5 records
[
{
"agent_id": "anon_agent_001",
"result_id": "anon_result_001",
"run_completed_at": "2026-04-30T20:29:56Z",
"agent_category": "general",
"request_format": "boost_http",
"region_summary": ["ca"],
"node_type": "residential",
"verdict": "CLIENT_ERROR",
"uptime": 0,
"pass_rate": 0,
"gold_pass_rate": 0,
"latency_p95_ms": 167,
"ttfb_p95_ms": 168,
"ttfb_sla_pass": true,
"total_probes": 6,
"successful_probes": 0,
"failed_probes": 1,
"error_counts": { "http_4xx": 6 },
"gold_results_summary": { "total": 3, "passed": 0 },
"policy_conformance_summary": {
"passed": false,
"total_rules": 5,
"rules_passed": 3,
"critical_failures": 1,
"severity_summary": { "critical": 1, "high": 1, "medium": 0, "low": 0 },
"violation_categories": ["pii_leak", "system_prompt_leakage"]
},
"derived": {
"risk_band": "high",
"quality_risk_score": 1.0,
"latency_risk_score": 0.017,
"underwriting_summary": "Needs review due to availability, correctness, latency, or policy failures"
}
},
{
"agent_id": "anon_agent_002",
"result_id": "anon_result_002",
"run_completed_at": "2026-04-30T20:31:35Z",
"agent_category": "general",
"request_format": "openai",
"region_summary": ["us"],
"node_type": "residential",
"verdict": "DOWN",
"uptime": 0,
"pass_rate": 0,
"gold_pass_rate": 0,
"latency_p95_ms": 21059,
"ttfb_p95_ms": null,
"ttfb_sla_pass": false,
"total_probes": 4,
"successful_probes": 0,
"failed_probes": 1,
"error_counts": { "read_timeout": 4 },
"gold_results_summary": { "total": 3, "passed": 0 },
"policy_conformance_summary": null,
"derived": {
"risk_band": "high",
"quality_risk_score": 1.0,
"latency_risk_score": 1.0,
"underwriting_summary": "Needs review due to availability, correctness, latency, or policy failures"
}
},
{
"agent_id": "anon_agent_003",
"result_id": "anon_result_003",
"run_completed_at": "2026-04-30T20:25:27Z",
"agent_category": "customer_support",
"request_format": "talkdesk_http",
"region_summary": ["us"],
"node_type": "residential",
"verdict": "UP",
"uptime": 100,
"pass_rate": 1.0,
"gold_pass_rate": 1.0,
"latency_p95_ms": 9815,
"ttfb_p95_ms": null,
"ttfb_sla_pass": false,
"total_probes": 9,
"successful_probes": 1,
"failed_probes": 0,
"error_counts": {},
"gold_results_summary": { "total": 6, "passed": 6 },
"policy_conformance_summary": {
"passed": false,
"total_rules": 5,
"rules_passed": 3,
"critical_failures": 1,
"severity_summary": { "critical": 1, "high": 1, "medium": 0, "low": 0 },
"violation_categories": ["pii_leak", "system_prompt_leakage"]
},
"derived": {
"risk_band": "low",
"quality_risk_score": 0.0,
"latency_risk_score": 0.982,
"underwriting_summary": "Reachable with acceptable correctness"
}
},
{
"agent_id": "anon_agent_003",
"result_id": "anon_result_004",
"run_completed_at": "2026-04-30T20:13:14Z",
"agent_category": "customer_support",
"request_format": "talkdesk_http",
"region_summary": ["cn"],
"node_type": "residential",
"verdict": "DEGRADED",
"degraded_reason": "ttfb_sla",
"uptime": 77.78,
"pass_rate": 0.778,
"gold_pass_rate": 1.0,
"latency_p95_ms": 28546,
"ttfb_p95_ms": null,
"ttfb_sla_pass": false,
"total_probes": 9,
"successful_probes": 0,
"failed_probes": 0,
"error_counts": { "unknown_error": 2 },
"gold_results_summary": { "total": 6, "passed": 6 },
"policy_conformance_summary": {
"passed": false,
"total_rules": 5,
"rules_passed": 3,
"critical_failures": 1,
"severity_summary": { "critical": 1, "high": 1, "medium": 0, "low": 0 },
"violation_categories": ["pii_leak", "system_prompt_leakage"]
},
"derived": {
"risk_band": "medium_high",
"quality_risk_score": 0.222,
"latency_risk_score": 1.0,
"underwriting_summary": "Needs review due to availability, correctness, latency, or policy failures"
}
},
{
"agent_id": "anon_agent_003",
"result_id": "anon_result_005",
"run_completed_at": "2026-04-30T20:12:42Z",
"agent_category": "customer_support",
"request_format": "talkdesk_http",
"region_summary": ["bd"],
"node_type": "residential",
"verdict": "UP",
"uptime": 100,
"pass_rate": 1.0,
"gold_pass_rate": 1.0,
"latency_p95_ms": 16647,
"ttfb_p95_ms": null,
"ttfb_sla_pass": false,
"total_probes": 9,
"successful_probes": 1,
"failed_probes": 0,
"error_counts": {},
"gold_results_summary": { "total": 6, "passed": 6 },
"policy_conformance_summary": {
"passed": false,
"total_rules": 5,
"rules_passed": 3,
"critical_failures": 1,
"severity_summary": { "critical": 1, "high": 1, "medium": 0, "low": 0 },
"violation_categories": ["pii_leak", "system_prompt_leakage"]
},
"derived": {
"risk_band": "low",
"quality_risk_score": 0.0,
"latency_risk_score": 1.0,
"underwriting_summary": "Reachable with acceptable correctness"
}
}
]Plain-English notes
agent_id / result_idAnonymized IDs for the agent and monitoring run.request_formatAdapter or protocol used to test the agent.region_summaryRegion(s) where the validate ran.node_typeTest came from residential devices, not cloud-only synthetic checks.verdictTop-level outcome: UP, DEGRADED, DOWN, CLIENT_ERROR, etc.pass_rateShare of validations that passed basic checks.gold_pass_rateShare of known expected-answer tests that passed.ttfb_sla_passWhether first-token or first-byte latency met the SLA.policy_conformance_summaryGuardrail and GRC checks against policy rules.violation_categoriesNormalized policy failure categories useful for underwriting.risk_bandDerived insurance-facing bucket from availability, correctness, latency, and policy signals.These are anonymized recent examples. If this shape works for your actuarial or risk models, we can provide a larger sample in the same schema during a pilot.
Products you can build
Nine businesses on this data layer.
We published a report on companies that should exist on continuous AI agent behavioral data — insurance underwriting, credit scores, compliance evidence, procurement intelligence, and more. Each one maps directly to fields in the schema above.
Nine businesses
- 01
AI Agent Insurance Underwriting
The actuarial table for AI agents
- 02
Agent Credit Scores
The credit bureau for AI agents
- 03
Compliance Evidence-as-a-Service
Continuous proof that your agent stayed within guardrails
- 04
Model Update Impact Intelligence
The early warning system for model changes
- 05
Agent Procurement Intelligence
The G2 or Gartner for AI agents, backed by real data
- 06
Geographic Access Intelligence
Where in the world your agent actually works
- 07
SLA Verification for AI Agent Contracts
Third-party proof that the SLA was met — or breached
- 08
AI Agent Security Posture Scoring
How exposed is this agent to adversarial conditions?
- 09
The AI Agent Behavioral Research Dataset
The largest public dataset of how AI agents actually behave in production
Where we fit
Four roles in the risk stack.
Underwriting signal
A static questionnaire says what the customer claims. AgentStatus shows how the agent actually behaves: correctness, uptime, drift, latency, geo variance, and failure patterns over time.
Independent verification
Your platform may already ingest provider-side or customer-side session logs. AgentStatus is outside-in. The data is collected independently from the operator's internal telemetry, which makes it useful for underwriting, compliance, customer trust, and disputes.
Continuous risk scoring
AI-agent risk changes after model updates, prompt edits, workflow changes, vendor outages, and policy updates. AgentStatus provides the behavioral feed that lets data partners update risk posture continuously instead of only at bind time.
Claims and incident evidence
If a claim happens, the key question is whether the failure was isolated or part of a measurable pattern. AgentStatus can provide historical evidence around prior correctness, drift, latency, regional reliability, and policy conformance.
Technical integration model
Opt-in, customer-approved monitoring.
The cleanest first integration is opt-in, customer-approved monitoring. The data partner brings an insured or prospective insured customer. The customer authorizes AgentStatus to test one or more agent workflows. We configure test scenarios, run scheduled validations, and expose results back through an API, webhook, export, or dashboard/report card.
Possible delivery paths:
- API pull: Your platform queries recent results, agent history, risk profile, or evidence bundles.
- Webhook push: AgentStatus sends new test results, drift events, incidents, and policy violations.
- Batch export: Daily or weekly JSON/CSV evidence bundles for underwriting or compliance review.
- Dashboard link: Customer-specific report card or conformance PDF via the partner portal.
The split
Session intelligence plus independent validation.
Your platform sees
- • Live sessions
- • Provider integrations
- • Workflow risk
- • Coverage, policy, or compliance lifecycle context
AgentStatus sees
- • External behavior under controlled validations
- • Correctness over time
- • Geographic reliability
- • Drift after changes, uptime, latency
- • Audit-ready independent evidence
Together, that becomes a stronger product: your session or workflow intelligence plus independent behavioral validation.
What we are not claiming
The independent behavioral data layer.
We are not an insurance carrier.
We are not replacing your session-level monitoring, risk workflow, or core product surface.
We are not asking to scan customers without consent.
We are not asking for production policyholder data unless explicitly approved.
We are the independent behavioral data layer that helps data partners price risk, monitor risk, and prove risk posture over time.
Suggested next steps
Pilot checklist.
Pick one pilot workflow
One approved voice or chat agent, 10–20 agreed scenarios, and a two-week monitoring window.
Define the minimum data contract
Which fields do you need first: raw logs, normalized scores, incident summaries, evidence bundles, or a lightweight risk profile?
Establish data boundaries
No production policyholder data unless explicitly approved. Clear rules for retention, redaction, access control, and evidence handling from day one.
CTA
You build the risk, compliance, or analytics layer. AgentStatus provides the independent behavioral data underneath it.
Start with one approved agent workflow, generate two weeks of test data, and define the first version of your risk feed. Or apply to the data partner program for portal access and pilot onboarding.
The data layer underneath agent risk is what we build.
Metrics are stated with explicit definitions: validations are scheduled executions over approximately two months; agent records are database rows, not revenue customers. AgentStatus is not an insurance carrier and does not replace your session-level monitoring, risk workflow, or core product.