2-min read

AgentStatus × Maven AGI / Independent assurance for agents built on Maven

Independent external monitoring for Maven-powered customer surfaces, with first-look findings on six anonymized tenants.

We ran a first-look outside-in validate sweep across six Maven-powered customer support assistants, same validations, same window, head-to-head. The 2-hour window is short, but two tenants surfaced concentration patterns sharper than aggregate platform health would show. This brief is the data, the honest framing, and a proposal to extend.

8k+

Agents monitored

18M+

Total validations

800+

Residential nodes

30+

Regions

agentstatus.dev | partner brief

What we understand about Maven AGI

We know how Maven AGI is built.

Maven AGI runs AI agents across customer support workflows, handling tickets, resolving queries, and escalating when needed. Your inside-out surface includes Conversation Analytics (transcript quality, resolution rates), a Knowledge Hub (what the agent knows), and performance dashboards (response times, deflection rates).

Those tell you what happened. They don't tell you what real users in São Paulo or Berlin experienced 8 minutes ago on a residential connection.

Maven AGI's REST API and webhook-based integrations make it straightforward to point AgentStatus validations at any deployed Maven agent endpoint. No internal access required, only what your public spec already exposes.

What AgentStatus is

Outside-in assurance for production AI agents.

• Validations that fire from 30+ real consumer devices on residential ISPs, not AWS.
• Gold Prompt Profiles: you define what "correct" looks like, we test against it continuously.
• Multi-turn validation: we test full conversations, not just single prompts.
• LLM-as-Judge evaluation: semantic correctness, not just HTTP 200.
• Model drift alerts: we tell you when your agent's answers start changing.
• Failure attribution: network issue vs model regression vs geographic variance.
• Explainability traces: step-level evidence for every pass/fail verdict.

Supporting capabilities: CI/CD gates via GitHub Actions, alerting via Slack, PagerDuty, and webhooks, and embeddable status badges for enterprise customers.

Where we fit

Outside-in vs inside-out.

01

Two views, both necessary

Maven AGI's analytics instrument from inside your stack outward. AgentStatus validations from the real user's position inward. Both views are necessary. Neither replaces the other.

02

Aggregate health hides path-dependent failures

Dashboards show aggregate resolution rates. They don't show you that users on Telefonica residential IPs in Madrid are hitting rate-limit 403s while your Datadog reads green.

03

Real residential coverage, not datacenter checks

30+ residential nodes across 26 regions and 14 countries. Zero datacenter infrastructure. We test where your customers actually are.

04

No instrumentation required

No SDK to install inside your Maven agent. We validate the live endpoint, the same way a user would. Our access is scoped to what we send and receive.

The split

Two truths, one story.

Maven AGI, Inside-out

• Conversation Analytics
• Knowledge Hub
• Resolution & deflection dashboards
• Transcript quality & QA workflows
• Internal escalation flows

AgentStatus, Outside-in

• Residential validations from 30+ devices
• Gold Prompt Profiles & drift alerts
• Multi-turn conversation validation
• LLM-as-Judge semantic checks
• Failure attribution & evidence trail

Proof / metrics

What we ran.

A first-look validate sweep across six Maven-powered customer support surfaces, all chat.onmaven.app endpoints. Validations run on a 5-minute scheduled cadence with conformance validations layered on top.

• Window: ~2 hours on May 5, 2026
• Total snapshots: 1,188 (954 scheduled + 234 conformance)
• Total validate executions: 8,586
• Tenants observed: 6 anonymized
• Aggregate verdict mix: UP 488 (51%) · DEGRADED 437 (46%) · CLIENT_ERROR 23 · DOWN 6
• Dominant degradation drivers: TTFB SLA (62% of degradations) · gold_fail (38%)

The finding

The finding worth investigating: per-tenant variance.

The aggregate verdict mix hides the actual story. Six tenants on the same platform, in the same window, the spread is wide:

Tenant	UP rate	Gold pass	p50 latency	Dominant driver
Tenant A	74%	95.4%	4.5s	TTFB SLA
Tenant B	71%	96.2%	6.4s	mixed
Tenant C	70%	95.6%	7.0s	mixed
Tenant D	64%	97.4%	12.2s	TTFB SLA
Tenant E	19%	97.1%	10.2s	TTFB SLA (143/157)
Tenant F	15%	80.3%	6.3s	gold_fail (109/113)

Two outliers, two different stories:

Tenant E is latency-bound. Same Maven platform as Tenant A, more than 2x the median latency, almost all DEGRADED rows driven by TTFB SLA. Configuration or per-tenant load issue, not a content problem.

Tenant F is semantically degraded. Gold pass rate 15 points below every other tenant. Almost all DEGRADED rows driven by gold_fail, not latency. Content or grounding problem, not a transport problem.

Two different remediation paths, surfacing on the same platform, in the same window. Aggregate platform health would show the same thing for both. Per-tenant outside-in evidence shows which is which.

Specific tenant names available to Maven under mutual confidentiality.

What we are not claiming

An independent layer that coexists.

AgentStatus doesn't replace Maven's Conversation Analytics, Knowledge Hub, or performance dashboards. We don't have visibility into your knowledge base, your training data, or your internal escalation flows.

We are an independent assurance layer, a third party that validates what users experience in the wild, so you can prove your agents work to customers, compliance teams, and leadership without asking them to trust your own telemetry.

Ask for the conversation

Three things we'd want to validate with you.

01

Whether the use case fits

Maven agents in customer-facing production workflows with SLA commitments are the primary fit. We can validate quickly.

02

What a sandbox run looks like

Point us at one live agent endpoint. We run 48 hours of validations. You see the data before committing to anything.

03

Whether there's a joint motion

We're open to co-selling, co-marketing, or referral arrangements where it's a clean fit for a shared customer.

Closing

Maven AGI builds and operates the agents. AgentStatus proves they're working, for your customers, your SLAs, and anyone who needs evidence, not dashboards.

Book a sandbox run See the Kore.ai page

Contact

dulra@carmel.soroman@carmel.so

This brief reflects monitoring data from a 2-hour observation window on May 5, 2026, on publicly-reachable chat.onmaven.app endpoints. Validations ran at conservative rate limits with no attempt to bypass authentication; no tenant data was collected beyond verdict metadata, latency aggregates, and short response previews. Findings are first-look only, sustained patterns require a longer window. Tenant names are anonymized in this brief and available to Maven under mutual confidentiality. AgentStatus is independent outside-in production monitoring for AI agents and is not affiliated with Maven AGI.