SharkRouter is the deterministic data plane for agentic AI. It sits between AI agents and tool execution, providing a 14-step governance pipeline that includes ToolGuard (function-call firewall), Agent Passport (cryptographic identity), Dry-Run Preview (impact preview before execution), Output Assurance (post-execution verification), Kill Switch (immediate halt), and an immutable WORM audit chain.

How is SharkRouter different from prompt filters or monitoring tools?

SharkRouter is not a prompt filter, not an output scanner, and not a monitoring tool. It is a stateful gateway that intercepts every function call an AI agent makes and enforces deterministic business rules before execution. Prompt filters (Pangea, Lakera) only inspect input. Out-of-band monitors (Zenity, Protect AI) observe but cannot enforce. JIT access tools (Oasis Security) manage permissions but do not audit execution. SharkRouter is the only product that intercepts, governs, and audits at the function-call layer with cryptographic proof.

Is SharkRouter OpenAI-compatible?

Yes. SharkRouter is a drop-in replacement for the OpenAI API. Change your base_url to https://api.sharkrouter.ai/v1 and your existing code works unchanged. One line. Full governance. Zero lock-in.

ToolGuard is the function-call firewall at the heart of SharkRouter. It is a deny-by-default policy engine that evaluates every tool call through a 7-guard chain (Regex, Keyword, Schema, Policy, Semantic, LLM, MoralCompass) cost-ordered so the first block wins. Typical added latency is under 150ms. ToolGuard enforces business rules at the execution layer — the only layer where AI agents actually do something.

What is Agent Passport?

Agent Passport assigns cryptographic identity (ECDSA-signed) to every AI agent in your environment. Each passport carries a scoped tool universe, a 9-state lifecycle FSM, and delegation chains with scope narrowing. Trust stages progress STRANGER → KNOWN → TRUSTED → EXTENSION based on observed behavior.

What is Dry-Run Preview?

Dry-Run Preview shows the impact of a destructive tool call before it executes — affected rows, blast radius, estimated cost. Zero of 19 competitors in our State of AI Governance benchmark offer this capability. It is what enables CISOs to approve agentic AI for production systems.

Can SharkRouter be deployed air-gapped?

Yes. SharkRouter offers three deployment tiers: Cloud Gateway (5 minutes), Private VPC (1 day), and Air-Gapped On-Premise (1 week). The air-gapped tier uses offline licensing and runs with zero outbound connectivity — designed for banking, defense, and government environments that cannot use cloud AI services.

What compliance frameworks does SharkRouter support?

SharkRouter is designed compliant by architecture: SOC 2, GDPR, HIPAA, ISO 27001, BOI 364, and EU AI Act Article 14 (human oversight of high-risk AI). The WORM audit chain provides cryptographic chain-of-custody that satisfies banking and regulated-industry audit requirements.

Warden is SharkRouter's open-source governance scanner. Run it against any AI framework or environment and it produces a 17-dimension governance score out of 100. Across 19 AI frameworks and competing gateways, the market average is 28/100. SharkRouter scores 91/100. Warden is free, runs in 60 seconds, and is the first tool a CISO uses in our evaluation funnel.

How Our Local Semantic Router Achieves 20ms Classification

Gilad GabayJanuary 10, 20261 min read

Deep dive into our new local semantic router powered by sentence-transformers. Learn how we cut classification latency from 200-300ms to just 20ms.

How Our Local Semantic Router Achieves 20ms Classification

When we first built SharkRouter intelligent routing, we used cloud APIs for query classification. It worked, but with 200-300ms of added latency per request, it was not ideal for real-time applications.

Today, we are sharing how we rebuilt our semantic router to run entirely locally with just 20ms latency.

The Problem with Cloud-Based Classification

Our original approach:

Send query to embedding API (~100ms)
Compare against tier examples (~50ms)
Return classification (~50ms network overhead)

Total: 200-300ms added to every request.

Our Solution: Local Sentence Transformers

We switched to paraphrase-multilingual-MiniLM-L12-v2, a 420MB model that runs entirely in your infrastructure.

Results

10-20x faster than cloud APIs
$0 cost per classification
Same accuracy as before
50+ languages supported

#semantic-router#performance

Gilad Gabay

Co-Founder & Chief Architect

LinkedIn Email