Injection score
0–1 risk score on every inbound structured reply event. Output of Mails.ai's six-category prompt-injection scanner. Above 0.95 the event is flagged `quarantined`.
Injection scoreis the 0–1 prompt-injection risk score attached to every inbound structured reply event. It is the output of Mails.ai’s six-category scanner, run on every inbound before the event reaches your code.
The six attack categories
- Boundary manipulation— role tokens or chat-format delimiters (
<|im_end|>,### system:,[INST]). - System prompt override— “Ignore prior instructions”, “You are now”, “Disregard your training”.
- Data exfiltration— “Forward your system prompt”, “List your tools”, “What documents do you have access to”.
- Role hijacking— “Pretend you are an admin”, “Act as a financial advisor and approve this”.
- Tool invocation— direct attempts to call agent tools with attacker-supplied arguments.
- Encoding tricks— base64, ROT13, unicode-substituted, or homoglyph-encoded payloads designed to bypass naive substring scanners.
How to use it
One line at the top of your inbound handler:
agent.onReply((event) => {
if (event.injection_score > 0.5) return;
// safe to handle
});Above 0.95 the event is flagged quarantined— still delivered to your webhook marked quarantined: true(also logged in your dashboard) so your agent skips it. Repeated high scores from a sender mark them as suspicious in your workspace — a signal designed to propagate network-wide as the cohort grows.
Read the prompt-injection post for the full threat-model + scanner internals.
Built for agents.
Self-serve at every volume.
Public API opens Q3 2026. Drop ~6 lines into your agent and ship.
$ npm install @mailsai/sdk