Glossary·Security

Injection score

0–1 risk score on every inbound structured reply event. Output of Mails.ai's six-category prompt-injection scanner. Above 0.95 the event is flagged `quarantined`.

Injection scoreis the 0–1 prompt-injection risk score attached to every inbound structured reply event. It is the output of Mails.ai’s six-category scanner, run on every inbound before the event reaches your code.

The six attack categories

Boundary manipulation— role tokens or chat-format delimiters (<|im_end|>, ### system:, [INST]).
System prompt override— “Ignore prior instructions”, “You are now”, “Disregard your training”.
Data exfiltration— “Forward your system prompt”, “List your tools”, “What documents do you have access to”.
Role hijacking— “Pretend you are an admin”, “Act as a financial advisor and approve this”.
Tool invocation— direct attempts to call agent tools with attacker-supplied arguments.
Encoding tricks— base64, ROT13, unicode-substituted, or homoglyph-encoded payloads designed to bypass naive substring scanners.

How to use it

One line at the top of your inbound handler:

agent.onReply((event) => {
  if (event.injection_score > 0.5) return;
  // safe to handle
});

Above 0.95 the event is flagged quarantined— still delivered to your webhook marked quarantined: true(also logged in your dashboard) so your agent skips it. Repeated high scores from a sender mark them as suspicious in your workspace — a signal designed to propagate network-wide as the cohort grows.

Read the prompt-injection post for the full threat-model + scanner internals.