AI Safety
Configure guardrails, blocked topics, and Serious Mode to keep your AI assistants on-brand and compliant.
Why AI safety matters
Your AI assistants are talking to real customers on your behalf. Without guardrails, they can occasionally go off-script — discussing topics outside their scope, making statements you'd rather not have associated with your brand, or mishandling sensitive situations.
The AI Safety page (at AI Assistant > AI Safety) gives you the tools to prevent this. It's not about limiting your AI — it's about making sure it stays focused, appropriate, and trustworthy.
How Autoch.at's safety system works
Autoch.at uses a layered approach to safety:
- Blocked keywords and topics — Hard stops that prevent the AI from engaging with specific content at all.
- Serious Mode — A special handling mode for sensitive inquiries that require a more careful, measured response.
- Handoff triggers — Automatic escalation to a human when the AI detects a situation it shouldn't handle alone.
These layers work together. A conversation might pass the keyword check but still trigger Serious Mode, which then escalates to a human.
Blocked keywords and topics
Blocked keywords are words or phrases that, if detected in a customer's message, will prevent the AI from responding normally. Instead, it will either:
- Redirect the conversation to a human agent.
- Send a pre-configured fallback message.
- End the conversation gracefully.
Adding a blocked keyword
- Go to AI Assistant > AI Safety.
- In the Blocked Keywords section, click Add Keyword.
- Enter the word or phrase.
- Choose the action to take when it's detected (redirect, fallback message, or end conversation).
- Click Save.
Use this for topics that are completely off-limits — competitor names you don't want the AI to discuss, legally sensitive terms, or anything that should always go straight to a human.
Serious Mode
Serious Mode is triggered when the AI detects that a conversation involves a sensitive topic — things like mental health crises, legal threats, medical emergencies, or expressions of distress.
When Serious Mode activates, the AI shifts its behavior:
- It stops following its normal playbook and persona.
- It responds with calm, measured language focused on acknowledging the situation.
- It immediately escalates to a human agent or provides emergency resources, depending on your configuration.
Configuring Serious Mode
- Go to AI Assistant > AI Safety.
- In the Serious Mode section, toggle it on.
- Choose the escalation action (route to a specific team, send an emergency message, or both).
- Optionally, customize the message the AI sends when Serious Mode activates.
Serious Mode is enabled by default for all new workspaces. We strongly recommend keeping it on. The cost of a false positive (a slightly awkward handoff) is much lower than the cost of an AI mishandling a genuine crisis.
Reviewing safety events
Every time a safety guardrail fires — whether it's a blocked keyword or a Serious Mode activation — Autoch.at logs it. You can review these events in Insights > Analytics to understand how often your guardrails are being triggered and whether your configuration needs adjustment.
If you're seeing a lot of false positives (the AI escalating normal conversations unnecessarily), you may need to refine your blocked keyword list or adjust the sensitivity of Serious Mode.
Last updated 3 weeks ago
Built with Documentation.AI