Build an email support agent

A support agent sounds like a triage agent with extra steps — but it isn’t. Triage decides what to do; support decides what to say. Saying the wrong thing in a customer-facing reply is the kind of mistake that ends up on a slide deck. The pattern in this recipe defends against that with two gates: a confidence threshold on knowledge-base matches, and a risk tier that escalates anything legally or commercially sensitive away from the agent entirely.

The five-step loop

Poll the support inbox for unread messages.
Read each one, extract the question.
Match against the knowledge base; get back a confidence score.
Risk-tier the ticket and route accordingly.
Draft a reply if the tier allows it; queue everything for human review.

The agent never hits send.

Confidence gating

Confidence comes from your KB lookup (typically a vector search over articles + a re-ranker). The agent uses the score to decide what to do:

Confidence	Action
`>= 0.85`	Draft directly from the matched article
`0.60 – 0.85`	Draft conservatively, cite the article inline so the reviewer can verify
`< 0.60`	Don’t draft. Flag for manual review with a “best guess” KB article attached

The two-tier draft (confident vs. citation-required) is what keeps the reviewer’s job manageable — they trust the high-confidence drafts, scrutinize the medium-confidence ones, and write the low-confidence ones from scratch.

Risk tiering

Independent from confidence:

Low — password resets, FAQ-shaped questions. Draft → human approves.
Medium — refund requests, account changes, anything affecting billing. Draft → human approves with extra scrutiny.
High — legal threats, regulatory matters, fraud reports. Skip drafting. Escalate immediately to a real person with full context attached.

Risk doesn’t care about confidence: a high-confidence KB match for a refund question still goes through human review. Compounding mistakes is what produced the Air Canada chatbot refund ruling — never let an agent commit your company to anything without a human in the loop.

Skill configuration (Manus pattern)

If you’re running this on Manus, the agent is configured through a SKILL.md rather than code:

# Support agent

## Reply style
- Replies are under 120 words.
- Cite KB articles inline: [KB-1234](https://kb.example.com/1234).
- Match the tone of the inbound message.

## Drafting rules
- Always show the draft before sending. Never auto-send.
- If confidence < 0.6, do not draft — flag for human.
- Refunds, account changes, legal threats: never draft. Escalate.

## Polling
- Check the support inbox every 10 minutes.
- Process at most 5 tickets per cycle while the agent is in shakedown.

The “always show the draft before sending” rule is the load-bearing constraint. Don’t remove it.

Drafting code (subprocess version)

If you’re rolling this yourself instead of using Manus, the loop looks like:

def handle(msg):
    question = extract_question(msg)
    article, conf = kb.search(question)

    if classify_risk(msg) == "high":
        escalate_to_human(msg, reason="high-risk topic")
        return

    if conf < 0.60:
        flag_for_review(msg, article)
        return

    draft = generate_draft(msg, article, cite_inline=(conf < 0.85))
    queue_for_approval(msg, draft, article)

queue_for_approval is the choke point. In production it usually drops the draft into a Slack channel or an internal review tool, not directly into the support inbox.

Scaling tips

Start with --limit 5. Process five tickets per cycle while you’re tuning the KB matcher and the risk classifier. Bump to 20 once the false-positive rate is acceptable.
Group similar tickets. If the agent sees three “where’s my receipt?” tickets in a row, batch them — same KB article, same draft template, one reviewer pass.
Mind KB drift. Tickets the agent can’t confidently match are the strongest signal of where to write new KB articles. Track them.

Things to know

Polling, not webhooks. Support inboxes typically have multiple recipients; webhook fan-out gets complicated. Polling every 5–15 minutes is simpler and the latency is acceptable for support contexts.
Human-in-the-loop is non-negotiable. Even at 99% accuracy, the 1% that makes legal commitments destroys trust faster than the 99% builds it.
Audit everything. For support, you’ll want a complete record of which articles were matched and which drafts were sent — log every classification, lookup, and approval decision to your own store.