# Build an AI email triage agent

Source: https://developer.nylas.com/docs/cookbook/agents/email-triage-agent/

A triage agent does what you'd ask an over-caffeinated EA to do: open each unread message, decide whether it needs you in the next hour, the next day, the next week, or never, draft replies for the urgent ones, and bulk-archive the rest. The pattern below runs as a cron job every fifteen minutes and costs roughly $0.002 per 100 emails using GPT-4o-mini for classification.

Most of the work happens in two prompts and three CLI calls.

## The four buckets

| Bucket | Meaning | Action |
| --- | --- | --- |
| `URGENT` | Production incident, executive ask | Draft a reply within the hour |
| `ACTION` | Code review, meeting follow-up | Draft a reply same-day |
| `FYI` | Status update, FYI thread | Leave it alone |
| `NOISE` | Newsletter, marketing, automated alert | Archive |

Four is the right number. Three loses fidelity (everything ends up in "important"). Five and the model starts confusing categories.

## The classification prompt

Run it with `temperature=0` and `max_tokens=10` — you want deterministic output and a single token of output, not a paragraph. The model gets sender + subject + 200-char snippet, not the full body. That's plenty for >90% accuracy and it keeps the per-email cost trivial.

```text
You triage email into one of four categories:

URGENT  — production incidents, executive requests; reply within 1 hour
ACTION  — code reviews, meeting follow-ups; reply same day
FYI     — informational, no response needed
NOISE   — newsletters, marketing, automated notifications

From:    {sender}
Subject: {subject}
Snippet: {snippet}

Return ONLY the category name. Nothing else.
```

Always validate the output against the four valid strings — LLMs occasionally invent a category. Fall back to `FYI` on anything unrecognized.

## The loop

```python

from openai import OpenAI

VALID = {"URGENT", "ACTION", "FYI", "NOISE"}
client = OpenAI()

def fetch_unread(limit=50):
    out = subprocess.run(
        ["nylas", "email", "list", "--unread", "--limit", str(limit), "--json"],
        capture_output=True, text=True, check=True,
    )
    return json.loads(out.stdout)

def classify(msg):
    resp = client.chat.completions.create(
        model="gpt-4o-mini",
        temperature=0,
        max_tokens=10,
        messages=[{"role": "user", "content": CLASSIFY_PROMPT.format(
            sender=msg["from"][0]["email"],
            subject=msg["subject"],
            snippet=msg["snippet"][:200],
        )}],
    )
    cat = resp.choices[0].message.content.strip()
    return cat if cat in VALID else "FYI"

def draft_reply(msg):
    resp = client.chat.completions.create(
        model="gpt-4o",
        temperature=0.7,
        messages=[{"role": "user", "content": DRAFT_PROMPT.format(
            sender=msg["from"][0]["email"],
            subject=msg["subject"],
            body=msg["body"],
        )}],
    )
    body = resp.choices[0].message.content
    subprocess.run(
        ["nylas", "email", "draft",
         "--to", msg["from"][0]["email"],
         "--subject", "Re: " + msg["subject"],
         "--body", body, "--json"],
        check=True,
    )

def archive(msg):
    subprocess.run(["nylas", "email", "archive", msg["id"]], check=True)

for msg in fetch_unread():
    cat = classify(msg)
    if cat in ("URGENT", "ACTION"):
        draft_reply(msg)
    elif cat == "NOISE":
        archive(msg)
    # FYI: do nothing
```

Drafts land in your drafts folder — the agent never sends. You review, edit, and hit send (or not).

## The drafting prompt

```text
Write a short, professional reply to this email. Three sentences max. Be direct.

From:    {sender}
Subject: {subject}
Body:    {body}

Reply:
```

Higher temperature here (0.7) lets the model produce natural prose. The "three sentences max" is load-bearing — without it, you'll get drafts that read like a politely overcompensating intern.

## Cron it

```cron
*/15 * * * * /usr/bin/python3 /opt/triage/triage.py >> /var/log/triage.log 2>&1
```

The whole thing is idempotent — only `--unread` messages are pulled, and once you read a draft (or the original) it falls out of subsequent runs.

## Cost math

GPT-4o-mini classification: ~$0.15 per 1M input tokens. A 200-char snippet plus the prompt is ~150 tokens. 100 emails ≈ 15K tokens ≈ $0.002. Drafting uses GPT-4o (~$2.50 / 1M input) but only on the URGENT + ACTION subset — typically under 20% of the inbox. A heavy day at 200 unread emails costs roughly a nickel.

## Privacy mode

For mail you don't want hitting OpenAI / Anthropic, swap to a local Ollama:

```python
client = OpenAI(base_url="http://localhost:11434/v1", api_key="ollama")
# model="llama3.1" or similar
```

Llama 3.1 classifies almost as well as GPT-4o-mini for this task. Drafting quality drops noticeably with smaller models — keep that on a hosted LLM unless you're running a 70B+ parameter local model.

## Things to know

- **Only `--unread`.** The agent never re-classifies what it already drafted. If you re-read a draft, the original message stays read; the next cron skips it.
- **No auto-send.** Always land in drafts. The cost of a wrong send (to the wrong person, at the wrong tone) is way higher than the friction of one extra click.
- **Tune per inbox.** Eng inboxes hit URGENT differently than sales inboxes. Customize the four categories and the prompt to your context.

## Next steps

- [Email support agent](/docs/cookbook/agents/email-support-agent/) — adds knowledge-base lookup
- [Reach inbox zero](/docs/cookbook/agents/inbox-zero/) — interactive variant
- [Build an LLM agent with email & calendar tools](/docs/cookbook/cli/llm-agent-with-tools/)