A single send-and-forget email is easy. A conversation that spans five exchanges over three days is harder. The agent needs to remember what it said, what the other person said, what it’s waiting for, and where in the workflow it is — across process restarts, deploy cycles, and hours of silence between replies.
This recipe builds that loop: the agent sends, waits for a reply, restores context, decides what to say, replies, and waits again. It runs entirely on webhooks and the Threads API so there’s no polling and no missed messages.
What you’ll build
Section titled “What you’ll build”- Send an initial outbound message and persist the conversation state.
- Receive replies via
message.createdwebhook. - Restore the full conversation history from the thread.
- Feed the history to an LLM to generate the next reply.
- Send the reply in-thread and update the state.
- Handle the conversation lifecycle: escalation, completion, and dormancy.
Before you begin
Section titled “Before you begin”- An Agent Account with a
message.createdwebhook subscribed. See Give your agent its own email. - A durable data store (Postgres, Redis, DynamoDB, or similar) for conversation state. In-memory won’t survive restarts, and email conversations span hours.
- Access to an LLM for generating replies. The examples keep this abstract — any OpenAI-compatible API works.
The conversation state model
Section titled “The conversation state model”Every active conversation needs a record that maps the Nylas thread_id to the agent’s internal state.
// What the agent stores per conversationconst conversationRecord = { threadId: "nylas-thread-id", grantId: AGENT_GRANT_ID, contactName: "Alice Chen", purpose: "demo_followup", // What started this conversation step: "awaiting_reply", // Where in the workflow we are turnCount: 1, // How many exchanges have happened maxTurns: 10, // Safety cap before escalation createdAt: "2026-04-14T10:00:00Z", lastActivityAt: "2026-04-14T10:00:00Z", metadata: {}, // Workflow-specific data};The step field is the heart of it. It tracks what the agent is waiting for and determines how it handles the next inbound message.
Start the conversation
Section titled “Start the conversation”When the agent initiates contact, it sends the first message and creates the conversation record.
async function startConversation({ to, subject, body, purpose, metadata }) { const sent = await nylas.messages.send({ identifier: AGENT_GRANT_ID, requestBody: { to: [{ email: to.email, name: to.name }], subject, body, }, });
// Persist the conversation state keyed by thread_id. await db.conversations.create({ threadId: sent.data.threadId, grantId: AGENT_GRANT_ID, contactEmail: to.email, contactName: to.name, purpose, step: "awaiting_reply", turnCount: 1, maxTurns: 10, createdAt: new Date().toISOString(), lastActivityAt: new Date().toISOString(), metadata: metadata ?? {}, });
return sent.data;}Handle inbound replies
Section titled “Handle inbound replies”When a reply arrives, the webhook handler looks up the conversation, rebuilds history, and passes it to the LLM.
app.post("/webhooks/nylas", async (req, res) => { res.status(200).end();
const event = req.body; if (event.type !== "message.created") return;
const msg = event.data.object; if (msg.grant_id !== AGENT_GRANT_ID) return;
// Skip messages the agent sent (outbound fires message.created too). if (msg.from?.[0]?.email === agentEmail) return;
const conversation = await db.conversations.findByThreadId(msg.thread_id); if (!conversation) { // New inbound message, not a reply to something we sent. await triageNewInbound(msg); return; }
await continueConversation(msg, conversation);});Restore context and generate a reply
Section titled “Restore context and generate a reply”Fetch the full thread from Nylas so the LLM has the complete conversation, not just the latest message.
async function continueConversation(msg, conversation) { // Fetch full body (webhook only has summary fields). const fullMessage = await nylas.messages.find({ identifier: AGENT_GRANT_ID, messageId: msg.id, });
// Pull the entire thread so the LLM sees the full exchange. const thread = await nylas.threads.find({ identifier: AGENT_GRANT_ID, threadId: conversation.threadId, });
// Fetch every message in the thread for full conversation history. const allMessages = await Promise.all( thread.data.messageIds.map((id) => nylas.messages.find({ identifier: AGENT_GRANT_ID, messageId: id }), ), );
// Format as a conversation transcript for the LLM. const transcript = allMessages .map((m) => m.data) .sort((a, b) => a.date - b.date) .map((m) => ({ body: m.body, date: new Date(m.date * 1000).toISOString(), }));
// Check lifecycle constraints before generating a reply. if (conversation.turnCount >= conversation.maxTurns) { await escalate(conversation, "max turns reached"); return; }
// Generate the reply. const replyBody = await llm.generateReply({ purpose: conversation.purpose, step: conversation.step, transcript, metadata: conversation.metadata, });
// Send in-thread. const sent = await nylas.messages.send({ identifier: AGENT_GRANT_ID, requestBody: { replyToMessageId: msg.id, to: [{ email: conversation.contactEmail, name: conversation.contactName }], subject: `Re: ${thread.data.subject}`, body: replyBody.text, }, });
// Update state. await db.conversations.update(conversation.threadId, { step: replyBody.nextStep ?? "awaiting_reply", turnCount: conversation.turnCount + 1, lastActivityAt: new Date().toISOString(), metadata: { ...conversation.metadata, ...replyBody.metadata }, });}The LLM receives the full transcript and the current workflow step, so it can generate a contextually appropriate reply. It also returns a nextStep value that advances the conversation state machine.
Handle lifecycle events
Section titled “Handle lifecycle events”Not every conversation ends neatly. Build handlers for the edges.
Escalation
Section titled “Escalation”When the agent hits its turn limit, encounters a topic it can’t handle, or detects frustration, hand the conversation to a human.
async function escalate(conversation, reason) { await db.conversations.update(conversation.threadId, { step: "escalated", metadata: { ...conversation.metadata, escalationReason: reason }, });
// Notify the human -- Slack, PagerDuty, internal API, whatever fits. await notifyHumanOperator({ threadId: conversation.threadId, contact: conversation.contactEmail, reason, });}Completion
Section titled “Completion”When the agent determines the conversation’s purpose is fulfilled (the prospect booked a meeting, the support question was answered), mark it done so future messages on the same thread get handled correctly.
async function completeConversation(conversation) { await db.conversations.update(conversation.threadId, { step: "completed", lastActivityAt: new Date().toISOString(), });}Dormant threads
Section titled “Dormant threads”Someone might reply to a conversation that’s been inactive for weeks. Decide up front what happens: re-read the thread and resume, escalate, or send a fresh introduction.
// In the webhook handler, before calling continueConversation:const hoursSinceLastActivity = (Date.now() - new Date(conversation.lastActivityAt).getTime()) / 3600000;
if (hoursSinceLastActivity > 168) { // Over a week of silence -- escalate instead of auto-replying. await escalate(conversation, "dormant thread reopened after 7+ days"); return;}Keep in mind
Section titled “Keep in mind”- Filter out the agent’s own messages.
message.createdfires for outbound too. If you don’t checkmsg.from, the agent will try to reply to itself. - Batch rapid replies. If someone sends two messages in quick succession, a short delay (30-60 seconds) before responding lets you treat them as one turn instead of generating two separate replies.
- Cap the conversation length. An unbounded conversation loop is a token sink and a risk. The
maxTurnsfield is there for a reason — set it based on what’s realistic for the workflow. - Persist conversation state durably. Redis with AOF, Postgres, DynamoDB — anything that survives restarts. The gap between messages can be days.
- The LLM doesn’t need every message. For long threads, summarize earlier messages and pass only the last 3-4 in full. This keeps token usage reasonable without losing critical context.
What’s next
Section titled “What’s next”- Handle email replies in an agent loop — the simpler single-reply recipe this builds on
- Prevent duplicate agent replies — dedup patterns for when multiple webhooks fire close together
- Email threading for agents — how Message-ID, In-Reply-To, and References headers work
- Build a support agent — end-to-end tutorial applying this pattern to customer support
- Policies, Rules, and Lists — filter inbound to reduce noise before it reaches the agent