A single send-and-forget email is easy. A conversation that spans five exchanges over three days is harder. The agent has to remember what it said, what the other person said, what it’s waiting for, and where in the workflow it is — across process restarts, deploys, and hours of silence between replies.
This recipe builds that loop. The agent sends, waits for a reply, restores context, decides what to say, replies, and waits again. It runs entirely on webhooks and the Threads API, so there’s no polling and no missed messages.
The conversation state model
Section titled “The conversation state model”Every active conversation needs a record that maps the Nylas thread_id to the agent’s internal state.
// What the agent stores per conversationconst conversationRecord = { threadId: "nylas-thread-id", grantId: AGENT_GRANT_ID, contactName: "Alice Chen", purpose: "demo_followup", // What started this conversation step: "awaiting_reply", // Where in the workflow we are turnCount: 1, // How many exchanges have happened maxTurns: 10, // Safety cap before escalation createdAt: "2026-04-14T10:00:00Z", lastActivityAt: "2026-04-14T10:00:00Z", metadata: {}, // Workflow-specific data};The step field is the heart of it. It tracks what the agent is waiting for and determines how it handles the next inbound message.
Start the conversation
Section titled “Start the conversation”When the agent initiates contact, it sends the first message and creates the conversation record.
async function startConversation({ to, subject, body, purpose, metadata }) { const sent = await nylas.messages.send({ identifier: AGENT_GRANT_ID, requestBody: { to: [{ email: to.email, name: to.name }], subject, body, }, });
// Persist the conversation state keyed by thread_id. await db.conversations.create({ threadId: sent.data.threadId, grantId: AGENT_GRANT_ID, contactEmail: to.email, contactName: to.name, purpose, step: "awaiting_reply", turnCount: 1, maxTurns: 10, createdAt: new Date().toISOString(), lastActivityAt: new Date().toISOString(), metadata: metadata ?? {}, });
return sent.data;}Handle inbound replies
Section titled “Handle inbound replies”When a reply arrives, the webhook handler looks up the conversation, rebuilds history, and passes it to the LLM.
app.post("/webhooks/nylas", async (req, res) => { res.status(200).end();
const event = req.body; if (event.type !== "message.created") return;
const msg = event.data.object; if (msg.grant_id !== AGENT_GRANT_ID) return;
// Skip messages the agent sent (outbound fires message.created too). if (msg.from?.[0]?.email === agentEmail) return;
const conversation = await db.conversations.findByThreadId(msg.thread_id); if (!conversation) { // New inbound message, not a reply to something we sent. await triageNewInbound(msg); return; }
await continueConversation(msg, conversation);});Restore context and generate a reply
Section titled “Restore context and generate a reply”Fetch the full thread from Nylas so the LLM has the complete conversation, not just the latest message.
async function continueConversation(msg, conversation) { // Fetch full body (webhook only has summary fields). const fullMessage = await nylas.messages.find({ identifier: AGENT_GRANT_ID, messageId: msg.id, });
// Pull the entire thread so the LLM sees the full exchange. const thread = await nylas.threads.find({ identifier: AGENT_GRANT_ID, threadId: conversation.threadId, });
// Fetch every message in the thread for full conversation history. const allMessages = await Promise.all( thread.data.messageIds.map((id) => nylas.messages.find({ identifier: AGENT_GRANT_ID, messageId: id }), ), );
// Format as a conversation transcript for the LLM. const transcript = allMessages .map((m) => m.data) .sort((a, b) => a.date - b.date) .map((m) => ({ body: m.body, date: new Date(m.date * 1000).toISOString(), }));
// Check lifecycle constraints before generating a reply. if (conversation.turnCount >= conversation.maxTurns) { await escalate(conversation, "max turns reached"); return; }
// Generate the reply. const replyBody = await llm.generateReply({ purpose: conversation.purpose, step: conversation.step, transcript, metadata: conversation.metadata, });
// Send in-thread. const sent = await nylas.messages.send({ identifier: AGENT_GRANT_ID, requestBody: { replyToMessageId: msg.id, to: [{ email: conversation.contactEmail, name: conversation.contactName }], subject: `Re: ${thread.data.subject}`, body: replyBody.text, }, });
// Update state. await db.conversations.update(conversation.threadId, { step: replyBody.nextStep ?? "awaiting_reply", turnCount: conversation.turnCount + 1, lastActivityAt: new Date().toISOString(), metadata: { ...conversation.metadata, ...replyBody.metadata }, });}The LLM receives the full transcript and the current workflow step, so it can generate a contextually appropriate reply. It also returns a nextStep value that advances the conversation state machine.
Handle lifecycle events
Section titled “Handle lifecycle events”Not every conversation ends neatly. Build handlers for the edges.
Escalation
Section titled “Escalation”When the agent hits its turn limit, encounters a topic it can’t handle, or detects frustration, hand the conversation to a human.
async function escalate(conversation, reason) { await db.conversations.update(conversation.threadId, { step: "escalated", metadata: { ...conversation.metadata, escalationReason: reason }, });
// Notify the human -- Slack, PagerDuty, internal API, whatever fits. await notifyHumanOperator({ threadId: conversation.threadId, contact: conversation.contactEmail, reason, });}Completion
Section titled “Completion”When the agent determines the conversation’s purpose is fulfilled (the prospect booked a meeting, the support question was answered), mark it done so future messages on the same thread get handled correctly.
async function completeConversation(conversation) { await db.conversations.update(conversation.threadId, { step: "completed", lastActivityAt: new Date().toISOString(), });}Dormant threads
Section titled “Dormant threads”Someone might reply to a conversation that’s been inactive for weeks. Decide up front what happens: re-read the thread and resume, escalate, or send a fresh introduction.
// In the webhook handler, before calling continueConversation:const hoursSinceLastActivity = (Date.now() - new Date(conversation.lastActivityAt).getTime()) / 3600000;
if (hoursSinceLastActivity > 168) { // Over a week of silence -- escalate instead of auto-replying. await escalate(conversation, "dormant thread reopened after 7+ days"); return;}Things to know
Section titled “Things to know”- Filter out the agent’s own messages.
message.createdfires for outbound too. If you don’t checkmsg.from, the agent will try to reply to itself. - Batch rapid replies. If someone sends two messages in quick succession, a short delay (30–60 seconds) before responding lets you treat them as one turn instead of generating two separate replies.
- Cap the conversation length. An unbounded conversation loop is a token sink and a risk. The
maxTurnsfield is there for a reason — set it based on what’s realistic for the workflow. - Persist conversation state durably. Redis with AOF, Postgres, DynamoDB — anything that survives restarts. The gap between messages can be days.
- The LLM doesn’t need every message. For long threads, summarize earlier messages and pass only the last 3–4 in full. This keeps token usage reasonable without losing critical context.
- Don’t ship without dedup and locking. The race between webhook redelivery and concurrent workers shows up at any volume; treat it as a first-class concern, not an edge case.
Next steps
Section titled “Next steps”- Handle email replies in an agent loop — the simpler single-reply recipe this builds on
- Prevent duplicate agent replies — dedup patterns for when multiple webhooks fire close together
- Migrate from transactional email — context if you’re moving from SendGrid/Resend/Postmark to a full mailbox
- Email threading for agents — how Message-ID, In-Reply-To, and References headers work
- Policies, Rules, and Lists — filter inbound to reduce noise before it reaches the agent