Skip to content
Skip to main content

Get a meeting transcript and recording

Last updated:

Your bot sat through the meeting and recorded everything. Now you need the files: the transcript, the MP4 recording, the summary, and the action items. The catch is that the download links Nylas hands back expire after 3,600 seconds, so once you fetch them you have one hour to pull the bytes onto your own storage before the URLs stop working.

This recipe shows how to fetch the media for a finished meeting, what each object in the response holds, and how to download the files before the links go stale. If you haven’t created a bot yet, a POST /v3/grants/{grant_id}/notetakers call sends one into a meeting. See the Notetaker API guide for the full setup.

Make a GET request to the media endpoint with the grant ID and the notetaker ID. Nylas returns one object per media type that was enabled for the meeting, each with a pre-authenticated url you can download without sending your API key again. The response carries five objects: recording, transcript, summary, action_items, and thumbnail.

The endpoint is read-only and idempotent, so you can call it as often as you need. It only returns media once the notetaker reaches the media_available state. Call it too early and you’ll get a 404; call it after the retention window and you’ll get a 410.

The data object holds up to five media objects, and each one bundles a download url with metadata like size, type, and ttl. Which objects appear depends on the settings the bot used: a video meeting returns all five, while an audio-only run skips the thumbnail. The table below maps each object to its MIME type and contents.

ObjectTypeContents
recordingvideo/mp4Full audio/video recording of the meeting.
transcriptapplication/jsonSpeaker-labelled transcript with text segments and timestamps.
summaryapplication/jsonShort text summary of what the meeting covered.
action_itemsapplication/jsonList of action items pulled from the conversation.
thumbnailimage/pngStill frame captured around the midpoint of the recording.

A couple of field details matter when you store this data. The recording.duration is the meeting length in seconds, so the sample’s 1800 means a 30-minute call. The recording.size is in bytes, so 52428800 is a 50 MB file. Both help you decide whether to stream the download or buffer it in memory. For the full field list, see Handling Notetaker media files.

Every media url carries a ttl of 3,600 seconds, which is the one-hour window you have to download the file before the link stops working. The fix is simple: as soon as you get the response, fetch each url and write the bytes to your own disk or object storage. The snippet below downloads the recording, and the same pattern works for the transcript, summary, and thumbnail.

If an hour passes before you download, the link is dead. Don’t try to “refresh” the same URL: call the Get Media endpoint again and Nylas mints a new set of URLs with a fresh 3,600-second window.

Notetaker media has a handful of behaviors that shape how you build a reliable download pipeline. The points below cover link expiry, the transcript shape, language settings, how to know when files are ready, and where the recordings actually live.

The ttl on each url is 3,600 seconds, and once it passes the link returns an error on any download attempt. This is a security constraint, not a bug: short-lived URLs limit exposure if one gets logged or shared by accident. Build your pipeline to re-fetch rather than cache URLs, because a stored link is worthless an hour later. The Get Media call is the only way to get fresh ones.

Separately, Nylas keeps the underlying files for a maximum of 14 days, tracked by the expires_at and ttl fields on each object. After that the files are deleted for good, so download anything you want to keep well inside that window.

For most meetings the transcript file is a JSON object with type: "speaker_labelled", a language code, and a transcript array. Each array entry has a speaker name, a text segment, and start and end times in milliseconds. So a single speaker’s turn becomes one or more timed segments you can render as a caption track or search by timestamp.

In rare cases Nylas returns type: "raw" instead, where transcript is a plain string with no speakers or timing. Check the type field before you parse, because a transcript can come back raw and your handler should cover both shapes.

Notetaker auto-detects the spoken language and reports it in the transcript’s language field. If your meetings run in a known set of languages, pass transcription_settings.expected_languages when you create the bot so the detected language usually matches one of your codes. This single setting noticeably improves accuracy on multilingual calls. See the supported language codes for the full list.

You have two ways to learn that media is ready. Polling means calling Get Media on a loop, but you’ll burn requests and eat 404s until processing finishes a few minutes after the bot leaves. The better path is the notetaker.media webhook, which fires once with state: "available" and includes the same download URLs. Wire that into your download step and you react in seconds instead of guessing. See the Notetaker webhooks recipe for a full handler.

Recordings are sensitive, so store them carefully

Section titled “Recordings are sensitive, so store them carefully”

A meeting recording is some of the most sensitive data your app will touch. The pre-authenticated URLs work like bearer credentials: anyone holding one can download the file for 3,600 seconds with no further auth. Don’t expose them in a frontend where they show up in browser network tabs, and proxy downloads through your backend instead. Once the files land on your infrastructure, put authentication, access control, and audit logging in front of them.