Exploring Teams' Json Transcript

Fri, Feb 2, 2024 One-minute read

I was recently interested in doing some data analysis of a Microsoft Teams transcript - specifically to understand if I was talking too much and not giving other folks the opportunity to speak.

If you’ve ever wondered something similar (or wanted to do data analysis for other reasons), you might have seen the option to download meeting transcripts. The default options are docx files (Word documents) and VTT files:

The VTT files are handy, but turning that data format into something you can process requires writing a mini little parser. Not a complicated one, but something custom nevertheless.

powershell_parse_vtt_file.png

While poking around, I did see that the Microsoft Stream version of the transcript is actually delivered as JSON. You can see this by looking for streamContent?format=json in Chrome Dev Tools:

Open the Response tab, and you can see a much more detailed JSON representation of the meeting transcript:

Save that (or copy it to your clipboard), and have fun!