Ideally, you’d plan projects around a set event like the inauguration months in advance, but ideas sometimes come at the last minute. Such was the case for the inspiration behind our annotated inauguration speech.
A week out from the inauguration, Vox’s Director of Programming, Allison Rockey, shared NPR’s live annotation of President Trump’s press conference in our “inspiration” Slack channel and asked if the Storytelling Studio could build something similar for Donald Trump’s inauguration address. Because of the time crunch, we decided the best way forward was by building the framework for a live annotation tool that we could iteration on over time.
Here’s how we did it.
Get into code quickly
Given the limited amount of time, we didn’t wait for a design to start building. Instead, we got started building out the framework for filling a Google Doc with live captions. First, we looked at IBM Watson’s Speech To Text service for transcribing a live feed. While the translation quality was good, and the service API makes it easy to handle live streaming media, we found a better approach – C-SPAN Captions.
After researching television captioning we came across OpenedCaptions, a service that makes C-SPAN 1 captions available over the web in real time (as described on Source and Nieman Lab). The quality of transcription was better than Watson Speech to Text, and easier to implement. However, OpenedCaptions provides a socket API endpoint, which the Google Docs app script does not support and can only refresh once a minute. To solve for that, we built a very simple intermediate server to buffer the text and make it available as REST API and, from there, the code for the Google app script was simple.
:no_upscale()/cdn.vox-cdn.com/uploads/chorus_asset/file/7843797/vox_mock.png)
After setting up a mechanism for getting the transcript into a Google Doc, we needed to build out the front end of the application. Rather than make a pixel-perfect mock, we did a quick design that informed the build. From there the team worked together to finalize the design in browser.
Collaborate with the news community
The news community was a helpful resource. We referenced and were inspired by the write-up by Tyler Fisher and the NPR visual team explaining their process for live annotating debates/speeches. Our version differs from the NPR workflow in that it uses open/free services readily available to retrieve captions. For this, we worked with Dan Schultz, a former Knight-Mozilla fellow, who made OpenedCaptions.
We have open-sourced the intermediate server code and the Google app script, to give back to the community.
:no_upscale()/cdn.vox-cdn.com/uploads/chorus_asset/file/7843957/vox_google_doc.png)
Editorial workflow is a part of your product
A huge part of making a project like this work is making sure it’s easy enough for our users, in this case Vox reporters, to actually use the system. We had to build a workflow that supported multiple concurrent users, had a quick shorthand for denoting an annotation, and allowed for an approval layer. To do this, we relied on the “suggestions mode” built into Google Docs, and a single and dual bracketing system to denote highlights and annotations. To identify the annotator, we created a list of names, initials, twitter handles, and other writer metadata, which we later referenced to fill out the details shown after each annotation.
In addition to creating the workflow in Google Docs, we documented all the roles and wrote a document on how to use the Google Doc. Because most of the Vox newsroom was busy reporting and preparing for inauguration, having the workflow documented was extremely valuable so everyone could review it asynchronously.
We also held a training session the week of publish to ensure key editors could communicate how to add and edit annotations during the speech.
Ship the MVP
We approached this as an experiment from the start. While the transcript wasn’t published until after the speech ended, we added annotations in real-time as if it were live. We’ll use learnings from this trial should we decide to iterate on this for future live events.