How an AI service won me over by becoming an AI platform - the Otter.ai machine learning transcription example

Profile picture for user jreed By Jon Reed January 14, 2021 Audio version
Summary:
I'd had plenty of satirical fun at AI's expense. But AI has also changed my content workflow. Here's how an AI service can become an AI platform, overcome glitches, and achieve a different level of user loyalty.

invetigator

A couple years ago, if you told me I'd be writing a (mostly) positive post about Otter.ai, l would have been gobsmacked. The story of how Otter.ai won me over is, for me, a humble pie lesson in the practical potential of AI - and how an imaginative platform can have so much more impact than a standalone service.

Given how much I mocked the overhype on AI text-generator GPT-3, you wouldn't think I'd be a machine learning transcript advocate. At first, I wasn't.

For years, I kicked tires on machine learning transcription applications. I was underwhelmed. Yes, I appreciated the speed, and the pricing (at most, five percent of what I paid for human transcripts).

But on the road, crushing on article deadlines, I couldn't spare the time to get machine learning transcripts from raw output to usable. Poring through audio and re-typing quotes isn't viable on deadline - not when a human can do a far better job. Not to mention it's a miserable way to spend a jetlagged night in a hotel room.

Periodically, however, I would check back. I found that sometimes, getting an imperfect transcript in less than an hour (example: after a newsworthy keynote) was worth the imperfections. It was my colleague, Phil Wainewright, who told me his Otter.ai transcripts had been pretty good.

So I gave Otter.ai a closer look again. Its ability to get technical phrases right had drastically improved (including, at times, the correct spelling of names that are referencable online). Eventually, the pandemic hit, and diginomica moved to a team pricing model for Otter.ai. This pricing model means I can have a deep catalog of audio transcribed, without worrying about the price per transcript anymore.

If anything, during the pandemic, such transcriptions are even more important - whether it's interviews, online discussions, or virtual event presentations. I set to work integrating Otter.ai with my workflow.

Team sharing - In my Otter.ai team account, I found it was easy to share transcripts (and audio) with colleagues. With each colleague, a folder has all the transcripts and recordings shared between us.

Audio playback controls - Otter had a vastly superior transcript interface to others I tested. I can search interview keywords, and press "play" right at that point in the text, updating the exact audio section I needed. It's hard to express how useful a proper replay control is - until you try it.

Dropbox integration for audio processing and downloading - Though I like the Otter.ai interface, there is no way I would trust any cloud vendor as the sole storage for my transcriptions. Therefore, I had to manually download the text file. Otter's Dropbox integration solved that for me. Now I put the audio in my Dropbox folder on one of my laptops, and the text transcripts are soon delivered back to me there. They The Dropbox integration wasn't easy as an early adopter - more on that in a moment. But it's been working well since.

The power of archived search - Instead of viewing transcripts as a one-off service, a light bulb went off. There is a payoff to a searchable archive of all my transcript content. I realized I should have a folder in Otter.ai for my own video transcripts and presentations.

Mobile phone notes - Sometimes, I have observations I would historically type in by hand (example: after a series of meetings). I now use the Otter.ai mobile app to record those notes, and categorize them based on topic. I've started to do this after the occasional medical check-in as well.

Overcoming problems with Otter.ai

When you commit to a platform, you can expect problems - especially as an early adopter. In particular, Otter.ai's first iteration of their Dropbox integration was buggy. Unbeknownst to me, I was in on the very early side. Fortunately for my sanity, the Otter.ai development team was responsive. Eventually, after some aggravating glitches, the integration locked in nicely - no further issues.

When I look at enterprise use cases, I don't look for super-smooth success stories. I look for well-tested partnerships. Recently, while Otter was enhancing their search, they changed something that messed up document-specific search. Their email support team shut me down with boilerplate responses. Their Twitter support, however, was aggressively responsive. One day later, the problem was fixed.

Otter.ai also made some mistakes when they changed their free pricing tier during the pandemic. The PR and communication around that change was, in my view, pretty tone deaf given the adverse circumstances.

However, I'm less interested in perfection than companies that keep at it. Another example: Otter's automatic Zoom transcript integration. I prefer to record my sessions on different video platforms and use the Dropbox integration for all of them, but I can see why others like the Zoom integration. Plugging apps into your platform is a smart way to go; Zoom itself is betting heavily on the same strategy with Zoom Apps.

My take - when AI doesn't overreach, count me in

I've found that Otter.ai does pretty darn well with "English as a second language" type English accents - something you can't always say for AI language services. Perhaps another reason Otter.ai has closed the transcription gap: my interviews tend to include a fair amount of enterprise tech terms. Even human transcriptionists struggle with these terms. Otter might not get all of them, but the point is, that's a challenge either way.

Is this the end of my use of human transcript services? Not necessarily. I can envision rare cases where I would still use a human. Otter.ai can handle imperfect recordings, but it would struggle with, for example, a keynote recording at an on-the-ground trade show, with lots of background noise. You don't realize how difficult it can be for machine transcriptions to make out human voices coming through keynote loudspeakers until you try it. But I would expect Otter.ai to comprise more than 95 percent of my transcripts going forward - with the bonus of having all of my transcripts in one searchable online location.

Given how hard I hammered AI for being "dumb" in my critique of hyper-personalization, the tone of this article may come as a surprise. But my issue has never been with AI per se, but with the over-reach and overhype when AI isn't there yet. For some things, AI is "good enough." I'd put machine-generated transcripts pretty high on that list.

Another AI-related writing service I make heavy use of is Grammarly. Grammarly also keeps getting better. It doesn't spot all mistakes yet - it still misses about 10-20 percent of the typos I want it to catch. But the writing issues Grammarly flags up are handy as well. I don't change every example of passive voice Grammarly spots, but I sure as heck want to know about it. The difference between Grammarly and Otter.ai? Grammarly isn't a platform for me yet. It's a useful AI service. I'm not as voluntarily locked-in as I am to Otter.

There may be other machine transcription services as good or better than Otter. I always urge companies to do careful evaluations; your use case could be very different than mine. But after unleashing so much venom against AI last week, it seemed appropriate to acknowledge that, albeit with some kicking and screaming on my part, AI services are helping me do my job better.