The way that consumers interact with content is changing. We are in the middle of a paradigm shift in which consumers are increasingly consuming video and audio content.
However, one of the most exciting developments in the audio world is text-to-speech technology.
To be clear, this isn’t exactly a new technology. Text-to-speech has existed for more than two decades. Nevertheless, it hasn’t yet caught on in mainstream media because of its lack of natural and realistic modulation.
Consumers felt the same way. For instance, news updates on smart speakers aren’t greatly loved because of their reliance on synthesised voices.
That said, text-to-speech technology is set to make a large leap forward due to one company.
That company is Amazon.
Amazon Polly changes everything because it is even closer to sounding like a human voice when reading text. The bottom line? Small and large publishers should pay attention to Amazon Polly, as it presents all kinds of exciting opportunities.
If you want to know how the introduction of this article sounds with Amazon Polly, just listen in here:
Why audio represents a great opportunity for publishers
But let’s back up. One basic premise is useful here. Generally speaking, publishers tend to move slower compared to companies in other industries.
According to Paul DeHart, the CEO of Blue Toad, “Publishers have slowly evolved into media companies.” In today’s environment, publishers “must continue to embrace new ways to create and deliver content to their readers.”
One of those new, exciting ways to create and deliver content is audio. According to a recent study by Infinite Dial, U.S. consumers listen to an average of 17 hours of audio per week.
Yes, it may not take over text or video consumption. However, it presents a convenient way for consumers to consume content wherever they are.
One of the most exciting trends in audio is the rise of the voice-activated speaker.
According to a Reuters Institute report titled The Future of Voice and the Implications for News, voice-activated speakers like Amazon’s Alexa and Google Assistant are growing faster than smartphones and tablets at a similar stage. Also, the use of voice-activated speakers in the U.S., U.K., and Germany has roughly doubled in the last year.
“Voice could become a critical gateway to media going forward.”Nic Newman, Senior Research Associate at Reuters Institute
Publishers are still contemplating monetisation strategies for audio. However, one way forward could be including a sponsorship message at the beginning of the listening experience. A 15 or 30-second ad could be one quick and easy way to generate revenue from audio content.
Amazon Polly: What’s so special?
Audio is a tool that has become increasingly relevant for publishers. However, some consumers were resistant to text-to-speech technology because the human reader seemed entirely inhuman. So far, it wasn’t an enjoyable listening experience at all.
That said, Amazon Polly is one of the more exciting developments in the text-to-speech space. Amazon Polly, according to the tech juggernaut, “is a [Text-to-Speech (TTS)] service that turns text into lifelike speech using deep learning.”
Essentially, TTS is the generation of synthesised speech from text. It lets users, including publishers, “create applications that talk, and build entirely new categories of speech-enabled products.” Amazon offers dozens of lifelike voices across a collection of different languages. They include English, Danish, French, Japanese, Spanish, and Mandarin Chinese.
But along with standard TTS voices, Amazon Polly has two game-changing features. They display major leaps forward in text-to-speech technology.
#1 Neural Text-to-Speech (NTTS)
The first is Neural Text-to-Speech (NTTS). NTTS delivers advanced improvements in speech quality by understanding the differences in speaking styles and generating speech in an expressive and lifelike way.
NTTS learns to speak by listening to recorded human speech and then copying it. Basically, it is designed to learn speech the way that children do.
According to Julien Simon, the Global Tech Chief Evangelist at Amazon, NTTS is a true game-changer “because it increases naturalness and expressiveness.” It gets us ever closer to automated voices that sound like real humans.
Currently, NTTS is available for eleven voices supporting U.S. and U.K. English.
#2 Newscaster Style
Amazon Polly has taken a large step forward because of its so-called Newscaster Style.
To listen to an example of the Newscaster Style, click the play button below:
Amazon Polly’s NTTS supports a newscaster-reading style tailored for narration use cases. Quite obviously, this is perfect for publishers who are looking for a different way to present their content.
Amazon’s Polly Newscaster Style makes narration sound extremely realistic. Newscaster Style is so advanced that it will modulate its voice based on whether it is reading a newscast, sportscast, or even a college lecture.
Amazon Polly users can also take advantage of Amazon Translate. These pieces of software working in concert will translate publishers’ content into the consumer’s preferred language. The bottom line for you? Your content can be accessible to a much larger audience.
Which publishers already use Amazon Polly?
Amazon Polly is extremely exciting for large and small publishers alike.
While the technology is still young, some early adopters have already used Amazon Polly in their own work. Some of those publishers include Gannett, The Globe and Mail, Ringier, Success Magazine, TIM Media, Encyclopedia Britannica, and CommonLit.
The Globe and Mail’s Audio Now
One of those publishers is The Globe and Mail. It is one of Canada’s most-read print and digital newspapers. The Globe and Mail has used Amazon Polly in particular to increase user engagement.
According to Greg Doufas, the Chief Technical and Digital Officer at The Globe and Mail, the newspaper has used Amazon Polly in its overall effort to help consumers access and engage with the Globe and Mail’s award-winning journalism.
Its product is called Audio Now, which leverages Amazon Polly Newscaster. According to Doufas, Audio Now is a first for Canada.
The Globe and Mail readers can access Audio Now by simply clicking on a story that interests them. Because Canada is a multilingual country, The Globe and Mail offers Audio Now in (male and female) versions of English and French.
That said, because the newspaper attracts a global audience, stories are also available in Chinese Mandarin. Yes, The Globe and Mail’s Audio Now product is on the newer side. However, it is already changing the way that Globe and Mail readers consume content.
Gannett embraces Amazon Polly
“Services like Amazon Polly and features like its Newscaster voice help us deliver breaking news and original reporting with increased speed and fidelity worthy of our brands.”Scott Stein, VP of Content Ventures at Gannett
You can see how useful Amazon Polly would be in a breaking-news environment. With news changing by the second, journalists simply don’t have the time to go into a recording booth and record a voiceover of their story.
The situation is different with Amazon Polly. By using Amazon Polly, journalists can spend more time breaking and reporting the news instead of a task that can be automated.
We are still in the early days of technologies like Amazon Polly. In all likelihood, there is more innovation ahead.
However, I believe it is already well worth paying attention to Amazon Polly. Simply put, audio is a tried-and-true way of consuming content. Whether we are commuting to work or are simply enjoying a relaxing evening at home, audio can be a terrific way to be entertained or learn something new.
Consumers are certainly paying attention. Audio content consumption is growing. In all likelihood, this trend will persist for some time. But even with these promising trends, publishers need to actively take advantage of technologies that improve the listening experience.
So what’s the bottom line? It is simple. Audio as a publishing channel is not going away. It is certainly here to stay. That’s why publishers should pay attention.
Whether you work for a small or large publisher, it is worth your time to consider the possibilities. Think about how you can use audio to better serve your audience. I bet it’s an investment certainly worth your time.
If you’re still hungry for more information: