Posted on

There’s Nothing Nuanced About Microsoft’s Plans For Voice Recognition Technology

By Enrique Dans for Forbes

Several media have already reported on Microsoft’s advanced talks over an eventual acquisition of Nuance Communications, a leader in the field of voice recognition, with a long and troubled history of mergers and acquisitions. The deal, which was finally announced on Monday, was estimated to be worth as much as $16 billion, which would make it Microsoft’s second-largest acquisition after LinkedIn in June 2016 for $26.2 billion, but has ended up closing at $19.7 billion, up 23% from the company’s share price on Friday.

After countless mergers and acquisitions, Nuance Communications has ended up nearly monopolizing the market in speech recognition products. It started out as Kurzweil Computer Products, founded by Ray Kurzweil in 1974 to develop character recognition products, and was then acquired by Xerox, which renamed it ScanSoft and subsequently spun it off. ScanSoft was acquired by Visioneer in 1999, but the consolidated company retained the ScanSoft name. In 2001, ScanSoft acquired the Belgian company Lernout & Hauspie, which had previously acquired Dragon Systems, creators of the popular Dragon NaturallySpeaking, to try to compete with Nuance Communications, which had been publicly traded since 1995, in the speech recognition market. Dragon was the absolute leader in speech recognition technology accuracy through the use of Hidden Markov models as a probabilistic method for temporal pattern recognition. Finally, in September 2005, ScanSoft decided to acquire Nuance and take its name.

Since then, the company has grown rapidly through acquisitions, buying as many as 52 companies in the field of speech technologies, in all kinds of industries and markets, creating a conglomerate that has largely monopolized related commercial developments, licensing its technology to all kinds of companies: Apple’s Siri was originally based on Nuance technology — although it is unclear how dependent on the company it remains.

The Microsoft purchase reveals the company’s belief in voice as an interface. The pandemic has seen videoconferencing take off, triggering an explosion in the use of technologies to transcribe voice: Zoom, for example, incorporated automatic transcription in April last year using Otter.ai, so that at the end of each of my classes, I automatically receive not only the video of them, but also their full transcript (which works infinitely better when the class is online than when it takes place in face-to-face mode in a classroom).

Microsoft, which is in the midst of a process of strong growth through acquisitions, had previously collaborated with Nuance in the healthcare industry, and many analysts feel that the acquisition intends to deepen even further into this collaboration. However, Microsoft could also be planning to integrate transcription technology into many other products, such as Teams, or throughout its cloud, Azure, allowing companies to make their corporate environments fully indexable by creating written records of meetings that can be retrieved at a later date. 

Now, Microsoft will try to raise its voice — it has almost twenty billion reasons to do so — and use it to differentiate its products via voice interfaces. According to Microsoft, a pandemic that has pushed electronic and voice communications to the fore is now the stimulus for a future with more voice interfaces, so get ready to see more of that. No company plans a twenty billion dollar acquisition just to keep doing the same things they were doing before.

Need more dictation or transcription supplies and accessories?

Visit our friends over at TranscriptionGear to get the rest of what you need! From headsets to foot pedals, they have you covered.