Alexa, Tell me everything I need to know about you!
Smart Speakers have become quite the household commodity nowadays. Recently, I was working on a research study related to smart speakers, as a result of which I got to know a good deal about Amazon’s Alexa. To my surprise, there’s not one single post out there that describes every detail about Alexa in entirety. You’ll have to browse through Amazon’s website scrolling across multiple pages to know about each new feature.
Wouldn’t life be better if you could find all the info in one place? For that very reason, I decided to combine all the information and present it all here. It was a lot and I do realize how valuable time is. So I did some editing, and have presented below a brief overview of Alexa. In this piece, you will find some insights on how it was developed, and how it evolved.
I have also made sure to include the links to a list of references for this article. If you are interested to know more, feel free to check ‘em out. You can access them by clicking on the table of contents below.
Alexa – It all started with a Mere Idea
It all started when Jeff Bezos wanted to create a conversational system just like the one depicted in Star Trek. A device capable of doing tasks like playing music at request, help make shopping lists, reminding about meetings, waking you up in the morning – the list goes on. All it needs is your voice command, no clicking/typing/touching or scrolling on screens.
With this idea, the journey of developing this amazing device began in Amazon’s Lab126 in early 2011. This is the same lab that created devices like Kindle and Fire Phone. The inventors had to work with the world’s largest, most challenging and tangled dataset – ‘The Human Voice’.
A meticulous process of developing new speech recognition algorithms, training and testing phase started.
Strategic Acquisitions and Patent filing
Meanwhile, Amazon acquired companies working in the speech recognition domain, like Yap, Evi, and Ivona. This move brought those newly acquired teams to work on Alexa. Back then, it led to speculations that Amazon might be working on their second smartphone or an assistant like Siri. But no one exactly knew what Amazon was up to.
Finally, after a year’s hard work, Lab126 engineers secretly started filing patents for various speech recognition and voice recognition algorithms and assigned those patents to Rawles LLC, one of Amazon’s shell companies. Currently, Amazon has around 266 families of patents related to speech recognition technology.
I also found some patents which seem to describe the basic functioning of Amazon Alexa. For example, US9047857B1, filed on 19th December 2012 and assigned to Rawles LLC, describes how a device is transitioned to an active state from sleep state on detection of a wake word (similar to how Alexa wakes up). In the patent, Amazon has claimed that the device determines how the user said the wake word along with the command, for example, in a raised voice or a calm voice, along with the transition of states. They might have done this to overcome prior arts like Siri or other voice assistants that depict similar functioning of transition to an active state on detecting an utterance of a wake word from an idle state.
Another patent US9401140B1 describes an unsupervised learning model used to train Alexa, whereas, US9424840B1 determines how Alexa selects appropriate answers to queries. US9786294B1 is another good patent that describes the various visual states of the Echo device, as to when the lights are illuminated when the data is being processed and sent back and forth to the cloud.
Still, there was a lot left to be done. One of the most important milestones achieved by the Alexa team was the development of far-field technology. A technology capable of picking up voice commands from afar or even in noisy rooms. Traditionally, voice interactive devices existed, but the microphone was close to the user’s mouth which means a clear signal and less ambient noise. But Amazon was quite clear that it wanted to develop a device that could be placed at any location, like a living room or a noisy environment and could take commands like, “Play some music!”, from a distance. For this Amazon used far-field technology that uses a microphone array and has a lot of amazing features like beamforming, noise reduction, acoustic echo cancellation, and barge-in capabilities.
During my study, I found a few interesting patents that seem to be the base patents for this technology. For example, US8855295B1 and US9689960B1 seem to be one of the earliest patents (from Amazon) on the implementation of acoustic echo cancellation and beamforming respectively. US9947333B1 is another patent describing how Alexa ignores the background noise and transmits only the user’s command to the cloud for further processing. However, this technology has come a long way with many improvements.
Alexa sees the world
Amazon Alexa was all set to meet the world. Just a month before the launch of its most ambitious project, Amazon filed a design patent – USD744541S1 for ‘The Echo Speaker’.
Ever since its launch on November 6, 2014, Amazon Echo rapidly gained popularity and expanded to different regions with different accents, on different continents and soon became an integral part of most households. Amazon Echo was truly magical. But what next?
Amazon harnessed the power of the cloud and launched Alexa Skills Kit and Alexa Voice Service. Alexa Skills Kit, for the uninitiated, is a collection of self-service APIs and tools that make it fast and easy for developers to create new voice-driven capabilities for Alexa. Like you have apps for Android, you have skills for Alexa. This further increased Alexa’s utility. Want to book an Uber? Just download Uber’s skill and book a cab. The world became simpler as things started getting done just on voice commands. No typing, no scrolling. Amazon Alexa grew tremendously. Amazon has sold almost 100 million devices and has more than over 70,000 Alexa skills.
With the help of Alexa Voice Service, third party manufacturers can integrate Alexa into their devices. Amazon also made its 7 microphone array system and the accompanying technology available to third-party manufacturers to develop high-quality Alexa Voice experience. Now any device could be easily integrated with Alexa.
Alexa, Remind me of your features again!
Amazon is continuously adding new features to Alexa and making it better. One of the most important features that helped Alexa further gain popularity is – ‘Smart Home Skills’. Patents like US9698999B2, US10026401B1, and US10031722B1 describe the working of one of the most important features of Alexa – ‘Smart Home Skills’. Among these US9698999B2 describes how Alexa enable users to control their cloud-connected home devices like lights, dishwasher, music system, etc. with Alexa and how Alexa communicates user command to these devices. US10026401B1 and US10031722B1 describe how a user can name a cloud-connected light for example as kitchen lights. And further how a command like “Alexa, switch on Kitchen lights”, is executed by Alexa.
Amazon also considered user’s suggestions for the problems they faced when they wanted to perform multiple tasks at a time. For this, Amazon launched ‘Routines’ in 2018, which enables a user to execute multiple commands with just one utterance. For example, just by saying, “Alexa, good night”, Alexa could lock the doors, switch off the lights, set the alarm for the next day. Also many new features like voice profiles which gave customized attention to each family member of the house using the same Alexa device. Alexa calls further made Amazon Alexa more popular.
Further, Amazon came up with many new algorithms that greatly improved the user’s experience. One such breakthrough algorithm introduced last year is contextual slot carry-over (US9754591B1). People experienced in speech recognition and machine learning work on Alexa. For example, Ruhi Sarikaya, who has worked in the past with IBM and Microsoft and has more than 70 patents, along with his team came up with the context slot carry-over algorithm for Alexa.
With the help of contextual slot carry-over Alexa was able to converse more naturally. For example, the dialogue flow could be, “Alexa, how’s the weather in San Francisco?”, “Are there any good Mexican restaurants there?”, “Thanks send the directions to my phone?” This made Alexa smart enough to understand that you want to know about good Mexican restaurants in San Francisco and the directions to reach there. Earlier to do the same you had to ask Alexa, “Alexa, how’s the weather in San Francisco?”, “Alexa, are there any good Mexican restaurants in San Francisco”, “Alexa, send the directions to La taqueria to my phone”. This was the first time any virtual assistant was given such capability.
Figure 1: Context Slot Carry Over
Source: Amazon Blog post
Another such remarkable algorithm made Alexa learn about user’s preferences over time. Now you did not need to tell Alexa every time to play Country music. On your command, “Alexa, play some music”, she would play Country music.
In 2017, Amazon also launched customized versions of Alexa for vehicles – Alexa Auto and for business. Over time it also extended its Echo family by adding a range of devices like the Echo dot, Echo Show, Echo Look, Echo Spot, Amazon Tap.
This is not all, a patent filed (in 2017) by Amazon titled Voice-based determination of physical and emotional characteristics of users indicate that Alexa would be able to detect a user’s mood and would be able to tell when you are ill and respond accordingly. Let’s see when Amazon launches this feature.
As of recently, Amazon is facing a lot of competition from Apple HomePod and Google Home. Amazon has to keep Alexa ahead of the game. How are they going to do this? By adding new features? By launching more variants of Echo Home? Let’s see what amazing things Amazon is planning for our little Genie!
Alexa, More Features Please!
Figure 2: Future of Alexa
References. https://www.wired.com/2011/11/yap-siri-amazon/ . https://venturebeat.com/2013/04/17/amazon-snaps-up-siri-esque-evi-but-why/ . https://techcrunch.com/2013/01/24/amazon-gets-into-voice-recognition-buys-ivona-software-to-compete-against-apples-siri/ . https://en.wikipedia.org/wiki/Shell_corporation . https://www.bloomberg.com/features/2016-amazon-echo/ . https://patents.google.com/patent/US9047857B1 . https://developer.amazon.com/alexa-voice-service/dev-kits/amazon-7-mic . https://developer.amazon.com/docs/alexa-voice-service/audio-hardware-configurations.html . https://www.theverge.com/2019/1/4/18168565/amazon-alexa-devices-how-many-sold-number-100-million-dave-limp . https://developer.amazon.com/blogs/alexa/post/a1470f4c-8aaf-4d12-b3bc-5cf38820ff82/alexa-it-s-bedtime-routines-for-avs . https://developer.amazon.com/blogs/alexa/post/15bf7d2a-5e5c-4d43-90ae-c2596c9cc3a6/how-alexa-is-learning-to-converse-more-naturallyhttps:/developer.amazon.com/blogs/alexa/post/15bf7d2a-5e5c-4d43-90ae-c2596c9cc3a6/how-alexa-is-learning-to-converse-more-naturally . https://developer.amazon.com/blogs/alexa/post/df460d13-612d-4a33-b4b4-547ab119d99d/how-alexa-can-use-song-playback-duration-to-learn-customers-preferences . https://www.amazon.com/dp/B0753K4CWG?tag=googhydr-20&hvadid=295796567981&hvpos=1t1&hvnetw=g&hvrand=17538587399523028360&hvpone=&hvptwo=&hvqmt=b&hvdev=c&hvdvcmdl=&hvlocint=&hvlocphy=1027028&hvtargid=kwd-324445919971&ref=pd_sl_3j0otr496z_e . https://aws.amazon.com/alexaforbusiness/ . https://patents.google.com/patent/US10096319B1 . https://developer.amazon.com/blogs/alexa/post/09acb84e-6d56-463b-959c-9e15c3ed3893/announcing-more-personalized-skill-responses-when-alexa-recognizes-a-customer-s-voice-developer-preview . https://www.amazon.in/gp/help/customer/display.html?nodeId=202136300 . https://patents.google.com/patent/US9754591B1 . https://patents.google.com/patent/US8855295B1 . https://patents.google.com/patent/US9689960B1 . https://patents.google.com/patent/USD744541S1 . https://patents.google.com/patent/US9826599B2 . https://patents.google.com/patent/US10026401B1 . https://patents.google.com/patent/US10031722B1 . https://www.linkedin.com/in/ruhisarikaya
Authored by: Priya Sinha, Research Analyst, Infringement and Aadesh Srivastava, Senior Research Analyst, Infringement.