Amazon has started moving AI from the Alexa cloud to its...

With its custom chips Inferentia, Amazon Web Service (AWS), the division of Amazon specializing in cloud computing services, has reduced its reliance on NVIDIA graphics processors it previously used. Amazon said Thursday it has moved most of its personal assistant Alexa processing to its own custom-designed Application Specific Integrated Circuit (ASIC) chips, aimed at making work faster and cheaper all, and then improving voice assistant performance.

Amazon developer Sebastien Stormacq wrote in a blog post that adopting AWS Inferentia for some Alexa skills results in a 25% improvement in latency rate, a cost estimated to be 30% cheaper.

8bc91bd240.jpg

Today, we are announcing that the Amazon Alexa team has migrated the vast majority of its GPU-based machine learning infringement workloads to Amazon Elastic Compute Cloud (EC2) Inf1 instances, powered by AWS. Inferentia. This results in 25% lower end-to-end latency and 30% lower cost compared to GPU-based instances for Alexa-to-speech workloads. The reduced latency allows Alexa engineers to innovate with more complex algorithms and improve the overall Alexa experience for our customers, it says.

Stormacq describes Inferentia’s hardware design as follows: AWS Inferentia is a custom chip, built by AWS, to accelerate machine learning infringement workloads and optimize their cost. Each AWS Inferentia chip contains four NeuronCores. Each NeuronCore implements a high performance systolic matrix multiplication engine, which massively speeds up typical deep learning operations such as convolution and transformers. The NeuronCores are also equipped with a large on-chip cache, which minimizes access to external memory, which significantly reduces latency and increases throughput.

Alexa is Amazon’s cloud-based voice service that powers Amazon Echo devices and more than 140,000 models of speakers, lights, plugs, TVs and smart cameras. According to the company, customers connect more than 100 million Alexa devices per day today. But if these devices are installed in offices or homes, Alexa’s brain is deployed on AWS, so that when someone who has a doted Echo or Echo uses the Alexa personal assistant, a very small part of the processing is performed on the device itself.

Workloads for an Alexa request primarily based on artificial intelligence

When users of devices such as Amazon’s Echo line of smart speakers ask the voice assistant a question, the device detects the wake-up word (Alexa) using its own on-board processing, then forwards the request to service centers. Amazon data. Echo then sends the request back to one of Amazon’s data centers for several processing steps. When Amazon’s computers determine a response, that response is in text form that must be translated into audible language for the voice assistant.

These calculation steps, after sending the request to the data center, have always been supported by GPUs signed by NVIDIA, which has precisely defined its strategy on artificial intelligence by adapting its graphics chips to the types of calculations required for it. pattern training or infringement. Specialized in parallel computing, GPUs are much more efficient than CPUs for these tasks, and they were therefore quickly adopted for this purpose.

But now Alexa will use AWS Inferencia, the first chip developed by Amazon, which was designed specifically to speed up deep learning calculations. AWS Inferentia is designed to deliver high cloud-based infringement performance, reduce the total cost of infringement, and enable developers to easily integrate machine learning with the functionality and capabilities of their business applications, Amazon said in its report. blog post. Since these chips are designed specifically for these tasks, they are even more efficient than GPUs in their accomplishment.

Announced for the first time in 2018, Amazon’s chip is tailor-made to speed up large volumes of machine learning tasks such as text-to-speech translation or image recognition. Cloud computing customers such as Amazon, Microsoft, and Alpahbet Inc.’s Google have become the biggest buyers of computer chips, leading to a boom in data center sales at Intel, Nvidia and others.

But big tech companies, anxious to reduce their dependence on the industry’s two gloves, Nvidia and Intel, are increasingly abandoning traditional silicon vendors to design their own custom chips. Apple introduced its first three Mac computers this week (a MacBook Air, a MacBook Pro with a 13-inch screen and a Mac Mini) with its own in-house ARM-based central processors. Apple has even said it plans to switch all of its Macs to its own processors over the next two years, moving away from Intel chips.

The personal assistant Alexa isn’t the only one benefiting from the Inferentia processor: the chip powers Amazon’s AWS Inf1 instances, which are accessible to the general public and compete with Amazon’s G4 instances powered by the GPU. Amazon’s AWS Neuron Software Development Kit enables machine learning developers to use Inferentia as a target for popular frameWorks, including TensorFlow, PyTorch, and MXNet, from Stormacq.

In addition to Alexa, Seb Stomarcq specified that Rekognition, Amazon’s cloud-based facial recognition system, and much described, would also be endowed with the group’s silicon. In his article, he cites a few outside clients who use the Inferentia. Among them, Snap Inc. for its Snapchat application or the Cond Nast group. The chip is also used by the insurer Anthem.

Customers, from Fortune 500 companies to startups, use Inf1 instances for machine learning infringement. For example, Snap Inc. integrates machine learning into many aspects of Snapchat, and exploring innovation in this area is a top priority for them. After hearing about AWS Inferentia, they worked with AWS to adopt Inf1 instances to make it easier to deploy machine learning, especially in terms of performance and cost, it says.

It’s amazing and exciting to see how all of these companies are coming out “out of nowhere” with their own chips to free themselves from the possibility of being controlled by established chip companies, like in this case NVIDIA. Maybe this will eventually trickle down to regular PCs and other devices, like the Pi. Interesting times! , responded a commentator. And you what do you think ?

Source : AWS

And you ?

What do you think of this migration of Alexa workloads from NVIDIA graphics chips to Amazon’s AWS Inferentia?
Migration reduces latency by 25% and costs 30%. What’s your comment ?

See as well :

Amazon unveils its second generation of 64-bit Arm chips designed in-house and targeting Cloud infrastructure, the company also confirms its desire to reduce its dependence on the Intel glove
Amazon has announced that it will build its own chips to drive its servers, should Intel be worried?
and Intel work together on artificial intelligence chip, slated for release in second half of 2019
Ampere unveils Altra: its new family of Arms chips dedicated to data centers, and its flagship, the first processor of its kind with up to 80 cores / 80 threads

*We just want readers to access information more quickly and easily with other multilingual content, instead of information only available in a certain language.

*We always respect the copyright of the content of the author and always include the original link of the source article.If the author disagrees, just leave the report below the article, the article will be edited or deleted at the request of the author. Thanks very much! Best regards!

These were the details of the news Amazon has started moving AI from the Alexa cloud to its... for this day. We hope that we have succeeded by giving you the full details and information. To follow all our news, you can subscribe to the alerts system or to one of our different systems to provide you with all that is new.

It is also worth noting that the original news has been published and is available at en24news and the editorial team at AlKhaleej Today has confirmed it and it has been modified, and it may have been completely transferred or quoted from it and you can read and follow this news from its main source.