Over the past couple of years, China’s online retail/cloud hosting/entertainment streaming powerhouse Alibaba has really got its teeth into AI, not least because its business now demands that it does its level best to exploit such technology as much as it can. So it comes as not too much of a surprise to note that it has stepped up alongside its Chinese compatriot Huawei to introduce a new processor designed specifically for the job of running AI tasks.
Known as the Hangguang 800, it is a Neural Processing Unit that the company has designed to boost the performance and capabilities for search functions, recommendation engines and improving customer service in e-commerce. Not surprisingly, these are three areas which are high priorities with the Alibaba’s online retailing operations.
The processor has been design by the T-Head research unit, which is under the aegis of the Alibaba DAMO Academy and is said to have a Chinese name that translates as `Honey Badger’.
As well as making use of the processor itself, the company is also going to make it available to its customer base of commercial users, many f which will already be users of Alibaba Cloud services. Unlike Huawei, however, it will not be making the chips themselves available for those commercial customers to engineer their own AI solutions and services from the ground up.
Instead the company will be making access to the processors, or more specifically the servers that will be running them, available as cloud services. This is an interesting approach that is well suited to both the capabilities of the AI processor and delivery by cloud. Customers gain access to just those capabilities – the compute resources – on a classic ‘time and materials’ basis, and avoid the potentially huge investment associated with moving to a new systems architecture and developing new applications from the ground up. To be fair to Huawei, it too will be offering its customers access to servers running its new AI processors and resources as cloud-delivered SaaS.
This delivery model should also prove to be of interest to channel partners, and may indeed help to grow the market for them to add AI and machine learning services to their portfolios. It would give them direct access to supported AI and machine learning resources, without having to make significant investments building their own specialist facilities and resources.
Highly-optimised algorithms can be run, primarily tailored for applications such as retail and logistics in the Alibaba ecosystem. For example, around one billion product images are uploaded every day by merchants to Taobao, Alibaba’s e-commerce site. It used to take one hour to categorise them and tailor search and personalised recommendations for hundreds of millions of consumers.
With Hanguang 800, it now takes five minutes to complete the same task. It has recorded a single-chip performance of 78,563 IPS at peak moment, while the computation efficiency was 500 IPS/W during the Resnet-50 Inference test. According to the company, these figures demonstrate that it out-performs industry averages.
The Hanguang 800 was announced at Alibaba Cloud’s annual Apsara Computing Conference in Hangzhou. The company also used the event to launch a raft of new products, including the 3rd generation of its X-Dragon Architecture, claimed by the company as a driving factor in the growth of its operations across e-commerce, logistics, finance and what it calls New Retail.
X-Dragon provides seamless integration of different computing platforms - including the company’s Elastic Container Service (ECS), bare metal servers and virtual machines within a single overall architecture. Its main goal is to improve the performance of cloud-native application and has achieved an increase in Queries Per Second handled of 30% and a decrease in latency of 60%. With power savings in running cloud native apps it claims a reduction in the unit computing cost of 50%.
Performance has become a focus for the company, as it has recently announced a self-developed data traffic control mechanism named High Precision Congestion Control (HPCC). The goal is to provide data transmission with ultra-low latency, high bandwidth and high stability.
Research has shown HPCC reacts faster to available bandwidth and congestion compared with other alternatives, while maintaining close-to-zero queues. In the simulations for under 50% traffic load, HPCC shortens flow completion times by up to 95%, causing little congestion even under large-scale incasts. By addressing challenges such as delayed INT information during congestion or overreaction to INT information, HPCC can quickly utilize free bandwidth to avoid this issue and can maintain near-zero in-network queues for ultra-low latency.
Here is another pitch, again from China, at moving the goalposts of what technologies – both hardware and software – will be needed to make AI systems effective. It is certainly true that, if we think we have powerful and data-rich systems environments now, then when AI really takes off the only observation that can be made is, ‘You ain’t seen nothing yet’. Follow the logic of that and it is not unreasonable to speculate that the current technologies that hold sway in ‘big’ and ‘fast’ systems are unlikely to be key components of the future. Whether any of these Alibaba offerings will play a part is far too early to tell, but they will be in the mix.