AMD sketched its high-level datacenter plans for its next-generation Vega 7nm graphics processing unit (GPU) at Computex today. Putting together bits of information dropped during AMD’s PC-heavy hour-and-a-half presentation, it becomes apparent that Vega 7nm is finally aimed at high performance deep learning (DL) and machine learning (ML) applications – artificial intelligence (AI), in other words. AMD’s EPYC successes may be paving the way for Vega 7nm in cloud AI training and inference applications.
AMD claims that the 7nm process node it has co-developed with its fab partners will yield twice the transistor density, twice the power efficiency and about a third more performance than its 14nm process node. Vega 7nm will include:
- 32GB of second-generation High Bandwidth Memory (HBM2) integrated into a multi-chip package
- Integrated AMD Infinity Fabric interface
- New deep learning instruction set operations
32GB of HBM2 will bring Vega to parity with NVIDIA’s high-end Tesla V100 datacenter GPUs. An educated guess says that not all Vega 7nm products will sport this high-end memory configuration – I think that showing off 32GB was a pointed message to AMD’s cloud customers.
AMD’s Infinity Fabric interface will enable high bandwidth, coherent memory communications between Vega 7nm chips and AMD Zen processor chips, such as AMD’s Zen2 7nm server chips. Infinity Fabric can connect chips within a multi-chip package and it can connect sockets on a motherboard. Integrating Infinity Fabric will benefit both consumer and datacenter products. We’ll have to wait to see if Infinity Fabric can scale anywhere near as well as NVIDIA’s NVLink interconnect, but I think AMD is likely to be aiming here, as well.
AMD did not divulge its new DL/ML instruction set operations. However, AMD’s message was clear, AMD intends for Vega 7nm to competitively accelerate both training and inference applications. AMD also claims that it will do so with low power.
AMD intends Vega 7nm to competitively accelerate #datacenter #DeepLearning #MachineLearning #AI training & inference
AMD is betting big on completely open software ecosystems for AI, ML and DL; its complete ecosystem is available today. Customers have transparent access to all code, which will allow customers to customize and optimize their frameworks to run on Vega 7nm. AMD says its ML ecosystem will enable software developers to compile models directly to Vega 7nm machine code.
AMD’s single ML ecosystem slide from its Computex keynote shows that AMD is aiming for image, video and speech recognition, as well as natural language processing. Supported frameworks include TensorFlow, PyTorch, Caffe/Caffe2 and MXNet.
There was also a claim about accelerating blockchain, but that seemed half-hearted. Blockchain looks like a solution in search of problems to solve, while AI is accelerating datacenter GPU purchasing today.
AMD CEO Dr. Lisa Su said that Vega 7nm is sampling to its partners and customers now and will launch in the second half of this year.
AMD downplayed its Vega 7nm new AI features in its press releases, but its Computex keynote told a different story. NVIDIA may have company in the datacenter AI training market in early 2019.