Intel Gaudi2 AI Accelerator Gains 2x Performance Leap on GPT-3 with FP8 Software


  • Staff
What’s New: Today, MLCommons published results of the industry standard MLPerf training v3.1 benchmark for training AI models, with Intel submitting results for Intel® Gaudi®2 accelerators and 4th Gen Intel® Xeon™ Scalable processors with Intel® Advanced Matrix Extensions (Intel® AMX). Intel Gaudi2 demonstrated a significant 2x performance leap, with the implementation of the FP8 data type on the v3.1 training GPT-3 benchmark. The benchmark submissions reinforced Intel’s commitment to bring AI everywhere with competitive AI solutions.

“We continue to innovate with our AI portfolio and raise the bar with our MLPerf performance results in consecutive MLCommons AI benchmarks. Intel Gaudi and 4th Gen Xeon processors deliver a significant price-performance benefit for customers and are ready to deploy today. Our breadth of AI hardware and software configuration offers customers comprehensive solutions and choice tailored for their AI workloads.”
–Sandra Rivera, Intel executive vice president and general manager of the Data Center and AI Group

Why It Matters: The newest MLCommons MLPerf results build on Intel’s strong AI performance over previous MLPerf training results from June. The Intel Xeon processor remains the only CPU reporting MLPerf results, and Intel Gaudi2 is one of only three accelerator solutions upon which results are based, only two of which are commercially available.

Intel Gaudi2 and 4th Gen Xeon processors demonstrate compelling AI training performance in a variety of hardware configurations to address the increasingly broad array of customer AI compute requirements.

newsroom-mlperf-gaudi2.jpg.rendition.intel.web.1920.1080.jpg


newsroom-intel-4th-gen-intel-xeon-2.jpg.rendition.intel.web.1920.1080.jpg


About the Intel Gaudi2 Results: Gaudi2 continues to be the only viable alternative to NVIDIA’s H100 for AI compute needs, delivering significant price-performance. MLPerf results for Gaudi2 displayed the AI accelerator’s increasing training performance:
  • Gaudi2 demonstrated a 2x performance leap with the implementation of the FP8 data type on the v3.1 training GPT-3 benchmark, reducing time-to-train by more than half compared to the June MLPerf benchmark, completing the training in 153.58 minutes on 384 Intel Gaudi2 accelerators. The Gaudi2 accelerator supports FP8 in both E5M2 and E4M3 formats, with the option of delayed scaling when necessary.
  • Intel Gaudi2 demonstrated training on the Stable Diffusion multi-modal model with 64 accelerators in 20.2 minutes, using BF16. In future MLPerf training benchmarks, Stable Diffusion performance will be submitted on the FP8 data type.
  • On eight Intel Gaudi2 accelerators, benchmark results were 13.27 and 15.92 minutes for BERT and ResNet-50, respectively, using BF16.
About the 4th Gen Xeon Results: Intel remains the only CPU vendor to submit MLPerf results. The MLPerf results for 4th Gen Xeon highlighted its strong performance:
  • Intel submitted results for RESNet50, RetinaNet, BERT and DLRM dcnv2. The 4th Gen Intel Xeon scalable processors’ results for ResNet50, RetinaNet and BERT were similar to the strong out-of-box performance results submitted for the June 2023 MLPerf benchmark.
  • DLRM dcnv2 is a new model from June’s submission, with the CPU demonstrating a time-to-train submission of 227 minutes using only four nodes.
4th Gen Xeon processor performance demonstrates that many enterprise organizations can economically and sustainably train small to mid-sized deep learning models on their existing enterprise IT infrastructure with general-purpose CPUs, especially for use cases in which training is an intermittent workload.

What’s Next: With software updates and optimizations, Intel anticipates more advances in AI performance results in forthcoming MLPerf benchmarks. Intel’s AI products provide customers with more choice for AI solutions to meet dynamic requirements requiring performance, efficiency and usability.

More Context: MLCommons Announcement

Source:
 

Attachments

  • newsroom-gaudi-2-lockup-3a.jpg.rendition.intel.web.416.234.jpg
    newsroom-gaudi-2-lockup-3a.jpg.rendition.intel.web.416.234.jpg
    7.8 KB · Views: 0
Oh, lawdy, it's Gaudi!

:winkt:
 

My Computers

System One System Two

  • OS
    Windows 11 23H2 Current build
    Computer type
    PC/Desktop
    Manufacturer/Model
    HomeBrew
    CPU
    AMD Ryzen 9 3950X
    Motherboard
    MSI MEG X570 GODLIKE
    Memory
    4 * 32 GB - Corsair Vengeance 3600 MHz
    Graphics Card(s)
    EVGA GeForce RTX 3080 Ti XC3 ULTRA GAMING (12G-P5-3955-KR)
    Sound Card
    Realtek® ALC1220 Codec
    Monitor(s) Displays
    2x Eve Spectrum ES07D03 4K Gaming Monitor (Matte) | Eve Spectrum ES07DC9 4K Gaming Monitor (Glossy)
    Screen Resolution
    3x 3840 x 2160
    Hard Drives
    3x Samsung 980 Pro NVMe PCIe 4 M.2 2 TB SSD (MZ-V8P2T0B/AM) } 3x Sabrent Rocket NVMe 4.0 1 TB SSD (USB)
    PSU
    PC Power & Cooling’s Silencer Series 1050 Watt, 80 Plus Platinum
    Case
    Fractal Design Define 7 XL Dark ATX Full Tower Case
    Cooling
    NZXT KRAKEN Z73 73.11 CFM Liquid CPU Cooler (3x 120 mm push top) + Air 3x 140mm case fans (pull front) + 1x 120 mm (push back) and 1 x 120 mm (pull bottom)
    Keyboard
    SteelSeries Apex Pro Wired Gaming Keyboard
    Mouse
    Logitech MX Master 3S | MX Master 3 for Business
    Internet Speed
    AT&T LightSpeed Gigabit Duplex Ftth
    Browser
    Nightly (default) + Firefox (stable), Chrome, Edge
    Antivirus
    Defender + MB 5 Beta
  • Operating System
    ChromeOS Flex Dev Channel (current)
    Computer type
    Laptop
    Manufacturer/Model
    Dell Latitude E5470
    CPU
    Intel(R) Core(TM) i5-6300U CPU @ 2.40GHz, 2501 Mhz, 2 Core(s), 4 Logical Processor(s)
    Motherboard
    Dell
    Memory
    16 GB
    Graphics card(s)
    Intel(R) HD Graphics 520
    Sound Card
    Intel(R) HD Graphics 520 + RealTek Audio
    Monitor(s) Displays
    Dell laptop display 15"
    Screen Resolution
    1920 * 1080
    Hard Drives
    Toshiba 128GB M.2 22300 drive
    INTEL Cherryville 520 Series SSDSC2CW180A 180 GB SATA III SSD
    PSU
    Dell
    Case
    Dell
    Cooling
    Dell
    Mouse
    Logitech MX Master 3S (shared w. Sys 1) | Dell TouchPad
    Keyboard
    Dell
    Internet Speed
    AT&T LightSpeed Gigabit Duplex Ftth

Latest Support Threads

Back
Top Bottom