NPU

Source: YouTube

Snippet from Wikipedia: AI accelerator

An AI accelerator, deep learning processor or neural processing unit (NPU) is a class of specialized hardware accelerator or computer system designed to accelerate artificial intelligence and machine learning applications, including artificial neural networks and computer vision. Typical applications include algorithms for robotics, Internet of Things, and other data-intensive or sensor-driven tasks. They are often manycore designs and generally focus on low-precision arithmetic, novel dataflow architectures or in-memory computing capability. As of 2024, a typical AI integrated circuit chip contains tens of billions of MOSFETs.

AI accelerators such as neural processing units (NPUs) are used in mobile devices such as Apple iPhones and Huawei cellphones, and personal computers such as AMD laptops and Apple silicon Macs. Accelerators are used in cloud computing servers, including tensor processing units (TPU) in Google Cloud Platform and Trainium and Inferentia chips in Amazon Web Services. A number of vendor-specific terms exist for devices in this category, and it is an emerging technology without a dominant design.

Graphics processing units designed by companies such as Nvidia and AMD often include AI-specific hardware, and are commonly used as AI accelerators, both for training and inference.