<aside> <img src="/icons/table_gray.svg" alt="/icons/table_gray.svg" width="40px" />

Table of Content

</aside>

Lecture 1: Quantize & De-Quantize a Tensor

Neural Network Quantization:

You can quantize:

The Weights: NN Parameters
The Activations

<aside> <img src="/icons/info-alternate_red.svg" alt="/icons/info-alternate_red.svg" width="40px" />

Quantize the NN after being trained, called Post Training Quantization (PTQ)

</aside>

Advantages of Quantization:

Smaller Models
Speed Gain:
- Memory Bandwidth
- Faster Operations:
  - GEMM: General Matrix Multiply
  - GEMV: General Matrix Vector Multiply

Challenge of Quantization:

Quantization Errors (the difference between quantized and de-quantized results)
Retraining
Limited Hardware Support
Calibration Dataset needed
Packing/Unpacking