Similar Tracks
Quantization explained with PyTorch - Post-Training Quantization, Quantization-Aware Training
Umar Jamil
LoRA: Low-Rank Adaptation of Large Language Models - Explained visually + PyTorch code from scratch
Umar Jamil
Coding a Transformer from scratch on PyTorch, with full explanation, training and inference.
Umar Jamil
LLaMA explained: KV-Cache, Rotary Positional Embedding, RMS Norm, Grouped Query Attention, SwiGLU
Umar Jamil