Similar Tracks
Attention is all you need (Transformer) - Model explanation (including math), Inference and Training
Umar Jamil
Coding a Multimodal (Vision) Language Model from scratch in PyTorch with full explanation
Umar Jamil
[Hindi] Coding a Transformer from scratch on Pytorch, with full explanation and training.
KNOWLEDGE DOCTOR
NLP Demystified 15: Transformers From Scratch + Pre-training and Transfer Learning With BERT/GPT
Future Mojo
LoRA: Low-Rank Adaptation of Large Language Models - Explained visually + PyTorch code from scratch
Umar Jamil
LLaMA explained: KV-Cache, Rotary Positional Embedding, RMS Norm, Grouped Query Attention, SwiGLU
Umar Jamil