Similar Tracks
LoRA: Low-Rank Adaptation of Large Language Models - Explained visually + PyTorch code from scratch
Umar Jamil
Direct Preference Optimization (DPO) explained: Bradley-Terry model, log probabilities, math
Umar Jamil
Coldplay | Top Songs 2025 Playlist | Yellow, My Universe, Viva La Vida, Something Just Like This ...
Shayon TOP SONGS
Mistral / Mixtral Explained: Sliding Window Attention, Sparse Mixture of Experts, Rolling Buffer
Umar Jamil
LLaMA explained: KV-Cache, Rotary Positional Embedding, RMS Norm, Grouped Query Attention, SwiGLU
Umar Jamil
Retrieval Augmented Generation (RAG) Explained: Embedding, Sentence BERT, Vector Database (HNSW)
Umar Jamil