Quantizing LLMs - How & Why (8-Bit, 4-Bit, GGUF & More) Share: Download MP3 Similar Tracks Language Model Merging - Techniques, Tools, and Implementations Adam Lucek LoRA explained (and a bit about precision and quantization) DeepFindr How might LLMs store facts | DL7 3Blue1Brown Harvard Professor Explains Algorithms in 5 Levels of Difficulty | WIRED WIRED 4-Bit Training for Billion-Parameter LLMs? Yes, Really. AI Coffee Break with Letitia Transformers (how LLMs work) explained visually | DL5 3Blue1Brown Improving RAG Retrieval by 60% with Fine-Tuned Embeddings Adam Lucek AI Engineering with Chip Huyen The Pragmatic Engineer Computer Scientist Explains One Concept in 5 Levels of Difficulty | WIRED WIRED Which Quantization Method is Right for You? (GPTQ vs. GGUF vs. AWQ) Maarten Grootendorst Stop Prompt Engineering! Program Your LLMs with DSPy Adam Lucek But what are Hamming codes? The origin of error correction 3Blue1Brown Model Distillation: Same LLM Power but 3240x Smaller Adam Lucek Visualizing transformers and attention | Talk for TNG Big Tech Day '24 Grant Sanderson Knowledge Graph or Vector Database… Which is Better? Adam Lucek Fine Tune DeepSeek R1 | Build a Medical Chatbot DataCamp Optimize Your AI - Quantization Explained Matt Williams How TOR Works- Computerphile Computerphile How To Run Private & Uncensored LLMs Offline | Dolphin Llama 3 Global Science Network How DeepSeek Rewrote the Transformer [MLA] Welch Labs