Training LLM to play chess using Deepseek GRPO reinforcement learning Share: Download MP3 Similar Tracks How DeepSeek learns: GRPO explained with Triangle Creatures Dr Mihai Nica Speculative Decoding: When Two LLMs are Faster than One Efficient NLP Reinforcement Learning (RL) for LLMs Natasha Jaques Experimenting with Reinforcement Learning with Verifiable Rewards (RLVR) Nathan Lambert A better Hugging Face model search with OpenAI, RAG, pgvector Efficient NLP Yanis Varoufakis REVEALS REAL Trump Tariff Strategy Breaking Points Fine-tuning Whisper to learn my Chinese dialect (Teochew) Efficient NLP How DeepSeek Rewrote the Transformer [MLA] Welch Labs DeepSeek R1 Theory Overview | GRPO + RL + SFT Deep Learning with Yacine Structured Output from LLMs: Grammars, Regex, and State Machines Efficient NLP Trump on Upholding Constitution: "I Don't Know" | The Daily Show The Daily Show The Most Accurate Speech-to-text APIs in 2025 Efficient NLP Speech LLMs: Models that listen and talk back Efficient NLP I Trained an LLM to Think Deeper (Here's How) Adam Lucek Deep Dive into LLMs like ChatGPT Andrej Karpathy Residual Vector Quantization for Audio and Speech Embeddings Efficient NLP Transformers (how LLMs work) explained visually | DL5 3Blue1Brown Function Approximation | Reinforcement Learning Part 5 Mutual Information Knowledge Distillation: How LLMs train each other Julia Turc