[object Object]

21:48

Qwen2.5-VL Technical Report—No Blinks Allowed! (Paper Walkthrough)

11:09

Large Language Diffusion Models (Paper Walkthrough)

18:19

DeepSeek new paper—Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention

12:42

Beyond Next-Token Guessing: LLM Pretraining with Continuous Concepts (Paper Walkthrough)

14:08

Animate Anyone 2: High-Fidelity Character Image Animation with Environment Affordance

16:12

On the Geometry of Deep Learning (Inspired by MLST's latest: Neural Nets Are Elastic Origami)

13:27

Distill or Drill? Distillation Scaling Laws (Paper Walkthrough)

09:59

Harnessing Language's Fractal Geometry with Recursive Inference Scaling (Paper Walkthrough)

13:08

Deep Networks Always Grok & Here is Why (Inspired by MLST's latest: Neural Nets Are Elastic Origami)

15:12

Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling (Paper Walkthru)

30:32

Goku: Flow Based Video Generative Foundation Models (Paper Walkthrough)

21:37

Gold-medalist Performance in Solving Olympiad Geometry with AlphaGeometry2 (Paper Walkthrough)

06:57

Feb 7, 2025 🚗New AI Papers Drive-Thru: LLM, IMO geometry, TD learning, VectorQ, omni-modal, Robot

13:24

Vision-Language Model Dialog Games for Self-Improvement | From Google DeepMind (Paper Walkthrough)

04:44

Feb 6, 2025 🚗New AI Papers Drive-Thru: VLM, InverseRL, LLMfast, envAI, LiDAR NeRF, fedAdam, DataFree

16:01

Sparse VideoGen: Accelerating Video Diffusion Transformers with Spatial-Temporal Sparsity

46:54

Feb 5, 2025 🚗New AI Papers Drive-Thru: RL, Diffusion, Transformer, Contrastive Learning, Adversarial

11:07

VideoJAM: Joint Appearance-Motion Representations for Enhanced Motion Generation (Paper Walkthru)

42:41

Feb 4, 2025 🚗New AI Papers Drive-Thru: Alignment, diffusion, PINN, GNN, medical image, data synergy

14:59

OmniHuman-1: Now Your AI Avatar Can Dance, Sing, and Wave! (Paper Walkthrough)

25:48

Feb 3, 2025 🚗New AI Papers Drive-Thru: LLM, Diffusion, GNN, RL, Low-Rank, Edge, Meta-Learning & More

14:01

Wait, Think Again!—Simple test-time scaling (Paper Walkthrough)

12:10

SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training (PaperWalkthru)

12:59

Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate

16:58

Towards General-Purpose Model-Free Reinforcement Learning | ICLR 2025 (Paper Walkthrough)

25:04

DeepSeek-Prover-V1.5: Theorem proofs? Cracked. Next!🎲

11:16

DeepSeek's Multimodal Launch—Janus-Pro Tech Report (Paper Walkthrough)

10:33

Chain-of-Retrieval Augmented Generation (Paper Walkthrough)

28:08

UI-TARS: Pioneering Automated GUI Interaction with Native Agents (Paper Walkthrough)

27:16

VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding

18:42

Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training (Paper Walkthrough)

14:01

Go-with-the-Flow: Motion-Controllable Video Diffusion Models Using Real-Time Warped Noise

13:08

Evolving Deeper LLM Thinking (Paper Walkthrough)

18:27

Kimi K1.5 Technical Report: Scaling Reinforcement Learning with LLMs (Paper Walkthrough)

19:22

DeepSeek-R1: Open Source Challenging OpenAI o1 (Paper Walkthrough)

26:18

Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps (Paper Walkthrough)

18:56

MiniCPM-V: A GPT-4V Level MLLM on Your Phone (Paper Walkthrough)

10:37

MangaNinja: Line Art Colorization with Precise Reference Following (Paper Walkthrough)

29:11

MiniMax-01: Scaling Foundation Models with Lightning Attention (Paper Walkthrough)

00:39

Minimax's sweet PDF-on-the-side feature makes paper reading and live QA so much easier!🍭

11:51

OmniManip: Robotic Manipulation via Object-Centric Interaction Primitives as Spatial Constraints

18:32

Behind Kokoro TTS: StyleTTS 2 through Style Diffusion and Adversarial Training (Paper Walkthrough)

44:01

NVIDIA's Cosmos World Foundation Model Platform for Physical AI (Paper Walkthrough)

17:44

The GAN is dead; long live the GAN! A Modern GAN Baseline (Paper Walkthrough)

20:11

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking (Paper Walkthrough)

15:07

Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images & Videos (Paper Walkthru)

15:09

ProTracker: Probabilistic Integration for Robust and Accurate Point Tracking (Paper Walkthrough)

09:04

OpenAI o1 Stumbles on Putnam: A True Test of Reasoning! (Paper Walkthrough)

17:13

ICLR: In-Context Learning of Representations (Paper Walkthrough)

22:59

4.5m (Suspected) Fake Stars in GitHub: A Growing Spiral of Popularity Contests, Scams, and Malware

15:16

ICONS: Influence Consensus for Vision-Language Data Selection (Paper Walkthrough)

22:32

4D Gaussian Splatting: Modeling Dynamic Scenes with Native 4D Primitives (Paper Walkthrough)

14:16

Trellis img-to-3d -- Structured 3D Latents for Scalable and Versatile 3D Generation (Paper Walkthru)

20:06

Do NOT Think That Much for 2+3=? On the Overthinking of o1-Like LLMs (Paper Walkthrough)

15:23

LLMs Hallucinate? Improving Factuality with Explicit Working Memory (Paper Walkthrough)

38:49

DeepSeek-V3 Technical Report Walkthrough

15:48

Navigation World Models | Latest from Yann LeCun's Team (Paper Walkthrough)

14:45

Adam on Local Time: Addressing Nonstationarity in RL with Relative Adam Timesteps (Paper Walkthru)

14:47

The Language of Motion: Unifying Verbal and Non-verbal Language of 3D Human Motion (Paper Walkthru)

14:58

Guiding a Diffusion Model with a Bad Version of Itself -- NeurIPS2024 Runner-Up (Paper Walkthrough)