FlashGRPO

sidequests to make grpo faster
Read more →

Replicating The Circuit Kings

replicating ‘circuit tracing: revealing computational graphs in language models’ by the absolute beasts over at anthropic
Read more →

Multi Query Low Rank Attention

a bit of stream of conciousness poasting
Read more →

Constitutional Mech

Legal Mech Interp
Read more →

Verify Verify Verify

verify the unverifiable
Read more →

Why So Hard (Negative) On Your Self (Reinforcement)?

Exploring hard negative mining with bm25, self-selection, bandits, and faiss
Read more →

LegalBench

legalbench, a private legal benchmark
Read more →

The Giving (Search) Tree

we do a little tree searchin’
Read more →

Explosions in the Sky - A (not so very) Deep Dive into the World of Explosions FP16 Space

why you explodin’ fam
Read more →

Generate Synthetic Data for Computer Vision Projects in Blender

freemium is better than premium
Read more →