roadmap
ml ›section 11 of 14

Mixture of Experts

Sparse activation — the next axis of scale

4 lessons·1medium3hard