Pythia (Hugging Face)
EleutherAI’s public scaling suite: matched GPT-NeoX–architecture models from 70M–12B with public datasets for interpretability research.
Why it is included
Pythia checkpoints remain heavily downloaded for `text-generation`—gold standard for mechanistic interpretability baselines.
Best for
Researchers studying training dynamics, memorization, and layer-wise behavior.
Strengths
- Public intermediate checkpoints
- Documented data
- Reproducible ladder
Limitations
- Not competitive with frontier chat models for products
Good alternatives
OLMo · GPT-NeoX · TinyLlama
Related tools
AI & Machine Learning
GPT-NeoX
EleutherAI framework and 20B-class models for training large autoregressive LMs with 3D parallelism—Apache-2.0 training stack.
AI & Machine Learning
OLMo
Allen AI fully open LLM **pipeline**: weights, training code, data mixes, and evaluation—research transparency flagship.
AI & Machine Learning
Hugging Face Transformers
State-of-the-art pretrained models for PyTorch, TensorFlow, and JAX.
AI & Machine Learning
OPT (Hugging Face)
Meta’s Open Pretrained Transformer suite (125M–175B) released with reproducible logbooks—canonical Hub org `facebook` / `facebook/opt-*`.
AI & Machine Learning
Axolotl
YAML-configured fine-tuning for LLMs: LoRA, QLoRA, FSDP, and many architectures on top of Hugging Face trainers.
AI & Machine Learning
BLOOM
BigScience 176B multilingual causal LM—landmark collaborative open training effort on Jean Zay (weights under BigScience Responsible AI License).
