Protein Language Modeling from Zero to Hero

This repository is a hands-on, step-by-step tutorial series inspired by Andrej Karpathy’s "nn zero to hero" philosophy, tailored for the protein language modeling domain.

🚀 What is this project?

This project walks you through building protein language models from scratch — starting from the simplest bigram model, all the way to advanced transformer architectures.
Along the way, you’ll learn how to apply reinforcement learning (RL) techniques to fine-tune these models for protein-specific tasks.

📖 Stay Tuned!

This series is under active development.
Follow the repository, and join the journey from zero to hero in protein language modeling!

Inspired by Andrej Karpathy. Bridging AI and biology, one model at a time.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
.gitignore		.gitignore
1a-ngram-for-cdrh3.ipynb		1a-ngram-for-cdrh3.ipynb
1b-slp-for-cdrh3.ipynb		1b-slp-for-cdrh3.ipynb
2a-mlp-for-cdrh3.ipynb		2a-mlp-for-cdrh3.ipynb
2b-mlp-with-batchnorm-for-cdrh3.ipynb		2b-mlp-with-batchnorm-for-cdrh3.ipynb
README.md		README.md
esm_evaluate.py		esm_evaluate.py
oas_pair_heavy.csv		oas_pair_heavy.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Protein Language Modeling from Zero to Hero

🚀 What is this project?

📖 Stay Tuned!

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Protein Language Modeling from Zero to Hero

🚀 What is this project?

📖 Stay Tuned!

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages