24 lines (14 loc) · 1.92 KB

Exploring-transformers

A small, learning-focused repository for understanding Transformer internals by implementing them from scratch.

What’s inside

Byte Pair Encoding (BPE) tokenizer (train / encode / decode)

currently has a basic notebook level implementation. Probably better to stick to tiktoken
Work on adding and handling special tokens for training and using llms

Rotary Positional Embeddings (RoPE) (complex-number implementation)

implementation based on llama code, src: https://github.com/meta-llama/llama3/blob/main/llama/model.py

Attention implmentation

trying to implement function code from paper, including gpt2/llama style llm(decoder only).

Working on finishing the model, then play around with training, evaluation and finetuning pipelines/scripts.

Purpose

This repo prioritizes clarity over optimization. It is meant for exploration, experimentation, and mapping theory → code — not for production use.