Skip to content

Latest commit

 

History

History
24 lines (14 loc) · 1.92 KB

File metadata and controls

24 lines (14 loc) · 1.92 KB

Exploring-transformers

A small, learning-focused repository for understanding Transformer internals by implementing them from scratch.

What’s inside

Byte Pair Encoding (BPE) tokenizer (train / encode / decode)

  • currently has a basic notebook level implementation. Probably better to stick to tiktoken
  • Work on adding and handling special tokens for training and using llms

Rotary Positional Embeddings (RoPE) (complex-number implementation)

Attention implmentation

  • trying to implement function code from paper, including gpt2/llama style llm(decoder only).

Working on finishing the model, then play around with training, evaluation and finetuning pipelines/scripts.

Purpose

This repo prioritizes clarity over optimization. It is meant for exploration, experimentation, and mapping theory → code — not for production use.