Skip to content

Matuco19/overengineered-rounding

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Matuco19's Overengineered Rounding AI (MOR-AI)

MOR AI (formely Matuco19's Overengineered Rounding AI) is a PyTorch implementation that solves the complex problem of... rounding floating-point numbers ¯\(ツ)

Introduction

Instead of utilizing archaic O(1) mathematical algorithms natively baked into CPUs (like math.round()), this project treats "rounding" as a highly complex Sequence-to-Sequence (Seq2Seq) Machine Translation task. It tokenizes numbers char-by-char and autoregressively generates the rounded number using an Encoder-Decoder Transformer architecture.

Warning

Disclaimer: This project was created as a satirical exercise in overengineering and should not be deployed in latency-critical enterprise environments.

Tip

You can try the project without training by downloading the latest checkpoint from the releases page.

Installation

Ensure you have Python 3.9+ installed

# Create a virtual environment
python3 -m venv .venv

# Activate the virtual environment
# Windows:
.\.venv\Scripts\activate
# Unix/MacOS:
# source .venv/bin/activate

# Install dependencies
pip install -r requirements.txt

Tip

You can use just to run the project instead of python.

just py {args}

Usage

MOR AI is fully controlled via the main.py CLI. By default, logs remain quiet (showing only progress bars for training). You can append -v or --verbose to any command to print detailed informational logs.

1. Generating Data

Generate a synthetic .jsonl dataset to train the model. The generator uses Curriculum Learning to progressively create harder examples (like edge cases x.499 vs x.501).

Available --level options:

  • easy: Far from the .5 boundary (x.1 or x.9), with few decimal places.
  • medium: Closer to the boundary (x.3 or x.7), with more decimal places.
  • hard: Very close to the .5 boundary (x.45 to x.55) to test precision and carrying capability.
  • extreme: Exact boundaries (x.5), zero, and very large or very long numbers.
  • mixed: A uniform distribution across all difficulty levels.
python3 main.py generate --samples 50000 --level mixed --verbose

2. Training the Model

Train the Transformer. The training loop automatically scans the data/ directory for generated .jsonl files and utilizes AdamW and CosineAnnealingLR to optimize the network. The best checkpoint is saved to checkpoints/best_model.pt. The training process can be resumed from the latest checkpoint by running the same command again.

To accelerate the process (especially on CPU), you can configure parallel data loading via --num_workers and PyTorch's multithreading operations via --interop_threads.

python3 main.py train --epochs 50 --batch_size 128 --lr 0.001 --num_workers 4 --interop_threads 2 --verbose

Warning

The workers num are defined by default as max(1, (os.cpu_count() or 1) // 2) (example: 4 workers for a 8-core CPU), that is the best value for CPU training. If you are using a GPU, you can set the num_workers to 0 to avoid unnecessary overhead.

3. Running Inference

Load the trained checkpoint and generate an autoregressive sequence of chars to "round" a user-provided string.

python3 main.py run --value 3.14159 --verbose

Technical Specifications

Architecture

MOR AI employs an Encoder-Decoder Transformer (Seq2Seq) architecture built natively in PyTorch. The numbers are processed through a custom character-level vocabulary that maps ['0-9', '.', '-'] alongside special tokens such as <PAD>, <SOS>, <EOS>, and <UNK>. During the forward pass, learned character embeddings are injected with classic absolute Sinusoidal Positional Encodings. This is crucial for the model to understand sequence order, such as differentiating the tenths place from the hundredths place. The decoding process utilizes Greedy Autoregressive Search paired with a causal lookahead mask to generate the rounded output char-by-char.

Training Concepts

The training process is heavily guided by Curriculum Learning. The FloatCurrGenerator meticulously structures training examples by their proximity to the .5 decision boundary and their total sequence length, ensuring the model tackles progressively harder mathematical edge cases.

For the loss function, standard Cross Entropy is computed over the vocabulary dimension at each output sequence step. Notably, label smoothing is intentionally skipped, as the act of rounding demands exact, deterministic matches rather than the probabilistic distributions typical in natural language tasks. Furthermore, the optimizer employs gradient norm clipping to prevent exploding gradients, which is a common destabilizing issue in deep Transformer training.

Limitations & Known Issues

This approach introduces significant Computational Inefficiency. Treating rounding as a translation task means a single inference call involves thousands of sequential matrix multiplications, making it perhaps the slowest possible way to round a number. Also, because the model treats numbers as language tokens, an under-trained network is prone to Hallucinations. It might loop the - token indefinitely, or decide that 99.9 should confidently round to 101. The model also suffers from a Fixed Sequence Length, bound by a max_seq_len (default 32 chars). Providing a float longer than this limit will result in truncated reasoning.

Finally, the architecture experiences a complete Lack of Inductive Bias. Transformers possess no inherent understanding of integer values or base-10 carrying algorithms (e.g., carrying over the 1 when rounding 8.5 to 9). The network must memorize the abstract concepts of addition purely via self-attention mechanisms alone.

References

License

License-MATCO Open Source V1

This project is open-source and available under the MATCO-Open-Source License. See the LICENSE file for details.

About

MOR AI (formely Matuco19's Overengineered Rounding AI) is a PyTorch implementation that solves the complex problem of... rounding floating-point numbers ¯\(ツ)/¯

Topics

Resources

License

Stars

Watchers

Forks

Contributors