Skip to content

kb777only/Onson

Repository files navigation

Onson & Onsonc

Onson is a compiled, class-based, imperative, event-driven programming language with Python-like indentation — featuring value-type structs, lists, GDScript-style events, scope-based concurrency, and optional raw-pointer memory access. Onsonc is its compiler: it transpiles Onson (.on) source to C++, then invokes a host C++ compiler (g++ or clang++) to produce a native executable.

  hello.on ──[lex]──► tokens ──[parse]──► AST ──[codegen]──► C++ ──[g++]──► ./hello

Onsonc itself is written in Python (so the project lives under a python/ directory in the original layout). A source-to-source transpiler is mostly string/AST manipulation, which Python does cleanly with zero build friction, and its own indentation-significant grammar mirrors Onson's.

Requirements: Python 3.10+ and a C++20 compiler (g++ or clang++).

Install

Install the onsonc command system-wide, then compile from anywhere:

./install.sh          # uses pipx if available, else pip --user
# or, equivalently:
pipx install .        # recommended (isolated)
pip install --user .  # alternative
make install          # same as pip --user

Then:

onsonc version
onsonc compile path/to/program.on --run

To build distributable artifacts (dist/*.whl, *.tar.gz): make release.

Run from the source tree (no install)

./onsonc-cli compile examples/hello.on --run
# or:  PYTHONPATH=. python3 -m onsonc compile examples/hello.on --run

Usage

# Compile to a native binary and run it:
onsonc compile examples/hello.on --run

# Transpile to C++ and stop (writes <name>.cpp + onson_runtime.hpp):
onsonc compile examples/demo.on --emit-cpp

# Build a binary at a chosen path:
onsonc compile examples/inheritance.on -o /tmp/animals

onsonc <file.on> is accepted as shorthand for onsonc compile <file.on>.

compile options

Option Meaning
-o, --output PATH Output executable path (default: source name w/o .on).
--run Run the produced binary after building.
--emit-cpp Transpile to C++ and stop.
--cxx NAME C++ backend to use (default: g++, then clang++).
--std STD C++ standard (default: c++20).
-O, --opt N Optimization level (default: 2).
--keep Keep the generated C++ build directory and print its path.
-q, --quiet Suppress the compilation report.
-v, --verbose Print the underlying C++ compiler command.

Compilation report

Every compile prints a report to stderr (so program output on stdout stays clean under --run). It separates the two phases — Onson→C++ and C++→binary — and gives a static estimate of the program's peak worker threads:

onsonc 0.3.0  |  concurrency.on
  frontend (.on -> C++)     0.77 ms   19 .on lines -> 24 C++ lines
  backend  (C++ -> bin)  3380.10 ms   g++ 16.1.1
  total                  3380.86 ms
  output                ./concurrency  (77.6 KB)
  structure             0 class(es), 0 method(s), 1 function(s)
  concurrency           3 delegate site(s); est. peak 4 thread(s) (1 main + 3 worker(s))

The thread estimate counts delegate(...) sites and computes the peak number of concurrently-live worker threads per scope (a var-bound delegate Task lives until its scope ends; a bare one is transient). It is intra-procedural; delegates inside loops are flagged since they may spawn more over time. Use -q to hide the report.

Language tour

Hello & basics

print("Hello from Onson!")

var name = "world"        # 'var' infers the type from the value
int x = 6                 # or declare the type first (type-first)
print("6 * 7 =", x * 7)
  • Indentation is significant; blocks open with : and an indent. # starts a comment.
  • Literals: integers, floats, "strings", true / false, null.
  • Types: int (64-bit), float, string, bool, void, Event, list_of_T, ptr_of_T, and class/struct types.

Variables

Declare with =; the type is either inferred (var) or written first, so an IDE/compiler can catch mismatches before runtime:

var count = 0            # inferred (int)
int n = 5                # explicit type
float ratio = 0.5
string label = "hi"
int blank               # typed, default-initialized (0)

Functions

Names are CamelCase (PascalCase), like C#. A function must declare its return type with returns Type (use returns void for none). Parameters are type-first.

func Classify(int n) returns string:
    if n < 0:
        return "negative"
    elif n == 0:
        return "zero"
    else:
        return "positive"

func Greet(string who) returns void:
    print("hi", who)

var i = 0
while i < 3:
    print(Classify(i - 1))
    i = i + 1

for k in range(0, 3):     # range(n) or range(start, end)
    print(k)

Compiling a non-CamelCase function name prints a warning suggesting the CamelCase form (the init constructor is exempt).

Classes & inheritance (reference types)

class Animal:
    string name                    # type-first fields

    func init(string name):        # `init` is the constructor
        self.name = name

    func Speak() returns string:
        return "..."

class Dog inherits Animal:
    func init(string name):
        super(name)                # call the base constructor

    func Speak() returns string:   # overrides Animal.Speak (virtual dispatch)
        return "Woof"
  • Class instances are reference types (std::shared_ptr); members are accessed with self..
  • Methods are virtual, so a base method calling self.Speak() runs the most-derived override.

Structs (value types)

A struct is like a class but a value type: copied on assignment, with no inheritance. Use it for plain data and predictable memory layout.

struct Point:
    int x
    int y

    func SumCoords() returns int:
        return self.x + self.y

var a = Point(1, 2)
var b = a            # an independent copy
b.x = 100            # does not affect 'a'
print(a.x, b.x)      # 1 100

Structs compose by value and work as list_of_Point.

Lists

A list type is list_of_T (→ std::vector<T>):

list_of_int nums = [10, 20, 30]
nums[1] = 99                       # index assignment (0-based)
append(nums, 40)                   # grow the list
print("len:", len(nums))           # 4

for n in nums:                     # iterate the elements
    print("  -", n)

func Sum(list_of_int xs) returns int:   # lists as params / return types
    var total = 0
    for x in xs:
        total = total + x
    return total
  • T may be any type, including another list: list_of_list_of_int grid = [[1, 2], [3]].
  • [] is the empty list (the element type comes from the declared type).
  • Works as a field, parameter, and return type.

Events (GDScript-style signals)

class Button:
    string label
    Event clicked

    func Press() returns void:
        self.clicked.emit()

func OnClick() returns void:
    print("clicked!")

var b = Button("OK")
var token = b.clicked.subscribe(OnClick)   # subscribe returns a token
b.Press()                                   # fires handlers
b.clicked.unsubscribe(token)                # remove by token

subscribe(handler) registers a handler (a func) and returns an int token; unsubscribe(token) removes it; emit() invokes every current handler.

Concurrency

func Work(int id) returns void:
    print("worker " + str(id) + " done")

var a = delegate(Work(1))     # runs concurrently, returns a Task
var b = delegate(Work(2))
# a and b join automatically when their scope ends (RAII) — no async/await

delegate(Call()) runs Call() on a real OS thread (std::async). The returned Task joins in its destructor, so concurrency is bounded by scope — no async/await ceremony.

Pointers & memory

Addresses are plain numbers; a ptr_of_T is a dereferenceable typed address (→ C++ T*). The -> operator writes a value into a target.

int n = 10
ptr_of_int p = n.getAddress()    # &n, as a pointer
print(p.readAddress())           # 10    (*p)
print(p.isValidAddress())        # true  (non-null)
99 -> p                          # *p = 99  -> writes into n
print(n)                         # 99

# Walk memory with byte offsets; .memSize() is sizeof (in bytes).
ptr_of_int q = n.getAddress().memOffset(int.memSize())
Form On Result
x.getAddress() any variable its address (an int)
x.memSize() a value, or a type (float.memSize()) byte size (sizeof)
addr.memOffset(n) an address / pointer the address shifted by n bytes
p.readAddress() a pointer the value at *p
p.isValidAddress() a pointer true if non-null
value -> target write value into target (through it, if a pointer)

-> writes through a pointer (*p = value); to point a pointer somewhere, use =/declaration. Get a variable's address with getAddress(), then memOffset.

Memory model

No manual memory management. Class instances are reference-counted std::shared_ptr (RAII); structs, Task, Event, and lists are scope-bound values, freed automatically. The raw-pointer features above are an explicit opt-in for low-level work.

Built-ins

Builtin Meaning
print(a, b, ...) Print arguments separated by spaces, then a newline.
str(x) Convert an int/float/bool/string to string.
len(x) Element count of a string or list (vs x.memSize(), the byte size).
append(list, value) Append value to a list.
range(n) / range(a, b) Iteration bounds for for.
delegate(Call()) Run Call() concurrently, returning a Task.

The pointer/memory methods (getAddress, readAddress, isValidAddress, memOffset, memSize) are in the table above.

How transpilation works

Onson has no headers. The compiler auto-generates everything: it emits one C++ translation unit plus a bridge header, onson_runtime.hpp, that supplies onson::Event, onson::Task / onson::delegate, onson::str / len, and the address/pointer helpers. Type mapping:

Onson C++
int long long
float double
string std::string
bool bool
Event onson::Event
list_of_T std::vector<T>
ptr_of_T T*
struct T T (value)
class T std::shared_ptr<T>

Classes are forward-declared and their methods are defined out-of-line so any method can reference any class. Inspect the output for any program with --emit-cpp.

Project layout

onsonc/                 the compiler package
  lexer.py              indentation-aware tokenizer
  parser.py             recursive-descent parser -> AST
  ast_nodes.py          AST dataclasses
  codegen.py            AST -> C++ (incl. lightweight type inference)
  analysis.py           program stats + worker-thread estimate for the report
  runtime.py            the auto-emitted C++ runtime (single source of truth)
  driver.py             CLI (compile/version) + pipeline + compile report
  errors.py             diagnostics with file:line:col + caret
runtime/onson_runtime.hpp   reference copy of the emitted runtime header
examples/*.on           sample programs
tests/test_examples.py  end-to-end tests (compile, run, diff stdout)
benchmarks/             Onson vs Python performance comparison (run.py)
pyproject.toml          packaging (installs the `onsonc` command)
Makefile / install.sh   install / release helpers
onsonc-cli              run from the source tree without installing
LICENSE                 MIT

Running the tests

python3 tests/test_examples.py   # or: make test

Each example is compiled to a native binary, run, and its output diffed against a fixture (the two concurrent examples are compared order-independently).

Benchmark: Onson vs Python

python3 benchmarks/run.py

Counts primes below N four ways — Onson (1 thread), Onson (4 delegate() workers), Python (1 thread), Python (4 threads) — and reports the speedups. On a 4-thread machine at N = 1,000,000:

  Onson  . 1 thread        0.682 s   (count 78498)
  Onson  . 4 workers       0.355 s
  Python . 1 thread        6.291 s   (count 78498)
  Python . 4 threads       7.853 s   (count 78498)

  Onson is    9.2x faster than Python (single-threaded)
  Onson parallel speedup:    1.92x  (4 delegate workers vs 1)
  Python thread speedup:     0.80x  (GIL: ~1x, no CPU gain)
  Onson 4 workers vs Python 4 threads:   22.2x faster

Two things stand out: compiled Onson is ~10× faster on the tight integer loop, and delegate() gives real parallel scaling while Python's threads are serialized by the GIL. See benchmarks/README.md.

v0.3 scope / limitations

Intentionally out of scope for now:

  • Event handlers are free functions (no bound-method handlers yet) and signals are parameterless.
  • No generics, modules/imports, or collection types beyond string / list_of_T.
  • delegate(...) takes a single call expression and returns a void task.
  • isValidAddress() is a non-null check; it cannot validate an arbitrary address.

These are extension points rather than design limits — the pipeline (lexer → parser → codegen → C++ backend) is structured to grow.

License

Released under the MIT License.

About

Onson — a compiled, class-based, event-driven language with Python-like indentation that transpiles to C++. Its compiler Onsonc turns .on files into native binaries, with GDScript-style events and async-free concurrency via delegate().

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors