Table of Contents
Amalgam® is a programming language developed for genetic programming and instance based machine learning, as well as for simulation, agent based modeling, data storage and retrieval, the mathematics of probability theory and information theory, and game content and AI. The language format is somewhat LISP-like in that it uses parenthesized list format with prefix notation and is designed for functional programming. There is a one-to-one mapping between the code and the corresponding parse tree. Its one-to-one parse-tree design coupled with its ability to sandbox execution makes it well suited for LLMs.
Whereas virtually all practical programming languages are primarily designed for some combination of programmer productivity and computational performance, Amalgam prioritizes code matching and merging, as well as a deep equivalence of code and data. Amalgam uses entities to store code and data, with a rich query system to find entities by their labels, which are roughly equivalent to attributes in object oriented programming. There is no separate class versus instance; entities can be used as prototypes to be copied and modified. Though code and data are represented as trees from the root of each entity, graphs in code and data structures are permitted and are automatically flattened to code using special references. Amalgam supports genetic programming and code mixing by being weakly typed, and attempts to find a way to execute code no matter whether types match or not.
Amalgam takes inspiration from many programming languages, but the largest influences are LISP, Scheme, Haskell, Perl, Smalltalk, and Python. Despite being much like LISP, there is deliberately no macro system. This is to make sure that code is semantically similar whenever the code is similar, regardless of context. This makes it easy to find the difference between x and y as an executable patch, and then apply that patch to z as (call (difference x y) {_ z}), or semantically mix blocks of code a and b as (mix a b). Though Amalgam is optimized for functional programming, it is not a purely functional language. It has imperative and object-oriented capabilities.
Genetic programming can create arbitrary code, so there is always a chance that an evolved program consumes considerable CPU or memory resources, or may attempt to affect the system outside of the interpreter. For these reasons, there are many strict sandboxing aspects of the language with optional constraints on access, CPU, and memory. Amalgam has a rich permissions system, which controls what entities and code are able to do, whether writing to the console or executing system commands.
The Amalgam interpreter does not currently have rich support for linking C libraries into the language, but that is planned for future functionality.
Initial development on Amalgam began in 2011. It was first offered as a commercial product in 2014 at Hazardous Software Inc. and was open sourced in September 2023 by Howso Incorporated (formerly known as Diveplane Corporation, a company spun out of Hazardous Software Inc.).
When referencing the language: 'Amalgam', 'amalgam', 'amalgam-lang', and 'amalgam language' are used interchangeably with Amalgam being preferred. When referencing the interpreter: 'Amalgam interpreter', 'interpreter', 'Amalgam app', and 'Amalgam lib' are used interchangeably.
See the Amalgam beginner's guide to get started.
Full Amalgam language usage documentation is located in the Amalgam Language Reference.
Further examples can be found in the examples directory.
The primary file extensions consist of:
.amlg- Amalgam script.mdam- Amalgam metadata, primarily just current random seed.caml- compressed Amalgam for fast storage and loading, that may contain many entities.
Syntax highlighting is provided as a plugin for 2 major vendors:
Since debugging Amalgam code is only supported in VSCode, it is recommended that VSCode be the main IDE for writing and debugging Amalgam code.
Debugging Amalgam is supported through the VSCode Plugin.
The Amalgam interpreter is written conforming to the C++ 17 standard so theoretically should be compilable on virtually any modern system.
An interpreter application and shared library (dll/so/dylib) are built for each release. A versioned tarball is created for each target platform in the build matrix:
| Platform | Variants 1 | Automated Testing | Notes |
|---|---|---|---|
| Windows amd64 | MT, ST, OMP, MT-NoAVX | ✔️ | |
| Linux amd64 | MT, ST, OMP, MT-NoAVX, ST-PGC, AFMI | ✔️ | ST-PGC and AFMI are for testing only, not packaged for release |
| Linux arm64: 8.2-a+simd+rcpc | MT, ST, OMP | ✔️ | Tested with qemu |
| Linux arm64: 8-a+simd | ST | ✔️ | Tested with qemu |
| macOS arm64: 8.4-a+simd | MT, ST, OMP | ✔️ | M1 and newer supported (amd64 NoAVX also tested w/ emulation) |
| WASM 64-bit | ST | ✔️ | Built on linux using emscripten, headless test with node:18 + jest |
- 1 Variant meanings:
- MT
- Multi-threaded
- Binary postfix: '-mt'
- Interpreter uses threads for parallel operations to increase throughput at the cost of some overhead
- ST
- Single-threaded
- Binary postfix: '-st'
- Interpreter does not use multiple threads for parallel operations
- OMP
- OpenMP
- Binary postfix: '-omp'
- Interpreter uses OpenMP threading internally to minimize latency of query operations
- Can be built with single-threaded or multi-threaded
- MT-NoAVX
- Multi-threaded but without AVX intrinsics
- Binary postfix: '-mt-noavx'
- Interpreter uses threads for parallel operations to increase throughput at the cost of some overhead but without AVX intrinsics
- Useful in emulators, virtualized environments, and older hardware that don't have AVX support
- amd64 only, as AVX intrinsics are only applicable to variants of this architecture
- ST-PGC
- Single-threaded with pedantic garbage collection (PGC)
- Binary postfix: '-st-pgc'
- Interpreter does not use any threads and performs garbage collection at every operation
- Very slow by nature, intended only be used for verification during testing or debugging
- AFMI
- Stands for Amalgam fast memory integrity, which performs memory instrumentation to catch a variety of potential bugs
- Incurs a small performance penalty so can be used in realistic workflows for debugging
- MT
Pre-built binaries use CMake+Ninja for CI/CD. See PR workflow for automated build steps.
Though Amalgam is intended to support any C++17 compliant compiler, the current specific tool and OS versions used are:
- CMake 3.30
- Ninja 1.10
- Windows:
- Visual Studio 2026 v145
- Linux:
- Ubuntu 20.04, gcc-10
- macOS (Darwin):
- macOS 13, AppleClang 15.0
- WASM:
- Ubuntu 20.04, emscripten 3.1.67
Running the pre-built interpreter has specific runtime requirements per platform:
- 64-bit OS
- AVX2 intrinsics
- A "no-avx" variant of the multi-threading binary is built on amd64 platforms for environments without AVX2 intrinsics.
- See Dev/local Builds for compiling with other intrinsics (i.e., AVX512)
- OS:
- Microsoft Windows 10 or later
- Microsoft Windows Server 2016 or later
- Arch: amd64
- glibc 2.31 or later
- Arch: amd64 or arm64
- Specific arm64 builds:
armv8-a+simd&armv8.2-a+simd+rcpc
- Specific arm64 builds:
- macOS 12 or higher
- Arch: arm64
- Specific arm64 builds:
armv8.4-a+simd(M1 or later)
- Specific arm64 builds:
WASM support is still experimental.
- No specific runtime requirements at this time
Dev and local builds can be either run using a CLI or IDE.
The workflow for building the interpreter on all platforms is fairly straightforward. CMake Presets are used to define all settings in the CMake presets file.
Example for release build on linux amd64:
PRESET="amd64-release-linux"
cmake --preset $PRESET # configure/generate (./out/build)
cmake --build --preset $PRESET # build
cmake --build --preset $PRESET --target test # test
cmake --build --preset $PRESET --target install # install (./out/install)
cmake --build --preset $PRESET --target package # package (./out/package)The above performs a local "build install". For specifying a custom location, run install with an install prefix. Depending on permissions, admin access (admin elevated prompt on Windows, sud/su on linux/macos) might be needed:
cmake -DCMAKE_INSTALL_PREFIX="/path/to/install/location" --build --preset $PRESET --target installDepending on the platform, not all tests will run successfully out of the box, especially when cross compiling. For those cases (i.e., arm64 on Mac M1 or AVX2/AVX512), the tests that are runnable on the specific platform can be included/excluded by running CTest directly (not through CMake, like above):
ctest --preset $PRESET --label-exclude 'advanced_intrinsics'To see all available test labels:
ctest --preset $PRESET --print-labelsAll CTest run options can be on the CMake website.
Automation uses the CMake generated build system (ninja), but Visual Studio or VSCode are the best options for local development. A helper script open-in-vs.bat is provided to set-up the CLI with VS build tools and open the IDE. On Windows, it can be used via:
- Default (no args) : Visual Studio solution (CMake generated, "amd64-windows-vs" preset)
open-in-vs.bat
- vs_cmake : Visual Studio directory (load from directory with CMake file)
open-in-vs.bat vs_cmake
- vscode : VSCode directory (load from directory with CMake file)
open-in-vs.bat vscode
- vs_static : Visual Studio solution (local static non-CMake generated: Amalgam.sln)
open-in-vs.bat vs_static
Note: on Windows, some issues have been found with using the CMake generated VS solutions and the native CMake support in Visual Studio and VSCode. If the developer experience is unstable, it is recommended that the vs_static build be used instead of the CMake generated build.
Some specific build customizations are important to note. These customizations can be altered in the main CMake file:
For debugging the C++ code, example launch files are provided in launch.json (VSCode) and launch.vs.json (Visual Studio) when opening a folder as a CMake project.
Remote debugging for linux is supported in both IDEs.
Unit tests can be run via an amalgam executable via the --validate-amalgam paramater. No additional files are required as the majority of the tests are run from the built-in documentation.
./amalgam-mt --validate-amalgamRunning the binary without any parameters is typical of many interpreters and yields a read-execute-print loop. Help is available from within:
./amalgam-mtHelp for command line parameters can be obtained via the startard --help parameter:
./amalgam-mt --helpTo run an Amalgam script:
./amalgam-mt test.amlgIf the Amalgam file begins with a shebang (#!) followed by the path to the executable binary, which, coincidentally is the syntax for a private label, the script can be run as:
./test.amlg