Skip to content

xberg-io/tree-sitter-language-pack

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1,365 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Xberg

tree-sitter-language-pack

Bindings Rust Python Node.js WASM Java Go C# PHP Ruby Elixir Dart Kotlin Swift Zig C FFI
<!-- Project Info -->
<a href="https://github.com/xberg-io/tree-sitter-language-pack/blob/main/LICENSE">
	<img src="https://img.shields.io/badge/License-MIT-007ec6" alt="License" />
</a>
<a href="https://docs.tree-sitter-language-pack.xberg.io">
	<img src="https://img.shields.io/badge/Docs-tree--sitter--language--pack-007ec6" alt="Documentation" />
</a>

A comprehensive collection of tree-sitter language parsers with polyglot bindings — 306 languages, one Rust core, downloaded on demand.

What and Why?

tree-sitter generates fast, incremental parsers from grammars for any programming language. As agentic tooling makes processing, inspecting, and analyzing code ever more critical, tree-sitter-language-pack bundles the most comprehensive collection of grammars available behind a single API.

The core is written in Rust with polyglot bindings for 15 other languages, plus a CLI and MCP server for working with code from the shell. Grammars are built into multi-platform binaries and downloaded on demand, so the package stays small while offering 300+ parsers.

Features

Feature Description
306 languages Pre-compiled parsers at ABI 14 (backwards compatible with tree-sitter 0.21–0.26)
Code intelligence Extract functions, classes, imports, docstrings, and symbols from source
Data extraction Hierarchical key-value trees from 17 config/data formats (JSON, YAML, TOML, XML, CSV, …)
Host-native language API get_language() returns native Language objects in Python, Node.js, Go, Java, C#, Kotlin, Swift, Zig, and C
On-demand downloads Parsers are fetched on first use and cached locally for fast, offline reuse
Selective installation Download only the languages you need; unused parsers are never downloaded
Polyglot bindings Native bindings across 15 languages plus a C ABI for everything else
CLI & MCP server ts-pack download to pre-fetch parsers; MCP integration for AI agents

Supported Languages

This pack includes 306 languages. See the full language list for every supported grammar with extensions and repository links.

⭐ Star this repo to show your support — it helps others discover tree-sitter-language-pack.

Quick Start

Language Packages

Rust
cargo add tree-sitter-language-pack

See Rust README for full documentation.

Python
pip install tree-sitter-language-pack

See Python README for full documentation.

Node.js
npm install @kreuzberg/tree-sitter-language-pack

See Node.js README for full documentation.

Go
go get github.com/xberg-io/tree-sitter-language-pack/packages/go

See Go README for full documentation.

Java

Available on Maven Central as dev.kreuzberg.treesitterlanguagepack:tree-sitter-language-pack. See Java README for the dependency snippet and current version.

C#
dotnet add package TreeSitterLanguagePack

See .NET README for full documentation.

Ruby
gem install tree_sitter_language_pack

See Ruby README for full documentation.

PHP
composer require xberg-io/tree-sitter-language-pack

See PHP README for full documentation.

Elixir

Add {:tree_sitter_language_pack, "~> 1.0"} to your mix.exs dependencies. See Elixir README for full documentation.

WebAssembly
npm install @kreuzberg/tree-sitter-language-pack-wasm

See WebAssembly README for full documentation.

C/C++ (FFI)

Build from source as part of this workspace. See FFI README for full documentation.

CLI
cargo install ts-pack-cli
brew install xberg-io/tap/ts-pack

Or run without a persistent install (the proxy package fetches the native binary):

npx @kreuzberg/ts-pack-cli parse <file>
uvx --from ts-pack-cli ts-pack parse <file>

See CLI README for full documentation.

MCP Server

The CLI bundles an MCP server for integration with AI agents. Start it with:

ts-pack mcp

The server runs over stdio by default. For HTTP transport:

ts-pack mcp --transport http --host 127.0.0.1 --port 8011

Register with Claude Code:

claude mcp add ts-pack -- ts-pack mcp

Or add to your Claude Desktop config at ~/Library/Application Support/Claude/claude_desktop_config.json:

{
  "mcpServers": {
    "ts-pack": {
      "command": "ts-pack",
      "args": ["mcp"]
    }
  }
}

The MCP server exposes 8 tools: parse, process, detect_language, list_languages, info, download, cache_dir, and clean_cache. It also provides resources for the available language catalog and a prompt for code analysis.

The marketplace plugin from xberg-io/plugins auto-registers the server — see AI Coding Assistants below to install it instead of manual registration.

For detailed setup, transport options, and tool reference, see the MCP Server guide.

AI Coding Assistants

Install the tree-sitter-language-pack plugin from the xberg-io/plugins marketplace. It ships the tree-sitter-language-pack agent skills (parse and extract code intelligence from 300+ languages) and works with every major coding agent — expand your harness below.

Claude Code
/plugin marketplace add xberg-io/plugins
/plugin install tree-sitter-language-pack@kreuzberg
Codex CLI
/plugins add https://github.com/xberg-io/plugins

Then search for tree-sitter-language-pack and select Install Plugin.

Cursor

Settings → Plugins → Add from URL → https://github.com/xberg-io/plugins, then select tree-sitter-language-pack.

Gemini CLI
gemini extensions install https://github.com/xberg-io/plugins
Factory Droid
droid plugin marketplace add https://github.com/xberg-io/plugins
droid plugin install tree-sitter-language-pack@kreuzberg
GitHub Copilot CLI
copilot plugin marketplace add https://github.com/xberg-io/plugins
copilot plugin install tree-sitter-language-pack@kreuzberg
opencode

Add the package to opencode.json:

{
  "$schema": "https://opencode.ai/config.json",
  "plugin": ["@kreuzberg/opencode-tree-sitter-language-pack"]
}

Documentation

Full guides, the host-native language API, data extraction, the CLI and MCP server, and the complete language list live at docs.tree-sitter-language-pack.xberg.io.

Part of Kreuzberg.dev

  • Kreuzberg — document intelligence: text, tables, metadata from 91+ formats with optional OCR.
  • Xberg Enterprise — managed extraction API with SDKs, dashboards, and observability.
  • crawlberg — web crawling and scraping with HTML→Markdown and headless-Chrome fallback.
  • html-to-markdown — fast, lossless HTML→Markdown engine.
  • liter-llm — universal LLM API client with native bindings for 14 languages and 143 providers.
  • tree-sitter-language-pack — tree-sitter grammars and code-intelligence primitives.
  • alef — the polyglot binding generator that produces every per-language binding across the 5 polyglot repos.

Contributing

Contributions are welcome! See CONTRIBUTING.md for guidelines.

Join our Discord community for questions and discussion.

License

MIT — see LICENSE for details.

All included tree-sitter grammars are permissively licensed (MIT, Apache-2.0, BSD, ISC, or similar). Copyleft licenses (GPL, AGPL, LGPL, MPL) are not accepted. See CONTRIBUTING.md for grammar inclusion criteria.

About

Comprehensive tree-sitter grammar compilation with polyglot bindings — Rust, Python, Node.js, Go, Java, Ruby, Elixir, PHP, C#, WASM, Dart, Kotlin-Android, Swift, Zig, and CLI. 306+ languages.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors