rapid_textrank

High-performance TextRank keyword extraction in Rust with Python bindings.

Extract keywords and key phrases from text up to 10-100x faster than pure Python implementations, with 7 core algorithm variants, an AutoRank ensemble, and stopword support for 18 languages. Computation runs in Rust; the Python GIL is released during extraction.

Install

pip install rapid_textrank

Optional extras: pip install rapid_textrank[spacy] for spaCy tokenization, pip install rapid_textrank[topic] for gensim LDA utilities.

Quick Start

from rapid_textrank import extract_keywords

text = """
Machine learning is a subset of artificial intelligence that enables
systems to learn and improve from experience. Deep learning, a type of
machine learning, uses neural networks with many layers.
"""

keywords = extract_keywords(text, top_n=5, language="en")
for phrase in keywords:
    print(f"{phrase.text}: {phrase.score:.4f}")

Output:

machine learning: 0.2341
deep learning: 0.1872
artificial intelligence: 0.1654
neural networks: 0.1432
systems: 0.0891

Recommended Default: AutoRank

If you do not want to choose a variant manually, use AutoRank. It runs the full eligible keyword ensemble for the document and returns consensus metadata alongside the ranked phrases.

from rapid_textrank import AutoRank

extractor = AutoRank(top_n=5, language="en")
result = extractor.extract_keywords(text)

for phrase, support in zip(result.phrases, result.consensus.phrase_support):
    print(
        f"{phrase.text}: {phrase.score:.4f} "
        f"(confidence={support.confidence:.2f}, variants={support.supporting_variants})"
    )

Performance

Extraction latency (single document, end-to-end including tokenization):

Document size	rapid_textrank	pytextrank + spaCy	Speedup
~20 words	~0.1 ms	~5 ms	~50x
~100 words	~0.3 ms	~15 ms	~50x
~1,000 words	~2 ms	~80 ms	~40x

See benchmarks for methodology and full results.

Steering Extraction with BiasedTextRank

Use focus terms to pull results toward a specific domain. Here we steer toward security/privacy phrases in a mixed document:

from rapid_textrank import BiasedTextRank

text = """
We encrypt data at rest using AES-256 and enforce TLS 1.2+ for data in transit.
Access to production is gated by MFA and short-lived credentials. Audit logs are
retained for 180 days and monitored for anomalous access patterns. Personal data
processing is limited to the declared purpose, and retention follows a documented
schedule. We support DSAR workflows and apply data minimization by default.
"""

extractor = BiasedTextRank(
    focus_terms=["privacy", "encrypt", "tls", "mfa", "audit", "retention"],
    bias_weight=8.0,
    top_n=10,
    language="en",
)

result = extractor.extract_keywords(text)
for phrase in result.phrases[:5]:
    print(f"{phrase.text}: {phrase.score:.4f}")

You can override focus terms per call without creating a new extractor:

result = extractor.extract_keywords(text, focus_terms=["privacy", "retention", "dsar"])

Choosing an Algorithm

Use AutoRank when you want the library to pick and fuse the right keyword variants for you. Use the table below when you want to select a specific algorithm yourself.

Algorithm	Best for	Key idea
BaseTextRank	General-purpose keyword extraction	Standard co-occurrence graph + PageRank
PositionRank	Short/structured text (abstracts, news)	Biases toward early-occurring terms
BiasedTextRank	Domain-focused extraction	Steers PageRank toward user-supplied focus terms
SingleRank	Long technical documents	Weighted edges + cross-sentence co-occurrence window
TopicRank	Multi-topic documents needing diversity	Clusters candidates into topics, ranks topics
TopicalPageRank	LDA-guided extraction	Personalization vector from per-word topic weights
MultipartiteRank	Fine-grained topic diversity	Multipartite graph separates candidates by topic cluster

Start with BaseTextRank if unsure. See the algorithm guide for a decision flowchart.

Learn More

Topic	Link
Getting Started	Installation, quickstart, recipes
Algorithms	How TextRank works + all 7 variants
API Reference	extract_keywords(), extractor classes, JSON interface
Performance	Benchmarks and comparison with alternatives
Development	Contributing guide

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 158 Commits
.beads		.beads
.github/workflows		.github/workflows
.plans		.plans
benches		benches
docs		docs
notebooks		notebooks
python		python
src		src
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md
RELEASE_NOTES.md		RELEASE_NOTES.md
exports.toml		exports.toml
mkdocs.yml		mkdocs.yml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

rapid_textrank

Install

Quick Start

Recommended Default: AutoRank

Performance

Steering Extraction with BiasedTextRank

Choosing an Algorithm

Learn More

See Also

License

About

Uh oh!

Releases 5

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

rapid_textrank

Install

Quick Start

Recommended Default: AutoRank

Performance

Steering Extraction with BiasedTextRank

Choosing an Algorithm

Learn More

See Also

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 5

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages