Skip to content

Yehonatal/fidel-tools

Repository files navigation

Fidel Tools logo

The most comprehensive, schema-driven NLP pre-processing framework for Amharic.

Website · Issues · Contributing

License NPM Version Live Demo pnpm


What is Fidel Tools?

Fidel Tools is a modular, schema-driven pre-processing and Natural Language Processing (NLP) toolkit designed specifically for Amharic and other Ethiopic script text. It provides high-performance components out of the box including character normalization, sentence boundary tokenization, prefix-aware stopword removal, light stemming, and bidirectional transliteration.

Natural Language Processing in the Ethiopic ecosystem is a half-solved problem. Most implementations require hardcoded, unconfigurable logic and suffer from low accuracy. We believe developers deserve a production-grade, highly customizable solution hence, Fidel Tools.


Packages

Fidel Tools is managed as a monorepo workspace. Check the individual package directories and their respective changelogs:

Package Description Version Changelog
@fidel-tools/core Core processing pipeline and NLP engine 0.1.6 Changelog
@fidel-tools/lang-am Amharic language pack & schema configurations 0.1.6 Changelog
@fidel-tools/validate-pack CLI tool to validate & fix language packs 0.1.6 Changelog

Quick Start

Installation

pnpm add @fidel-tools/core @fidel-tools/lang-am

Usage

import { Pipeline } from '@fidel-tools/core'
import amPack from '@fidel-tools/lang-am'

const nlp = new Pipeline(amPack)

// Normalize homophones, labialization, and gemination
const text = nlp.normalize("ሐኪም ኀይሉ በልቷልልል!")
console.log(text) // "ሃኪም ሃይሉ በልቱዋልል!"

// Remove stopwords using boundary rules
const cleaned = nlp.removeStopwords("ያወጣውን የተጨማሪ እሴት")
console.log(cleaned) // "ያወጣውን የ እሴት"

// Stem Amharic words
const stem = nlp.stem("ልጆቻቸውን")
console.log(stem) // "ልጅ"

Production Deployment Notes

Rate Limiter Limitation

The built-in rate limiter (apps/api/src/middleware/rateLimiter.ts) stores request counters in an in-memory Javascript store. While perfectly sufficient for single-instance applications or local testing, this state is volatile and resets on server restarts. If you are deploying the API across multiple distributed instances, it is recommended to refactor the memory store in rateLimiter.ts to utilize a shared cache database like Redis.


Contribution

Fidel Tools is free and open-source software licensed under the MIT License. You can support the project by:

  • Contributing features, fixes, or new language packs. Read our Contributing Guide.
  • Opening issues or submitting feature requests.

Academic References

The processing logic draws on academic foundations in Ethiopic NLP:

About

Resources

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors