Upload Excel → OpenMetadata flags PII → AI explains what breaks → Quality scored → Compliance reported → Lineage shown. What takes a data steward 3 hours now takes 30 seconds.
Built for the WeMakeDevs x OpenMetadata Hackathon (Apr 17–26, 2026)
🔗 Live Demo: metalens-production.up.railway.app 📦 GitHub: pratyakshyamishra43-coder/MetaLens
Data teams work with Excel files daily but have no easy way to:
- Know which columns contain sensitive (PII) data
- Understand what downstream pipelines break if a column is modified
- Get a plain-English explanation of what the data means
- Assess data quality without writing custom scripts
- Generate compliance reports for GDPR/HIPAA audits
MetaLens connects your Excel file to OpenMetadata's governance layer and uses AI to answer all of this instantly.
- Upload any
.xlsxfile on the dashboard - MetaLens matches your columns to OpenMetadata tables
- PII columns are flagged and masked automatically
- AI explains your data, scores quality, and shows lineage
- Compliance report generated for GDPR/HIPAA/PCI-DSS
- Export everything as a PDF in one click
| Feature | Description |
|---|---|
| 🛡 PII Detection | Auto-flags Sensitive & NonSensitive columns from OpenMetadata tags |
| 🔒 PII Masking | Sensitive column values masked as *** MASKED *** in preview |
| 💥 Column Impact Analyzer | "What breaks if I delete this column?" — AI answer + impact score 1–10 |
| 📊 Data Quality Score | Circular score (0–100) with null, PII risk & completeness breakdown |
| 📋 Metadata Completeness | Every column graded on description, tags, type, sensitivity |
| 🧩 Smart Table Matcher | Fuzzy + AI matching of Excel columns to OpenMetadata schema |
| ⚖ Compliance Report | Auto-generated GDPR / HIPAA / PCI-DSS matrix per column |
| ✏️ Description Editor | Write column descriptions back to OpenMetadata inline |
| 🔗 Lineage View | Upstream/downstream table + column-level dependencies |
| 📄 PDF Export | Download full governance report as a professional PDF |
| 🔍 Live Table Search | Search any table across OpenMetadata catalog in real time |
| 🌐 Live Deployment | Hosted on Railway, publicly accessible |
1. Dashboard — Upload your Excel file to begin analysis

2. Analysis — Columns matched to OpenMetadata; PII auto-detected and masked

3. Impact Scores — Every column scored 1–10 for deletion risk

4. Data Quality — Null penalties + PII exposure rolled into a single score

5. AI Chat — Ask anything about your data; AI answers using metadata + profile

6. Risk Score — "What breaks if I delete this column?" answered in seconds

7. Lineage — See where your data comes from and what depends on it

8. Metadata Completeness — Every column graded on description, tags, type, sensitivity

9. Smart Match — Excel columns mapped to OpenMetadata schema with AI suggestions

10. Compliance Report — GDPR / HIPAA / PCI-DSS matrix auto-generated from PII findings

| Layer | Technology |
|---|---|
| Backend | Python 3, Flask |
| Excel Parsing | pandas, openpyxl |
| Metadata | OpenMetadata REST API (sandbox) |
| AI | Groq API — LLaMA 3.3 70B |
| PDF Export | ReportLab |
| Frontend | Jinja2, custom CSS, Inter font |
| Deployment | Railway (gunicorn) |
# Clone the repo
git clone https://github.com/pratyakshyamishra43-coder/MetaLens.git
cd MetaLens
# Install dependencies
pip install -r requirements.txt
# Set up environment variables
cp .env.example .env
# Add your OPENMETADATA_TOKEN and GROQ_API_KEY to .env
# Run
python app.py
# Visit http://127.0.0.1:5000OPENMETADATA_TOKEN=your_90_day_personal_access_token GROQ_API_KEY=your_groq_api_key
MetaLens/
├── app.py
├── fetch_metadata.py
├── parse_excel.py
├── pipeline.py
├── requirements.txt
├── Procfile
├── .env.example
├── accounts_sample.xlsx
├── static/
│ └── style.css
└── templates/
├── base.html
├── index.html
├── analysis.html
├── quality.html
├── chat.html
├── lineage.html
├── completeness.html
├── smart_match.html
└── compliance.html
MetaLens uses the OpenMetadata REST API to:
- Fetch table schema and column-level PII tags (
PII.Sensitive,PII.NonSensitive) - Pull upstream/downstream lineage edges and column-level lineage
- Search across all tables in the catalog live
- Write column descriptions back to OpenMetadata via PATCH API
- Drive compliance classification using DataSensitivity and DataTier tags
Active demo table: ACME_MYSQL.default.FINANCIAL_STAGING.ACCOUNTS
Note: Description write-back is fully implemented. The OpenMetadata sandbox restricts
EditAllfor free accounts — this feature works on self-hosted OpenMetadata deployments.
| Date | Shipped |
|---|---|
| Apr 17–18 | Project setup, OpenMetadata auth, fetch_metadata.py |
| Apr 19–20 | Flask app, 5 pages, full UI, pipeline.py |
| Apr 21 | Railway deployment, table search, PII masking |
| Apr 22 | Column impact analyzer, DataSensitivity badges |
| Apr 23 | Column lineage, metadata completeness, smart matcher |
| Apr 24 | PDF export, description editor, compliance report |
| Apr 25 | Final polish, README, demo video, submission |
Pratyakshya Mishra — First year CS undergrad, India Hackathon: WeMakeDevs x OpenMetadata | April 17–26, 2026