This project was developed for the Redrob AI-Based Candidate Ranking Hackathon.
The system processes large-scale candidate profile data, engineers hiring-relevant features, ranks candidates against the target job requirements, and generates a competition-compliant Top-100 shortlist.
The solution is designed to be:
- Fully reproducible
- CPU-only
- Offline
- Fast enough for large candidate pools
- Explainable through candidate-specific reasoning
- Compatible with Redrob's evaluation constraints
- Processed approximately 100,000 candidate profiles
- Generated a ranked Top-100 shortlist
- Produced a valid submission CSV matching competition specifications
- Generated candidate-specific ranking explanations
- Fully reproducible ranking pipeline
- No external API calls during ranking
https://huggingface.co/spaces/Sanketh3119/redrob
redrob/
│
├── data/
│ ├── candidates.jsonl
│ ├── candidate_schema.json
│ ├── job_description.docx
│ ├── sample_submission.csv
│ └── validation files
│
├── scripts/
│ ├── parse_candidates.py
│ ├── feature_engineering.py
│ ├── rank_candidates.py
│ ├── create_shortlist.py
│ ├── create_submission.py
│ └── run_pipeline.py
│
├── outputs/
│ ├── ranked_candidates.csv
│ ├── top_100_candidates.csv
│ └── final_submission.csv
│
├── requirements.txt
├── submission_metadata.yaml
├── validate_submission.py
└── README.md
The ranking pipeline combines candidate profile information, career history, skills, engagement signals, and hiring relevance indicators into a unified scoring framework.
- Retrieval/Search Experience
- Ranking/Relevance Experience
- Recommendation Systems Experience
- Evaluation Framework Experience
- Python Expertise
- Machine Learning Experience
- Current Role Relevance
- Product Company Experience
- Seniority Level
- Years of Experience
- Profile Completeness
- Open-to-Work Status
- Recruiter Response Rate
- Search Appearances
- Saved by Recruiters
- Interview Completion Rate
- GitHub Activity
- Verified Email
- Verified Phone
- LinkedIn Connected
Candidate profiles are extracted from the provided dataset and normalized into a structured format.
Relevant hiring signals are converted into numerical features suitable for ranking.
Examples:
- Retrieval expertise
- Ranking experience
- Recommendation system experience
- Evaluation framework familiarity
- Product-company background
- Candidate engagement metrics
A weighted scoring model computes a final ranking score for each candidate.
The model prioritizes candidates with:
- Strong search/retrieval experience
- Ranking system experience
- Recommendation system expertise
- Relevant ML experience
- Positive recruiter engagement signals
Candidates are sorted by score and assigned ranks.
Only the top 100 candidates are retained for submission.
Each shortlisted candidate receives an explanation generated directly from profile data and engineered features.
Example:
Senior Applied Scientist at Meta with 16.2 years of experience. Strong retrieval, ranking, recommendation systems, and evaluation expertise. High overall alignment with search and ranking focused ML roles.
The reasoning is generated using profile attributes and ranking features only, reducing the risk of hallucinations.
The entire pipeline is deterministic.
No hosted AI services are used during ranking.
No external API calls are made.
Given the same input data, the pipeline will always produce the same ranked output.
Tested on:
- Windows 11
- Python 3.11
- CPU-only execution
- 16 GB RAM
Create a virtual environment:
python -m venv .venvActivate the environment:
.venv\Scripts\activatesource .venv/bin/activateInstall dependencies:
pip install -r requirements.txtpython scripts/parse_candidates.pypython scripts/feature_engineering.pypython scripts/rank_candidates.pypython scripts/create_submission.pyRun the complete pipeline:
python scripts/run_pipeline.pyOutput:
outputs/final_submission.csv
Generated output:
candidate_id,rank,score,reasoningValidation requirements satisfied:
- Exactly 100 candidates
- Unique ranks (1–100)
- Unique candidate IDs
- Monotonically decreasing scores
- UTF-8 encoded CSV
- Human-readable reasoning
Validate the submission file:
python validate_submission.pyExpected output:
✓ Submission validation passed
| Requirement | Status |
|---|---|
| CPU Only | ✓ |
| No GPU Required | ✓ |
| No External APIs | ✓ |
| Offline Ranking | ✓ |
| Deterministic Output | ✓ |
| Reproducible Pipeline | ✓ |
Development assistance was used during project implementation and documentation.
No external AI services are used during candidate ranking or submission generation.
All ranking decisions are computed locally using engineered features and scoring logic.
- Learning-to-Rank models
- Better skill relevance matching
- Automatic honeypot detection
- Enhanced reasoning generation
- Candidate-job semantic matching
- Explainability dashboards
Sanketh
Redrob Hackathon 2026 Submission
AI-Based Candidate Ranking System


