Redrob AI-Based Candidate Ranking System

Overview

This project was developed for the Redrob AI-Based Candidate Ranking Hackathon.

The system processes large-scale candidate profile data, engineers hiring-relevant features, ranks candidates against the target job requirements, and generates a competition-compliant Top-100 shortlist.

The solution is designed to be:

Fully reproducible
CPU-only
Offline
Fast enough for large candidate pools
Explainable through candidate-specific reasoning
Compatible with Redrob's evaluation constraints

Results

Processed approximately 100,000 candidate profiles
Generated a ranked Top-100 shortlist
Produced a valid submission CSV matching competition specifications
Generated candidate-specific ranking explanations
Fully reproducible ranking pipeline
No external API calls during ranking

Demo

Hugging Face Space

https://huggingface.co/spaces/Sanketh3119/redrob

Demo Screenshots

Ranking System

Top Ranked Candidates

Project Structure

Repository Structure

redrob/
│
├── data/
│   ├── candidates.jsonl
│   ├── candidate_schema.json
│   ├── job_description.docx
│   ├── sample_submission.csv
│   └── validation files
│
├── scripts/
│   ├── parse_candidates.py
│   ├── feature_engineering.py
│   ├── rank_candidates.py
│   ├── create_shortlist.py
│   ├── create_submission.py
│   └── run_pipeline.py
│
├── outputs/
│   ├── ranked_candidates.csv
│   ├── top_100_candidates.csv
│   └── final_submission.csv
│
├── requirements.txt
├── submission_metadata.yaml
├── validate_submission.py
└── README.md

Methodology

The ranking pipeline combines candidate profile information, career history, skills, engagement signals, and hiring relevance indicators into a unified scoring framework.

Feature Categories

Technical Relevance

Retrieval/Search Experience
Ranking/Relevance Experience
Recommendation Systems Experience
Evaluation Framework Experience
Python Expertise
Machine Learning Experience

Career Quality

Current Role Relevance
Product Company Experience
Seniority Level
Years of Experience

Candidate Signals

Profile Completeness
Open-to-Work Status
Recruiter Response Rate
Search Appearances
Saved by Recruiters
Interview Completion Rate
GitHub Activity

Verification Signals

Verified Email
Verified Phone
LinkedIn Connected

Ranking Process

Step 1 — Candidate Parsing

Candidate profiles are extracted from the provided dataset and normalized into a structured format.

Step 2 — Feature Engineering

Relevant hiring signals are converted into numerical features suitable for ranking.

Examples:

Retrieval expertise
Ranking experience
Recommendation system experience
Evaluation framework familiarity
Product-company background
Candidate engagement metrics

Step 3 — Candidate Scoring

A weighted scoring model computes a final ranking score for each candidate.

The model prioritizes candidates with:

Strong search/retrieval experience
Ranking system experience
Recommendation system expertise
Relevant ML experience
Positive recruiter engagement signals

Step 4 — Top-100 Selection

Candidates are sorted by score and assigned ranks.

Only the top 100 candidates are retained for submission.

Step 5 — Reasoning Generation

Each shortlisted candidate receives an explanation generated directly from profile data and engineered features.

Example:

Senior Applied Scientist at Meta with 16.2 years of experience. Strong retrieval, ranking, recommendation systems, and evaluation expertise. High overall alignment with search and ranking focused ML roles.

The reasoning is generated using profile attributes and ranking features only, reducing the risk of hallucinations.

Reproducibility

The entire pipeline is deterministic.

No hosted AI services are used during ranking.

No external API calls are made.

Given the same input data, the pipeline will always produce the same ranked output.

Environment

Tested on:

Windows 11
Python 3.11
CPU-only execution
16 GB RAM

Installation

Create a virtual environment:

python -m venv .venv

Activate the environment:

Windows

.venv\Scripts\activate

Linux / Mac

source .venv/bin/activate

Install dependencies:

pip install -r requirements.txt

Running the Pipeline

Parse Candidate Data

python scripts/parse_candidates.py

Generate Features

python scripts/feature_engineering.py

Rank Candidates

python scripts/rank_candidates.py

Generate Submission

python scripts/create_submission.py

One-Command Execution

Run the complete pipeline:

python scripts/run_pipeline.py

Output:

outputs/final_submission.csv

Submission Format

Generated output:

candidate_id,rank,score,reasoning

Validation requirements satisfied:

Exactly 100 candidates
Unique ranks (1–100)
Unique candidate IDs
Monotonically decreasing scores
UTF-8 encoded CSV
Human-readable reasoning

Validation

Validate the submission file:

python validate_submission.py

Expected output:

✓ Submission validation passed

Compute Compliance

Requirement	Status
CPU Only	✓
No GPU Required	✓
No External APIs	✓
Offline Ranking	✓
Deterministic Output	✓
Reproducible Pipeline	✓

AI Tool Usage

Development assistance was used during project implementation and documentation.

No external AI services are used during candidate ranking or submission generation.

All ranking decisions are computed locally using engineered features and scoring logic.

Future Improvements

Learning-to-Rank models
Better skill relevance matching
Automatic honeypot detection
Enhanced reasoning generation
Candidate-job semantic matching
Explainability dashboards

Author

Sanketh

Redrob Hackathon 2026 Submission

AI-Based Candidate Ranking System

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
data		data
docs		docs
outputs		outputs
scripts		scripts
.gitignore		.gitignore
README.md		README.md
project_status_report.md		project_status_report.md
ranking_methodology.md		ranking_methodology.md
requirements.txt		requirements.txt
submission_metadata.yaml		submission_metadata.yaml
validate_submission.py		validate_submission.py

Folders and files

Latest commit

History

Repository files navigation

Redrob AI-Based Candidate Ranking System

Overview

Results

Demo

Hugging Face Space

Demo Screenshots

Ranking System

Top Ranked Candidates

Project Structure

Repository Structure

Methodology

Feature Categories

Technical Relevance

Career Quality

Candidate Signals

Verification Signals

Ranking Process

Step 1 — Candidate Parsing

Step 2 — Feature Engineering

Step 3 — Candidate Scoring

Step 4 — Top-100 Selection

Step 5 — Reasoning Generation

Reproducibility

Environment

Installation

Windows

Linux / Mac

Running the Pipeline

Parse Candidate Data

Generate Features

Rank Candidates

Generate Submission

One-Command Execution

Submission Format

Validation

Compute Compliance

AI Tool Usage

Future Improvements

Author

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages