Skip to content

Sankethhhhhhh/redrob

Repository files navigation

Redrob AI-Based Candidate Ranking System

Overview

This project was developed for the Redrob AI-Based Candidate Ranking Hackathon.

The system processes large-scale candidate profile data, engineers hiring-relevant features, ranks candidates against the target job requirements, and generates a competition-compliant Top-100 shortlist.

The solution is designed to be:

  • Fully reproducible
  • CPU-only
  • Offline
  • Fast enough for large candidate pools
  • Explainable through candidate-specific reasoning
  • Compatible with Redrob's evaluation constraints

Results

  • Processed approximately 100,000 candidate profiles
  • Generated a ranked Top-100 shortlist
  • Produced a valid submission CSV matching competition specifications
  • Generated candidate-specific ranking explanations
  • Fully reproducible ranking pipeline
  • No external API calls during ranking

Demo

Hugging Face Space

https://huggingface.co/spaces/Sanketh3119/redrob


Demo Screenshots

Ranking System

Demo Screenshot

Top Ranked Candidates

Top Candidates

Project Structure

Project Structure

Repository Structure

redrob/
│
├── data/
│   ├── candidates.jsonl
│   ├── candidate_schema.json
│   ├── job_description.docx
│   ├── sample_submission.csv
│   └── validation files
│
├── scripts/
│   ├── parse_candidates.py
│   ├── feature_engineering.py
│   ├── rank_candidates.py
│   ├── create_shortlist.py
│   ├── create_submission.py
│   └── run_pipeline.py
│
├── outputs/
│   ├── ranked_candidates.csv
│   ├── top_100_candidates.csv
│   └── final_submission.csv
│
├── requirements.txt
├── submission_metadata.yaml
├── validate_submission.py
└── README.md

Methodology

The ranking pipeline combines candidate profile information, career history, skills, engagement signals, and hiring relevance indicators into a unified scoring framework.

Feature Categories

Technical Relevance

  • Retrieval/Search Experience
  • Ranking/Relevance Experience
  • Recommendation Systems Experience
  • Evaluation Framework Experience
  • Python Expertise
  • Machine Learning Experience

Career Quality

  • Current Role Relevance
  • Product Company Experience
  • Seniority Level
  • Years of Experience

Candidate Signals

  • Profile Completeness
  • Open-to-Work Status
  • Recruiter Response Rate
  • Search Appearances
  • Saved by Recruiters
  • Interview Completion Rate
  • GitHub Activity

Verification Signals

  • Verified Email
  • Verified Phone
  • LinkedIn Connected

Ranking Process

Step 1 — Candidate Parsing

Candidate profiles are extracted from the provided dataset and normalized into a structured format.

Step 2 — Feature Engineering

Relevant hiring signals are converted into numerical features suitable for ranking.

Examples:

  • Retrieval expertise
  • Ranking experience
  • Recommendation system experience
  • Evaluation framework familiarity
  • Product-company background
  • Candidate engagement metrics

Step 3 — Candidate Scoring

A weighted scoring model computes a final ranking score for each candidate.

The model prioritizes candidates with:

  • Strong search/retrieval experience
  • Ranking system experience
  • Recommendation system expertise
  • Relevant ML experience
  • Positive recruiter engagement signals

Step 4 — Top-100 Selection

Candidates are sorted by score and assigned ranks.

Only the top 100 candidates are retained for submission.

Step 5 — Reasoning Generation

Each shortlisted candidate receives an explanation generated directly from profile data and engineered features.

Example:

Senior Applied Scientist at Meta with 16.2 years of experience. Strong retrieval, ranking, recommendation systems, and evaluation expertise. High overall alignment with search and ranking focused ML roles.

The reasoning is generated using profile attributes and ranking features only, reducing the risk of hallucinations.


Reproducibility

The entire pipeline is deterministic.

No hosted AI services are used during ranking.

No external API calls are made.

Given the same input data, the pipeline will always produce the same ranked output.


Environment

Tested on:

  • Windows 11
  • Python 3.11
  • CPU-only execution
  • 16 GB RAM

Installation

Create a virtual environment:

python -m venv .venv

Activate the environment:

Windows

.venv\Scripts\activate

Linux / Mac

source .venv/bin/activate

Install dependencies:

pip install -r requirements.txt

Running the Pipeline

Parse Candidate Data

python scripts/parse_candidates.py

Generate Features

python scripts/feature_engineering.py

Rank Candidates

python scripts/rank_candidates.py

Generate Submission

python scripts/create_submission.py

One-Command Execution

Run the complete pipeline:

python scripts/run_pipeline.py

Output:

outputs/final_submission.csv

Submission Format

Generated output:

candidate_id,rank,score,reasoning

Validation requirements satisfied:

  • Exactly 100 candidates
  • Unique ranks (1–100)
  • Unique candidate IDs
  • Monotonically decreasing scores
  • UTF-8 encoded CSV
  • Human-readable reasoning

Validation

Validate the submission file:

python validate_submission.py

Expected output:

✓ Submission validation passed

Compute Compliance

Requirement Status
CPU Only
No GPU Required
No External APIs
Offline Ranking
Deterministic Output
Reproducible Pipeline

AI Tool Usage

Development assistance was used during project implementation and documentation.

No external AI services are used during candidate ranking or submission generation.

All ranking decisions are computed locally using engineered features and scoring logic.


Future Improvements

  • Learning-to-Rank models
  • Better skill relevance matching
  • Automatic honeypot detection
  • Enhanced reasoning generation
  • Candidate-job semantic matching
  • Explainability dashboards

Author

Sanketh

Redrob Hackathon 2026 Submission

AI-Based Candidate Ranking System

About

AI-Powered Candidate Ranking System for Search & Recommendation Engineering Roles using Feature Engineering, Scoring Models, and Recruiter Signal Analysis.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages