Skip to content

hari-huynh/JobQA-KnowledgeGraph

Repository files navigation

Knowledge Graph-Powered Job Post Question Answering System

HCMUS - NLP Group Project - Semester II/2023-2024

hf_space

Sample signal

📖 Table of Contents

📝 Introduction

This project aims to develop a question answering system that can provide comprehensive and informative responses to queries related job postings. The core component of this system is a knowledge graph meticulously constructed from a vast amount of job postings. This knowledge graph serves as a robust Retrieval Augmented Generation (RAG) engine, enabling an advanced language model to effectively extract and process relevant information.

🍴 Usage

Clone the github repo

git clone https://github.com/hari-huynh/MultiHop-QA-KnowledgeGraph.git

Install requirements

pip install requirements.txt

Scrape job posts from Indeed

The knowledge graph is constructed from job posts scraped from Indeed. To scrape job post information from Indeed, use the following code:

cd knowledge_graph
python scrape_jd.py --url "indeed-url-for-scraping" --job "role-that-you-want-to-scrape" --loc "the-location"

Example:

python scrape_jd.py --url "https://vn.indeed.com/jobs" --job "Artificial Intelligence" --loc "Thành phố Hồ Chí Minh"

The result will be a JSON file containing several job posts and the corresponding information, such as titles, company names and job descriptions. See examples in the job_posts_data inside knowledge_graph folder.

Construct & Update Knowledge Graph

python update_kg.py

This will create a knowledge graph with a predefined schema (If the knowledge graph hasn't been created yet) or update the knowledge graph with new data.

QA with LLM

After having a knowledge graph filled with job post information, you can now start asking about job post related questions.

cd react_agent
python main.py

📦 Components

Data Collection Module

This module is responsible for automaticaly scraping and collecting job post data on a daily basis.

Data Collection Module

Knowledge Graph Module

This module acts as an information processor. It extracts entities and relationships from incoming data, and then uses this information to update the knowledge graph itself.

Knowledge Graph Module

Inference Module

This module is responsible for generating responses to user's queries. A ReAct agent with 2 tools (Knowledge Graph Search and Tavily Search) is the core of this module.

Inference Module

🔑 Key Features

  • Ask and answer detailed information about available jobs both within and outside the knowledge base: required skills, experience, etc.

  • Ask and answer information about the company: position, field, etc.

  • Ask and answer reasoning related to job information.

  • Suggest suitable jobs.

📚 References

[1] Lewis, Patrick, et al. "Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks." ArXiv, 2020, /abs/2005.11401. Accessed 3 Jul. 2024. [https://ar5iv.labs.arxiv.org/html/2005.11401]

[2] Hogan, Aidan, et al. "Knowledge Graphs." ArXiv, 2020, https://doi.org/10.1145/3447772. Accessed 3 Jul. 2024.

[3] Yao, Shunyu, et al. "ReAct: Synergizing Reasoning and Acting in Language Models." ArXiv, 2022, /abs/2210.03629. Accessed 3 Jul. 2024. [https://arxiv.org/abs/2210.03629]

[4] Team, Gemini, et al. "Gemini: A Family of Highly Capable Multimodal Models." ArXiv, 2023, /abs/2312.11805. Accessed 3 Jul. 2024. [https://arxiv.org/abs/2312.11805]

[5] https://www.superannotate.com/blog/llm-agents

🛠️ Tech Stack

💻 Demo

chatbot_demo.mp4

🤝 Contributors

Casper/
hari-huynh
Bailey
QuangTruong-Nguyen
Yixin
TaiQuach123

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors