Skip to content

Driveights/ProjectMIRCV

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

160 Commits
 
 
 
 

Repository files navigation

ProjectMIRCV

Description

This project involves developing a search engine for the Multimedia Information Retrieval and Computer Vision course, part of the Master's program in Artificial Intelligence and Data Engineering at the University of Pisa for the academic year 2023/2024. Each folder in this repository includes a README.md file detailing the classes it contains.

Jar Files

In the project directory, you will find two JAR files:

  1. BuildIndex.jar: Used for creating the search index.

    • To Run: java -jar .\BuildIndex.jar encodingType compressDocID compressFreq dimBlock scoringFunction stopWordRemoval stemming
    • Parameters:
      • encodingType: str ("bin" or "text"), specifies the type of encoding to be used.
      • compressDocID: str ("none" or "variablebyte"), how document IDs will be compressed.
      • compressFreq: str ("none" or "variablebyte" or "unarycode"), how frequencies will be compressed.
      • dimBlock: int, specifies the dimension of the block.
      • scoringFunction: str ("TFIDF" or "BM25" or "BM11" or "BM15"), specifies the scoring function to be used.
      • stopWordRemoval: boolean, if true, stopwords will be removed.
      • stemming: boolean, if true, stemming will be performed.
  2. Query.jar: Used for executing queries on the search engine.

    • To Run: java -jar .\Query.jar numResults scoringFunction queryMode compressDocID compressFreq docProcessor dimBlock stopWordRemoval stemming
    • Parameters:
      • numResults: int, specifies the number of documents retrieved by a query.
      • scoringFunction: str ("TFIDF" or "BM25" or "BM11" or "BM15"), specifies the scoring function to be used.
      • queryMode: str ("conjunctive" or "disjunctive"), specifies how the query will be performed.
      • compressDocID: str ("none" or "variablebyte" or "unarycode"), how document IDs will be compressed.
      • compressFreq: str ("none" or "variablebyte" or "unarycode"), how frequencies will be compressed.
      • docProcessor: str ("DAAT" or "MaxScore"), specifies the documents will be processed.
      • dimBlock: int, specifies the dimension of the block.
      • stopWordRemoval: boolean, if true, stopwords will be removed.
      • stemming: boolean, if true, stemming will be performed.

General Infos

In the repository, there is also the documentation (Documentazione.pdf) that outlines the structure of our project, lists the design choices made, and shows the results obtained from our search engine. Additionally, each Java class has been commented to provide a detailed explanation of the code.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors