A simple desktop search engine built with Python, Tkinter, and Whoosh. This application indexes local files and allows users to search through their contents using full-text search.
import os, sys, json, csv, datetime
import tkinter as tk
from tkinter import ttk, filedialog, messagebox
from whoosh import index, qparser
from whoosh.fields import Schema, TEXT, ID, DATETIME
from whoosh.analysis import StemmingAnalyzer
from whoosh.highlight import Highlighter, HtmlFormatter, ContextFragmenter
from whoosh.qparser import QueryParser, MultifieldParser
from whoosh.query import DateRange, Every, Term
import whoosh.index as windex
try:
from pypdf import PdfReader
HAS_PDF = True
except ImportError:
HAS_PDF = False
try:
import openpyxl
HAS_XLSX = True
except ImportError:
HAS_XLSX = False
INDEX_DIR = os.path.join(os.path.expanduser("~"), ".mini_search_index")
RESULTS_PER_PAGE = 5
SCHEMA = Schema(
path = ID(stored=True, unique=True),
filename = TEXT(stored=True),
filetype = ID(stored=True),
content = TEXT(stored=True, analyzer=StemmingAnalyzer()),
modified = DATETIME(stored=True),
)- Index local folders
- Full-text search using Whoosh
- Search inside:
- TXT files
- PDF files
- JSON files
- CSV files
- XLSX files
- Keyword highlighting
- File type filtering
- Date filtering
- Pagination for results
- Index statistics window
- Simple Tkinter GUI
Install required packages:
pip install whoosh pypdf openpyxlpython app.pyReplace app.py with your actual filename.
- Choose a folder containing files
- Select file formats to index
- Click Build Index
- Enter a search query
- Browse paginated results
The search index is stored locally at:
~/.mini_search_index
| Extension | Description |
|---|---|
| .txt | Text files |
| PDF documents | |
| .json | JSON files |
| .csv | CSV spreadsheets |
| .xlsx | Excel workbooks |
The engine supports:
- Fuzzy search
- Wildcards
- Filename search
- Content search
- File type filtering
- Date range filtering
machine learning
report*
python~1
invoice
project/
│
├── app.py
├── README.md
- Hidden folders are skipped during indexing
- Invalid or unreadable files are ignored safely
- PDF and XLSX support are optional depending on installed packages
- Python
- Tkinter
- Whoosh
- PyPDF
- OpenPyXL
This project is open-source and free to use.