Skip to content

harshuljain13/opensearch-at-scale

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

OpenSearch at Scale Workshop

A comprehensive, hands-on workshop covering OpenSearch from foundations to production-scale deployments. Designed for engineers who want both theoretical depth and industry best practices.

Workshop Flow

# Notebook Focus Area
1 01_foundations.ipynb Core concepts: indices, mappings, analyzers, document lifecycle
2 02_ingestion_pipelines.ipynb Bulk ingestion, ingest processors, Data Prepper, OSIS
3 03_text_search_deep_dive.ipynb BM25, analyzers, relevance tuning, query DSL mastery
4 04_vector_search.ipynb k-NN plugin, FAISS/NMSLIB/Lucene engines, embedding strategies
5 05_hybrid_search.ipynb Combining BM25 + vector, Reciprocal Rank Fusion, search pipelines
6 06_hyde_and_advanced_retrieval.ipynb HyDE, query expansion, re-ranking, RAG patterns
7 07_neural_search_ml_commons.ipynb ML Commons framework, neural search pipelines, model deployment
8 08_serving_architecture.ipynb Read replicas, cross-cluster search, caching, search pipelines
9 09_scalability_optimizations.ipynb Shard strategies, segment merging, refresh tuning, circuit breakers
10 10_production_patterns.ipynb Monitoring, benchmarking (OpenSearch Benchmark), security, ILM

Prerequisites

  • Python 3.10+
  • Docker (for local OpenSearch cluster)
  • opensearch-py client library
  • AWS account (optional, for managed OpenSearch examples)

Setup

pip install -r requirements.txt
docker-compose up -d  # Starts local OpenSearch cluster

Inspired By

  • AWS OpenSearch documentation & blogs
  • Real-world DynamoDB-to-OpenSearch ingestion patterns
  • Industry papers on hybrid search, HyDE, and vector DB scalability

About

Workshop material to dive deeper into opensearch

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors