A comprehensive, hands-on workshop covering OpenSearch from foundations to production-scale deployments. Designed for engineers who want both theoretical depth and industry best practices.
| # | Notebook | Focus Area |
|---|---|---|
| 1 | 01_foundations.ipynb |
Core concepts: indices, mappings, analyzers, document lifecycle |
| 2 | 02_ingestion_pipelines.ipynb |
Bulk ingestion, ingest processors, Data Prepper, OSIS |
| 3 | 03_text_search_deep_dive.ipynb |
BM25, analyzers, relevance tuning, query DSL mastery |
| 4 | 04_vector_search.ipynb |
k-NN plugin, FAISS/NMSLIB/Lucene engines, embedding strategies |
| 5 | 05_hybrid_search.ipynb |
Combining BM25 + vector, Reciprocal Rank Fusion, search pipelines |
| 6 | 06_hyde_and_advanced_retrieval.ipynb |
HyDE, query expansion, re-ranking, RAG patterns |
| 7 | 07_neural_search_ml_commons.ipynb |
ML Commons framework, neural search pipelines, model deployment |
| 8 | 08_serving_architecture.ipynb |
Read replicas, cross-cluster search, caching, search pipelines |
| 9 | 09_scalability_optimizations.ipynb |
Shard strategies, segment merging, refresh tuning, circuit breakers |
| 10 | 10_production_patterns.ipynb |
Monitoring, benchmarking (OpenSearch Benchmark), security, ILM |
- Python 3.10+
- Docker (for local OpenSearch cluster)
opensearch-pyclient library- AWS account (optional, for managed OpenSearch examples)
pip install -r requirements.txt
docker-compose up -d # Starts local OpenSearch cluster- AWS OpenSearch documentation & blogs
- Real-world DynamoDB-to-OpenSearch ingestion patterns
- Industry papers on hybrid search, HyDE, and vector DB scalability