An intelligent audio analysis tool for detecting and analyzing drops in electronic music. Built with Python, Streamlit, and machine learning.
Here Comes The Drop is a comprehensive drop detection and analysis system for EDM and electronic music. It analyzes audio tracks to identify tension build-ups, drops, and impact events using audio signal processing and optional machine learning predictions.
- Feature Extraction: RMS energy, bass RMS, spectral centroid, spectral flux, and onset strength
- Temporal Analysis: Sliding window statistics for tracking changes over time
- Score Calculation:
- Tension Score: Measures build-up and anticipation
- Impact Score: Detects sudden energy releases
- Stability Score: Tracks rhythmic consistency
- Automatic detection of:
- Drops: The moment when tension releases into the main beat
- High Tension Zones: Build-up sections before drops
- Impact Events: Sudden energy spikes
- Real-time audio playback with waveform display
- Interactive timeline with zoomable, toggleable trace layers
- Click-to-seek functionality on timeline and events
- Color-coded event markers
- Train custom drop detection models from your annotations
- Real-time ML predictions with confidence scores
- Export training data in JSON format
- Manual drop marking for creating training datasets
- Confirm or reject auto-detected drops
- Export annotations for ML training
- Python 3.8 or higher
- pip package manager
- Clone the repository:
git clone https://github.com/yourusername/here-comes-the-drop.git
cd here-comes-the-drop- Install dependencies:
pip install -r requirements.txtStart the Streamlit app:
streamlit run app.pyThe app will open in your default browser at http://localhost:8501.
- Upload an audio file (WAV, MP3, FLAC, or OGG)
- Adjust analysis parameters in the sidebar:
- Hop length (frame size)
- Temporal window size
- Score weights for tension, impact, and stability
- Detection thresholds
- Click Analyze
- Explore the interactive timeline and detected events
- Click on events or the timeline to seek playback
- Analyze multiple tracks and add manual annotations
- Confirm or reject auto-detected drops
- Run the training script:
python -m ml.train_model- Enable "ML predictions" in the sidebar to use the trained model
here-comes-the-drop/
├── app.py # Main Streamlit application
├── requirements.txt # Python dependencies
│
├── core/ # Core analysis modules
│ ├── audio_loader.py # Audio file loading
│ ├── feature_extractor.py # Audio feature extraction
│ ├── temporal_features.py # Temporal analysis
│ ├── score_calculator.py # Tension/Impact/Stability scores
│ └── event_detector.py # Drop/tension/impact detection
│
├── visualization/ # Visualization components
│ ├── timeline.py # Interactive timeline chart
│ └── audio_player.py # Audio playback control
│
├── export/ # Data export utilities
│ ├── data_exporter.py # JSON export
│ └── annotation_manager.py # Annotation management
│
├── ml/ # Machine learning components
│ ├── dataset_builder.py # Training dataset creation
│ ├── train_model.py # Model training script
│ └── predictor.py # ML inference
│
├── output/ # Analysis results (JSON)
├── annotations/ # Manual annotations (JSON)
├── models/ # Trained ML models
└── samples/ # Sample audio files
The system extracts frame-based audio features using librosa:
- RMS Energy: Overall loudness per frame
- Bass RMS: Energy in the 20-200 Hz range
- Spectral Centroid: Brightness of the sound
- Spectral Flux: Rate of spectral change
- Onset Strength: Percussive transient detection
A sliding window calculates statistics over time:
- Slope (rate of change)
- Variance (instability)
- Mean (average level)
Combines features into interpretable scores (0-1):
- Tension: High when energy is building (rising RMS, increasing flux)
- Impact: Spikes when bass drops suddenly
- Stability: High during consistent rhythmic sections
Uses configurable thresholds to identify:
- Drops: High tension → high impact → high stability
- Tension Zones: Extended periods of rising tension
- Impact Events: Sudden energy increases
- Random Forest classifier trained on annotated examples
- Features: Audio characteristics + scores at current frame
- Predicts:
normal,pre_drop, ordrop
All analysis parameters are adjustable via the sidebar:
- Hop Length: Frame size for feature extraction (512-4096 samples)
- Temporal Window: Sliding window duration (0.5-3.0 seconds)
Customize how features contribute to tension, impact, and stability scores.
Adjust sensitivity for detecting tension zones, impacts, and drops.
Analysis results are exported to JSON:
output/features_<trackname>.json- Frame-by-frame audio featuresoutput/scores_<trackname>.json- Tension/Impact/Stability scoresoutput/events_<trackname>.json- Detected drops and eventsannotations/<trackname>_annotations.json- Manual annotations
librosa>=0.10.0 # Audio analysis
numpy>=1.24.0 # Numerical computing
scipy>=1.10.0 # Scientific computing
pandas>=2.0.0 # Data manipulation
plotly>=5.15.0 # Interactive visualization
streamlit>=1.37.0 # Web interface
soundfile>=0.12.0 # Audio I/O
scikit-learn>=1.3.0 # Machine learning
Contributions are welcome! Feel free to:
- Report bugs
- Suggest features
- Submit pull requests
- Share annotated datasets
MIT License - feel free to use this project for any purpose.
Built with:
- librosa - Audio analysis
- Streamlit - Web interface
- Plotly - Interactive charts
- scikit-learn - Machine learning
Analyze. Annotate. Train. Detect.
Made with passion for electronic music.