(Realtime) Temporal Convolutions in PyTorch
-
Updated
Apr 7, 2025 - Python
(Realtime) Temporal Convolutions in PyTorch
Streamable Text-to-Speech model using a language modeling approach, without vector quantization
Dual-model speech AI toolkit for speaker verification and speaker-aware diarization, with streaming inference, meeting analysis, long-audio monitoring, and speaker-bank integration.
Pure PyTorch + 🤗 Transformers reimplementation of Megalodon (CEMA + chunked attention) - readable, hackable, no CUDA kernels required
World's most deployable time series foundation model — 200K-6.5M params, zero-shot forecasting, streaming RNN inference, ONNX edge deployment, runs on Raspberry Pi
Open ML systems platform for training, profiling, evaluating, and serving AI models.
Lossless AI model compression - ~34% smaller with bit-identical weights; the autopilot profiles your machine, picks the highest fidelity that runs, and streams models bigger than your RAM.
Don't Think It Twice: Exploit Shift Invariance for Efficient Online Streaming Inference of CNNs
CascadeLUT: Information-Ordered Streaming Inference for Bandwidth-Constrained FPGAs [FPL'26]
CPU-native inference runtime. Local-propagation paradigm: the active region pays the cost, not the field. Bit-exact across architectures. Validated for streaming anomaly detection and audio VAD.
Streaming version of S4ND-U-Net
Real-time music-genre classification: spectrogram CNN, ONNX-optimised, served as a streaming/chunked classifier with PyTorch-vs-ONNX benchmarks. Track-aware GTZAN eval.
Add a description, image, and links to the streaming-inference topic page so that developers can more easily learn about it.
To associate your repository with the streaming-inference topic, visit your repo's landing page and select "manage topics."