Fast and Easy Infinite Neural Networks in Python
-
Updated
Mar 1, 2024 - Jupyter Notebook
Fast and Easy Infinite Neural Networks in Python
CVPR 2024-Improved Implicit Neural Representation with Fourier Reparameterized Training
ICML2025-Inductive Gradient Adjustment for Spectral Bias in Implicit Neural Representations
Existing literature about training-data analysis.
A unified framework for attributing model components, data, and training dynamics to model behavior.
Official repository for "FOCUS: First Order Concentrated Updating Scheme"
Code for "What Happens During the Loss Plateau? Understanding Abrupt Learning in Transformers" (NeurIPS 2025)
Code for 'Towards a Theoretical Understanding of the 'Reversal Curse' via Training Dynamics'
Source code for <Probability Consistency in Large Language Models: Theoretical Foundations Meet Empirical Discrepancies>
Code for "Effect of equivariance on training dynamics"
Official repository for the EMNLP 2024 paper "How Hard is this Test Set? NLI Characterization by Exploiting Training Dynamics"
External LLM intelligence monitors & diagnoses MoE expert ecology during training — preventing routing collapse without auxiliary loss engineering. 16 Experts, 3 Tiers, Top-2 Gating, Claude-in-the-Loop.
Code and data for: Three Phases of Expert Routing — How Load Balance Evolves During MoE Training
Cross-Family Convergence of Neural Network Weight Skeletons. Companion to Zenodo paper (10.5281/zenodo.19652706).
Code for tracking concept emergence via attention-head binding (EB*). Pythia experiments across 160M-2.8B parameters.
Code for "Abrupt Learning in Transformers: A Case Study on Matrix Completion" (NeurIPS 2024)
σFlow-PDE: A drop-in H-Bar training engine that escapes the σ-trap in neural PDE solvers via live σ/δ/α ODE integration, autonomous phase curriculum, and auto-falsification.
Reimplementation of the Sliced Information Plane (SIP) framework from Wongso, Ghosh, and Motani (2025) for analyzing deep neural network training dynamics. The repo uses Sliced Mutual Information (SMI) to obtain scalable, finite dependence estimates in high‑dimensional, deterministic settings, and applies them to MNIST MLP experiments.
Supplementary code for the paper 'Dynamic Rescaling for Training GNNs' to be published at NeurIPS 2024
Atomic benchmark suite showing drift can act as an early warning before direct symmetry detection in gradual-breaking regimes, with reversal controls, finite-budget sensitivity tests, and exact alarm-time validation.
Add a description, image, and links to the training-dynamics topic page so that developers can more easily learn about it.
To associate your repository with the training-dynamics topic, visit your repo's landing page and select "manage topics."