Tutorials
- Tutorial by [Stanley Osher] - PDE Based Trustworthy Deep Learning / [video]
(Stanley Osher: "Trustworthy deep learning" - YouTube)
– NIPS ResNets Ensemble via the Feynman-Kac Formalism to Improve Natural and Robust Accuracies
– SIAM MDS ResNets Ensemble via the Feynman-Kac Formalism to Improve Natural and Robust Accuracies - Talk by Lars Partial Differential Equations Meet Deep Learning: Old Solutions for New Problems & Vice Versa
- Mathematics of Deep Learning
- Talk by Lars Machine Learning meets Optimal Transport: Old solutions for new problems and vice versa
ODE/PDE/SDE
- Lars Deep neural networks motivated by partial differential equations / video 1 / video 2
- SIAM MDS Minitutorial: Deep Neural Networks for High-Dimensional Parabolic PDEs by Christoph Reisinger
- ICML 18: Beyond finite layer neural networks: Bridging deep architectures and numerical differential equations.
- Integrating Physics-Based Modeling With Machine Learning: A Survey
- NIPS 18 best paper: Neural Ordinary Differential Equations
- NIPS 18 Nonlocal neural networks, nonlocal diffusion and nonlocal modeling
- Deep neural networks motivated by partial differential equations
- ICML 18 PDE-Net: Learning PDEs from Data
- Neural SDE Neural Stochastic Differential Equations: Deep Latent Gaussian Models in the Diffusion Limit [Presentation Video]
- AISTAT 20: Scalable Gradients for Stochastic Differential Equations
- ICLR 21 Learning Neural Event Functions for Ordinary Differential Equations
- PNAS Solving high-dimensional partial differential equations using deep learning
w/ Graph
- GitHub - zongyi-li/graph-pde: Using graph network to solve PDEs
- Multipole Graph Neural Operator for Parametric Partial Differential Equations
- Combining Differentiable PDE Solvers and Graph Neural Networks for Fluid Flow Prediction
Deep Learning And ODE from
A very early paper using differential equation to design residual like network
- Chen Y, Yu W, Pock T. On learning optimized reaction diffusion processes for effective image restoration CVPR2015
The First papers introducing the idea linking ODEs and Deep ResNets
- Weinan E. A proposal on machine learning via dynamical systems[J]. Communications in Mathematics and Statistics, 2017, 5(1): 1-11.
- Sonoda S, Murata N. Transport analysis of infinitely deep neural network[J]. The Journal of Machine Learning Research, 2019, 20(1): 31-82. (It’s on arxiv 2017)
- Haber E, Ruthotto L. Stable architectures for deep neural networks[J]. Inverse Problems, 2017, 34(1): 014004.
- Lu Y, Zhong A, Li Q, et al. Beyond finite layer neural networks: Bridging deep architectures and numerical differential equations[J]. arXiv preprint arXiv:1710.10121, 2017.(ICLR workshop 2018/ICML2018)
Architecture Design
- Chang B, Meng L, Haber E, et al. Reversible architectures for arbitrarily deep residual neural networks[C]//Thirty-Second AAAI Conference on Artificial Intelligence. 2018.
- Haber E, Ruthotto L. Stable architectures for deep neural networks[J]. Inverse Problems, 2017.
- Lu Y. et al., Beyond Finite Layer Neural Network: Bridging Deep Architects and Numerical Differential Equations, ICML 2018.
- Chang B, Chen M, Haber E, et al. Antisymmetricrnn: A dynamical system view on recurrent neural networks[J]. arXiv preprint arXiv:1902.09689, 2019.(ICLR2019)
- Latent ODEs for Irregularly-Sampled Time Series
Yulia Rubanova, Ricky T. Q. Chen, David Duvenaud
Advances in Neural Information Processing Systems (NeurIPS). - Chen R T Q, Duvenaud D. Neural Networks with Cheap Differential Operators[C]//2019 ICML Workshop on Invertible Neural Nets and Normalizing Flows (INNF). 2019.
- Dupont E, Doucet A, Teh Y W. Augmented neural odes[J]. arXiv preprint arXiv:1904.01681, 2019.
- Zhong Y D, Dey B, Chakraborty A. Symplectic ODE-Net: Learning Hamiltonian Dynamics with Control[J]. arXiv preprint arXiv:1909.12077, 2019.
- Che Z, Purushotham S, Cho K, et al. Recurrent neural networks for multivariate time series with missing values[J]. Scientific reports, 2018, 8(1): 6085.
Modeling other networks
- Tao Y, Sun Q, Du Q, et al. Nonlocal Neural Networks, Nonlocal Diffusion and Nonlocal Modeling. NeurIPS 2018. (Modeling nonlocal neural networks)
- Lu Y, Li Z, He D, et al. Understanding and Improving Transformer From a Multi-Particle Dynamic System Point of View. arXiv preprint arXiv:1906.02762, 2019.(Modeling Transformer like seq2seq learning networks)
- Variational Integrator Networks for Physically Meaningful Embeddings link
Changing schemes
- Zhang L, Schaeffer H. Forward Stability of ResNet and Its Variants. arXiv preprint arXiv:1811.09885, 2018.
- Zhu M, Chang B, Fu C. Convolutional Neural Networks combined with Runge-Kutta Methods. arXiv:1802.08831, 2018.
- Xie X, Bao F, Maier T, Webster C. Analytic Continuation of Noisy Data Using Adams Bashforth ResNet. arXiv:1905.10430, 2019.
- Dynamical System Inspired Adaptive Time Stepping Controller for Residual Network Families AAAI2020
- Herty M, Trimborn T, Visconti G. Kinetic Theory for Residual Neural Networks[J]. arXiv preprint arXiv:2001.04294, 2020.
Training Algorithm
Adjoint Method
- Li Q, Chen L, Tai C, et al. Maximum principle based algorithms for deep learning[J]. The Journal of Machine Learning Research, 2017, 18(1): 5998-6026.
- Li Q, Hao S. An optimal control approach to deep learning and applications to discrete-weight neural networks[J]. arXiv preprint arXiv:1803.01299, 2018.=
- Chen T Q, Rubanova Y, Bettencourt J, et al. Neural ordinary differential equations[C]//Advances in neural information processing systems. 2018: 6571-6583.
- Zhang D, Zhang T, Lu Y, et al. You only propagate once: Painless adversarial training using maximal principle[J]. arXiv preprint arXiv:1905.00877, 2019.(Neurips2019)
Multi-grid like algorithm
- Chang B, Meng L, Haber E, et al. Multi-level residual networks from dynamical systems view[J]. arXiv preprint arXiv:1710.10348, 2017.
- Günther S, Ruthotto L, Schroder J B, et al. Layer-parallel training of deep residual neural networks[J]. SIAM Journal on Mathematics of Data Science, 2020, 2(1): 1-23.
- Parpas P, Muir C. Predict Globally, Correct Locally: Parallel-in-Time Optimal Control of Neural Networks. arXiv:1902.02542.
Linking SDE
- Lu Y. et al., Beyond Finite Layer Neural Network: Bridging Deep Architects and Numerical Differential Equations, ICML 2018.
- Sun Q, Tao Y, Du Q. Stochastic training of residual networks: a differential equation viewpoint[J]. arXiv preprint arXiv:1812.00174, 2018.
- Tzen B, Raginsky M. Neural Stochastic Differential Equations: Deep Latent Gaussian Models in the Diffusion Limit[J]. arXiv preprint arXiv:1905.09883, 2019.
- Twomey N, Kozłowski M, Santos-Rodríguez R. Neural ODEs with stochastic vector field mixtures[J]. arXiv preprint arXiv:1905.09905, 2019.
- neural jump stochastic differential equation arXiv:1905.10403
- neural stochastic differential equation arXiv:1905.11065
- Wang B, Yuan B, Shi Z, et al. Enresnet: Resnet ensemble via the feynman-kac formalism. arXiv preprint arXiv:1811.10745, 2018.
- Li X, Wong T K L, Chen R T Q, et al. Scalable Gradients for Stochastic Differential Equations[J]. arXiv preprint arXiv:2001.01328, 2020.
Theoritical Papers
- Weinan E, Han J, Li Q. A mean-field optimal control formulation of deep learning[J]. Research in the Mathematical Sciences, 2019, 6(1): 10.
- Thorpe M, van Gennip Y. Deep limits of residual neural networks[J]. arXiv preprint arXiv:1810.11741, 2018.
- Avelin B, Nyström K. Neural ODEs as the Deep Limit of ResNets with constant weights[J]. arXiv preprint arXiv:1906.12183, 2019.
- Zhang H, Gao X, Unterman J, et al. Approximation Capabilities of Neural Ordinary Differential Equations[J]. arXiv preprint arXiv:1907.12998, 2019.
- Hu K, Kazeykina A, Ren Z. Mean-field Langevin System, Optimal Control and Deep Neural Networks[J]. arXiv preprint arXiv:1909.07278, 2019.
- Tzen B, Raginsky M. Theoretical guarantees for sampling and inference in generative models with latent diffusions[J]. arXiv preprint arXiv:1903.01608, 2019.(COLR2019)
Robustness
- Zhang J, Han B, Wynter L, Low KH, Kankanhalli M. Towards robust resnet: A small step but a giant leap. IJCAI 2019.
- Yan H, Du J, Tan V Y F, et al. On Robustness of Neural Ordinary Differential Equations[J]. arXiv preprint arXiv:1910.05513, 2019.
- Liu X, Si S, Cao Q, et al. Neural SDE: Stabilizing Neural ODE Networks with Stochastic Noise[J]. arXiv preprint arXiv:1906.02355, 2019.
- Reshniak V, Webster C. Robust learning with implicit residual networks[J]. arXiv preprint arXiv:1905.10479, 2019.
- Wang B, Yuan B, Shi Z, et al. Enresnet: Resnet ensemble via the feynman-kac formalism[J]. arXiv preprint arXiv:1811.10745, 2018.(Neurips2019)
Generative Models
-
Neural Ordinary Differential Equations (BEST PAPER AWARD)
Ricky T. Q. Chen, Yulia Rubanova, Jesse Bettencourt, David Duvenaud
Advances in Neural Information Processing Systems (NeurIPS). -
FFJORD: Free-form Continuous Dynamics for Scalable Reversible Generative Models (ORAL)
Will Grathwohl, Ricky T. Q. Chen, Jesse Bettencourt, Ilya Sutskever, David Duvenaud
International Conference on Learning Representations (ICLR). -
Invertible Residual Networks (LONG ORAL)
Jens Behrmann, Will Grathwohl, Ricky T. Q. Chen, David Duvenaud, Jörn-Henrik Jacobsen -
Residual Flows for Invertible Generative Modeling (SPOTLIGHT),
Ricky T. Q. Chen, Jens Behrmann, David Duvenaud, Jörn-Henrik Jacobsen
Advances in Neural Information Processing Systems (NeurIPS). -
Accelerating Neural ODEs with Spectral Elements[J]
Quaglino A, Gallieri M, Masci J, et al.
arXiv preprint arXiv:1906.07038, 2019. -
ODE ^ 2 VAE: Deep generative second order ODEs with Bayesian neural networks[J]
Yıldız Ç, Heinonen M, Lähdesmäki H.
arXiv preprint arXiv:1905.10994, 2019.(Neurips2019) -
ANODEV2: A Coupled Neural ODE Framework
arXiv:1906.04596 - Port-Hamiltonian Approach to Neural Network Training CDC19
-
How to train your neural ODE
Chris Finlay, Jörn-Henrik Jacobsen, Levon Nurbekyan, Adam M Oberman
arXiv:2002.02798
Image Processing
- Liu R, Lin Z, Zhang W, et al. Learning PDEs for image restoration via optimal control[C]//European Conference on Computer Vision. Springer, Berlin, Heidelberg, 2010: 115-128.
- Chen Y, Yu W, Pock T. On learning optimized reaction diffusion processes for effective image restoration CVPR2015
- Xiaoshuai Zhang*, Yiping Lu*, Jiaying Liu, Bin Dong. “Dynamically Unfolding Recurrent Restorer: A Moving Endpoint Control Method for Image Restoration” Seventh International Conference on Learning Representations(ICLR) 2019(*equal contribution)
- Xixi Jia, Sanyang Liu, Xiagnchu Feng, Lei Zhang, “FOCNet: A Fractional Optimal Control Network for Image Denoising,” in CVPR 2019.
very early work for learning ode/pdes
- Zhu S C, Mumford D B. Prior learning and Gibbs reaction-diffusion[C]. Institute of Electrical and Electronics Engineers, 1997.
- Gilboa G, Sochen N, Zeevi Y Y. Estimation of optimal PDE-based denoising in the SNR sense[J]. IEEE Transactions on Image Processing, 2006, 15(8): 2269-2280.
- Bongard J, Lipson H. Automated reverse engineering of nonlinear dynamical systems[J]. Proceedings of the National Academy of Sciences, 2007, 104(24): 9943-9948.
- Liu R, Lin Z, Zhang W, et al. Learning PDEs for image restoration via optimal control[C]//European Conference on Computer Vision. Springer, Berlin, Heidelberg, 2010: 115-128.
Review Paper
- Liu G H, Theodorou E A. Deep learning theory review: An optimal control and dynamical systems perspective[J]. arXiv preprint arXiv:1908.10920, 2019.
3d Vision
- He X, Cao H L, Zhu B. AdvectiveNet: An Eulerian-Lagrangian Fluidic reservoir for Point Cloud Processing[J]. arXiv preprint arXiv:2002.00118, 2020.