Stable Differentiable Modal Synthesis for Learning Nonlinear Dynamics

Victor Zheleznov¹, Stefan Bilbao¹, Alec Wright¹ and Simon King²

¹Acoustics and Audio Group, University of Edinburgh, Edinburgh, UK
²Centre for Speech Technology Research, University of Edinburgh, Edinburgh, UK

Accompanying web-page for the JAES submission

Code arXiv

Abstract

Modal methods are a long-standing approach to physical modelling synthesis. Extensions to nonlinear problems are possible, including the case of a high-amplitude vibration of a string. A modal decomposition leads to a densely coupled nonlinear system of ordinary differential equations. Recent work in scalar auxiliary variable techniques has enabled construction of explicit and stable numerical solvers for such classes of nonlinear systems. On the other hand, machine learning approaches (in particular neural ordinary differential equations) have been successful in modelling nonlinear systems automatically from data. In this work, we examine how scalar auxiliary variable techniques can be combined with neural ordinary differential equations to yield a stable differentiable model capable of learning nonlinear dynamics. The proposed approach leverages the analytical solution for linear vibration of system’s modes so that physical parameters of a system remain easily accessible after the training without the need for a parameter encoder in the model architecture. As a proof of concept, we generate synthetic data for the nonlinear transverse vibration of a string and show that the model can be trained to reproduce the nonlinear dynamics of the system. Sound examples are presented.

Sound Examples

Below are some sound examples along with string and excitation parameters for the datasets used in the submission. All sound examples can be downloaded from the accompanying repository.

Test Dataset

$\gamma$	$\kappa$	$\nu$	$x_{\mathrm{e}}$	$x_{\mathrm{o}}$	$f_{\mathrm{amp}}$	$T_{\mathrm{e}}$	Note
232.5	1.05	174.4	0.12	0.90	4.9e+04	7.2e-04	Largest relative MSE for audio output (illustrated in the manuscript)
209.0	1.08	129.1	0.37	0.23	4.9e+04	1.2e-03	Smallest relative MSE for audio output
196.4	1.05	171.9	0.85	0.89	4.8e+04	1.3e-03	Strongest nonlinear effects (illustrated in the manuscript)
243.1	1.05	146.1	0.43	0.82	4.4e+04	1.4e-03	Random example #1
180.9	1.08	157.1	0.13	0.32	4.9e+04	1.4e-03	Random example #2
184.0	1.08	148.7	0.63	0.74	4.3e+04	6.6e-04	Random example #3
246.9	1.08	167.4	0.87	0.64	4.2e+04	9.9e-04	Random example #4
202.0	1.07	171.1	0.81	0.79	3.8e+04	1.1e-03	Random example #5
196.8	1.08	140.5	0.29	0.50	4.2e+04	1.5e-03	Random example #6
202.8	1.06	126.1	0.57	0.61	4.1e+04	5.3e-04	Random example #7
191.7	1.06	139.1	0.65	0.87	3.8e+04	1.3e-03	Random example #8
229.8	1.08	138.6	0.81	0.55	4.8e+04	1.1e-03	Random example #9
190.5	1.08	125.2	0.87	0.58	4.8e+04	1.4e-03	Random example #10

Validation Dataset

$\gamma$	$\kappa$	$\nu$	$x_{\mathrm{e}}$	$x_{\mathrm{o}}$	$f_{\mathrm{amp}}$	$T_{\mathrm{e}}$	Note
203.2	1.07	171.0	0.83	0.38	4.7e+04	5.0e-04	Largest relative MSE for audio output
210.2	1.06	139.1	0.78	0.49	4.3e+04	1.4e-03	Smallest relative MSE for audio output
202.0	1.07	154.0	0.23	0.60	4.9e+04	1.4e-03	Random example #1
231.3	1.06	160.8	0.39	0.21	3.5e+04	9.6e-04	Random example #2
177.1	1.08	170.5	0.30	0.19	4.7e+04	1.2e-03	Random example #3
229.6	1.06	140.5	0.79	0.27	4.3e+04	8.0e-04	Random example #4
190.7	1.06	130.9	0.22	0.28	3.5e+04	1.4e-03	Random example #5

Training Dataset

$\gamma$	$\kappa$	$\nu$	$x_{\mathrm{e}}$	$x_{\mathrm{o}}$	$f_{\mathrm{amp}}$	$T_{\mathrm{e}}$	Note
169.5	1.03	169.9	0.72	0.31	2.5e+04	5.3e-04	Largest relative MSE for audio output
148.6	1.02	129.4	0.47	0.56	3.0e+04	1.3e-03	Smallest relative MSE for audio output
150.3	1.01	172.5	0.37	0.16	3.4e+04	1.5e-03	Random example #1
144.9	1.04	164.9	0.28	0.71	2.6e+04	1.0e-03	Random example #2
167.3	1.05	138.4	0.14	0.21	2.8e+04	8.1e-04	Random example #3
141.4	1.02	163.2	0.67	0.11	2.9e+04	7.5e-04	Random example #4
125.4	1.01	173.1	0.62	0.72	2.9e+04	7.2e-04	Random example #5