Recent research conducted by a team of scientists from Bocconi University, Politecnico di Torino, and Bocconi Institute for Data Science and Analytics sheds light on the intricate landscape of neural networks. The study, centered around the negative spherical perceptron, a basic non-convex neural network model, unveils a star-shaped geometry within its solution space.
The scientists delved into the study with the aim of analytically reproducing the phenomenon observed in neural network landscapes. Luca Saglietti, co-author of the paper, explained, "Recent research on the landscape of neural networks has shown that independent stochastic gradient descent (SGD) trajectories often land in the same low-loss basin, and often no barrier is found along the linear interpolation between them."
The analytical method introduced by the researchers allows the computation of energy barriers on the linear interpolation between pairs of solutions. This method was applied to the negative spherical perceptron, a continuous and non-convex model, renowned for its rich solution space in the overparameterized regime.
Clarissa Lauditi, another co-author, emphasized, "In the negative perceptron, the constraints can be relaxed, and in this overparameterized regime, the geometry of the solution space becomes surprisingly rich."
By employing the replica method, a well-established technique in statistical physics studies, the researchers uncovered a surprising result: the solutions of the negative perceptron are arranged in a star-shaped geometry. Enrico Malatesta, co-author, noted, "Most of them are located on the tips of the star, but there exists a subset of solutions (located in the core of the star) that are connected through a straight line to almost all the other solutions."
The star-shaped geometry has significant implications for the behavior of training algorithms used in deep learning. Malatesta added, "Common algorithms used in deep learning have a bias towards the solutions located in the core of the star. Those solutions have desirable properties, e.g. better robustness and generalization capabilities."
This research provides valuable insights into the geometrical shapes of solutions associated with the spherical negative perceptron, offering implications for the performance of algorithms. Gabriele Perugini, co-author, expressed the team's future goals, stating, "Our future research aims to understand to what extent the star-shaped geometry may be a universal property of overparameterized neural networks and weakly constrained optimization problems."
In conclusion, the study presents intriguing possibilities for the universality of the star-shaped geometry in overparameterized neural networks, hinting at potential connectivity properties in various non-convex optimization problems.