Back to news

NeurIPS 2022: Paper review #6

25 January 2023

Quantitative Research

G-Research were headline sponsors at NeurIPS 2022, in New Orleans.

ML is a fast-evolving discipline; attending conferences like NeurIPS and keeping up-to-date with the latest developments is key to the success of our quantitative researchers and machine learning engineers.

Our NeurIPS 2022 paper review series gives you the opportunity to hear about the research and papers that our quants and ML engineers found most interesting from the conference.

Here, Maria R, Quantitative Researcher at G-Research, discusses two papers from NeurIPS:

Low-rank lottery tickets: finding efficient low-rank neural networks via matrix differential equations
AUTOMATA: Gradient Based Data Subset Selection for Compute-Efficient Hyper-parameter Tuning

Low-rank lottery tickets: finding efficient low-rank neural networks via matrix differential equations

Steffen Schotthöfer, Emanuele Zangrando, Jonas Kusch, Gianluca Ceruti, Francesco Tudisco

Various results suggest that at the end of training, a large proportion of the parameters in modern neural networks are unnecessary (for example, the lottery ticket hypothesis).

Building on this observation, the authors approximate the weight matrix of the network by a matrix of much smaller rank, of the form USV, obtained by a decomposition of the original matrix, and a selection of the top singular values (i.e. the first values in the diagonal of S).

To ensure the low rank structure of the matrix is preserved, instead of optimising through discreet stochastic gradient descent, the authors integrate the gradient flow equations for each one of U, S and V.

The ordinary differential equation integrator they propose further allows to optimise the rank of the approximate matrix (i.e. the dimension of the manifold on which optimisation occurs). The pseudo-algorithm is presented in Algorithm 1, and theoretical convergence guarantees numerical results are given in Section 5 (tested on MNIST, CIFAR-10 and ImageNet1K).

AUTOMATA: Gradient Based Data Subset Selection for Compute-Efficient Hyper-parameter Tuning

Krishnateja Killamsetty, Guttu Sai Abhishek, Aakriti, Alexandre V. Evfimievski, Lucian Popa, Ganesh Ramakrishnan, Rishabh Iyer

The paper aims at reducing the significant computational cost of hyper-parameter tuning.

Typically, hyper-parameter optimisation is performed by choosing random subsets of training data. The authors propose a gradient-based informative subset selection model that allows for much faster tuning. They combine this with usual hyper-parameter search into their proposed framework for hyper-parameters tuning: AUTOMATA.

They empirically test their framework on various benchmark datasets: SST2 (text), glue-SST2 (text), CIFAR10 (image), CIFAR100 (image), and CONNECT-4 (tabular). The speed-up is considerable, from 10 to 30 times with respect to tuning on the full dataset when no scheduler is used, and two to three times when the ASHA scheduler is used. The performance loss is of no more than 3% in both cases.

The AUTOMATA framework consists of three components: [1] a hyper-parameter search algorithm that determines which configurations need to be evaluated, [2] the proposed gradient-based subset selection algorithm (SSD) that trains and evaluates each configuration efficiently, and [3] a hyper-parameter scheduling algorithm, which provides early stopping by eliminating the poor configurations quickly.

The novel SSD method (step [2]) assigns weights to different data samples and allows it to find, for a given choice of hyper-parameters, the most informative data subset to evaluate the model. It finds the optimal subset and weights through gradient descent on the difference between the training loss on the full dataset and on the weighted subset (Eq. 2) which it alternates with usual training epochs for the network’s parameter.

Read all of our NeurIPS 2022 paper reviews

NeurIPS 2022: Paper review #7

25 Jan 2023

Our NeurIPS 2022 paper review series gives you the opportunity to hear about the research and papers that our quants and ML engineers found most interesting from the conference. Here, a Senior Quantitative Researcher at G-Research, discusses two papers from NeurIPS:

G-Research at NeurIPS 2022

We work on a very mature problem at GE Research predicting financial markets, which means we need to stay at the cutting edge of what we do. That's why events like noritz, where we've ATO tier sponsors are crucial for our business as they bring together the best machine learning practitioners to present and discuss the latest research and innovation in ml. We encourage our quant researchers and machine learning engineers to attend leading conferences in person to further develop their skills and stay abreast of the latest technological developments from some of the brightest minds in the industry. Additionally, a number of our talent acquisition team were on hand throughout the week to talk to attendees about what we do, including the various research and engineering roles we are currently hiring for, and we kept everyone fueled with the help of our head barista as busy as we were inside Europe as a headline sponsor. We also ran a number of events outside the conference hall during the week as well. Not least the first ever G Research boat party held on a classic paddle steamer on the Mississippi. What better way to bring together like-minded people in New Orleans? And we were delighted. So many people wanted to come along, As well as providing a unique networking opportunity. This event also gave us the chance to give our guests a flavor of what life is like. At Achieve research, we pride ourselves on cultivating an environment where smart people come together to challenge themselves, enjoy their work, and achieve things as a team. And there's also plenty of opportunity for fun along the way. You know what they say about all work and no play. Want to learn more about GE Research or meet us at a future event? Visit our website to find out more.

Open video transcript

NeurIPS 2022: Paper review #6

Low-rank lottery tickets: finding efficient low-rank neural networks via matrix differential equations

AUTOMATA: Gradient Based Data Subset Selection for Compute-Efficient Hyper-parameter Tuning

Read all of our NeurIPS 2022 paper reviews

G-Research at NeurIPS 2022

Stay up to date with G-Research

Stay up to date with
G-Research