NeurIPs Paper Reviews 2023 #2

Sharpness-Aware Minimization Leads to Low-Rank Features

Maksym Andriushchenko, Dara Bahri, Hossein Mobahi, Nicolas Flammarion

In overparametrised neural networks, sharpness of minima has been observed to correlate negatively with the generalisation error of the model. Sharpness-aware minimisation (SAM) is a recent algorithm that introduces an explicit sharpness penalty to the optimisation objective which has been shown to improve model performance.

In this paper, the authors investigate the effect that SAM has on the features of the model. They demonstrate that SAM reduces the feature rank at different layers, as measured by the number of principal components that are needed to capture 99% of the variance, compared to networks that are trained using standard minimisation algorithms. This can for instance be used to reduce the dimensionality of the feature space, improving the performance of downstream tasks. In contrast, the authors found that directly imposing a lower feature rank on the model itself did not lead to improved generalisation. This suggests that the low rank is a useful side effect but not a full explanation of the benefits of SAM.

To further understand the mechanism behind this effect, the authors study a two-layer ReLU network. They show, both experimentally and theoretically, that SAM decreases pre-activation values within the network. This, in turn, reduces the number of non-zero activations and results in the observed low rank of the features.

Sharpness-Aware Minimization Leads to Low-Rank Features

NeurIPS 2022 Paper Reviews

Read paper reviews from NeurIPS 2022 from a number of our quantitative researchers and machine learning practitioners.

Read now

When Do Neural Nets Outperform Boosted Trees on Tabular Data?

Duncan C. McElfresh, Sujay Khandagale, Jonathan Valverde, Vishak Prasad C, Ganesh Ramakrishnan, Micah Goldblum, Colin White

This paper presents a comprehensive study comparing the performance of neural network (NN), gradient boosted decision trees (GBDT), and baseline algorithms like linear or k-nearest neighbour models on a large number of tabular datasets. It also introduces a benchmark suite of challenging tabular datasets to accelerate research in this area.

The study shows that no single algorithm dominates on all datasets, nearly all algorithms examined ranked first on at least one dataset. When aggregating the algorithms by their respective family, GBDTs are high-performing on slightly more datasets than NNs and baseline methods while also being faster than NNs. However, in many cases the difference in performance between NNs and GBDTs is either negligible or, at fixed budget, tuning the hyperparameters of GBDT is more useful than trying out different methods.

Additionally, the authors present a metafeature analysis to identify dataset properties that correlate with superior performance of certain techniques, which is helpful for practitioners selecting suitable methods for their respective datasets. For example, the authors demonstrate that GBDTs tend to outperform NNs on datasets with heavy-tailed, skewed, or high-variance features.

Quantitative Research and Machine Learning

Want to learn more about life as a researcher at G-Research?

Learn more

Latest News

G-Research May 2025 Grant Winners

18 Jun 2025

Each month, we provide up to £2,000 in grant money to early career researchers in quantitative disciplines. Hear from our May grant winners.

Read article

G-Research 2025 PhD prize winners: University of Warwick

04 Jun 2025

Every year, G-Research runs a number of different PhD prizes in Maths and Data Science at universities in the UK, Europe and beyond. We're pleased to announce the winners of this prize, run in conjunction with the University of Warwick.

Read article

G-Research 2025 PhD prize winners: University of Oxford

29 May 2025

Every year, G-Research runs a number of different PhD prizes in Maths and Data Science at universities in the UK, Europe and beyond. We're pleased to announce the winners of this prize, run in conjunction with the University of Oxford.

Read article

Latest Events

Quantitative Engineering
Quantitative Research

ML in PL Conference 2025

15 Oct 2025 - 18 Oct 2025 Copernicus Science Centre, Warsaw, Poland

More info

Quantitative Engineering
Quantitative Research

SIAM Conference on Financial Mathematics and Engineering

15 Jul 2025 - 18 Jul 2025 Hyatt Regency Miami, 400 SE 2nd St, Miami, FL 33131, United States

More info

NeurIPs Paper Reviews 2023 #2

Sharpness-Aware Minimization Leads to Low-Rank Features

When Do Neural Nets Outperform Boosted Trees on Tabular Data?

Quantitative Research and Machine Learning

Read more of our quantitative researchers thoughts

Latest News

Latest Events

ML in PL Conference 2025

SIAM Conference on Financial Mathematics and Engineering

Stay up to date with
G-Research

NeurIPs Paper Reviews 2023 #2

Sharpness-Aware Minimization Leads to Low-Rank Features

When Do Neural Nets Outperform Boosted Trees on Tabular Data?

Quantitative Research and Machine Learning

Read more of our quantitative researchers thoughts

Latest News

Latest Events

ML in PL Conference 2025

SIAM Conference on Financial Mathematics and Engineering

Stay up to date with G-Research

Stay up to date with
G-Research