Skip to main content
Back to News
ICML 2024: Paper Review #2

ICML 2024: Paper Review #2

24 September 2024
  • Quantitative Research

Machine Learning (ML) is a fast evolving discipline, which means conference attendance and hearing about the very latest research is key to the ongoing development and success of our quantitative researchers and ML engineers.

In this paper review series, our ICML 2024 attendees reveal the research and papers they found most interesting.

Here, discover the perspectives of Machine Learning Engineer, Danny, as he discusses his most compelling findings from the conference.

Compute Better Spent: Replacing Dense Layers with Structured Matrices

Shikai Qiu, Andres Potapczynski, Marc Finzi, Micah Goldblum, Andrew Gordon Wilson

In this paper, the authors seek to find more compute-efficient alternatives to replace dense linear layers, investigating structured alternatives such as low-rank matrices, monarch matrices and Kronecker products.

The authors claim that these approaches have largely failed in the past due to choosing hyperparameters poorly when these alternatives are used in place of dense linear layers.

To address this, they adapt the initialisation scheme derived from the maximal update parametrization work to support with these structured matrices, and use it to optimise some simple hyperparameters (like learning rate). They show that by doing this, they are able to achieve better test performance per flop on a number of tasks.

Compute Better Spent: Replacing Dense Layers with Structured Matrices
ICML 2023 Paper Reviews

Read paper reviews from ICML 2023 from a number of our quantitative researchers and machine learning practitioners.

Read now

Emergent Equivariance in Deep Ensembles

Jan E. Gerken and Pan Kessel

In this work, the authors use the theory of neural tangent kernels to prove that ensembles of infinitely wide deep neural networks are equivariant at all stages of training if trained with full data augmentation.

The authors then demonstrate this property empirically for ensembles of wide and deep neural networks applied to several image classification tasks. In these cases, the neural network ensemble becomes equivariant to relevant symmetries in the data even when the underlying members of the ensemble do not display this property.

Emergent Equivariance in Deep Ensembles

Quantitative Research and Machine Learning

Want to learn more about life as a researcher at G-Research?

Learn more

Read more of our quantitative researchers thoughts

ICML 2024: Paper Review #1

Discover the perspectives of Yousuf, one of our machine learning engineers, on the following papers:

  • Arrows of Time for Large Language Models
  • Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality
Read now
ICML 2024: Paper Review #3

Discover the perspectives of Jonathan, one of our software engineers, on the following papers:

  • A Universal Class of Sharpness-Aware Minimization Algorithms
  • Rotational Equilibrium: How Weight Decay Balances Learning Across Neural Networks
Read now
ICML 2024: Paper Review #4

Discover the perspectives of Evgeni, one of our senior quantitative researchers, on the following papers:

  • Trained Random Forests Completely Reveal your Dataset
  • Test-of-time Award: DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition
Read now
ICML 2024: Paper Review #5

Discover the perspectives of Michael, one of our Scientific Directors, on the following papers:

  • Stop Regressing: Training Value Functions via Classification for Scalable Deep RL
  • Physics of Language Models: Part 3.1, Knowledge Storage and Extraction
Read now
ICML 2024: Paper Review #6

Discover the perspectives of Fabian, one of our senior quantitative researchers, on the following papers:

  • I/O Complexity of Attention, or How Optimal is Flash Attention?
  • Simple Linear Attention Language Models Balance the Recall-Throughput Tradeoff
Read now
ICML 2024: Paper Review #7

Discover the perspectives of Ingmar, one of our quantitative researchers, on the following papers:

  • Offline Actor-Critic Reinforcement Learning Scales to Large Models
  • Information-Directed Pessimism for Offline Reinforcement Learning
Read now
ICML 2024: Paper Review #8

Discover the perspectives of Oliver, one of our quantitative researchers, on the following papers:

  • Better & Faster Large Language Models via Multi-token Prediction
  • Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations
Read now

Latest News

The Tyranny of Tech Debt
  • 28 Apr 2025

Hear from our Head of Forecasting Engineering on why the term "tech debt" has outlived its usefulness. In this blog, he explores why we should move away from generic labels and instead ask more precise, value-driven questions that lead to meaningful improvements in engineering and business outcomes.

Read article
G-Research March 2025 Grant Winners
  • 22 Apr 2025

Each month, we provide up to £2,000 in grant money to early career researchers in quantitative disciplines. Hear from our March grant winners.

Read article
Invisible Work of OpenStack: Eventlet Migration
  • 25 Mar 2025

Hear from Jay, an Open Source Software Engineer, on tackling technical debt in OpenStack. As technology evolves, outdated code becomes inefficient and harder to maintain. Jay highlights the importance of refactoring legacy systems to keep open-source projects sustainable and future-proof.

Read article

Latest Events

  • Quantitative Engineering
  • Quantitative Research

SIAM Conference on Financial Mathematics and Engineering

15 Jul 2025 - 18 Jul 2025 Hyatt Regency Miami, 400 SE 2nd St, Miami, FL 33131, United States
  • Quantitative Engineering
  • Quantitative Research

Imperial PhD Careers Fair

10 Jun 2025 Queen's Tower Rooms, Sherfield Building, South Kensington Campus, Imperial College London, London, SW7 2AZ
  • Platform Engineering
  • Software Engineering

Oxbridge Women in Computer Science Conference

03 May 2025 The William Gates Building 15 JJ Thomson Avenue, Cambridge, CB3 0FD

Stay up to date with
G-Research