The best of ICML 2022 – Paper reviews (part 2)

8 October 2022

Quantitative research

This article is one of a series of paper reviews from our researchers and machine learning engineers – view more

khk

Last month, G-Research were Diamond sponsors at ICML, hosted this year in Baltimore, US.

As well as having a stand and team in attendance, a number of our quantitative researchers and machine learning engineers attended as part of their ongoing learning and development.

We asked our quants and machine learning practitioners to write about some of the papers and research that they found most interesting.

Here, Johann, Quantitative Research Manager at G-Research, discusses three papers.

Unaligned Supervision for Automatic Music Transcription in-the-Wild

Ben Maman, Amit H. Bermano

The fascinating story of 14 year old Mozart, who transcribed the score of the closely guarded Allegiri’s Miserere – only ever sung during the Easter days within the secretive walls of the St. Peter’s Rom – after hearing it just once, shows that Multi-Automatic Music Transcription (AMT), the holy grail of Music Information Retrieval, is at least a 250 year old dream.

AMT is challenging for many reasons, such as notes sharing frequencies, polyphony, echo effect and the complexity of multi-instrument performances. A bottle neck for data hungry methods, such as DNNS, is the lack of highly accurate annotated datasets because of the cost of manual annotation.

This paper introduces Note_EM, a method for simultaneously training a transcriber and aligning scores to their corresponding performances to generate annotated data. The expectation maximisation approach works in three stages:

First, an existing architecture for transcription is bootstrapped on synthetic data to create digitised performance of musical pieces. In the E-step, The resulting network is then used to predict the transcription of unlabelled recordings and the unaligned scores are warped based on likelihood predictions and used as a labelling. For the M-step the underlying architecture is then trained on the previously obtained labels and this procedure is then repeated.

The resulting method achieves state-of-the-art performance for in-the-wild transcription on a wide variety of instruments and genres.

Dataset Condensation via Efficient Synthetic-Data Parameterization

Jang-Hyun Kim, Jinuk Kim, Seong Joon Oh, Sangdoo Yun, Hwanjun Song, Joonhyun Jeong, Jung-Woo Ha, Hyun Oh Song

The great success of machine learning in recent years has come at the price of ever-growing computational and memory costs for the vast amounts of data needed.

The field of data condensation tries to remove the dependency on massive data by synthesizing the relevant information into more compact datasets. The challenge is to synthesize the data to have comparable training performance to the original data.

One approach is gradient matching, which is given a model to be trained. The synthetic data is constructed such that gradients in the training of the model matches that of the original dataset. In the paper, the authors suggest an intermediate step in the construction of the synthetic dataset by first increasing the synthetic data and mapping each element in the original set to multiple elements in the increased set, and then performing matching on the original datasets via the augmented synthetic set.

FEDformer: Frequency Enhanced Decomposed Transformer for Long-term Series Forecasting

Tian Zhou, Ziqing Ma, Qingsong Wen, Xue Wang, Liang Sun, Rong Jin

As we know, long-term time series prediction is difficult as prediction accuracy tends to decrease quickly with increasing horizons. Transformers have improved long-term forecasting but are computationally expensive to train and struggle with global trends of time series. Predictions for each time step are made individually and independently, so the prediction is unlikely to maintain global properties and statistics.

This paper suggests a decomposition in which a seasonal trend model, based on Fourier Analysis, applies the Transformer methods to the frequency domain. The method relies on an assumption that time series tend to have a sparse set of underlying representations in frequency space for efficiency. The seasonal trend component captures the global profile while the Transformer captures a more detailed structure. The resulting model, baptised Frequency Enhanced Decomposed Transformers FEDformer, looks to be both more effective and more efficient.

View more ICML 2022 paper reviews

08 Oct 2022

The best of ICML 2022 – Paper reviews (part 3)

The best of ICML 2022 – Paper reviews (part 2)

Unaligned Supervision for Automatic Music Transcription in-the-Wild

Dataset Condensation via Efficient Synthetic-Data Parameterization

FEDformer: Frequency Enhanced Decomposed Transformer for Long-term Series Forecasting

View more ICML 2022 paper reviews

Stay up to date with G-Research