Reproduction of Deep learning Paper: “Are We Really Making Much Progress?

3 min readApr 14, 2022

Reproduction of Deep learning Paper: “Are We Really Making Much Progress? A Worrying Analysis of Recent Neural Recommendation Approaches”

Yet another year, and deep learning seems to be a booming area of research. Due to the variety of applications, the area has been growing quite well. But really are we growing well in the area. The authors of the paper raise a similar concern and try to answer the curiosity of researchers in the field. For the same, the authors pick 18 algorithms from the top journals published in the year and compare it with conventional analytical methods. In conclusion, the authors conclude that out of 18 algorithms, 7 algorithms were reproducible and out of which 6 did outperform the conventional simple analytical methods. The author gives a brief of his results in the paper and also presents the details of his experiment in his appendix available along with the source code.

For the comparison of the complex deep learning algorithms, the author uses the following baseline algorithms as follows:

Non personalized: Top Popular
Collaborative Filtering: ItemKNN, UserKNN, P3alpha, RP3beta
Hybrid: ItemKNN CF + CBF
Machine Learning: SLIM

The list of baseline algorithms can also be found in the supportive document available as appendix available on github. The author segregates the baseline in his code as separate folders in the main folder. The sub folders of the baseline algorithms have different variation, the author compares the complex algorithms. The author mentions that the code used in the experiments is the same baseline code provided by the authors of the original paper.

The author also provides a stats of the algorithms which algorithms outperformed the simple baselines.

The author segregates the complex deep learning algorithms based on conferences in which they were presented under the conferences' subsection.

The code is mostly written in python and compatible with TensorFlow libraries, which makes it easily reproducible. The installation steps are described well on the GitHub. The installation steps include creating a virtual environment, installing all the libraries and packages in the requirements list and running the script for the particular experiment.

There is a separate script for every experiment which starts with run_ in the main folder. The running of experiments is segregated per conference. Once the script is executed, the dataset is downloaded automatically and the split of training and test if done automatically based on the split mentioned in the original papers. The author after every experiment records the metrics and stores it in results folder. The results' folder records results after every 10 epochs. The result of every experiment is separated folderwise based on the conference name.

The gives the following toggle parameters for the experiments

‘-b’ or ‘ — baseline_tune’: Run baseline hyperparameter search
‘-a’ or ‘ — DL_article_default’: Train the deep learning algorithm with the original hyperparameters
‘-p’ or ‘ — print_results’: Generate the latex tables for this experiment

which can be set to TRUE or FALSE at the time of running the experiment.

Once the experiment is run it takes a few hours for it to finish the experiment depending on the compute facility, but the results can be compared from the results' folder. The author saves the list of hyperparameters, metrics, test, validation results in the form of json in a zip file and also a txt file.

The results of every experiment can also be compared with the one of the appendix.

University Name: Technical university Delft

Course: Deep learning

Authors of the article: Alexandra Dobrița, Kevin Shidqi Prakoso, Prithvish Vijaykumar Nembhani

Reproduction of Deep learning Paper: “Are We Really Making Much Progress? A Worrying Analysis of Recent Neural Recommendation Approaches”

Written by Prithvish Nembhani