Machine Learning 2 Seminar (Summer Term 2021)

Overview

Format: Virtual seminar with remote supervision and televised student presentations (0/2/0)
Teaching: Mark Schöne, Holger Heidrich, Bjoern Andres
Creditable toward the modules CMS-CE-EL1, CMS-CE-EL2, CMS-CLS-ELG, CMS-VC-ELV1, CMS-VC-ELV2, INF-04-FG-IS, INF-B-520, INF-B-540, INF-BAS2, INF-BAS7, INF-LE-MA, INF-VERT2, INF-VERT7, MATH-MA-INFGDV
Enrolment - Students enrolled in the study program Computational Modeling and Simulation (CMS) need to register firstly and additionally via SELMA.
Talk scheduling - Student presentations will be clustered in two virtual conferences:
- From May 31st through June 2nd
- From July 19th through July 22nd
Paper selection is now open for all registered students who have scheduled their talk.
Supervision will be exclusively by email. There will not be live meetings. There will not be a kickoff meeting.

Talks

Date	Time	Seminar	Topic
June 1st	09:40 - 10:20		(cancelled)
	11:00 - 11:40	Join live	Deep Double Descent: Where Bigger Models and More Data Hurt
	11:40 - 12:20		(cancelled)
June 2nd	09:00 - 09:40	Join live	Consistent k-Clustering for General Metrics
	11:40 - 12:20	Join live	On the Optimization of Deep Networks: Implicit Acceleration by Overparameterization
	12:20 - 13:00	Join live	Optimizing Rank-Based Metrics With Blackbox Differentiation
	15:20 - 16:00	Join live	~~The Linear Ordering Problem. Chapter 6 (Linear Ordering Polytope)~~
	16:00 - 16:40	Join live	~~Neuro-algorithmic Policies enable Fast Combinatorial Generalization~~
	16:40 - 17:20	Join live	Towards Understanding the Invertibility of Convolutional Neural Networks
	17:20 - 18:00	Join live	The Linear Ordering Problem. Chapter 4 (Branch-and-Bound)
June 19th	09:00 - 09:40	Join live	Deep Learning using Linear Support Vector Machines
	10:20 - 11:00	Join live	Implicit Regularization in Deep Matrix Factorization
	11:00 - 11:40	Join live	Invertible Residual Networks
	15:20 - 16:00	Join live	The Linear Ordering Problem. Chapter 5 (Branch-and-Cut)
	16:40 - 17:20	Join live	Exponential expressivity in deep neural networks through transient chaos
June 20th	09:40 - 10:20	Join live	Sets Clustering
	10:20 - 11:00	Join live	Invertible Convolutional Networks
	11:00 - 11:40	Join live	Totally Deep Support Vector Machines
	15:20 - 16:00	Join live	The Linear Ordering Polytope
	16:00 - 16:40	Join live	A Better k-means++ Algorithm via Local Search
	16:40 - 17:20	Join live	The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks
	17:20 - 18:00	Join live	~~Current Regularization Techniques for Neural Networks~~
July 21st	09:40 - 10:20	Join live	Graph convolutional networks - Geometric deep learning on graphs and manifolds using mixture model CNNs
	10:20 - 11:00	Join live	Attention is All you Need
	11:00 - 11:40	Join live	Structured Prediction with Partial Labelling through the Infimum Loss
	11:40 - 12:20	Join live	Not All Samples Are Created Equal:Deep Learning with Importance Sampling
	12:20 - 13:00	Join live	Semi-Supervised Classification with Graph Convolutional Networks
July 22nd	09:00 - 09:40	Join live	t-SNE 1: Stochastic neighbor embedding
	09:40 - 10:20	Join live	t-SNE 2
	10:20 - 11:00	Join live	~~Exact Algorithms for the Quadratic Linear Ordering Problem~~

In this seminar, participating students will read, understand, prepare and present the contents of a research article or book chapter on a topic from the field of machine learning. The article or book chapter will be chosen by the student from the list below, or suggested by the student for approval in the beginning of the term. The preparation will include relevant foundational and related work. By attending at least 10 presentations of their peers, students will get an overview of diverse topics in the field of machine learning.

Prerequisites

Prerequisites for taking this course are a strong background in mathematics (esp. linear algebra and analysis) and theoretical computer science, as well as basics of machine learning, comparable to the contents of the course Machine Learning 1. For some of the suggested articles and book chapters, additional knowledge from the field of mathematical optimization is required.

Requirements

Requirements for passing this course are:

A 30-minute oral presentation by the student, televised to their peers and the teachers → Scheduling
A written report of at least six pages to be submitted by the student before their presentation → Template with instructions
Active participation in at least 10 presentations by other students
Additional forms of examination if required by the module toward which the course is to be credited

Supervision

In their preparation, participating students are supervised remotely, by email. They are strongly encouraged to report on their progress briefly, every Friday, by email.

Suggested Research Articles

Unsupervised Learning

Vivien Cabannes, Alessandro Rudi and Francis R. Bach. Structured Prediction with Partial Labelling through the Infimum Loss. ICML 2020
Silvio Lattanzi, Thomas Lavastida, Benjamin Moseley and Sergei Vassilvitskii. Online Scheduling via Learned Weights. SODA 2020

Structured Learning (Graphical Models)

Chirag Pabbaraju, Po-Wei Wang, J. Zico Kolter. Efficient semidefinite-programming-based inference for binary and multi-class MRFs. NeurIPS 2020
Alexander Shekhovtsov, Paul Swoboda, Bogdan Savchynskyy. Maximum Persistency via Iterative Relaxed Inference in Graphical Models. IEEE Trans. Pattern Anal. Mach. Intell. 40(7): 1668-1682 (2018)
Paul Swoboda, Alexander Shekhovtsov, Jörg Hendrik Kappes, Christoph Schnörr, Bogdan Savchynskyy. Partial Optimality by Pruning for MAP-Inference with General Graphical Models. IEEE Trans. Pattern Anal. Mach. Intell. 38(7): 1370-1382 (2016)
Stefan Haller, Paul Swoboda, Bogdan Savchynskyy. Exact MAP-Inference by Confining Combinatorial Search with LP Relaxation. AAAI 2018
Siddharth Tourani, Alexander Shekhovtsov, Carsten Rother, Bogdan Savchynskyy. Taxonomy of Dual Block-Coordinate Ascent Methods for Discrete Energy Minimization. AISTATS 2020
Paul Swoboda, Jan Kuske, Bogdan Savchynskyy. A Dual Ascent Framework for Lagrangean Decomposition of Combinatorial Problems. CVPR 2017
Lena Gorelick, Yuri Boykov, Olga Veksler, Ismail Ben Ayed, Andrew Delong. Local Submodularization for Binary Pairwise Energies. Trans. Pattern Anal. Mach. Intell. 39(10): 1985-1999 (2017)

Clustering

Hendrik Fichtenberger, Silvio Lattanzi, Ashkan Norouzi-Fard and Ola Svensson. Consistent k-Clustering for General Metrics. SODA 2021
Jafar Jafarov, Sanchit Kalhan, Konstantin Makarychev, Yury Makarychev. Correlation Clustering with Asymmetric Classification Errors. ICML 2020
Ibrahim Jubran, Murad Tukan, Alaa Maalouf and Dan Feldman. Sets Clustering. ICML 2020
Silvio Lattanzi and Christian Sohler. A Better k-means++ Algorithm via Local Search. ICML 2019
Jörg Hendrik Kappes, Paul Swoboda, Bogdan Savchynskyy, Tamir Hazan, Christoph Schnörr. Multicuts and Perturb & MAP for Probabilistic Graph Clustering. Journal of Mathematical Imaging and Vision 56(2):221-237 (2016)
Dan Feldman, Michael Langberg. A unified framework for approximating and clustering data. STOC 2011
Erik D. Demaine, Dotan Emanuel, Amos Fiat, and Nicole Immorlica. Correlation Clustering in General Weighted Graphs. Theoretical Computer Science 361(2-3):172-187 (2006)
Nikhil Bansal, Avrim Blum and Shuchi Chawla. Correlation Clustering. Machine Learning 56:89-113 (2004)

Ordering

Rafael Martí and Gerhard Reinelt. The Linear Ordering Problem. Springer 2011.
- Chapters 2-3 (Heuristics and Meta-Heuristics)
- Chapter 4 (Branch-and-Bound)
- Chapter 5 (Branch-and-Cut)
- Chapter 6 (Linear Ordering Polytope)
Christoph Buchheim, Angelika Wiegele and Lanbo Zheng. Exact Algorithms for the Quadratic Linear Ordering Problem. INFORMS J. Comput. 22(1):168-177 (2010)

Embedding

George C. Linderman and Stefan Steinerberger. Clustering with t-SNE, provably. SIAM Journal on Mathematics of Data Science 2019
Sam Roweis and Geoffrey Hinton. Stochastic neighbor embedding. NIPS 2002

Deep Learning (Theory)

Marin Vlastelica Pogancic, Anselm Paulus, Vit Musil, Georg Martius, Michal Rolínek. Differentiation of Blackbox Combinatorial Solvers. ICLR 2020
Wei Hu, Lechao Xiao and Jeffrey Pennington. Provable Benefit of Orthogonal Initialization in Optimizing Deep Linear Networks. ICLR 2020
The neural tangent kernel in high dimensions: Triple descent and a multi-scale theory of generalization. Ben Adlam and Jeffrey Pennington. ICML 2020
Sanjeev Arora, Nadav Cohen, Wei Hu and Yuping Luo. Implicit Regularization in Deep Matrix Factorization. NeurIPS 2019
Jens Behrmann, Will Grathwohl, Ricky T. Q. Chen, David Duvenaud and Joern-Henrik Jacobsen. Invertible Residual Networks. ICML 2019.
Sanjeev Arora, Nadav Cohen and Elad Hazan. On the Optimization of Deep Networks: Implicit Acceleration by Overparameterization. ICML 2018
Samuel S. Schoenholz, Justin Gilmer, Surya Ganguli and Jascha Sohl-Dickstein. Deep information propagation. ICLR 2017
Anna C. Gilbert, Yi Zhang, Kibok Lee, Yuting Zhang and Honglak Lee. Towards Understanding the Invertibility of Convolutional Neural Networks. IJCAI 2017
Ben Poole, Subhaneil Lahiri, Maithra Raghu, Jascha Sohl-Dickstein and Surya Ganguli. Exponential expressivity in deep neural networks through transient chaos. NIPS 2016
Yichuan Tang. Deep Learning using Linear Support Vector Machines. ICML Workshop Challenges in Representation Learning 2013.

Deep Learning (Applied)

Marin Vlastelica, Michal Rolínek and Georg Martius. Neuro-algorithmic Policies enable Fast Combinatorial Generalization. arXiv 2021
Michal Rolínek, Vit Musil, Anselm Paulus, Marin Vlastelica, Claudio Michaelis and Georg Martius. Optimizing Rank-Based Metrics With Blackbox Differentiation. CVPR 2020
Preetum Nakkiran, Gal Kaplun, Yamini Bansal, Tristan Yang, Boaz Barak and Ilya Sutskever. Deep Double Descent: Where Bigger Models and More Data Hurt. ICLR 2020
Christopher Rackauckas, Yingbo Ma, Julius Martensen, Collin Warner, Kirill Zubov, Rohit Supekar, Dominic Skinner, Ali Ramadhan and Alan Edelman. Universal Differential Equations for Scientific Machine Learning. arXiv 2020.
Jonathan Frankle and Michael Carbin. The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks. ICLR 2019
Emma Strubell, Ananya Ganesh and Andrew McCallum. Energy and Policy Considerations for Deep Learning in NLP. ACL 2019.
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin. Attention is All you Need. NeurIPS 2017
Chiyuan Zhang and Benjamin Recht and Samy Bengio and Moritz Hardt and Oriol Vinyals. Understanding deep learning requires rethinking generalization. ICLR 2017
Federico Monti, Davide Boscaini, Jonathan Masci, Emanuele Rodol\`a, Jan Svoboda, Michael M. Bronstein. Geometric deep learning on graphs and manifolds using mixture model CNNs. CVPE 2017