Machine Learning 2 Seminar (Summer Term 2021)
Overview
- Format: Virtual seminar with remote supervision and televised student presentations (0/2/0)
- Teaching: Mark Schöne, Holger Heidrich, Bjoern Andres
- Creditable toward the modules CMS-CE-EL1, CMS-CE-EL2, CMS-CLS-ELG, CMS-VC-ELV1, CMS-VC-ELV2, INF-04-FG-IS, INF-B-520, INF-B-540, INF-BAS2, INF-BAS7, INF-LE-MA, INF-VERT2, INF-VERT7, MATH-MA-INFGDV
- Enrolment - Students enrolled in the study program Computational Modeling and Simulation (CMS) need to register firstly and additionally via SELMA.
- Talk scheduling - Student presentations will be clustered in two virtual conferences:
- From May 31st through June 2nd
- From July 19th through July 22nd
- Paper selection is now open for all registered students who have scheduled their talk.
- Supervision will be exclusively by email. There will not be live meetings. There will not be a kickoff meeting.
Talks
Date | Time | Seminar | Topic |
---|---|---|---|
June 1st | 09:40 - 10:20 | (cancelled) | |
11:00 - 11:40 | Join live | Deep Double Descent: Where Bigger Models and More Data Hurt | |
11:40 - 12:20 | (cancelled) | ||
June 2nd | 09:00 - 09:40 | Join live | Consistent k-Clustering for General Metrics |
11:40 - 12:20 | Join live | On the Optimization of Deep Networks: Implicit Acceleration by Overparameterization | |
12:20 - 13:00 | Join live | Optimizing Rank-Based Metrics With Blackbox Differentiation | |
15:20 - 16:00 | Join live | ||
16:00 - 16:40 | Join live | ||
16:40 - 17:20 | Join live | Towards Understanding the Invertibility of Convolutional Neural Networks | |
17:20 - 18:00 | Join live | The Linear Ordering Problem. Chapter 4 (Branch-and-Bound) | |
June 19th | 09:00 - 09:40 | Join live | Deep Learning using Linear Support Vector Machines |
10:20 - 11:00 | Join live | Implicit Regularization in Deep Matrix Factorization | |
11:00 - 11:40 | Join live | Invertible Residual Networks | |
15:20 - 16:00 | Join live | The Linear Ordering Problem. Chapter 5 (Branch-and-Cut) | |
16:40 - 17:20 | Join live | Exponential expressivity in deep neural networks through transient chaos | |
June 20th | 09:40 - 10:20 | Join live | Sets Clustering |
10:20 - 11:00 | Join live | Invertible Convolutional Networks | |
11:00 - 11:40 | Join live | Totally Deep Support Vector Machines | |
15:20 - 16:00 | Join live | The Linear Ordering Polytope | |
16:00 - 16:40 | Join live | A Better k-means++ Algorithm via Local Search | |
16:40 - 17:20 | Join live | The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks | |
17:20 - 18:00 | Join live | ||
July 21st | 09:40 - 10:20 | Join live | Graph convolutional networks - Geometric deep learning on graphs and manifolds using mixture model CNNs |
10:20 - 11:00 | Join live | Attention is All you Need | |
11:00 - 11:40 | Join live | Structured Prediction with Partial Labelling through the Infimum Loss | |
11:40 - 12:20 | Join live | Not All Samples Are Created Equal:Deep Learning with Importance Sampling | |
12:20 - 13:00 | Join live | Semi-Supervised Classification with Graph Convolutional Networks | |
July 22nd | 09:00 - 09:40 | Join live | t-SNE 1: Stochastic neighbor embedding |
09:40 - 10:20 | Join live | t-SNE 2 | |
10:20 - 11:00 | Join live |
Contents
In this seminar, participating students will read, understand, prepare and present the contents of a research article or book chapter on a topic from the field of machine learning. The article or book chapter will be chosen by the student from the list below, or suggested by the student for approval in the beginning of the term. The preparation will include relevant foundational and related work. By attending at least 10 presentations of their peers, students will get an overview of diverse topics in the field of machine learning.
Prerequisites
Prerequisites for taking this course are a strong background in mathematics (esp. linear algebra and analysis) and theoretical computer science, as well as basics of machine learning, comparable to the contents of the course Machine Learning 1. For some of the suggested articles and book chapters, additional knowledge from the field of mathematical optimization is required.
Requirements
Requirements for passing this course are:
- A 30-minute oral presentation by the student, televised to their peers and the teachers → Scheduling
- A written report of at least six pages to be submitted by the student before their presentation → Template with instructions
- Active participation in at least 10 presentations by other students
- Additional forms of examination if required by the module toward which the course is to be credited
Supervision
In their preparation, participating students are supervised remotely, by email. They are strongly encouraged to report on their progress briefly, every Friday, by email.
Suggested Research Articles
Unsupervised Learning
- Vivien Cabannes, Alessandro Rudi and Francis R. Bach. Structured Prediction with Partial Labelling through the Infimum Loss. ICML 2020
- Silvio Lattanzi, Thomas Lavastida, Benjamin Moseley and Sergei Vassilvitskii. Online Scheduling via Learned Weights. SODA 2020
Structured Learning (Graphical Models)
- Chirag Pabbaraju, Po-Wei Wang, J. Zico Kolter. Efficient semidefinite-programming-based inference for binary and multi-class MRFs. NeurIPS 2020
- Alexander Shekhovtsov, Paul Swoboda, Bogdan Savchynskyy. Maximum Persistency via Iterative Relaxed Inference in Graphical Models. IEEE Trans. Pattern Anal. Mach. Intell. 40(7): 1668-1682 (2018)
- Paul Swoboda, Alexander Shekhovtsov, Jörg Hendrik Kappes, Christoph Schnörr, Bogdan Savchynskyy. Partial Optimality by Pruning for MAP-Inference with General Graphical Models. IEEE Trans. Pattern Anal. Mach. Intell. 38(7): 1370-1382 (2016)
- Stefan Haller, Paul Swoboda, Bogdan Savchynskyy. Exact MAP-Inference by Confining Combinatorial Search with LP Relaxation. AAAI 2018
- Siddharth Tourani, Alexander Shekhovtsov, Carsten Rother, Bogdan Savchynskyy. Taxonomy of Dual Block-Coordinate Ascent Methods for Discrete Energy Minimization. AISTATS 2020
- Paul Swoboda, Jan Kuske, Bogdan Savchynskyy. A Dual Ascent Framework for Lagrangean Decomposition of Combinatorial Problems. CVPR 2017
- Lena Gorelick, Yuri Boykov, Olga Veksler, Ismail Ben Ayed, Andrew Delong. Local Submodularization for Binary Pairwise Energies. Trans. Pattern Anal. Mach. Intell. 39(10): 1985-1999 (2017)
Clustering
- Hendrik Fichtenberger, Silvio Lattanzi, Ashkan Norouzi-Fard and Ola Svensson. Consistent k-Clustering for General Metrics. SODA 2021
- Jafar Jafarov, Sanchit Kalhan, Konstantin Makarychev, Yury Makarychev. Correlation Clustering with Asymmetric Classification Errors. ICML 2020
- Ibrahim Jubran, Murad Tukan, Alaa Maalouf and Dan Feldman. Sets Clustering. ICML 2020
- Silvio Lattanzi and Christian Sohler. A Better k-means++ Algorithm via Local Search. ICML 2019
- Jörg Hendrik Kappes, Paul Swoboda, Bogdan Savchynskyy, Tamir Hazan, Christoph Schnörr. Multicuts and Perturb & MAP for Probabilistic Graph Clustering. Journal of Mathematical Imaging and Vision 56(2):221-237 (2016)
- Dan Feldman, Michael Langberg. A unified framework for approximating and clustering data. STOC 2011
- Erik D. Demaine, Dotan Emanuel, Amos Fiat, and Nicole Immorlica. Correlation Clustering in General Weighted Graphs. Theoretical Computer Science 361(2-3):172-187 (2006)
- Nikhil Bansal, Avrim Blum and Shuchi Chawla. Correlation Clustering. Machine Learning 56:89-113 (2004)
Ordering
-
Rafael Martí and Gerhard Reinelt.
The Linear Ordering Problem.
Springer 2011.
- Chapters 2-3 (Heuristics and Meta-Heuristics)
- Chapter 4 (Branch-and-Bound)
- Chapter 5 (Branch-and-Cut)
- Chapter 6 (Linear Ordering Polytope)
- Christoph Buchheim, Angelika Wiegele and Lanbo Zheng. Exact Algorithms for the Quadratic Linear Ordering Problem. INFORMS J. Comput. 22(1):168-177 (2010)
Embedding
- George C. Linderman and Stefan Steinerberger. Clustering with t-SNE, provably. SIAM Journal on Mathematics of Data Science 2019
- Sam Roweis and Geoffrey Hinton. Stochastic neighbor embedding. NIPS 2002
Deep Learning (Theory)
- Marin Vlastelica Pogancic, Anselm Paulus, Vit Musil, Georg Martius, Michal Rolínek. Differentiation of Blackbox Combinatorial Solvers. ICLR 2020
- Wei Hu, Lechao Xiao and Jeffrey Pennington. Provable Benefit of Orthogonal Initialization in Optimizing Deep Linear Networks. ICLR 2020
- The neural tangent kernel in high dimensions: Triple descent and a multi-scale theory of generalization. Ben Adlam and Jeffrey Pennington. ICML 2020
- Sanjeev Arora, Nadav Cohen, Wei Hu and Yuping Luo. Implicit Regularization in Deep Matrix Factorization. NeurIPS 2019
- Jens Behrmann, Will Grathwohl, Ricky T. Q. Chen, David Duvenaud and Joern-Henrik Jacobsen. Invertible Residual Networks. ICML 2019.
- Sanjeev Arora, Nadav Cohen and Elad Hazan. On the Optimization of Deep Networks: Implicit Acceleration by Overparameterization. ICML 2018
- Samuel S. Schoenholz, Justin Gilmer, Surya Ganguli and Jascha Sohl-Dickstein. Deep information propagation. ICLR 2017
- Anna C. Gilbert, Yi Zhang, Kibok Lee, Yuting Zhang and Honglak Lee. Towards Understanding the Invertibility of Convolutional Neural Networks. IJCAI 2017
- Ben Poole, Subhaneil Lahiri, Maithra Raghu, Jascha Sohl-Dickstein and Surya Ganguli. Exponential expressivity in deep neural networks through transient chaos. NIPS 2016
- Yichuan Tang. Deep Learning using Linear Support Vector Machines. ICML Workshop Challenges in Representation Learning 2013.
Deep Learning (Applied)
- Marin Vlastelica, Michal Rolínek and Georg Martius. Neuro-algorithmic Policies enable Fast Combinatorial Generalization. arXiv 2021
- Michal Rolínek, Vit Musil, Anselm Paulus, Marin Vlastelica, Claudio Michaelis and Georg Martius. Optimizing Rank-Based Metrics With Blackbox Differentiation. CVPR 2020
- Preetum Nakkiran, Gal Kaplun, Yamini Bansal, Tristan Yang, Boaz Barak and Ilya Sutskever. Deep Double Descent: Where Bigger Models and More Data Hurt. ICLR 2020
- Christopher Rackauckas, Yingbo Ma, Julius Martensen, Collin Warner, Kirill Zubov, Rohit Supekar, Dominic Skinner, Ali Ramadhan and Alan Edelman. Universal Differential Equations for Scientific Machine Learning. arXiv 2020.
- Jonathan Frankle and Michael Carbin. The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks. ICLR 2019
- Emma Strubell, Ananya Ganesh and Andrew McCallum. Energy and Policy Considerations for Deep Learning in NLP. ACL 2019.
- Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin. Attention is All you Need. NeurIPS 2017
- Chiyuan Zhang and Benjamin Recht and Samy Bengio and Moritz Hardt and Oriol Vinyals. Understanding deep learning requires rethinking generalization. ICLR 2017
- Federico Monti, Davide Boscaini, Jonathan Masci, Emanuele Rodol\`a, Jan Svoboda, Michael M. Bronstein. Geometric deep learning on graphs and manifolds using mixture model CNNs. CVPE 2017