Computer Vision Seminar (Summer Term 2021)

Overview

Format: Virtual seminar with remote supervision and televised student presentations (0/2/0)
Teaching: Mark Schöne, Holger Heidrich, Bjoern Andres
Creditable toward the modules CMS-CLS-ELG, CMS-VC-ELV1, CMS-VC-ELV2, INF-04-FG-IS, INF-B-520, INF-B-540, INF-BAS7, INF-LE-MA, INF-VERT2, INF-VERT7, MATH-MA-INFGDV
Enrolment - Students enrolled in the study program Computational Modeling and Simulation (CMS) need to register firstly and additionally via SELMA.
Talk scheduling - Student presentations will be clustered in two virtual conferences:
- From June 3rd through June 7th
- From July 12th through July 14th
Paper selection is now open for all registered students who have scheduled their talk.
Supervision will be exclusively by email. There will not be live meetings. There will not be a kickoff meeting.

Talks

Date	Time	Seminar	Topic
June 3rd	09:00 - 09:40	Join live	ORB-SLAM: a Versatile and Accurate Monocular SLAM System
June 4th	09:40 - 10:20	Join live	A Primal-Dual Solver for Large-Scale Tracking-by-Assignment
	11:00 - 11:40		(cancelled)
	11:40 - 12:20	Join live	An Intriguing Failing of Convolutional Neural Networks and the CoordConv Solution
	16:00 - 16:40		(cancelled)
June 7th	11:00 - 11:40	Join live	Image to Image translation with Conditional Adversarial Networks
July 12th	11:00 - 11:40	Join live	A Image-Based 3D Object Reconstruction: State-of-the-Art and Trends in the Deep Learning Era
	15:20 - 16:00	Join live	Cut, Glue, & Cut: A Fast, Approximate Solver for Multicut Partitioning
	16:00 - 16:40	Join live	Fusion moves for correlation clustering
July 13th	09:40 - 10:20	Join live	K-convexity Shape Priors for Segmentation
	10:20 - 11:00	Join live	Regions with CNN features - R-CNNs
July 14th	09:00 - 09:40	Join live	Guided Image Generation with Conditional Invertible Neural Networks
	09:40 - 10:20	Join live	LocalViT: Bringing Locality to Vision Transformers
	10:20 - 11:00	Join live	Generative Adversarial Networks in Computer Vision: A Survey and Taxonomy
	11:00 - 11:40	Join live	A Style-Based Generator Architecture for Generative Adversarial Networks
	12:20 - 13:00	Join live	Joint Variational Method of Shape of Shading and Stereo
	15:20 - 16:00	Join live	Training data-efficient image transformers & distillation through attention
	16:00 - 16:40	Join live	Transformers in Vision: A Survey
	16:40 - 17:20	Join live	Text-based Editing of Talking-head Video
	17:20 - 18:00	Join live	Deep Residual Learning for Image Recognition

In this seminar, participating students will read, understand, prepare and present the contents of a research article or book chapter on a topic from the field of computer vision. The article or book chapter will be chosen by the student from the list below, or suggested by the student for approval in the beginning of the term. The preparation will include relevant foundational and related work. By attending at least 10 presentations of their peers, students will get an overview of diverse topics in the field of computer vision.

Prerequisites

Prerequisites for taking this course are a strong background in mathematics (esp. linear algebra and analysis) and theoretical computer science. For some of the suggested articles and book chapters, additional knowledge from the field of mathematical optimization or machine learning is required.

Requirements

Requirements for passing this course are:

A 30-minute oral presentation by the student, televised to their peers and the teachers → Scheduling
A written report of at least six pages to be submitted by the student before their presentation → Template with instructions
Active participation in at least 10 presentations by other students
Additional forms of examination if required by the module toward which the course is to be credited

Supervision

In their preparation, participating students are supervised remotely, by email. They are strongly encouraged to report on their progress briefly, every Friday, by email.

Suggested Research Articles

Image Decomposition

Hossam N. Isack, Lena Gorelick, Karin Ng, Olga Veksler, Yuri Boykov. K-convexity Shape Priors for Segmentation. ECCV 2018
Lena Gorelick, Olga Veksler, Yuri Boykov, Claudia Nieuwenhuis. Convexity Shape Prior for Binary Segmentation. Trans. Pattern Anal. Mach. Intell. 39(2): 258-271 (2017)
Julian Yarkony, Charless C. Fowlkes: Planar Ultrametrics for Image Segmentation. NIPS 2015
Thorsten Beier, Fred A. Hamprecht and Jörg H. Kappes. Fusion moves for correlation clustering. CVPR 2015
Thorsten Beier, Thorben Kröger, Jörg H. Kappes, Ullrich Köthe and Fred A. Hamprecht. Cut, Glue, & Cut: A Fast, Approximate Solver for Multicut Partitioning. CVPR 2014
Sungwoong Kim, Chang Dong Yoo, Sebastian Nowozin, Pushmeet Kohli. Image Segmentation Using Higher-Order Correlation Clustering. Trans. Pattern Anal. Mach. Intell. 36(9): 1761-1774 (2014)
Julian Yarkony, Alexander T. Ihler, Charless C. Fowlkes. Fast Planar Correlation Clustering for Image Segmentation. ECCV 2012
Sebastian Nowozin and Christoph H. Lampert. Global Interactions in Random Field Models: A Potential Function Ensuring Connectedness. SIAM J. Imaging Sci. 3(4): 1048-1074 (2010)

Tracking

Stefan Haller, Mangal Prakash, Lisa Hutschenreiter, Tobias Pietzsch, Carsten Rother, Florian Jug, Paul Swoboda, Bogdan Savchynskyy. A Primal-Dual Solver for Large-Scale Tracking-by-Assignment. AISTATS 2020
Andrea Hornakova, Roberto Henschel, Bodo Rosenhahn, Paul Swoboda. Lifted Disjoint Paths with Application in Multiple Object Tracking. ICML 2020
Shaofei Wang, Steffen Wolf, Charless C. Fowlkes, Julian Yarkony. Tracking Objects with Higher Order Interactions via Delayed Column Generation. AISTATS 2017

Matching

Michal Rolinek, Paul Swoboda, Dominik Zietlow, Anselm Paulus, Vit Musil, Georg Martius. Deep Graph Matching via Blackbox Differentiation of Combinatorial Solvers. ECCV 2020
Paul Swoboda, Carsten Rother, Hassan Abu Alhaija, Dagmar Kainmüller, Bogdan Savchynskyy. A Study of Lagrangean Decompositions and Dual Ascent Solvers for Graph Matching. CVPR 2017
Thomas Windheuser, Ulrich Schlickewei, Frank R. Schmidt and Daniel Cremers. Large-Scale Integer Linear Programming for Orientation Preserving 3D Shape Matching. Comput. Graph. Forum 30(5):1471-1480 (2011)

Pose Estimation

Frank Michel, Alexander Kirillov, Eric Brachmann, Alexander Krull, Stefan Gumhold, Bogdan Savchynskyy, Carsten Rother. Global Hypothesis Generation for 6D Object Pose Estimation. CVPR 2017

Motion Analysis

Margret Keuper. Higher-Order Minimum Cost Lifted Multicuts for Motion Segmentation. ICCV 2017

Reconstruction

Yuhao Xiao, Dingding Xu, Guijin Wang, Xiaowei Hu, Yongbing Zhang, Xiangyang Ji and Li Zhang. Confidence Map Based 3D Cost Aggregation with Multiple Minimum Spanning Trees for Stereo Matching. Asian Conference on Pattern Recognition (ACPR) 2019
Daniel Maurer, Yong Chul Ju, Michael Breuß & Andrés Bruhn. Combining Shape from Shading and Stereo: A Joint Variational Method for Estimating Depth, Illumination and Albedo. International Journal of Computer Vision 126:1342-1366 (2018)
Raúl Mur-Artal, J. M. M. Montiel and Juan D. Tardós. ORB-SLAM: A Versatile and Accurate Monocular SLAM System. Transactions on Robotics 31(5):1147-1163 (2015).

Deep Learning for Computer Vision

Yantao Shen, Tong Xiao, Shuai Yi, Dapeng Chen, Xiaogang Wang and Hongsheng Li. Person Re-Identification With Deep Kronecker-Product Matching and Group-Shuffling Random Walk. Transactions on Pattern Analysis and Machine Intelligence 43(5)1649-1665 (2021)
Yawei Li, Kai Zhang, Jiezhang Cao, Radu Timofte, Luc Van Gool. LocalViT: Bringing Locality to Vision Transformers. arXiv 2021
Kuan Zhu, Haiyun Guo, Shiliang Zhang, Yaowei Wang, Gaopan Huang, Honglin Qiao, Jing Liu, Jinqiao Wang, Ming Tang. AAformer: Auto-Aligned Transformer for Person Re-Identification. arXiv 2021
Pengcheng He, Xiaodong Liu, Jianfeng Gao, Weizhu Chen. DeBERTa: Decoding-enhanced BERT with Disentangled Attention. arXiv 2020
Matthew Tancik, Pratul P. Srinivasan, Ben Mildenhall, Sara Fridovich-Keil, Nithin Raghavan, Utkarsh Singhal, Ravi Ramamoorthi, Jonathan T. Barron and Ren Ng. Fourier Features Let Networks Learn High Frequency Functions in Low Dimensional Domains. NeurIPS 2020
Hugo Touvron, Matthieu Cord, Matthijs Douze, Francisco Massa, Alexandre Sablayrolles, Hervé Jégou Training data-efficient image transformers & distillation through attention. arXiv 2020
Prajit Ramachandran, Niki Parmar, Ashish Vaswani, Irwan Bello, Anselm Levskaya and Jonathon Shlens. Stand-Alone Self-Attention in Vision Models. NeurIPS 2019
Ohad Fried, Ayush Tewari, Michael Zollhöfer, Adam Finkelstein, Eli Shechtman, Dan B Goldman, Kyle Genova, Zeyu Jin, Christian Theobalt, Maneesh Agrawala. Text-based Editing of Talking-head Video. ACM Transactions on Graphics 38(4) 2019.
Lynton Ardizzone, Carsten Lüth, Jakob Kruse, Carsten Rother, Ullrich Köthe Guided Image Generation with Conditional Invertible Neural Networks. arXiv 2019
Rosanne Liu, Joel Lehman, Piero Molino, Felipe Petroski Such, Eric Frank, Alex Sergeev and Jason Yosinski. An Intriguing Failing of Convolutional Neural Networks and the CoordConv Solution. NIPS 2018
Phillip Isola, Jun-Yan Zhu, Tinghui Zhou and and Alexei A. Efros. Image-to-image translation with conditional adversarial networks. CVPR 2017
Haesol Park and Kyoung Mu Lee. Look Wider to Match Image Patches With Convolutional Neural Networks. Signal Processing Letters 24(12):1788-1792 (2017)
Tero Karras, Samuli Laine and Timo Aila. A Style-Based Generator Architecture for Generative Adversarial Networks. CVPR 2019
Kaiming He, Xiangyu Zhang, Shaoqing Ren and Jian Sun. Deep Residual Learning for Image Recognition. CVPR 2016.
Shuyang Sun, Jiangmiao Pang, Jianping Shi, Shuai Yi and Wanli Ouyang. FishNet: A Versatile Backbone for Image, Region, and Pixel Level Prediction. NeurIPS 2018.
Shaoqing Ren and Kaiming He and Ross Girshick and Jian Sun. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. NeurIPS 2015.