Machine learning for image and video understanding

Contact: Justin Dauwels

Human intelligence can easily discover object components, learn abstract concepts, and reason about entity relationships from natural scenes, while artificial intelligence still lags behind. This theme aims at bridging this gap in practical applications by incorporating prior domain knowledge. Specific research topics include object detection, person tracking, 3D scene reconstruction, spatial-temporal radar image prediction, representation learning, and image (video) generation. Instead of considering synthetic images or manually cleaned simple datasets, this research theme focuses on more challenging real-world scenes such as room interiors, office buildings, and precipitation images, to name just a few, where occlusion, incomplete observations, or extreme values may appear. To cope with these challenges, the interplay of flexible deep neural networks with multi-view information fusion approaches, neural radiance field models, extreme value theory, and probabilistic graphical models will be explored. We closely collaborate with a variety of research institutes (e.g., NTU) and companies (e.g., Deepmind, Philips, TÜV SÜD, Nexans).

Projects under this theme

Neural Radiance Fields for 3D Reconstruction of Scenes from Images

This project aims at synthesizing novel views of scenes given only sparse views.

Nowcasting of Extreme Rainfall

This project explores nowcasting of extreme rainfall with deep generative models and extreme value theory.

Machine learning for Optimizing Workflow in the Operating Room

This project is concerned with object detection, person tracking and camera calibration in scenes with severe occlusion.