Keynote Speakers

Prof. Rama Chellappa
University of Maryland
Learning Along the Edge of Deep Networks

BIO
Prof. Rama Chellappa is a Distinguished University Professor, a Minta Martin Professor of Engineering and Chair of the ECE department at the University of Maryland. His current research interests span many areas in image processing, computer vision, machine learning and pattern recognition. Prof. Chellappa is a recipient of an NSF Presidential Young Investigator Award and four IBM Faculty Development Awards. He received the K.S. Fu Prize from the International Association of Pattern Recognition (IAPR). He is a recipient of the Society, Technical Achievement and Meritorious Service Awards from the IEEE Signal Processing Society. He also received the Technical Achievement and Meritorious Service Awards from the IEEE Computer Society. Recently, he received the inaugural Leadership Award from the IEEE Biometrics Council. At UMD, he received college and university level recognitions for research, teaching, innovation and mentoring of undergraduate students. In 2010, he was recognized as an Outstanding ECE by Purdue University. He received the Distinguished Alumni Award from the Indian Institute of Science in 2016. Prof. Chellappa served as the Editor-in-Chief of PAMI. He is a Golden Core Member of the IEEE Computer Society, served as a Distinguished Lecturer of the IEEE Signal Processing Society and as the President of IEEE Biometrics Council. He is a Fellow of IEEE, IAPR, OSA, AAAS, ACM and AAAI and holds six patents.
Abstract
While Deep Convolutional Neural Networks (DCNNs) have achieved impressive results on many detection and classification tasks (for example, unconstrained face detection, verification and recognition), it is still unclear why they perform so well and how to properly design them. It is widely recognized that while training deep networks, an abundance of training samples is required. These training samples need to be lossless, perfectly labeled, and spanning various classes in a balanced way. The generalization performance of designed networks and their robustness to adversarial examples needs to be improved too. In this talk, we analyze each of these individual conditions to understand their effects on the performance of deep networks and present mitigation strategies when the ideal conditions are not met.
First, we investigate the relationship between the performance of a convolutional neural network (CNN), its depth, and the size of its training set and present performance bounds on CNNs with respect to the network parameters and the size of the available training dataset. Next, we consider the task of adaptively finding optimal training subsets which will be iteratively presented to the DCNN. We present convex optimization methods, based on an objective criterion and a quantitative measure of the current performance of the classifier, to efficiently identify informative samples to train on. Then we present Defense-GAN, a new strategy that leverages the expressive capability of generative models to defend DCNNs against adversarial attacks. The Defense-GAN can be used with any classification model and does not modify the classifier structure or training procedure. It can also be used as a defense against any attack as it does not assume knowledge of the process for generating the adversarial examples. An approach for training a DCNN using compressed data will also be presented by employing the GAN framework. Finally, to address generalization to unlabeled test data and robustness to adversarial samples, we propose an approach that leverages unsupervised data to bring the source and target distributions closer in a learned joint feature space. This is accomplished by inducing a symbiotic relationship between the learned embedding and a generative adversarial network. We demonstrate the impact of the analyses discussed above on a variety of reconstruction and classification problems.

Prof. Sven Dickinson
University of Toronto
The Role of Symmetry in Human and Computer Vision

BIO
Sven Dickinson received the B.A.Sc. degree in Systems Design Engineering from the University of Waterloo, in 1983, and the M.S. and Ph.D. degrees in Computer Science from the University of Maryland, in 1988 and 1991, respectively. He is currently Professor of the Department of Computer Science at the University of Toronto, where he has served as Chair (2010-2015), Acting Chair (2008-2009), and Vice Chair (2003-2006). From 1995-2000, he was Assistant Professor of Computer Science at Rutgers University, where he held a joint appointment in the Rutgers Center for Cognitive Science (RuCCS) and membership in the Center for Discrete Mathematics and Theoretical Computer Science (DIMACS). From 1994-1995, he was a Research Assistant Professor in the Rutgers Center for Cognitive Science, and from 1991-1994, a Research Associate at the Artificial Intelligence Laboratory, University of Toronto. He has held affiliations with the MIT Media Laboratory (Visiting Scientist, 1992-1994), the University of Toronto (Visiting Assistant Professor, 1994-1997), the Computer Vision Laboratory of the Center for Automation Research at the University of Maryland (Assistant Research Scientist, 1993-1994, Visiting Assistant Professor, 1994-1997), and the University of California, Santa Barbara (Visiting Professor, 2010-2011, 2015-2016). Prior to his academic career, he worked in the computer vision industry, designing image processing systems for Grinnell Systems Inc., San Jose, CA, 1983-1984, and optical character recognition systems for DEST, Inc., Milpitas, CA, 1984-1985.
Dr. Dickinson's research interests revolve around the problem of shape perception in computer vision and, more recently, human vision. Much of his recent work focuses on perceptual grouping and its role in image segmentation and shape recovery. He's introduced numerous qualitative shape representations, and their basis in symmetry provides a focus for his perceptual grouping research. His interest in multiscale, parts-based shape representations, and their common abstraction as hierarchical graphs, has motivated his research in inexact graph indexing and matching -- key problems in object recognition, another broad focus of his research. His research has also explored many problems related to object recognition, including object tracking, vision-based navigation, content-based image retrieval, language-vision integration, and image/model abstraction.
In 1996, Dr. Dickinson received the NSF CAREER award for his work in generic object recognition, and in 2002, received the Government of Ontario Premiere's Research Excellence Award (PREA), also for his work in generic object recognition. In 2012, he received the Lifetime Research Achievement Award from the Canadian Image Processing and Pattern Recognition Society (CIPPRS). In an effort to bring together researchers from human and computer vision, he was co-chair of the 1997, 1999, 2004, and 2007 IEEE International Workshops on Generic Object Recognition (or Object Categorization), which culminated in the interdisciplinary volume, Object Categorization: Computer and Human Vision Perspectives, in 2009, and was co-chair of the 2008, 2009, 2010, and 2011 International Workshops on Shape Perception in Human and Computer Vision, which culminated in the interdisciplinary volume, Shape Perception in Human and Computer Vision: An Interdisciplinary Perspective, in 2013. He was General Co-Chair of the 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), and currently serves or has served on the editorial boards of the journals: IEEE Transactions on Pattern Analysis and Machine Intelligence; International Journal of Computer Vision; Computer Vision and Image Understanding; Image and Vision Computing; Graphical Models; Pattern Recognition Letters; IET Computer Vision; and the Journal of Electronic Imaging. In 2017, he became Editor-in-Chief of the IEEE Transactions on Pattern Analysis and Machine Intelligence. He is also co-editor of the Synthesis Lectures on Computer Vision from Morgan & Claypool Publishers, since its inauguration in 2009.
Abstract
Symmetry is one of the most ubiquitous regularities in our natural world. For almost 100 years, human vision researchers have studied how the human vision system has evolved to exploit this powerful regularity as a basis for grouping image features and, for almost 50 years, as a basis for how the human vision system might encode the shape of an object. While computer vision is a much younger discipline, the trajectory is similar, with symmetry playing a major role in both perceptual grouping and object representation. After briefly reviewing some of the milestones in symmetry-based perceptual grouping and object representation/recognition in both human and computer vision, I will articulate some of the research challenges. I will then briefly describe some of our recent efforts to address these challenges, including the detection of symmetry in complex imagery and understanding the role of symmetry in human scene perception.

Prof. Shaogang (Sean) Gong
Queen Mary University of London
People Search In Large Scale Videos

BIO
Gong is Professor of Visual Computation at Queen Mary University of London, elected a Fellow of the Institution of Electrical Engineers, a Fellow of the British Computer Society, a member of the UK Computing Research Committee, and served on the Steering Panel of the UK Government Chief Scientific Advisor's Science Review.
Prof. Gong's early interest was in information theory & measurement and received a B.Sc. from the University of Electronic Sciences and Technology of China in 1985. Gong's B.Sc. thesis project was on biomedical image analysis which gave him the opportunity to develop a wider interest in robotics. This led Gong to pursue a doctorate in computer vision under the supervision of Mike Brady at Keble College Oxford and the Oxford Robotics Group in 1986. Brady introduced Gong to differential geometry in computer vision and the work of Ellen Hildreth at the MIT AI Lab on computing optic flow. During that time, Gong met David Murray who was on sabbatical at Oxford from GEC Hirst. Murray introduced Gong to the extensive work by Murray and Bernard Buxton at the GEC Hirst Centre on structure-from-motion for autonomous guided vehicle navigation. Gong received his D.Phil. from Oxford in 1989 with a thesis on computing optic flow by second-order geometrical analysis of Hessian derivatives with wave-diffusion propagation. Gong is a recipient of a Queen's Research Scientist Award in 1987, a Royal Society Research Fellow in 1987 and 1988, and a GEC sponsored Oxford research fellow in 1989.
Abstract
The amount of video data from urban environments is growing exponentially from 24/7 operating infrastructure cameras, online social media sources, self-driving cars, and smart city intelligent transportation systems, with 1.4 trillion hours CCTV video in 2017 and growing to 3.3 trillion hours by 2020. The scale and diversity of these videos make it very difficult to filter and extract useful information in a timely manner. Finding people and searching for the same individuals against a large population of unknowns in urban spaces pose a significant challenge to computer vision and machine learning. Established techniques such as face recognition, although successful for document verification in controlled environments and on smart phones, is poor for people search in unstructured videos of wide-field views due to low-resolution, motion blur, and a lack of detectable facial imagery in unconstrained scenes. In contrast to face recognition, person re-identification considers pedestrian whole-body appearance matching by exploring clothing characteristics and body-part attributes from arbitrary views. In the past decade, significant progresses have been made on person re-identification for matching people in increasingly larger scale benchmarks. However, such progresses rely heavily on supervised learning with strong assumptions on both model training and testing data being sampled from the same domain, and the availability of pair-wise labelled training data exhaustively sampled for every camera pair in each domain. Such assumptions render most existing techniques unscalable to large scale videos from unknown number of unknown sources. In this talk, I will focus on recent progress in advancing unsupervised person re-identification for people search in large scale videos, addressing the problems of visual attention deep learning, joint attribute-identity domain transfer deep learning, imbalanced attribute deep learning, unsupervised deep learning of space-time correlations, and mutual learning in multi-scale matching.

^ Back to Top