Google is proud to be a Platinum Sponsor of the European Conference on Computer Vision (ECCV 2022), a premier forum for the dissemination of research in computer vision and machine learning (ML). This year, ECCV 2022 will be held as a hybrid event, in person in Tel Aviv, Israel with virtual attendance as an option. Google has a strong presence at this year’s conference with over 60 accepted publications and active involvement in a number of workshops and tutorials. We look forward to sharing some of our extensive research and expanding our partnership with the broader ML research community.
Registered for ECCV 2022? We hope you’ll visit our on-site or virtual booths to learn more about the research we’re presenting at ECCV 2022, including several demos and opportunities to connect with our researchers. Learn more about Google’s research being presented at ECCV 2022 below (Google affiliations in bold).
Organizing Committee
Program Chairs include: Moustapha Cissé
Awards Paper Committee: Todd Zickler
Area Chairs include: Ayan Chakrabarti, Tali Dekel, Alireza Fathi, Vittorio Ferrari, David Fleet, Dilip Krishnan, Michael Rubinstein, Cordelia Schmid, Deqing Sun, Federico Tombari, Jasper Uijlings, Ming-Hsuan Yang, Todd Zickler
Accepted Publications
NeuMesh: Learning Disentangled Neural Mesh-Based Implicit Field for Geometry and Texture Editing
Bangbang Yang, Chong Bao, Junyi Zeng, Hujun Bao, Yinda Zhang, Zhaopeng Cui, Guofeng Zhang
Anti-Neuron Watermarking: Protecting Personal Data Against Unauthorized Neural Networks
Zihang Zou, Boqing Gong, Liqiang Wang
Exploiting Unlabeled Data with Vision and Language Models for Object Detection
Shiyu Zhao, Zhixing Zhang, Samuel Schulter, Long Zhao, Vijay Kumar B G, Anastasis Stathopoulos, Manmohan Chandraker, Dimitris N. Metaxas
Waymo Open Dataset: Panoramic Video Panoptic Segmentation
Jieru Mei, Alex Zhu, Xinchen Yan, Hang Yan, Siyuan Qiao, Yukun Zhu, Liang-Chieh Chen, Henrik Kretzschmar
PRIF: Primary Ray-Based Implicit Function
Brandon Yushan Feng, Yinda Zhang, Danhang Tang, Ruofei Du, Amitabh Varshney
LoRD: Local 4D Implicit Representation for High-Fidelity Dynamic Human Modeling
Boyan Jiang, Xinlin Ren, Mingsong Dou, Xiangyang Xue, Yanwei Fu, Yinda Zhang
k-Means Mask Transformer (see blog post)
Qihang Yu*, Siyuan Qiao, Maxwell D Collins, Yukun Zhu, Hartwig Adam, Alan Yuille, Liang-Chieh Chen
MaxViT: Multi-Axis Vision Transformer (see blog post)
Zhengzhong Tu, Hossein Talebi, Han Zhang, Feng Yang, Peyman Milanfar, Alan Bovik, Yinxiao Li
E-Graph: Minimal Solution for Rigid Rotation with Extensibility Graphs
Yanyan Li, Federico Tombari
RBP-Pose: Residual Bounding Box Projection for Category-Level Pose Estimation
Ruida Zhang, Yan Di, Zhiqiang Lou, Fabian Manhardt, Federico Tombari, Xiangyang Ji
GOCA: Guided Online Cluster Assignment for Self-Supervised Video Representation Learning
Huseyin Coskun, Alireza Zareian, Joshua L Moore, Federico Tombari, Chen Wang
Scaling Open-Vocabulary Image Segmentation with Image-Level Labels
Golnaz Ghiasi, Xiuye Gu, Yin Cui, Tsung-Yi Lin*
Adaptive Transformers for Robust Few-Shot Cross-Domain Face Anti-spoofing
Hsin-Ping Huang, Deqing Sun, Yaojie Liu, Wen-Sheng Chu, Taihong Xiao, Jinwei Yuan, Hartwig Adam, Ming-Hsuan Yang
DualPrompt: Complementary Prompting for Rehearsal-Free Continual Learning
Zifeng Wang*, Zizhao Zhang, Sayna Ebrahimi, Ruoxi Sun, Han Zhang, Chen-Yu Lee, Xiaoqi Ren, Guolong Su, Vincent Perot, Jennifer Dy, Tomas Pfister
BLT: Bidirectional Layout Transformer for Controllable Layout Generation
Xiang Kong, Lu Jiang, Huiwen Chang, Han Zhang, Yuan Hao, Haifeng Gong, Irfan Essa
V2X-ViT: Vehicle-to-Everything Cooperative Perception with Vision Transformer
Runsheng Xu, Hao Xiang, Zhengzhong Tu, Xin Xia, Ming-Hsuan Yang, Jiaqi Ma
Learning Visibility for Robust Dense Human Body Estimation
Chun-Han Yao, Jimei Yang, Duygu Ceylan, Yi Zhou, Yang Zhou, Ming-Hsuan Yang
Are Vision Transformers Robust to Patch Perturbations?
Jindong Gu, Volker Tresp, Yao Qin
PseudoAugment: Learning to Use Unlabeled Data for Data Augmentation in Point Clouds
Zhaoqi Leng, Shuyang Cheng, Ben Caine, Weiyue Wang, Xiao Zhang, Jonathon Shlens, Mingxing Tan, Dragomir Anguelov
Structure and Motion from Casual Videos
Zhoutong Zhang, Forrester Cole, Zhengqi Li, Noah Snavely, Michael Rubinstein, William T. Freeman
PreTraM: Self-Supervised Pre-training via Connecting Trajectory and Map
Chenfeng Xu, Tian Li, Chen Tang, Lingfeng Sun, Kurt Keutzer, Masayoshi Tomizuka, Alireza Fathi, Wei Zhan
Novel Class Discovery Without Forgetting
Joseph K J, Sujoy Paul, Gaurav Aggarwal, Soma Biswas, Piyush Rai, Kai Han, Vineeth N Balasubramanian
Hierarchically Self-Supervised Transformer for Human Skeleton Representation Learning
Yuxiao Chen, Long Zhao, Jianbo Yuan, Yu Tian, Zhaoyang Xia, Shijie Geng, Ligong Han, Dimitris N. Metaxas
PACTran: PAC-Bayesian Metrics for Estimating the Transferability of Pretrained Models to Classification Tasks
Nan Ding, Xi Chen, Tomer Levinboim, Soravit Changpinyo, Radu Soricut
InfiniteNature-Zero: Learning Perpetual View Generation of Natural Scenes from Single Images
Zhengqi Li, Qianqian Wang*, Noah Snavely, Angjoo Kanazawa*
Generalizable Patch-Based Neural Rendering (see blog post)
Mohammed Suhail*, Carlos Esteves, Leonid Sigal, Ameesh Makadia
LESS: Label-Efficient Semantic Segmentation for LiDAR Point Clouds
Minghua Liu, Yin Zhou, Charles R. Qi, Boqing Gong, Hao Su, Dragomir Anguelov
The Missing Link: Finding Label Relations Across Datasets
Jasper Uijlings, Thomas Mensink, Vittorio Ferrari
Learning Instance-Specific Adaptation for Cross-Domain Segmentation
Yuliang Zou, Zizhao Zhang, Chun-Liang Li, Han Zhang, Tomas Pfister, Jia-Bin Huang
Learning Audio-Video Modalities from Image Captions
Arsha Nagrani, Paul Hongsuck Seo, Bryan Seybold, Anja Hauth, Santiago Manen, Chen Sun, Cordelia Schmid
TL;DW? Summarizing Instructional Videos with Task Relevance & Cross-Modal Saliency
Medhini Narasimhan*, Arsha Nagrani, Chen Sun, Michael Rubinstein, Trevor Darrell, Anna Rohrbach, Cordelia Schmid
On Label Granularity and Object Localization
Elijah Cole, Kimberly Wilber, Grant Van Horn, Xuan Yang, Marco Fornoni, Pietro Perona, Serge Belongie, Andrew Howard, Oisin Mac Aodha
Disentangling Architecture and Training for Optical Flow
Deqing Sun, Charles Herrmann, Fitsum Reda, Michael Rubinstein, David J. Fleet, William T. Freeman
NewsStories: Illustrating Articles with Visual Summaries
Reuben Tan, Bryan Plummer, Kate Saenko, J.P. Lewis, Avneesh Sud, Thomas Leung
Improving GANs for Long-Tailed Data Through Group Spectral Regularization
Harsh Rangwani, Naman Jaswani, Tejan Karmali, Varun Jampani, Venkatesh Babu Radhakrishnan
Planes vs. Chairs: Category-Guided 3D Shape Learning Without Any 3D Cues
Zixuan Huang, Stefan Stojanov, Anh Thai, Varun Jampani, James Rehg
A Sketch Is Worth a Thousand Words: Image Retrieval with Text and Sketch
Patsorn Sangkloy, Wittawat Jitkrittum, Diyi Yang, James Hays
Learned Monocular Depth Priors in Visual-Inertial Initialization
Yunwen Zhou, Abhishek Kar, Eric L. Turner, Adarsh Kowdle, Chao Guo, Ryan DuToit, Konstantine Tsotsos
How Stable are Transferability Metrics Evaluations?
Andrea Agostinelli, Michal Pandy, Jasper Uijlings, Thomas Mensink, Vittorio Ferrari
Data-Free Neural Architecture Search via Recursive Label Calibration
Zechun Liu*, Zhiqiang Shen, Yun Long, Eric Xing, Kwang-Ting Cheng, Chas H. Leichner
Fast and High Quality Image Denoising via Malleable Convolution
Yifan Jiang*, Bartlomiej Wronski, Ben Mildenhall, Jonathan T. Barron, Zhangyang Wang, Tianfan Xue
Concurrent Subsidiary Supervision for Unsupervised Source-Free Domain Adaptation
Jogendra Nath Kundu, Suvaansh Bhambri, Akshay R Kulkarni, Hiran Sarkar,
Varun Jampani, Venkatesh Babu Radhakrishnan
Learning Online Multi-Sensor Depth Fusion
Erik Sandström, Martin R. Oswald, Suryansh Kumar, Silvan Weder, Fisher Yu, Cristian Sminchisescu, Luc Van Gool
Hierarchical Semantic Regularization of Latent Spaces in StyleGANs
Tejan Karmali, Rishubh Parihar, Susmit Agrawal, Harsh Rangwani, Varun Jampani, Maneesh K Singh, Venkatesh Babu Radhakrishnan
RayTran: 3D Pose Estimation and Shape Reconstruction of Multiple Objects from Videos with Ray-Traced Transformers
Michał J Tyszkiewicz, Kevis-Kokitsi Maninis, Stefan Popov, Vittorio Ferrari
Neural Video Compression Using GANs for Detail Synthesis and Propagation
Fabian Mentzer, Eirikur Agustsson, Johannes Ballé, David Minnen, Nick Johnston, George Toderici
Exploring Fine-Grained Audiovisual Categorization with the SSW60 Dataset
Grant Van Horn, Rui Qian, Kimberly Wilber, Hartwig Adam, Oisin Mac Aodha, Serge Belongie
Implicit Neural Representations for Image Compression
Yannick Strümpler, Janis Postels, Ren Yang, Luc Van Gool, Federico Tombari
3D Compositional Zero-Shot Learning with DeCompositional Consensus
Muhammad Ferjad Naeem, Evin Pınar Örnek, Yongqin Xian, Luc Van Gool, Federico Tombari
FindIt: Generalized Localization with Natural Language Queries (see blog post)
Weicheng Kuo, Fred Bertsch, Wei Li, AJ Piergiovanni, Mohammad Saffar, Anelia Angelova
A Simple Single-Scale Vision Transformer for Object Detection and Instance Segmentation
Wuyang Chen*, Xianzhi Du, Fan Yang, Lucas Beyer, Xiaohua Zhai, Tsung-Yi Lin, Huizhong Chen, Jing Li, Xiaodan Song, Zhangyang Wang, Denny Zhou
Improved Masked Image Generation with Token-Critic
Jose Lezama, Huiwen Chang, Lu Jiang, Irfan Essa
Learning Discriminative Shrinkage Deep Networks for Image Deconvolution
Pin-Hung Kuo, Jinshan Pan, Shao-Yi Chien, Ming-Hsuan Yang
AudioScopeV2: Audio-Visual Attention Architectures for Calibrated Open-Domain On-Screen Sound Separation
Efthymios Tzinis*, Scott Wisdom, Tal Remez, John Hershey
Simple Open-Vocabulary Object Detection with Vision Transformers
Matthias Minderer, Alexey Gritsenko, Austin C Stone, Maxim Neumann, Dirk Weißenborn, Alexey Dosovitskiy, Aravindh Mahendran, Anurag Arnab, Mostafa Dehghani, Zhuoran Shen, Xiao Wang, Xiaohua Zhai, Thomas Kipf, Neil Houlsby
COMPOSER: Compositional Reasoning of Group Activity in Videos with Keypoint-Only Modality
Honglu Zhou, Asim Kadav, Aviv Shamsian, Shijie Geng, Farley Lai, Long Zhao, Ting Liu, Mubbasir Kapadia, Hans Peter Graf
Video Question Answering with Iterative Video-Text Co-tokenization (see blog post)
AJ Piergiovanni, Kairo Morton*, Weicheng Kuo, Michael S. Ryoo, Anelia Angelova
Class-Agnostic Object Detection with Multi-modal Transformer
Muhammad Maaz, Hanoona Abdul Rasheed, Salman Khan, Fahad Shahbaz Khan, Rao Muhammad Anwer, Ming-Hsuan Yang
FILM: Frame Interpolation for Large Motion (see blog post)
Fitsum Reda, Janne Kontkanen, Eric Tabellion, Deqing Sun, Caroline Pantofaru, Brian Curless
Compositional Human-Scene Interaction Synthesis with Semantic Control
Kaifeng Zhao, Shaofei Wang, Yan Zhang, Thabo Beeler, Siyu Tang
Workshops
LatinX in AI
Mentors include: José Lezama
Keynote Speakers include: Andre Araujo
AI for Creative Video Editing and Understanding
Keynote Speakers include: Tali Dekel, Negar Rostamzadeh
Learning With Limited and Imperfect Data (L2ID)
Invited Speakers include: Xiuye Gu
Organizing Committee includes: Sadeep Jayasumana
International Challenge on Compositional and Multimodal Perception (CAMP)
Program Committee includes: Edward Vendrow
Self-Supervised Learning: What is Next?
Invited Speakers include: Mathilde Caron, Arsha Nagrani
Organizers include: Andrew Zisserman
3rd Workshop on Adversarial Robustness In the Real World
Invited Speakers include: Ekin Dogus Cubuk
Organizers include: Xinyun Chen, Alexander Robey, Nataniel Ruiz, Yutong Bai
AV4D: Visual Learning of Sounds in Spaces
Invited Speakers include: John Hershey
Challenge on Mobile Intelligent Photography and Imaging (MIPI)
Invited Speakers include: Peyman Milanfar
Robust Vision Challenge 2022
Organizing Committee includes: Alina Kuznetsova
Computer Vision in the Wild
Challenge Organizers include: Yi-Ting Chen, Ye Xia
Invited Speakers include: Yin Cui, Yongqin Xian, Neil Houlsby
Self-Supervised Learning for Next-Generation Industry-Level Autonomous Driving (SSLAD)
Organizers include: Fisher Yu
Responsible Computer Vision
Organizing Committee includes: Been Kim
Invited Speakers include: Emily Denton
Cross-Modal Human-Robot Interaction
Invited Speakers include: Peter Anderson
ISIC Skin Image Analysis
Organizing Committee includes: Yuan Liu
Steering Committee includes: Yuan Liu, Dale Webster
Invited Speakers include: Yuan Liu
Observing and Understanding Hands in Action
Sponsored by Google
Autonomous Vehicle Vision (AVVision)
Speakers include: Fisher Yu
Visual Perception for Navigation in Human Environments: The JackRabbot Human Body Pose Dataset and Benchmark
Organizers include: Edward Vendrow
Language for 3D Scenes
Invited Speakers include: Jason Baldridge
Organizers include: Leonidas Guibas
Designing and Evaluating Computer Perception Systems (CoPe)
Organizers include: Andrew Zisserman
Learning To Generate 3D Shapes and Scenes
Panelists include: Pete Florence
Advances in Image Manipulation
Program Committee includes: George Toderici, Ming-Hsuan Yang
TiE: Text in Everything
Challenge Organizers include: Shangbang Long, Siyang Qin
Invited Speakers include: Tali Dekel, Aishwarya Agrawal
Instance-Level Recognition
Organizing Committee: Andre Araujo, Bingyi Cao, Tobias Weyand
Invited Speakers include: Mathilde Caron
What Is Motion For?
Organizing Committee: Deqing Sun, Fitsum Reda, Charles Herrmann
Invited Speakers include: Tali Dekel
Neural Geometry and Rendering: Advances and the Common Objects in 3D Challenge
Invited Speakers include: Ben Mildenhall
Visual Object-Oriented Learning Meets Interaction: Discovery, Representations, and Applications
Invited Speakers include: Klaus Greff, Thomas Kipf
Organizing Committee includes: Leonidas Guibas
Vision with Biased or Scarce Data (VBSD)
Program Committee includes: Yizhou Wang
Multiple Object Tracking and Segmentation in Complex Environments
Invited Speakers include: Xingyi Zhou, Fisher Yu
3rd Visual Inductive Priors for Data-Efficient Deep Learning Workshop
Organizing Committee includes: Ekin Dogus Cubuk
DeeperAction: Detailed Video Action Understanding and Anomaly Recognition
Advisors include: Rahul Sukthankar
Sign Language Understanding Workshop and Sign Language Recognition, Translation & Production Challenge
Organizing Committee includes: Andrew Zisserman
Speakers include: Andrew Zisserman
Ego4D: First-Person Multi-Modal Video Understanding
Invited Speakers include: Michal Irani
AI-Enabled Medical Image Analysis: Digital Pathology & Radiology/COVID19
Program Chairs include: Po-Hsuan Cameron Chen
Workshop Partner: Google Health
Visual Object Tracking Challenge (VOT 2022)
Technical Committee includes: Christoph Mayer
Assistive Computer Vision and Robotics
Technical Committee includes: Maja Mataric
Human Body, Hands, and Activities from Egocentric and Multi-View Cameras
Organizers include: Francis Engelmann
Frontiers of Monocular 3D Perception: Implicit x Explicit
Panelists include: Pete Florence
Tutorials
Self-Supervised Representation Learning in Computer Vision
Invited Speakers include: Ting Chen
Neural Volumetric Rendering for Computer Vision
Organizers include: Ben Mildenhall, Pratul Srinivasan, Jon Barron
Presenters include: Ben Mildenhall, Pratul Srinivasan
New Frontiers in Efficient Neural Architecture Search!
Speakers include: Ruochen Wang
*Work done while at Google. ↩Posted by Shaina Mehta, Program Manager, Google Google is proud to be a Platinum Sponsor of the European Conference on Computer Vision (ECCV 2022), a premier forum for the dissemination of research in computer vision and machine learning (ML). This year, ECCV 2022 will be held as a hybrid event, in person in Tel Aviv, Israel with virtual attendance as an option. Google has a strong presence at this year’s conference with over 60 accepted publications and active involvement in a number of workshops and tutorials. We look forward to sharing some of our extensive research and expanding our partnership with the broader ML research community. Registered for ECCV 2022? We hope you’ll visit our on-site or virtual booths to learn more about the research we’re presenting at ECCV 2022, including several demos and opportunities to connect with our researchers. Learn more about Google’s research being presented at ECCV 2022 below (Google affiliations in bold). Organizing Committee Program Chairs include: Moustapha Cissé Awards Paper Committee: Todd Zickler Area Chairs include: Ayan Chakrabarti, Tali Dekel, Alireza Fathi, Vittorio Ferrari, David Fleet, Dilip Krishnan, Michael Rubinstein, Cordelia Schmid, Deqing Sun, Federico Tombari, Jasper Uijlings, Ming-Hsuan Yang, Todd Zickler Accepted PublicationsNeuMesh: Learning Disentangled Neural Mesh-Based Implicit Field for Geometry and Texture EditingBangbang Yang, Chong Bao, Junyi Zeng, Hujun Bao, Yinda Zhang, Zhaopeng Cui, Guofeng Zhang Anti-Neuron Watermarking: Protecting Personal Data Against Unauthorized Neural NetworksZihang Zou, Boqing Gong, Liqiang Wang Exploiting Unlabeled Data with Vision and Language Models for Object DetectionShiyu Zhao, Zhixing Zhang, Samuel Schulter, Long Zhao, Vijay Kumar B G, Anastasis Stathopoulos, Manmohan Chandraker, Dimitris N. Metaxas Waymo Open Dataset: Panoramic Video Panoptic SegmentationJieru Mei, Alex Zhu, Xinchen Yan, Hang Yan, Siyuan Qiao, Yukun Zhu, Liang-Chieh Chen, Henrik Kretzschmar PRIF: Primary Ray-Based Implicit FunctionBrandon Yushan Feng, Yinda Zhang, Danhang Tang, Ruofei Du, Amitabh Varshney LoRD: Local 4D Implicit Representation for High-Fidelity Dynamic Human ModelingBoyan Jiang, Xinlin Ren, Mingsong Dou, Xiangyang Xue, Yanwei Fu, Yinda Zhang k-Means Mask Transformer (see blog post) Qihang Yu*, Siyuan Qiao, Maxwell D Collins, Yukun Zhu, Hartwig Adam, Alan Yuille, Liang-Chieh Chen MaxViT: Multi-Axis Vision Transformer (see blog post) Zhengzhong Tu, Hossein Talebi, Han Zhang, Feng Yang, Peyman Milanfar, Alan Bovik, Yinxiao Li E-Graph: Minimal Solution for Rigid Rotation with Extensibility GraphsYanyan Li, Federico Tombari RBP-Pose: Residual Bounding Box Projection for Category-Level Pose EstimationRuida Zhang, Yan Di, Zhiqiang Lou, Fabian Manhardt, Federico Tombari, Xiangyang Ji GOCA: Guided Online Cluster Assignment for Self-Supervised Video Representation LearningHuseyin Coskun, Alireza Zareian, Joshua L Moore, Federico Tombari, Chen Wang Scaling Open-Vocabulary Image Segmentation with Image-Level LabelsGolnaz Ghiasi, Xiuye Gu, Yin Cui, Tsung-Yi Lin* Adaptive Transformers for Robust Few-Shot Cross-Domain Face Anti-spoofingHsin-Ping Huang, Deqing Sun, Yaojie Liu, Wen-Sheng Chu, Taihong Xiao, Jinwei Yuan, Hartwig Adam, Ming-Hsuan Yang DualPrompt: Complementary Prompting for Rehearsal-Free Continual LearningZifeng Wang*, Zizhao Zhang, Sayna Ebrahimi, Ruoxi Sun, Han Zhang, Chen-Yu Lee, Xiaoqi Ren, Guolong Su, Vincent Perot, Jennifer Dy, Tomas Pfister BLT: Bidirectional Layout Transformer for Controllable Layout GenerationXiang Kong, Lu Jiang, Huiwen Chang, Han Zhang, Yuan Hao, Haifeng Gong, Irfan Essa V2X-ViT: Vehicle-to-Everything Cooperative Perception with Vision TransformerRunsheng Xu, Hao Xiang, Zhengzhong Tu, Xin Xia, Ming-Hsuan Yang, Jiaqi Ma Learning Visibility for Robust Dense Human Body EstimationChun-Han Yao, Jimei Yang, Duygu Ceylan, Yi Zhou, Yang Zhou, Ming-Hsuan Yang Are Vision Transformers Robust to Patch Perturbations?Jindong Gu, Volker Tresp, Yao Qin PseudoAugment: Learning to Use Unlabeled Data for Data Augmentation in Point CloudsZhaoqi Leng, Shuyang Cheng, Ben Caine, Weiyue Wang, Xiao Zhang, Jonathon Shlens, Mingxing Tan, Dragomir Anguelov Structure and Motion from Casual VideosZhoutong Zhang, Forrester Cole, Zhengqi Li, Noah Snavely, Michael Rubinstein, William T. Freeman PreTraM: Self-Supervised Pre-training via Connecting Trajectory and MapChenfeng Xu, Tian Li, Chen Tang, Lingfeng Sun, Kurt Keutzer, Masayoshi Tomizuka, Alireza Fathi, Wei Zhan Novel Class Discovery Without ForgettingJoseph K J, Sujoy Paul, Gaurav Aggarwal, Soma Biswas, Piyush Rai, Kai Han, Vineeth N Balasubramanian Hierarchically Self-Supervised Transformer for Human Skeleton Representation LearningYuxiao Chen, Long Zhao, Jianbo Yuan, Yu Tian, Zhaoyang Xia, Shijie Geng, Ligong Han, Dimitris N. Metaxas PACTran: PAC-Bayesian Metrics for Estimating the Transferability of Pretrained Models to Classification TasksNan Ding, Xi Chen, Tomer Levinboim, Soravit Changpinyo, Radu Soricut InfiniteNature-Zero: Learning Perpetual View Generation of Natural Scenes from Single ImagesZhengqi Li, Qianqian Wang*, Noah Snavely, Angjoo Kanazawa* Generalizable Patch-Based Neural Rendering (see blog post) Mohammed Suhail*, Carlos Esteves, Leonid Sigal, Ameesh Makadia LESS: Label-Efficient Semantic Segmentation for LiDAR Point CloudsMinghua Liu, Yin Zhou, Charles R. Qi, Boqing Gong, Hao Su, Dragomir Anguelov The Missing Link: Finding Label Relations Across DatasetsJasper Uijlings, Thomas Mensink, Vittorio Ferrari Learning Instance-Specific Adaptation for Cross-Domain SegmentationYuliang Zou, Zizhao Zhang, Chun-Liang Li, Han Zhang, Tomas Pfister, Jia-Bin Huang Learning Audio-Video Modalities from Image CaptionsArsha Nagrani, Paul Hongsuck Seo, Bryan Seybold, Anja Hauth, Santiago Manen, Chen Sun, Cordelia Schmid TL;DW? Summarizing Instructional Videos with Task Relevance & Cross-Modal SaliencyMedhini Narasimhan*, Arsha Nagrani, Chen Sun, Michael Rubinstein, Trevor Darrell, Anna Rohrbach, Cordelia Schmid On Label Granularity and Object LocalizationElijah Cole, Kimberly Wilber, Grant Van Horn, Xuan Yang, Marco Fornoni, Pietro Perona, Serge Belongie, Andrew Howard, Oisin Mac Aodha Disentangling Architecture and Training for Optical Flow Deqing Sun, Charles Herrmann, Fitsum Reda, Michael Rubinstein, David J. Fleet, William T. Freeman NewsStories: Illustrating Articles with Visual SummariesReuben Tan, Bryan Plummer, Kate Saenko, J.P. Lewis, Avneesh Sud, Thomas Leung Improving GANs for Long-Tailed Data Through Group Spectral RegularizationHarsh Rangwani, Naman Jaswani, Tejan Karmali, Varun Jampani, Venkatesh Babu Radhakrishnan Planes vs. Chairs: Category-Guided 3D Shape Learning Without Any 3D CuesZixuan Huang, Stefan Stojanov, Anh Thai, Varun Jampani, James Rehg A Sketch Is Worth a Thousand Words: Image Retrieval with Text and SketchPatsorn Sangkloy, Wittawat Jitkrittum, Diyi Yang, James Hays Learned Monocular Depth Priors in Visual-Inertial InitializationYunwen Zhou, Abhishek Kar, Eric L. Turner, Adarsh Kowdle, Chao Guo, Ryan DuToit, Konstantine Tsotsos How Stable are Transferability Metrics Evaluations?Andrea Agostinelli, Michal Pandy, Jasper Uijlings, Thomas Mensink, Vittorio Ferrari Data-Free Neural Architecture Search via Recursive Label CalibrationZechun Liu*, Zhiqiang Shen, Yun Long, Eric Xing, Kwang-Ting Cheng, Chas H. Leichner Fast and High Quality Image Denoising via Malleable ConvolutionYifan Jiang*, Bartlomiej Wronski, Ben Mildenhall, Jonathan T. Barron, Zhangyang Wang, Tianfan Xue Concurrent Subsidiary Supervision for Unsupervised Source-Free Domain AdaptationJogendra Nath Kundu, Suvaansh Bhambri, Akshay R Kulkarni, Hiran Sarkar, Varun Jampani, Venkatesh Babu Radhakrishnan Learning Online Multi-Sensor Depth FusionErik Sandström, Martin R. Oswald, Suryansh Kumar, Silvan Weder, Fisher Yu, Cristian Sminchisescu, Luc Van Gool Hierarchical Semantic Regularization of Latent Spaces in StyleGANsTejan Karmali, Rishubh Parihar, Susmit Agrawal, Harsh Rangwani, Varun Jampani, Maneesh K Singh, Venkatesh Babu Radhakrishnan RayTran: 3D Pose Estimation and Shape Reconstruction of Multiple Objects from Videos with Ray-Traced TransformersMichał J Tyszkiewicz, Kevis-Kokitsi Maninis, Stefan Popov, Vittorio Ferrari Neural Video Compression Using GANs for Detail Synthesis and PropagationFabian Mentzer, Eirikur Agustsson, Johannes Ballé, David Minnen, Nick Johnston, George Toderici Exploring Fine-Grained Audiovisual Categorization with the SSW60 DatasetGrant Van Horn, Rui Qian, Kimberly Wilber, Hartwig Adam, Oisin Mac Aodha, Serge Belongie Implicit Neural Representations for Image CompressionYannick Strümpler, Janis Postels, Ren Yang, Luc Van Gool, Federico Tombari 3D Compositional Zero-Shot Learning with DeCompositional ConsensusMuhammad Ferjad Naeem, Evin Pınar Örnek, Yongqin Xian, Luc Van Gool, Federico Tombari FindIt: Generalized Localization with Natural Language Queries (see blog post) Weicheng Kuo, Fred Bertsch, Wei Li, AJ Piergiovanni, Mohammad Saffar, Anelia Angelova A Simple Single-Scale Vision Transformer for Object Detection and Instance SegmentationWuyang Chen*, Xianzhi Du, Fan Yang, Lucas Beyer, Xiaohua Zhai, Tsung-Yi Lin, Huizhong Chen, Jing Li, Xiaodan Song, Zhangyang Wang, Denny Zhou Improved Masked Image Generation with Token-CriticJose Lezama, Huiwen Chang, Lu Jiang, Irfan Essa Learning Discriminative Shrinkage Deep Networks for Image DeconvolutionPin-Hung Kuo, Jinshan Pan, Shao-Yi Chien, Ming-Hsuan Yang AudioScopeV2: Audio-Visual Attention Architectures for Calibrated Open-Domain On-Screen Sound SeparationEfthymios Tzinis*, Scott Wisdom, Tal Remez, John Hershey Simple Open-Vocabulary Object Detection with Vision TransformersMatthias Minderer, Alexey Gritsenko, Austin C Stone, Maxim Neumann, Dirk Weißenborn, Alexey Dosovitskiy, Aravindh Mahendran, Anurag Arnab, Mostafa Dehghani, Zhuoran Shen, Xiao Wang, Xiaohua Zhai, Thomas Kipf, Neil Houlsby COMPOSER: Compositional Reasoning of Group Activity in Videos with Keypoint-Only ModalityHonglu Zhou, Asim Kadav, Aviv Shamsian, Shijie Geng, Farley Lai, Long Zhao, Ting Liu, Mubbasir Kapadia, Hans Peter Graf Video Question Answering with Iterative Video-Text Co-tokenization (see blog post) AJ Piergiovanni, Kairo Morton*, Weicheng Kuo, Michael S. Ryoo, Anelia Angelova Class-Agnostic Object Detection with Multi-modal TransformerMuhammad Maaz, Hanoona Abdul Rasheed, Salman Khan, Fahad Shahbaz Khan, Rao Muhammad Anwer, Ming-Hsuan Yang FILM: Frame Interpolation for Large Motion (see blog post) Fitsum Reda, Janne Kontkanen, Eric Tabellion, Deqing Sun, Caroline Pantofaru, Brian Curless Compositional Human-Scene Interaction Synthesis with Semantic ControlKaifeng Zhao, Shaofei Wang, Yan Zhang, Thabo Beeler, Siyu Tang WorkshopsLatinX in AIMentors include: José LezamaKeynote Speakers include: Andre Araujo AI for Creative Video Editing and UnderstandingKeynote Speakers include: Tali Dekel, Negar Rostamzadeh Learning With Limited and Imperfect Data (L2ID)Invited Speakers include: Xiuye GuOrganizing Committee includes: Sadeep Jayasumana International Challenge on Compositional and Multimodal Perception (CAMP)Program Committee includes: Edward Vendrow Self-Supervised Learning: What is Next?Invited Speakers include: Mathilde Caron, Arsha Nagrani Organizers include: Andrew Zisserman 3rd Workshop on Adversarial Robustness In the Real WorldInvited Speakers include: Ekin Dogus CubukOrganizers include: Xinyun Chen, Alexander Robey, Nataniel Ruiz, Yutong Bai AV4D: Visual Learning of Sounds in SpacesInvited Speakers include: John Hershey Challenge on Mobile Intelligent Photography and Imaging (MIPI)Invited Speakers include: Peyman Milanfar Robust Vision Challenge 2022Organizing Committee includes: Alina Kuznetsova Computer Vision in the WildChallenge Organizers include: Yi-Ting Chen, Ye XiaInvited Speakers include: Yin Cui, Yongqin Xian, Neil Houlsby Self-Supervised Learning for Next-Generation Industry-Level Autonomous Driving (SSLAD)Organizers include: Fisher Yu Responsible Computer VisionOrganizing Committee includes: Been KimInvited Speakers include: Emily Denton Cross-Modal Human-Robot InteractionInvited Speakers include: Peter Anderson ISIC Skin Image AnalysisOrganizing Committee includes: Yuan LiuSteering Committee includes: Yuan Liu, Dale WebsterInvited Speakers include: Yuan Liu Observing and Understanding Hands in ActionSponsored by Google Autonomous Vehicle Vision (AVVision)Speakers include: Fisher Yu Visual Perception for Navigation in Human Environments: The JackRabbot Human Body Pose Dataset and BenchmarkOrganizers include: Edward Vendrow Language for 3D ScenesInvited Speakers include: Jason BaldridgeOrganizers include: Leonidas Guibas Designing and Evaluating Computer Perception Systems (CoPe)Organizers include: Andrew Zisserman Learning To Generate 3D Shapes and ScenesPanelists include: Pete Florence Advances in Image ManipulationProgram Committee includes: George Toderici, Ming-Hsuan Yang TiE: Text in EverythingChallenge Organizers include: Shangbang Long, Siyang QinInvited Speakers include: Tali Dekel, Aishwarya Agrawal Instance-Level RecognitionOrganizing Committee: Andre Araujo, Bingyi Cao, Tobias WeyandInvited Speakers include: Mathilde Caron What Is Motion For?Organizing Committee: Deqing Sun, Fitsum Reda, Charles HerrmannInvited Speakers include: Tali Dekel Neural Geometry and Rendering: Advances and the Common Objects in 3D ChallengeInvited Speakers include: Ben Mildenhall Visual Object-Oriented Learning Meets Interaction: Discovery, Representations, and ApplicationsInvited Speakers include: Klaus Greff, Thomas KipfOrganizing Committee includes: Leonidas Guibas Vision with Biased or Scarce Data (VBSD)Program Committee includes: Yizhou Wang Multiple Object Tracking and Segmentation in Complex EnvironmentsInvited Speakers include: Xingyi Zhou, Fisher Yu 3rd Visual Inductive Priors for Data-Efficient Deep Learning WorkshopOrganizing Committee includes: Ekin Dogus Cubuk DeeperAction: Detailed Video Action Understanding and Anomaly RecognitionAdvisors include: Rahul Sukthankar Sign Language Understanding Workshop and Sign Language Recognition, Translation & Production ChallengeOrganizing Committee includes: Andrew ZissermanSpeakers include: Andrew Zisserman Ego4D: First-Person Multi-Modal Video UnderstandingInvited Speakers include: Michal Irani AI-Enabled Medical Image Analysis: Digital Pathology & Radiology/COVID19Program Chairs include: Po-Hsuan Cameron ChenWorkshop Partner: Google Health Visual Object Tracking Challenge (VOT 2022)Technical Committee includes: Christoph Mayer Assistive Computer Vision and RoboticsTechnical Committee includes: Maja Mataric Human Body, Hands, and Activities from Egocentric and Multi-View CamerasOrganizers include: Francis Engelmann Frontiers of Monocular 3D Perception: Implicit x ExplicitPanelists include: Pete Florence TutorialsSelf-Supervised Representation Learning in Computer VisionInvited Speakers include: Ting Chen Neural Volumetric Rendering for Computer VisionOrganizers include: Ben Mildenhall, Pratul Srinivasan, Jon BarronPresenters include: Ben Mildenhall, Pratul Srinivasan New Frontiers in Efficient Neural Architecture Search!Speakers include: Ruochen Wang *Work done while at Google. ↩