Posted by Zizhao Zhang, Software Engineer, Google Cloud In visual understanding, the Visual Transformer (ViT) and its variants have received significant attention recently due to their superior performance on many core visual applications, such as image classification, object detection, and…