Vision Transformer for Chest X-Ray Classification

University of Toronto | Mar 2021 - Apr 2021 | Research, team of Three

We simulated the diagnostic process of COVID-19 by classifying chest X-ray images into virus-infected lung images and normal lung images, with packages including PyTorch, OpenCV, and Pandas.

My contributions:

Fine-tuned two ImageNet-21k pre-trained models, Vision Transformer (BiT-M-R50x1) and CNN model (R50+ViT-B/16), on the Chest X-Ray dataset
Demonstrated that for this classification subtask, although Vision Transformers lack the inductive biases inherent in CNN models and do not generalize well when trained on small datasets, after pre-trained at sufficient scale, Vision Transformers can outperform the state-of-art CNN model
Utilized Gradient-weighted Class Activation Mapping (Grad-CAM) to investigate the activation heatmap of BiT model, and the Activation Maximization technique to analyze and compare the two models’ decision policies

Vision Transformer for Chest X-Ray Classification

My contributions:

Ziyi Zhou

Research Engineer