Vision Transformer for Chest X-Ray Classification

University of Toronto | Mar 2021 - Apr 2021 | Research, team of Three

We simulated the diagnostic process of COVID-19 by classifying chest X-ray images into virus-infected lung images and normal lung images, with packages including PyTorch, OpenCV, and Pandas.

My contributions:

  1. Fine-tuned two ImageNet-21k pre-trained models, Vision Transformer (BiT-M-R50x1) and CNN model (R50+ViT-B/16), on the Chest X-Ray dataset

  2. Demonstrated that for this classification subtask, although Vision Transformers lack the inductive biases inherent in CNN models and do not generalize well when trained on small datasets, after pre-trained at sufficient scale, Vision Transformers can outperform the state-of-art CNN model

  3. Utilized Gradient-weighted Class Activation Mapping (Grad-CAM) to investigate the activation heatmap of BiT model, and the Activation Maximization technique to analyze and compare the two models’ decision policies

Ziyi Zhou
Ziyi Zhou
Research Engineer

Research Engineer | Samsung Research America (SRA)