University of Toronto | Mar 2021 - Apr 2021 | Research, team of Three
We simulated the diagnostic process of COVID-19 by classifying chest X-ray images into virus-infected lung images and normal lung images, with packages including PyTorch, OpenCV, and Pandas.
My contributions:
-
Fine-tuned two ImageNet-21k pre-trained models, Vision Transformer (BiT-M-R50x1) and CNN model (R50+ViT-B/16), on the Chest X-Ray dataset
-
Demonstrated that for this classification subtask, although Vision Transformers lack the inductive biases inherent in CNN models and do not generalize well when trained on small datasets, after pre-trained at sufficient scale, Vision Transformers can outperform the state-of-art CNN model
-
Utilized Gradient-weighted Class Activation Mapping (Grad-CAM) to investigate the activation heatmap of BiT model, and the Activation Maximization technique to analyze and compare the two models’ decision policies