Author: Oganes Ashikyan, MD Contact: firstname.lastname@example.org
Transfer learning is a popular technique in current radiological artificial intelligence (AI) articles that investigate a variety of conditions. This technique relies on pre-trained deep neural networks (ResNet, Densenet, Inception, etc) to train computers to classify various radiological images. Humans evaluate radiological images taking into account numerous background data and both consciously and subconsciously evaluate numerous features on the images. In classification problems, whether a finding is aggressive or non-aggressive is in part based on its shape and contour when humans look at the images. It is unknown what exact features deep learning AI uses when it classifies radiological images. Surprisingly, there is very little scientific information available that describes AI’s accuracy in classifying simple geometric shapes. It is relatively easy to test the accuracy of pre-trained nets in differentiating simple geometric shapes.
Purpose: To investigate whether transfer learning using pre-trained deep neural network ResNet50 can accurately classify simple geometric shapes.
Randomly sized and randomly colored circles and rectangles were generated on 224 x 224 pixel, 3-color channel images using a simple Python script. 100 circles and 100 rectangles were created as input dataset. Using the usual transfer learning techniques and pre-trained ResNet50 deep learning network, a new net was trained to classify images into circle and rectangle categories. The training set was split into training (70%) and validation (30%) input datasets, and the usual dataset augmentation techniques were applied to the training dataset. The new net was tested on additional randomly generated 100 test circles and 100 test rectangles. Consumer Intel Core i5 CPU computer without high powered GPU was used in this project.
The re-trained ResNet50 achieved close to 100% accuracy and low loss function after 4 min of training on Intel Core i5 single CPU (Figure 1). The complete re-training transfer learning time was 16 min and 30 sec. Notice that while smoothed training/validation accuracy was recorded at 100%, there were occasional cases where accuracy dropped to 90% for individual images.
Examples of how re-trained net classified four random validation images are provided in Fig 2.
During testing on the additional 200 images, the network achieved an overall 98% accuracy (Figure 3.)
Ok, what is going on here? Overall, the accuracy was not bad at 98% considering the training was done on 200 images only. But, how can powerful ResNet50 that achieves pretty good results in classifying cats and dogs and keyboards and bicycles drop it’s accuracy by 2% when faced with a simple task of differentiating a perfectly clean rectangle from a circle in testing dataset. Let’s dig deeper. Remember the part about “occasional case of 90%” during training in the Results section. This “occasional” mis-classification or uncertain classification persists in testing too (Figure 4).
Here is another example that shows correctly classified small circle, but with a tiny fraction of uncertainty (Figure 5). The black and white plots in Figure 5 demonstrate what net “sees” during three selected convolutions. One way to think of convolutions is to think of them as “image filters.” Note how the last image only “sees” the top part of the circle, for example. All of these convolutions, which can number in hundreds in very deep networks, contribute to the final decision making by the network.
Radiologists commonly ask what does AI actually “see.” It is possible to start understanding what AI “sees” by looking at how different parts of an image contribute to the final decision by the net. A technique called “Class Activation Mapping” allows generation of “maps” to show which parts of the image contributed the most to the final decision. The following three figures (Figures 6-8) encode the regions that contributed the most in red and the rest in yellow to green. There is zero contribution from completely black areas.
So, it seems that for this simple task re-trained ResNet50 actually does pick up shape features. It also puts more weight on where there is abrupt change in the shape. Notice how parts of the circle circumference that abruptly stop at the image boundary contribute more to the final decision in Figure 8. There is only 77% certainty in the net that this was a circle for the example in Figure 8.
If we look at this same circle sampling three specific convolutions (Figure 9), in the same fashion as in Figure 5, we realize that the net only has usable information in one of these (the lighter inverse image). The dark original circle blended with background in the other two convolutions.
There are obviously some limitation to this project. Only 200 images were used for training, 100 in each category. It would be easy to redo this project with larger datasets. However, clinical projects usually rely on small number of images as well. Since couple of percentage points can be lost in a simple two shape classification problem, accuracy in radiological AI projects can be easily affected to a greater degree when thousands of features contribute to any given real classification task.
Another difference from real life radiology datasets is the availability of input color images in this project. The accuracy achieved by this project maybe somewhat different if the same project was conducted on greyscale images only. When greyscale images are used in radiology projects, there are two ways to approach specifics of implementation. No large publicly available nets exist at this time that were trained on radiological grey scale images. To use publicly available nets like ResNet, radiology researchers need to modify their images to be suitable as inputs into nets that were pre-trained on color images. The greyscale image has to be effectively “cloned” into three channels to allow use of public pre-trained nets. One way to look at it, is that 2/3 of the input is “wasted” as duplicated information. Another approach is to train your own net. This requires large input dataset, which is not always possible, and long training, testing, optimizing time commitment.
In summary, re-trained ResNet50 that used transfer learning technique and small input dataset achieved very high accuracy in differentiating a circle from a rectangle. The accuracy dropped by 2% on additional new testing cases.