COV-ADSX: An Automated Detection System using X-ray Images, Deep Learning, and XGBoost for COVID-19

Following the COVID-19 pandemic, scientists have been looking for different ways to diagnose COVID-19, and these efforts have led to a variety of solutions. One of the common methods of detecting infected people is chest radiography. In this paper, an Automated Detection System using X-ray images (COV-ADSX) is proposed, which employs a deep neural network and XGBoost to detect COVID-19. COV-ADSX was implemented using the Django web framework, which allows the user to upload an X-ray image and view the results of the COVID-19 detection and image’s heatmap, which helps the expert to evaluate the chest area more accurately.


Introduction
COVID-19 virus was first reported in Wuhan, China, in late 2019 and spread rapidly throughout the world [1][2][3][4]. The symptoms of people with COVID-19 included fever, cough, sore throat, headache, fatigue, muscle ache, and shortness of breath [5,6]. As mentioned, one According to the previous studies [10][11][12], X-ray images of patients infected with COVID-19 have important and useful information for detecting this virus. So, these images were employed in the software introduced in this paper (i.e., COV-ADSX). COV-ADSX uses the algorithm proposed by Nasiri and Hasani [13], through which image features were extracted using DenseNet169 [14] Deep Neural Network (DNN), and the extracted features were given as input to the XGBoost algorithm to perform the classification. In the method proposed by Nasiri and Hasani, a pre-trained DNN was employed, and the network was not trained. COV-ADSX receives an X-ray image of a person's chest and uses a DNN to extract its features. It then gives the extracted features to the trained XGBoost to determine if the person was infected with COVID-19 or not.

XGBoost
Extreme Gradient Boosting (XGBoost) is a gradient boosting algorithm enhanced for efficiency, versatility, and scalability [15][16][17]. In recent years, XGBoost has been widely used by researchers, and it has shown impressive performance in a variety of Machine Learning (ML) challenges [18,19]. XGBoost was proposed by Chen and Guestrin [20] as an ensemble algorithm based on gradient boosted decision trees [21]. Using a normalized objective function helps XGBoost reduce the complexity of the model and prevent overfitting [22][23][24].

Software features
COV-ADSX is based on deep learning and the XGBoost algorithm. In this software, deep learning is applied in the context of web applications. Due to the power of Python language in deep learning, this language and Django web framework [25] were employed to implement COV-ADSX. The implemented software uses Python version 3.8.8 and Django framework version 3.2.8.
COV-ADSX included four main steps as follows: (1) receiving the user's image and sending it to the deep learning model (Fig. 1); (2) extracting the features of the image using DenseNet169; (3) giving the extracted features as input to the XGBoost algorithm and performing the classification, i.e., detecting whether the person has infected with COVID-19 or not; (4) using the Gradient-based Class Activation Mapping (Grad-CAM) algorithm [26], specifying the decision area on the heatmap, and displaying it to the user (Fig. 2).
In the first step, the user uploads his/her radiographic image. The image received from the user was then passed to the deep learning model to extract its features. Note that the deep learning model's weights were saved as a file and loaded into the software. The extracted features were then passed to the XGBoost algorithm to perform the detection. The XGBoost algorithm was already trained, and its trained model was employed in COV-ADSX. This software has a high speed and displays the result to the user within a maximum of 10 s. According to the results obtained in [13], the accuracy of the model used in COV-ADSX based on the ChestX-ray8 [27] dataset was 98.23%. After the image was detected by the XGBoost algorithm, the Grad-CAM algorithm was utilized to show the decision area on a heatmap in the last step. Heatmaps are very important to radiologists since helping them to examine the chest area more accurately.

Impact overview
As mentioned in the previous sections, COV-ADSX can diagnose people infected with the COVID-19 virus using radiographic images. COV-ADSX has a very simple user interface and can be used by all classes of society. Anyone, who has his/her CXR image, can use this software to check whether he/she is infected by COVID-19 or not. As can be seen in Fig. 1, to use the software, the user only needs to upload his/her radiographic image, and the results are then displayed to him/her (Fig. 2).
COV-ADSX provides useful information to medical specialists, especially radiologists. Its first advantage is diagnosing COVID-19 infection in a person, which normally needs much time with PCR test, in a very short time with high accuracy. In addition, it displays a heatmap to the expert to show which parts of the input radiographic image the deep learning model is more focused on and can thus serve as an initial screening tool.
COV-ADSX can be connected to third-party applications in hospitals and medical centers. So, when a patient goes to the hospital and a CXR image is taken, the image automatically is given as input to the COV-ADSX, and the software determines if the person was infected with COVID-19 or not. Then the COVID-19 test result is recorded in the patient file. Moreover, the specified decision area on the heatmap is also stored in the patient file and is shown to the medical specialist.
To further evaluate the performance of COV-ADSX, it was tested on a limited number of patients in Shafa Hospital of Semnan (Iran). The experimental results indicated that COV-ADSX achieves reliable results in the real environment.
Since COV-ADSX uses XGBoost for classification and XGBoost implements parallel processing [15], COV-ADSX has a lower computational cost than other ML methods. Furthermore, XGBoost has better performance than other ML techniques like Random Forest (RF) [28] and Support Vector Machine (SVR) [29], which leads to the high accuracy of COV-ADSX. The limitations and drawbacks of the COV-ADSX include using a limited number of COVID-19 X-ray images (i.e., 125 samples) for the training of the DNN model, which may decrease the generalization of software. Moreover, since XGBoost has too many hyperparameters that need to be tuned, its tuning process is challenging, and overfitting is possible if parameters are not tuned properly. Other ML methods like RF and SVR have fewer hyperparameters and are easier to tune. Another limitation of the COV-ADSX is that it only accepts CXR images as input and does not support other radiography images like chest CT. This feature can be added to the future version of the software.

Conclusion
In this paper, COV-ADSX is introduced. COV-ADSX can diagnose people infected with COVID-19 using radiographic images of a person's chest, which is impossible by the non-specialist. On the other hand, this type of imaging as the most accessible type can be easily done in most medical centers. The deep learning model used in COV-ADSX based on the Chest-Xray6 dataset has an accuracy of 98.23%. Since COV-ADSX does not require DNN training, it has a high execution speed. Displaying the heatmap of the decision area to the user is another benefit of this software that can help radiologists better diagnose COVID-19.