Skip to main content

Developing a hybrid algorithm to detect brain tumors from MRI images

Abstract

Background

Image processing technologies have been developed in the past two decades to help clinicians diagnose tumors using medical images. Computer-aided diagnosis systems (CADs) have proven their ability to increase clinicians' detection rate of positive cases by 10% and have become integrated with many medical imaging systems and technologies. The study aimed to develop a hybrid algorithm to help doctors detect brain tumors from magnetic resonance imaging images.

Results

We were able to reach a detection accuracy of 96.6% and design a computer application that allows the user to enter the image and identify the location of the tumor in it if it exists with many additional features.

Conclusions

This approach can be improved by using different segmentation techniques, extracting additional features, or using other classifiers.

Background

Brain cancer is considered one of the most dangerous and most common types of cancer. Hence, research focused on improving the quality of brain images in a non-intrusive manner, which is magnetic resonance imaging. MRI depends on stimulating water molecules’ protons in the human body and hitting them with radio waves, where the protons respond to this stimulation by generating radio energy [1].

Along with the significant development in the medical image processing field, image processing algorithms have become an essential part of the MRI device software, from simple operations such as contrast control, edge detection, and gray-level transformations, to image segmentation, classification, and brain image diagnosis [2].

In the past couple years, several critical studies have been conducted in this field. In 2019, a study entitled “Brain tumor classification using MRI images with K-nearest neighbor method” was launched [3], this study detected brain tumors and classified the tumor into three types using watershed segmentation and K-nearest neighbor (KNN) classification algorithm, but the accuracy reached was 89%, which is not sufficient.

This was followed by a study in 2020 entitled “Detection and classification of brain tumor using support vector machine-based GUI” [4], this study relied on wavelet transformation to extract features and used principal component analysis (PCA) technology to reduce their number. A graphical user interface (GUI) was also designed to display the processing results. One of the disadvantages of this study is that the accuracy was not calculated to determine its success, in addition to that, the designed GUI displays the values of the extracted features which do not mean much to the user.

The same year, a study entitled “Semantic segmentation of brain tumor MRI images and SVM classification using GLCM features” was published [5]. This study also used the watershed segmentation technique, extracted gray level co-occurrence matrix (GLCM) features, and then compared the results of classification using six support vector machine (SVM) classifiers, the highest classification accuracy was for both the linear SVM and the quadratic SVM, with 93%. One of the disadvantages of this study is that it used only 36 images for training, which is not an adequate number, where employing additional features such as shape features could have raised the accuracy further.

The above research papers represent the gold standard we have attempted to outperform by applying a method based on several algorithms that achieved higher accuracy than all previous similar studies.

The primary motive of the research is the extremely large number of medical images that consumes a lot of time and effort to diagnose, and the inability of the clinician sometimes to determine all suspicious areas in the image, in addition to the lack of previous studies that reached a satisfactory result, so we designed an innovative hybrid algorithm, in which we relied on a database of 150 cross-sectional MRI images of the brain.

We followed a methodology that consists of two main stages:

  1. 1.

    Classifying magnetic resonance images of the brain into images with or without a tumor and displaying the tumor if it exists.

    This stage consists of several steps: preprocessing and enhancement, segmentation, feature extracting, and classification in a hybrid manner based on the results of three classifiers combined. This method reached an accuracy of 96.6%.

  2. 2.

    Designing and programming a graphical user interface and a standalone application using MATLAB 2018a.

    The importance of this research lies in the ability to diagnose a large number of images in a short period of time, thus reducing the burden on the doctor, it also gives more accurate diagnoses because of its ability to distinguish areas that may not be visible to the naked eye, in addition to the possibility of using these programs in training medical students to diagnose images and identify suspicious areas.

Methods

Database

The dataset analyzed during the study is a standard dataset and available on the internet through Kaggle https://www.kaggle.com/navoneel/brain-mri-images-for-brain-tumor-detection [6]. It is classified into two subfolders, the “YES” subfolder which contains 93 brain images with tumors, and the “NO” subfolder which contains 57 healthy brain images.

We randomly divided the dataset into 120 training images (80%) and 30 test images (20%).

Work stages

The process begins with pre-processing and image enhancement, then cropping the tumor area, followed by features extraction and classification, and ending with designing a standalone application to display the results. The flow chart in Fig. 1 shows the stages of work.

Fig. 1
figure 1

The proposed method flow chart

Preprocessing and enhancement

Images entered by the user may vary in accuracy, clarity, size, and color. Consequently, before operating on the image, some improvements and adjustments are necessary to standardize the quality of the pictures and obtain better results in the subsequent processing stages.

First, we transformed the image into a grayscale and then adjusted its size to 300 * 300 pixels to standardize the images in terms of accuracy and processing time.

Then we noticed that the black background represents a large part of the image and doesn’t contain any valuable information, which places a massive mathematical burden on the subsequent stages. Thus, we cropped the parts of the image outside the rectangle containing the beneficial information, using the method of vertical and horizontal projection.

Afterward, we filtered the image using two types of filtering:

  • the median filter, which performs as a smoothing filter to eliminate all the noise in the image [7].

  • the unsharp masking, which performs as a boosting filter to reinforce and clarify all the details in the image [8].

Segmentation:

This stage aims to detect the region of interest, i.e., the tumor area, to extract features from this area to use them in classification later, and we achieved this through two steps:

  • Skull stripping:

This stage consists of four basic steps:

  • Thresholding using the average gray-level value method [7, 8].

  • Using the Connected Components algorithm to maintain the largest component only (the brain tissue) [8].

  • Performing a closing operation on the image to fill all the holes [7, 8].

  • Retrieving original pixel values.

  • Segmenting tumor area:

This step is one of the most complicated steps of the extraction process because the tumor area overlaps with the brain tissue.

This stage also consists of four basic steps:

  • Gamma transformation to increase the contrast between the tumor and the brain tissue [8].

  • Thresholding using Otsu’s method to automatically binarize the Gamma image [9].

  • Morphological operations to delete all the unwanted parts and fill the gaps in the image.

  • Retrieving original pixel values.

Feature extraction

After the acquisition of the tumor area, different elements of the image must be represented as a set of features to use in classification and diagnosis.

Following are the features that we extracted:

Texture features [10, 11]:

  1. (a)

    GLCM features: contrast, correlation, energy, and homogeneity.

  2. (b)

    Mean.

  3. (c)

    Standard deviation.

  4. (d)

    Entropy.

  5. (e)

    Rout Mean Square (RMS)

  6. (f)

    Variance.

  7. (g)

    Smoothness.

  8. (h)

    Skewness.

  9. (i)

    Kurtosis.

  10. (j)

    Local Binary Features (LBP) (10 features).

Shape features:

  1. (a)

    Major Axis Length.

  2. (b)

    Solidity.

  3. (c)

    Perimeter.

Classification

At this stage, we used the features extracted in the previous step to classify the image as one with a tumor “Yes” or one without a tumor “No”.

We have trained three different classification algorithms:

  • Medium Gaussian SVM (MG-SVM) model.

  • Fine KNN model.

  • Cosine KNN model.

But what caught our attention is that the images a classifier classifies wrong can be different from the images that another classifies wrong, for instance, MG-SVM can make a mistake classifying one of the images, while the other two classify it correctly, so based on that idea, we have come up with a hybrid algorithm that relies on the previous three classifiers together, where each image is classified using each of the previous classifiers and then the final classification result depends on the majority opinion since it is unlikely that all or most of the classifiers will get the same image wrong, (for example: if MG-SVM classifies an image as “NO” while Fine KNN and Cosine KNN classify it as “YES”, then the final result is more likely to be a “YES”).

Standalone application

We have designed a GUI using "App Designer" (a program attached to MATLAB 2018a) and have programmed it using Object-Oriented Programming.

This interface enables the user to enter an image, and display the tumor if it exists, the result of each processing stage, the classification decision of each classifier, and the final classification result, in addition to other display features like contrast and brightness. We then converted this interface into a stand-alone application, which can be transferred, installed, and used on any computer, even if it does not have an installed version of MATLAB.

Evaluating the classification

After applying all stages, from enhancement and segmentation to feature extraction and classification, we had to test the accuracy of the hybrid classifier and compare it with the accuracy of each classifier separately. We achieved this by drawing a Confusion Matrix for each classifier, from which we derived several values that reflected the success and effectiveness of the classification. But before doing so, some basic concepts needed to be clarified:

  • True Positive (TP): images of an injured brain that have been classified as injured.

  • False Positive (FP): images of a healthy brain that has been classified as injured.

  • True Negative (TN): images of a healthy brain that has been classified as healthy.

  • False Negative (FN): images of an injured brain that have been classified as healthy [12].

Heading out of the afore-mentioned concepts, we will clarify the most significant values deduced from the Confusion Matrix:

  • Sensitivity: It is the classifier's ability to distinguish disease states, and it is calculated as the number of injured images that were classified as injured, over the total number of injured images [12], i.e.:

    $$\frac{{\varvec{T}}{\varvec{P}}}{({\varvec{T}}{\varvec{P}}+{\varvec{F}}{\varvec{N}})}$$
  • Specificity: It is the classifier's ability to distinguish healthy states, and it is calculated as the number of healthy images that were classified as healthy, over the total number of healthy images [12], i.e.:

    $$\frac{{\varvec{T}}{\varvec{N}}}{({\varvec{T}}{\varvec{N}}+{\varvec{F}}{\varvec{P}})}$$
  • Positive Predicted Value (PPV): it is the number of injured images that were classified as injured, over the total number of images that were classified as injured [13], i.e.:

    $$\frac{{\varvec{T}}{\varvec{P}}}{({\varvec{T}}{\varvec{P}}+{\varvec{F}}{\varvec{P}})}$$
  • Negative Predicated Value (NPV): it is the number of healthy images that were classified as healthy, over the total number of images that were classified as healthy [13], i.e.:

    $$\frac{{\varvec{T}}{\varvec{N}}}{({\varvec{T}}{\varvec{N}}+{\varvec{F}}{\varvec{N}})}$$
  • Accuracy: it represents the classifier’s ability to classify correctly, and it is calculated as the number of correctly classified images, over the total number of images [14], i.e.:

    $$\frac{{\varvec{T}}{\varvec{P}}+{\varvec{T}}{\varvec{N}}}{{\varvec{t}}{\varvec{o}}{\varvec{t}}{\varvec{a}}{\varvec{l}}}$$

Results

We will review the resulting images after applying each stage of the algorithm to an image from the database. In Fig. 2, we can see an original image from the database, and in Fig. 3, we may observe the result of the preprocessing and enhancement stage, which mainly includes the process of eliminating the black background and filtering.

Fig. 2
figure 2

The original image

Fig. 3
figure 3

The result of the 1st processing stage

Figure 4 illustrates the result of the skull-stripping process. It is worth mentioning that the importance of this step is due to the great similarity between the gray level of both the skull and the tumor, so cropping the skull pixels reduces the error.

Fig. 4
figure 4

The result of skull stripping

Figure 5 depicts the result of segmenting the tumor after thresholding using Otsu’s method.

Fig. 5
figure 5

The result of tumor segmentation

While Figs. 6 and 7 express the designed GUI. We may notice that the interface enables the user to display the result of each work stage, and the classification result of each classifier separately, in addition to the final classification result. It also provides a set of additional features such as tumor encircling and brightness/contrast control.

Fig. 6
figure 6

The graphical user interface (GUI)

Fig. 7
figure 7

The graphical user interface (GUI)

Discussion

After completing all the work stages, it was necessary to evaluate the accuracy of the results we obtained, that is why we calculated the testing and training accuracy resulting from each model separately, and the results were the following:

  1. 1.

    MG-SVM model: with a training accuracy of 92.3% and testing accuracy of 93.3%.

  2. 2.

    Fine KNN model: with a training accuracy of 92.3% and testing accuracy of 90%.

  3. 3.

    Cosine KNN model: with a training accuracy of 88.5% and testing accuracy of 90%.

However, the proposed hybrid model reached a testing accuracy of 96.7%.

We plotted the confusion matrix of each classifier to calculate the essential values needed to evaluate the results (Fig. 8, Table 1).

Fig. 8
figure 8

The hybrid classifier’s confusion matrix

Table 1 Values extracted from confusion matrices

The following table demonstrates the values extracted from the Confusion Matrix for each classifier separately, along with the hybrid classifier.

We note from the table that the final accuracy of the hybrid classifier is higher than the accuracy of each classifier separately, which indicates the success of our innovative hybrid method and its clear and remarkable improvement in results.

Conclusions

Due to the large field of image processing, there is always room for development and improvement. Therefore, we will not stop at this point, but rather seek to develop and improve this work, whether by adopting better segmentation methods, extracting more features, or even training more classifiers such as Decision Trees and Logistic Regression, and comparing the results.

The algorithm can also be developed to diagnose the type of tumor, according to the available datasets.

In the end, despite all the difficulties, we believe that we have been able to reach a good result and an innovative algorithm that has not been implemented before, hoping that it will be the beginning of better achievements and results in the future.

Availability of data and materials

The dataset used is downloaded from Kaggle https://www.kaggle.com/navoneel/brain-mri-images-for-brain-tumor-detection.

Abbreviations

MRI:

Magnetic resonance imaging

CAD:

Computer aided diagnosis systems

GLCM:

Gray level co-occurrence matrix

GUI:

Graphical user interface

PCA:

Principal component analysis

SVM:

Support vector machine

KNN:

K-nearest neighbor

RMS:

Rout mean square

LBP:

Local binary features

MG-SVM:

Medium Gaussian SVM

TP:

True positive

TN:

True negative

FP:

False positive

FN:

False negative

PPV:

Positive predicted value

NPV:

Negative predicated value

References

  1. Brown MA, Semelka RC (2011) MRI: basic principles and applications. Wiley, Chicester

    Google Scholar 

  2. Vijayakumar C, Gharpure DC (2011) Development of image-processing software for automatic segmentation of brain tumors in MR images. J Med Phys 36(3):147. https://doi.org/10.4103/0971-6203.83481

    Article  CAS  Google Scholar 

  3. Ramdlon R, Kusumaningtys E, Karlita T (2019) Brain tumor classification using MRI images with K-Nearest neighbor method. In: Presented at international electronics symposium (IES), Indonesia

  4. Khan IU, Akhtar S, Khan Sh (2020) Detection and classification of brain tumor using support vector machine based GUI. In: Presented at 7th international conference on signal processing and integrated networks, Spain

  5. Hussain A, Khunteta A (2020) Semantic segmentation of Brain tumor from MRI images and SVM classification using GLCM features. In: Presented at the second international conference on inventive research in computing applications, India

  6. Chakrabarty KN. https://www.kaggle.com/datasets/navoneel/brain-mri-images-for-brain-tumor-detection. Accessed 10 Mar 2021

  7. Gonzalez RC, Woods RE (2018) Digital image processing. Pearson, New York, NY

    Google Scholar 

  8. Ammar M (2013) Medical image display and processing systems. Damascus University, Damascus

    Google Scholar 

  9. Greensted A (2010) Otsu thresholding. The Lab Book Pages. Dr Andrew Greensted. http://www.labbookpages.co.uk/software/imgProc/otsuThreshold.html. Accessed 20 Aug 2021

  10. Saad G, Khadour A, Kanafani Q (2016) Ann and Adaboost application for automatic detection of microcalcifications in breast cancer. Egypt J Radiol Nuc Med 47(4):1803–1814

    Article  Google Scholar 

  11. Huang FH (2014) Research on classification of remote sensing image based on SVM including textural features. Appl Mech Mater 543–547:2559–2565.

    Google Scholar 

  12. Marshall WJ et al (2020) Chapter 3: the interpretation of biochemical data. In: Clinical biochemistry: metabolic and clinical aspects. Edinburgh i pozostałe, Churchill Livingstone/Elsevier

  13. Monaghan TF et al (2021) Foundational statistical principles in medical research: Sensitivity, specificity, positive predictive value, and negative predictive value. Medicina 57(5):503–509

    Google Scholar 

  14. Salman J, Saad G, Suliman M (2021) Using deep learning algorithms and computer vision in detecting human brain tumor. Tartous Univ J Res Sci Stud 5(10):1–17

    Google Scholar 

Download references

Acknowledgements

The authors express their sincere appreciation to Tishreen University for their assistance.

Funding

No funding was obtained for this study.

Author information

Authors and Affiliations

Authors

Contributions

LB and SB: all steps of the study from acquisition of Data, developing the algorithm and programming the GUI to manuscript drafting and approval of the final version, GS: supervising all steps of the work, directing the segmentation and feature extraction stages, critical revision of the manuscript conception and approval of the final version, AS: supervising all work stages and approval of the final version. All authors have read and approved the manuscript.

Corresponding author

Correspondence to Luna Bitar.

Ethics declarations

Ethics approval and consent to participate

Our research was approved as a graduation project from the department of biomedical engineering in Tishreen University.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Saad, G., Suliman, A., Bitar, L. et al. Developing a hybrid algorithm to detect brain tumors from MRI images. Egypt J Radiol Nucl Med 54, 14 (2023). https://doi.org/10.1186/s43055-023-00962-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s43055-023-00962-w

Keywords

  • MRI
  • CAD
  • GLCM
  • SVM
  • KNN
  • Otsu’s method