Skip to main content

Low-dose CT radiomics features-based neural networks predict lymphoma types

Abstract

Background

Fluorodeoxyglucose positron emission tomography (PET)–computed tomography (CT) is preferred for pretreatment staging and treatment planning in patients with lymphoma. This study aims to train and validate the neural networks (NN) for predicting lymphoma types using low-dose CT radiomics.

Results

Few radiomics features were stable in intraclass correlation coefficient and coefficient of variation analysis (n = 119). High collinear ones with variance inflation factor were eliminated (n = 56). Twenty-four features were selected with the least absolute shrinkage and selection operator regression for network training. NN had 75.76% predictive accuracy in the validation set and has 0.73 (95% CI 0.55–0.91) area under the curve (AUC) to differentiate Hodgkin lymphoma from non-Hodgkin lymphoma. NN which was used to differentiate B-cell lymphoma from T-cell lymphoma had 78.79% predictive accuracy and has 0.81 (95% CI 0.63–0.99) AUC.

Conclusions

In this study, in which we used low-dose CT images of PET–CT scans, predictions of the neural network were near acceptable lower bound for Hodgkin and non-Hodgkin lymphoma discrimination, and B-cell and T-cell lymphoma differentiation.

Background

Lymphomas are a heterogeneous group of malignancies affecting the lymphoid system [1]. Imaging findings of the lymphoma may vary due to their different behaviors and relatively low incidence depending on the organ polymorphism involved in their expressions. The final diagnosis is made through pathology. Tissue evaluation differentiates between Hodgkin lymphoma (HL) and non-Hodgkin lymphoma (NHL), followed by B-cell lymphoma (BCL) and T-cell lymphoma differentiation (TCL). Afterward, different types of characteristic translocations are investigated on the basis of the cytogenetic and molecular evaluation results [2]. The F-18 fluorodeoxyglucose (FDG) positron emission tomography-computed tomography (PET–CT) is preferred for pretreatment staging and treatment planning in patients with lymphoma.

Radiomics is a developing tool that makes radiological images multidimensional and suitable for data screening, thus being more useful and helpful for diagnosis purposes [3]. It reveals hidden quantitative information using computer-assisted advanced techniques so that there will be more information in the obtained radiological images than the naked eye can distinguish, including the detailed quantitative tissue feature parameters [4]. Texture analysis is a statistical method used in the quantitative evaluation of tissue images [5]. The radiomics algorithm consists of steps such as obtaining consecutive images, determining the area of interest, preprocessing operations, and extracting and evaluating parameters [6].

This study aims to train and validate the neural networks (NN) for predicting lymphoma types using low-dose CT radiomics obtained from FDG PET–CT.

Methods

Ethical considerations

A retrospective model-development study was done after the university's local ethics committee approval, and written informed consent was waived by local ethics committee. This study had been performed in accordance with the ethical standards laid down in the 1964 Declaration of Helsinki and its later amendments. The STARD statement was followed for reporting [7].

Patients selection

Among the 330 patients whose lymphoma diagnosis was confirmed by biopsy between January 2014 and June 2020, 110 patients who underwent FDG PET–CT scan were included.

The low-dose CT data from the FDG PET–CT imaging of the cases were evaluated with radiomics by two independent observers to differentiate between HL-NHL and BCL-TCL. The “ground truth” was the pathology results of the cases.

Vertex-upper thigh imaging was performed with a 60 mAs CT parameter after at least 8 h of fasting using FDG radiopharmaceutical with the PHILIPS Gemini TF 64 Slice PET–CT device.

Inclusion criteria

Patients diagnosed with lymphoma in our institution within the period from January 2014 to June 2020 and who underwent PET–CT examination before treatment were included in the study, without gender and age discrimination.

Exclusion criteria

Patients with FDG PET–CT screening at another center, patients diagnosed in another center, patients with inaccessible imaging data and insufficient imaging quality, and patients with cutaneous lymphoma or without significant lymph node involvement to be segmented were excluded from the study. The exclusion criteria are presented in Fig. 1.

Fig. 1
figure 1

Flowchart of the study. Patients who were eliminated from the study were presented with their reasons

Segmentation of the target lymph nodes and extracting radiomics features

The lymph node with the maximum standard uptake volume (SUVmax) value was determined as the target lymph node to be segmented in each axial slice of the low-dose CT images. Target lymph nodes are involved cervical, axillary, paraaortic, mesenteric, pelvic, and inguinal nodes. Resampling (1.0 × 1.0 × 1.0 mm) and normalization (μ ± 3σ) were performed as described in the literature [8]. Two independent observers segmented the target lymph node from the axial slices using the “Segment Editor” module in the 3D Slicer software (open-source imaging software for Mac OS X). Afterward, Radiomics, the 3D Slicer module of the “PyRadiomics” platform, was used to obtain radiomics data. All classes of the features available in this module were included. Wavelet-based filters were utilized so that parameters other than shape could be studied from both the original image and eight filters (Original, HLL, LHL, LHH, LLH, HLH, HHH, HHL, and LLL). A total of 851 parameters from each segmentation were recorded by two observers.

The pathology data were binary-coded for Hodgkin’s lymphoma (positive/negative) and B-cell lymphoma (positive/negative).

Statistical evaluation

The statistical analysis in this study was performed using Statistica (Version 13.5.0, Palo Alto, Cal: TIBCO Software), and 3D Slicer (open-source imaging software version 4.10.2 for Mac OS X) package software.

Inter-observer segmentation overlap evaluation

The Dice coefficient was used for segmentation overlap analysis. The Dice coefficient was evaluated as “weak” if it was below 0.50 in this index, “moderate” if it was 0.50–0.74, “good” if it was 0.75–0.89, and “excellent” if it was ≥ 0.90 [9].

Radiomics feature stability evaluation

Stabile features are repeatable and have low variance. This study tested repeatability at the segmentation level with the Dice coefficient. Radiomics features and inter-observer reliability were evaluated with ICC analysis [10]. The reliability was considered “good” if the ICC value was > 0.75–0.89, and “excellent” if the ICC value was > 0.89. Features which have “excellent” and “good” reliability were included in the study. “Moderate” and “weak” reliable radiomics features (ICC < 0.75) were eliminated from the study [10]. Radiomics features with a coefficient of variance (CoV) > 0.15 were eliminated from this study, as low variance radiomics features are more reproducible [11].

Radiomics feature selection process

The radiomics features that were found to be highly correlated in the variance inflation factor analyses (VIF) were eliminated due to collinearity [12]. The selected features were recorded for next step of feature selection process.

The least absolute shrinkage and selection operator (LASSO) method was proposed by Tibshirani et al. [13]. The most related features were selected for the model with LASSO. In this study, the data were analyzed separately for HL versus NHL, BCL versus TCL with the LASSO Regression plugin. Fivefold cross-validation was used in the analysis for L1 normalization.

Artificial neural networks structure

The radiomics features selected by the LASSO regression were used for structuring neural networks for binary classification tasks (one-vs-all fashion). The data were randomly sampled into three groups (“train,” 60%, “test,” 10%, and “validation” 30%) for each analysis by the software. The test set was used for hyper-parameter tuning with an early stopping algorithm, and the validation set was used as a holdout set. The multilayer perceptron (MLP) and the radial base function (RBF) type networks were trained. These networks had two hidden layers and two bias neurons. The pipeline of the study is shown in Fig. 2. Graphs of coefficients against log lambda are provided for feature selection with LASSO regression (Additional files 1 and 2).

Fig. 2
figure 2

The pipeline of the study. The one with the highest SUVmax value was selected among the lymph nodes in the whole body. Two observers segmented the lesions. The inter-observer agreement of the segments was evaluated with the Dice coefficient (Mean Dice: 0.807). Inter-observer agreement was evaluated with the ICC coefficient, and features < 0.75 were eliminated. Then, LASSO regression was performed for final selection of features. Neural networks were trained and validated

Results

Demographic results

Of all the cases in this study, 13 (11.8%) were children (< 18 years), 69 (62.7%) were adults (18–65 years), and 28 (25.5%) were elderly (> 65 years). The overall mean age was 50.75 ± 20.86 years; the mean age for the children was 11.92 ± 2.81 years, that for the adults was 48.43 ± 12.51 years, and that for the elderly was 74.50 ± 5.87 years. As for the gender, 42 (38.2%) were female and 68 (61.8%) were male. The mean age of the females was 48.33 ± 21.30 years and that of the males was 52.25 ± 20.60 years. The detailed demographic data are presented in Table 1.

Table 1 Patient’s Demographics

Patients’ pathology results

Seventy-one patients (64.6%) were diagnosed with NHL and 39 (35.4%) with HL. Of the patients diagnosed with NHL, 60 (54.6%) were diagnosed with BCL and 11 (10%) with TCL. Eighteen cases (16.4%) of HL were nodular sclerosis, 14 (12.7%) were mixed cellular, 2 (1.8%) were lymphocyte-depleted, and only 1 (0.9%) was lymphocyte-rich. The remaining four patients were not specified in the pathology report (Table 2).

Table 2 Pathology distribution

Segmentation overlap evaluation and inter-observer reliability results

It was observed that only 2 (1.8%) of the segmentations were < 0.50, 15 (13.6%) were in the “moderate” Dice coefficient, and 13 (11.8%) were in the “excellent” inter-observer reliability. The remaining 80 (72.7%) measurements were in the “good” Dice coefficient. The mean Dice coefficient for all segmentations was (n = 110) calculated as 0.81 ± 0.09 (95% CI 0.79–0.82).

The inter-observer reliability of radiomics features of the 14 “shape” features ranged from 0.45 to 0.95. Only 33 (33.5%) of the original images had “good” and “excellent” inter-observer reliability whereas the number of radiomics features obtained from the wavelet filters that had “good” and “excellent” inter-observer reliability and was highest in the HLH filter (94%). Out of a total of 851 radiomics features, 250 (29.38%) of them had an ICC < 0.75. This result showed that most radiomics features were reproducible among observers. However, many of the remaining 601 features showed high variance in the high CoV analysis and were eliminated (n = 482, 56.64%).

Radiomics feature selection process results

After ICC and CoV analysis, 119 radiomics features remained stable. Fifty-six features were eliminated due to high collinearity. Finally, among the remaining radiomics features, the most related features to outcomes were selected for two neural network types with LASSO regression (Table 3).

Table 3 Selected predictor radiomics features analyzed by LASSO regression

Hodgkin/non-Hodgkin lymphoma type prediction results

Neural networks which were used to differentiate Hodgkin lymphoma from non-Hodgkin lymphoma trained with 22 selected radiomics features and age-gender information. The neural network used the Broyden–Fletcher–Goldfarb–Shanno (BFGS) algorithm and reached the most accurate results in the seventh cycle. In this network, the error function was “SOS,” hidden activation was made with “Logistic” and output activation was made with “Identity” algorithms. Training set accuracy was 74.24%, and test set accuracy was 90.91%. The selected neural network had 75.76% predictive accuracy in the validation set and has 0.73 (95% CI 0.55–0.91) AUC.

B-cell/T-cell lymphoma type prediction results

Neural networks which were used to differentiate B-cell lymphoma from T-cell lymphoma trained with 22 selected radiomics features and age-gender information. The neural network used the BFGS algorithm and reached the most accurate results in the fifth cycle. In this network, the error function was “Entropy,” hidden activation was made with “Logistic” and output activation was made with “Softmax” algorithms. Training set accuracy was 71.21%, and test set accuracy was 100%. The selected neural network had 78.79% predictive accuracy and had 0.81 (95% CI 0.63–0.99) AUC in the validation set.

Discussion

In this study, we investigated whether HL-NHL differentiation and BCL-TCL differentiation with the radiomics analysis of the low-dose CT images from FDG PET–CT. According to our findings, artificial neural networks could be applied to estimate lymphoma type prediction. We do not expect these and similar models to replace biopsy any time soon. Instead, the expectation from these models may be to triage patients in the increasing workload and undertake a supportive second opinion for differential diagnosis.

A prior study investigated the role of mean density values in PET–CT patients with lymphoma [14]. All the cases had a histopathological diagnosis and were classified as HL or NHL. The density of the malignant lesions was shown to be statistically significantly higher than that of the benign lesions. In addition, 20 HU was determined to be the cut-off for the differentiation of malignant and benign lesions [14]. This study showed that mean density values of segmentations are not accurate, reliable, and reproducible enough. Similarly, their approach, we utilized CT-based radiomics features. However, radiomics enables sophisticated image analysis which provides additional diagnostic and prognostic power [15].

Parvez et al. [14] investigated the effects of metabolic tumor parameters and radiomics features on treatment response and survival in aggressive BCL in FDG PET–CT [16]. PET–CT images taken for BCL staging were included in their study, and the effects of the whole-body metabolic tumor volume and radiomics features on disease-free survival and total survival were investigated. A correlation was found between the whole-body tumor volume and the treatment response. The texture features were found to be insufficient for predicting the treatment response, but it was found that they could be successfully used in predicting the presence of residual mass and survival [16]. Therapy responses and lymphoma outcomes could be evaluated using PET/CT modalities [17]. Milgrom et al. investigated the possibility of using PET radiomics in predicting refractory mediastinal HL [18]. The PET images of patients diagnosed with stage 1 and 2 HL were evaluated in their study. Variables such as metabolic tumor volume and total lesion glycolysis were also evaluated with a model created from machine learning. The AUC was able to predict refractory disease with 95.2% rate by evaluating the five factors with the highest predictive power of the created model. This model can make it possible to come up with a personalized treatment plan by stratifying early stage HLs. PET radiomics were performed in this study, and the evaluation included parameters related to FDG uptake [18]. Unlike the aforementioned studies, our study aimed to reach a pathological diagnosis through CT radiomics. In conclusion, it was shown in this study that artificial neural networks could be successful only when they use radiomics features, and that they could enable good predictions by adding age information. It was also shown that radiomics features-based models could be successfully used in BCL-TCL differentiation.

In 2020, Ou et al. investigated the possibility of differentiating the SUV and radiomics properties of breast cancer-breast lymphoma in the PET–CT images of patients [18]. The clinical information obtained showed that using the radiomics features of SUV in PET images could differentiate breast cancer from lymphoma, with the AUC values being 0.806 and 0.891, respectively. Limited number of researches have been done about this topic, and the published radiomics studies have focused on differentiating lymphoma from other pathologies including breast cancer and glioblastoma [17, 19]. Mayerhoefer et al. reported that central nervous system lymphoma can be differentiated from other tumors, such as glioblastoma, through functional imaging using radiomics features in the evaluation of lymphoma, and that prognostic predictions can be made [20]. They reported that standardization is needed in image reconstruction, post-processing, and segmentation so that the studies conducted on this subject could be compared with each other, and that this subject is open to study. Our study showed that the PET–CT images of lymphomas, which are very heterogeneous diseases when the radiomics features are considered, were successfully used in differentiating HL-NHL and BCL-TCL through the CT radiomics features. The correlation of the radiomics features with histological subgroups or molecular level markers can be investigated in larger patient groups as there have been limited studies on this subject.

One recent study on the use of radiomics for diagnostic purposes was the meta-analysis conducted by Wang et al. in 2020 [17]. The data obtained in such a study showed that radiomics could be a diagnostic and prognostic indicator for lymphoma. In addition, the importance of optimal studies where lesion selection, segmentation, the effect of pathological patterns and the like are better evaluated is emphasized [17].

Study limitations

The main limitation of our study was that HL and NHL and their subtypes do not show equal distributions. PET data were not evaluated through radiomics, and the relatively small number of patients (as the data were divided into three sets: training 60%, test 10%, and validation 30%) were also limitations of our study. Increasing sample size may yield more generalizable results. Also, we didn’t categorize the patients according to their weight which may affect the outcomes of the study by altering the received dose.

Conclusions

Histopathological examination is still considered the most valuable diagnosis method of lymphoma. HL-NHL and BCL-TCL differentiation showed acceptable performance with low-dose radiomics features on FDG PET–CT imaging. In near future, these tools will not replace biopsy; however, they may be used for determining a patient's priority for diagnosis, reporting, and treatment.

Availability of data and materials

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Abbreviations

AUC:

Area under the curve

BCL:

B-cell lymphoma

CoV:

Coefficient of variation

CT:

Computed tomography

FDG:

Fluorodeoxyglucose

HL:

Hodgkin lymphoma

ICC:

Intraclass correlation coefficient

PET–CT:

Positron emission tomography–computed tomography

TCL:

T-cell lymphoma

LASSO:

Least absolute shrinkage and selection operator

MLP:

Multilayer perception

NHL:

Non-Hodgkin lymphoma

NN:

Neural network

SUVmax:

The maximum standard uptake volume

VIF:

Variance Inflation Factor

References

  1. Frampas E (2013) Lymphomas: basic points that radiologists should know. Diagn Interv Imaging 94:131–144. https://doi.org/10.1016/j.diii.2012.11.006

    Article  PubMed  CAS  Google Scholar 

  2. Wang L, Qin W, Huo Y-J et al (2020) Advances in targeted therapy for malignant lymphoma. Signal Transduct Target Ther. https://doi.org/10.1038/s41392-020-0113-2

    Article  PubMed  PubMed Central  Google Scholar 

  3. Gillies RJ, Kinahan PE, Hricak H (2016) Radiomics: images are more than pictures, they are data. Radiology 278:563–577. https://doi.org/10.1148/radiol.2015151169

    Article  PubMed  Google Scholar 

  4. Kumar V, Gu Y, Basu S et al (2012) Radiomics: the process and the challenges. Magn Reson Imaging 30:1234–1248. https://doi.org/10.1016/j.mri.2012.06.010

    Article  PubMed  PubMed Central  Google Scholar 

  5. Kassner A, Thornhill RE (2010) Texture analysis: a review of neurologic MR imaging applications. Am J Neuroradiol 31:809–816. https://doi.org/10.3174/ajnr.a2061

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  6. Larroza A, López-Lereu MP, Monmeneu JV et al (2018) Texture analysis of cardiac cine magnetic resonance imaging to detect nonviable segments in patients with chronic myocardial infarction. Med Phys 45:1471–1480. https://doi.org/10.1002/mp.12783

    Article  PubMed  CAS  Google Scholar 

  7. Bossuyt PM, Reitsma JB, Bruns DE et al (2015) STARD 2015: an updated list of essential items for reporting diagnostic accuracy studies. Radiology 277:826–832. https://doi.org/10.1148/radiol.2015151516

    Article  PubMed  Google Scholar 

  8. The image biomarker standardisation initiative &mdash; IBSI 0.0.1dev documentation. In: The image biomarker standardisation initiative & mdash; IBSI 0.0.1dev documentation. https://ibsi.readthedocs.io/en/latest/index.html

  9. Parsad NM (2018) Deep Learning in Medical Imaging V. In: Medium. https://medium.datadriveninvestor.com/deep-learning-in-medical-imaging-3c1008431aaf

  10. Koo TK, Li MY (2016) A Guideline of selecting and reporting intraclass correlation coefficients for reliability research. J Chiropr Med 15:155–163. https://doi.org/10.1016/j.jcm.2016.02.012

    Article  PubMed  PubMed Central  Google Scholar 

  11. Alberich-Bayarri A, Sourbron S, Golay X et al (2020) ESR statement on the validation of imaging biomarkers. Insights Imaging. https://doi.org/10.1186/s13244-020-00872-9

    Article  Google Scholar 

  12. Kim JH (2019) Multicollinearity and misleading statistical results. Korean J Anesthesiol 72:558–569. https://doi.org/10.4097/kja.19087

    Article  PubMed  PubMed Central  Google Scholar 

  13. Tibshirani R (1996) Regression shrinkage and selection via the lasso. J Roy Stat Soc Ser B 58:267–288. https://doi.org/10.1111/j.2517-6161.1996.tb02080.x

    Article  Google Scholar 

  14. Flechsig P, Walker C, Kratochwil C et al (2017) Role of CT density in PET/CT-based assessment of lymphoma. Mol Imag Biol 20:641–649. https://doi.org/10.1007/s11307-017-1155-x

    Article  CAS  Google Scholar 

  15. Lambin P, Leijenaar RTH, Deist TM et al (2017) Radiomics: the bridge between medical imaging and personalized medicine. Nat Rev Clin Oncol 14:749–762. https://doi.org/10.1038/nrclinonc.2017.141

    Article  PubMed  Google Scholar 

  16. Parvez A, Tau N, Hussey D et al (2018) 18F-FDG PET/CT metabolic tumor parameters and radiomics features in aggressive non-Hodgkin’s lymphoma as predictors of treatment outcome and survival. Ann Nucl Med 32:410–416. https://doi.org/10.1007/s12149-018-1260-1

    Article  PubMed  CAS  Google Scholar 

  17. Wang H, Zhou Y, Li L et al (2020) Current status and quality of radiomics studies in lymphoma: a systematic review. Eur Radiol 30:6228–6240. https://doi.org/10.1007/s00330-020-06927-1

    Article  PubMed  Google Scholar 

  18. Milgrom SA, Elhalawani H, Lee J et al (2019) A PET radiomics model to predict refractory mediastinal Hodgkin lymphoma. Sci Rep. https://doi.org/10.1038/s41598-018-37197-z

    Article  PubMed  PubMed Central  Google Scholar 

  19. Ou X, Zhang J, Wang J et al (2019) Radiomics based on 18 F-FDG PET/CT could differentiate breast carcinoma from breast lymphoma using machine-learning approach: a preliminary study. Cancer Med 9:496–506. https://doi.org/10.1002/cam4.2711

    Article  PubMed  PubMed Central  Google Scholar 

  20. Mayerhoefer ME, Umutlu L, Schöder H (2021) Functional imaging using radiomic features in assessment of lymphoma. Methods 188:105–111. https://doi.org/10.1016/j.ymeth.2020.06.020

    Article  PubMed  CAS  Google Scholar 

Download references

Acknowledgements

We would like to thank all members of our department for their support during research.

Funding

The authors declared that this study received no financial support.

Author information

Authors and Affiliations

Authors

Contributions

HE was involved in concept; HE, MBE contributed to design; BA, İC were involved in supervision; HE, AB, MBE, MA, MTT contributed to data collection and processing; AB, MBE were involved in analysis and/or interpretation; HE, MA contributed to writing.

Corresponding author

Correspondence to Aysenur Buz Yaşar.

Ethics declarations

Ethics approval and consent to participate

All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/ or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards. Written informed consent was waived by local ethics committee.

Consent for publication

We give our consent for the publication of identifiable details, which can include images and/or case history and/or details within the text to be published in the Journal.

Competing interests

No competing interests were declared by the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1.

Graph of coefficients against log lambda.

Additional file 2.

Graph of coefficients against log lambda.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Erturk, H., Eser, M.B., Buz Yaşar, A. et al. Low-dose CT radiomics features-based neural networks predict lymphoma types. Egypt J Radiol Nucl Med 54, 135 (2023). https://doi.org/10.1186/s43055-023-01084-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s43055-023-01084-z

Keywords