Preoperative CT-based radiomics for diagnosing muscle invasion of bladder cancer

The primary objective of the research was to develop a method using radiomics-based computed tomography (CT) to predict muscle invasion in bladder cancer (BCa) before surgery. A total of 269 patients with bladder cancer were divided into two groups; training group (n = 188 cases) and validation group (n = 81 cases). Radiomics characteristics were determined by analyzing the CT images of each patient. The least absolute shrinkage and selection operator (LASSO) technique was used for developing a radiomics signature. Furthermore, logistic regression (LR), the support vector machine (SVM), decision tree (DT), and Artificial Neural Network (ANN) models were applied to differentiate between non-muscle-invasive bladder cancer (NMIBC) and muscle-invasive bladder cancer (MIBC). Their performance was determined using the area under the receiver operating characteristic curve (AUC-ROC). In addition, accuracy, specificity, and sensitivity evaluations were also conducted. The radiomics signature was found to be successful in its prediction. A total of 1036 radiomics features were found in the 269 patients, and out of those, 16 were selected as the best predictors of radiomics features. The results revealed that the ANN classifier had the best performance, with a validation set accuracy of 0.950. The current work used machine learning and radiomics techniques to successfully construct a prediction model for muscle invasion in bladder cancer. The ANN model produced significant outcomes that may be used in clinical diagnosis or therapy.


Background
Bladder cancer is considered to be the fourth most prevalent type of cancer.More than 60% of bladder cancer cases and 50% (0.17 million) of bladder cancerassociated deaths in 2017 occurred in developing countries [1].Muscle invasion is often the primary deciding factor in clinical treatment [2].Muscle-invasive bladder cancer(MIBC), is distinguished by tumors that have progressed into the bladder muscle.This type of bladder cancer has a higher risk of spreading to lymph nodes and other organs.
Non-muscle invasive bladder cancers (NMIBC) comprise non-invasive papillary tumors, carcinoma in situ (CIS), and lamina propria-invading papillary tumors [3][4][5].There are substantial discrepancies between MIBC and NMIBC regarding therapy, prognosis, and management.If the primary tumor has not progressed beyond the muscle wall, a patient may potentially be treated with transurethral resection of the bladder tumor (TRUBT) alone.
On the other hand, if a primary tumor invades beyond the muscle wall, the patient may need more extensive systemic therapy, which will be more successful [6].Additionally, transurethral resection of BCa runs a high risk of under-staging the tumor because of these following reasons [7]: 1. Sampling error: The pathologist examines only a small portion of the tissue obtained through TURBT.Therefore, if the sampled area does not contain muscle-invasive cancer cells, the pathologist might miss the presence of muscle-invasive cancer elsewhere in the bladder.2. Incomplete resection: It is possible that the surgeon may leave behind some cancerous tissue during TURBT, which could lead to tumor recurrence and progression.3. Artifacts and distortion: The histopathological evaluation of TURBT specimens may be affected by various artifacts such as thermal injury, cautery artifacts, crush artifacts, or tangential sectioning.Additionally, distortion of the biopsy specimen can also occur during processing, making it difficult to interpret the histological findings accurately.
Thus, it is crucial to perform a precise pre-treatment evaluation of the degree of muscle invasion by the BCa to identify the most appropriate treatment for the individual patient.Cystoscopy and histological tissue biopsy or analysis of resection specimens are considered the standard methods for clinical staging and diagnosing of bladder cancer.However, biopsy findings may not necessarily indicate the whole tumor, and alternative methods of diagnosis are needed.
Precision medicine has expanded the potential for standard digital imaging by combining it with big data in recent years, and this is one of the new frontiers in cancer treatment.Currently, computed tomography urography (CTU) is the most often-used imaging technique for patients undergoing the initial evaluation or suspected of having bladder tumors [8].This imaging technique allows a quick and comprehensive assessment of the urinary tract in a single examination.In recent years, it has led to its widespread replacement of intravenous urography (IVU).
MRI (magnetic resonance imaging) plays an important role in diagnosing and staging bladder cancer [9].MRI can provide detailed images of the bladder and surrounding tissues, allowing doctors to assess the extent of the tumor and determine its stage.MRI is particularly useful for detecting muscle invasion, which is an important factor in determining the stage of bladder cancer.It can also help identify any nearby lymph node involvement or metastasis to distant organs.
Multiparametric MRI (mpMRI), which combines different MRI techniques to provide a more comprehensive assessment, has shown promise in improving the accuracy of diagnosing and staging bladder cancer.In some cases, mpMRI may also be used to guide biopsies or other interventions.
Radiomics, a field that has made significant progress in clinical practice, aims to combine quantitative feature analysis from the region of interest (ROI) that is potentially beyond the perception of the human eye for the location and identification of specific features that may facilitate prediction [10].Therefore, the purpose of the current work was to develop CT-based radiomics models for the early detection of BCa muscle invasion.

Methods
Figure 1 presents the workflow followed in the current study.

Patient selection
The ethics committee of Affiliated Hospital of Nantong University approved the present study.The ethics committee of Affiliated Hospital of Nantong University committee waived the need for patients to sign informed consent.All methods were carried out following relevant guidelines and regulations.
Patients who had undergone BCa surgical excision were verified pathologically and were treated at two different institutions between January 2014 and September 2019.The training set consisted of 188 patients treated at the Affiliated Hospital of Nantong University (China).The validation set consisted of 81 patients treated concurrently at Affiliated Yancheng No. 1 People's Hospital of Nanjing University Medical School (China).
The inclusion criteria were as follows: (1) For BCa, the availability of pathological data; (2) for lesion evaluation, complete CT images of all four phases of the CTU examination; and (3) CT examinations performed within 30 days before surgery.
The exclusion criteria were: (1) Poor image quality of the CT or preoperative CTU; (2) lesions with hard-todefine edges; (3) if the patient received immunotherapy or chemotherapy before the CT; and (4) missing clinicopathological data on the patient.

Acquisition of CT images
Preoperative CTU was performed on all enrolled patients in both centers using a similar protocol setup but different equipment.After a 4-6 h fast, patients were instructed to consume roughly 1000 ml of water 45 min before the scan and wait to urinate until the scan was complete.Both Philips Brilliance iCT and Somatom Force CT instruments were used to perform every test at the CTU.The Philips Brilliance iCT and Somatom Force CT both have 256 detectors.
The patient's pictures in all phases were acquired.Furthermore, the following parameters were used for the scanning: modulation engaged tube voltage with automated tube current, i.e., 120 kV; collimation was set to 128 × 0.6 mm with the section thickness of 0.625-1 mm; and section intervals of 0.625-1 mm.

Preprocessing of the image
Before extracting features of the images to establish an isotopic dataset, the images were resampled using a voxel-size resampling method.The voxel size was normalized to 1.0 × 1.0 × 1.0 mm 3 , and the pixel values were rescaled to (0,1).This enabled comparisons between the picture data obtained from various samples and scanners.The Laplacian of Gaussian (LoG) and wavelet decomposition (WAV) methods were utilized to obtain certain features.All the textural characteristics, histogram, and intensity were subjected to preprocessing steps [11].

Human evaluation
On a five-point Likert scale score, one senior radiologist read all CTU images to assess the muscle invasion with the naked eye.Threshold values were 1, 2, 3, 4, and 5 points.

Segmentation of ROI
The ITK-SNAP tool (http:// www.itksn ap.org) manually segments the tumor's region of interest (ROI) in three dimensions.To compute the Kappa coefficient, two radiologists inspected all of the images and manually drew the ROI, slice by slice, throughout the whole tumor and slightly around the visible perimeter of the lesion (Fig. 2).The accurate segmentation of bladder tumors in CT exams relies on a combination of technical expertise and clinical judgment: 1. Carefully observe the location, size, number, and contour of the tumors, and delineate along the boundary of the tumor area as much as possible; 2. When areas of necrosis or cystic changes appear within the tumor, these areas should also be included in Fig. 1 Step-wise presentation of the workflow in the present study Fig. 2 Manual segmentation of tumor contours on CTU image the delineated ROI to ensure that the internal information of the lesion is not missed.3. We segmented the affected portion in only muscle invasive bladder cancer and segment the whole tumor and avoid the bladder wall in NMIBC.4. When bladder cancer extends beyond the outer layer, we include the surrounding tissues and organs within the ROI.The average time of the segmentation process is 10 min.

Radiomics extraction of features
Following the segmentation of the ROI, feature extraction was carried out using the Python program Pyradiomics (version 2.2.0, available at http:// www.radio mics.io/ pyrad iomics.html) [12].The average time of feature extraction is 20-30 min.Inter-reader agreement for CT features was assessed by percent (%) of concordant cases and Kappa of agreement, with 95% confidence intervals (CI).Conventionally, a value of Kappa lower than 0.20 is considered poor agreement, values between 0.21 and 0.40 fair, between 0.41 and 0.60 moderate, between 0.61 and 0.80 substantial, and values greater than 0.80 are considered almost perfect agreement [13].This demonstrated that the features acquired by different radiologists were not notably dissimilar from one another.The extracted features were relatively consistent when the Kappa coefficient was > 0.60.

Feature selection and radiomic model construction
The training set factors most strongly associated with muscle invasion were identified using the regression model (LASSO).In order to avoid over-fitting, tenfold cross-validation was also carried out [14].The LASSO methodology is a technique for evaluating data that can reduce the coefficients of factors unrelated to muscle invasion to zero.Therefore, characteristics with coefficients greater than zero were considered.Moreover, the optimal tuning parameter was measured using the minimum criteria (the minimum lambda) [15].Feature Selection is very fast.The average time of it is 10 s.The radiomic signature was established by linearly combining certain features and multiplying the radiomic signature with the non-zero coefficients.

Model evaluation
The best parameter configuration for each model was found using the training set and tenfold cross-validation.The four models were put through their paces on the validation set to evaluate their performance.The ability of each classifier to make accurate predictions was assessed using the ROC curve, and their respective AUCs were determined for comparison.The Youden index determined sensitivity, and specificity, accuracy for both validations sets [10].The radiomics classifier with the highest AUC was considered the best candidate for the radiomics signature.Decision curve analysis (DCA) was also used to assess the clinical relevance of the best model.

Statistical analysis
R software version 4.1.2(Vienna, Austria) was used for all statistical analyses.The non-normally distributed characteristics were analyzed using the Mann-Whitney U test, and the categorical variables were compared either through the chi-squared or Fisher's tests.The ideal cutoff values were determined using the ROC curve analysis based on the highest Youden index value.Delong's test was applied to compare AUC.Furthermore, LASSO regression was conducted using the 'glmnet' package.The software package 'pROC' was used to plot the ROC curves.Moreover, the software package 'rmda' was for DCA analysis.The data were regarded as statistically significant if the two-sided P value was less than 0.05.

Clinicopathological data
The study cohort's clinical characteristics are shown in Table 1.Muscle-invasive bladder cancer was present in 48.4% (91/188) and 46.9% (38/81) of patients in the validation and training sets, respectively.No statistically significant differences were observed between the two study groups regarding bladder cancer invasion (P = 0.9271).Moreover, there were no statistically significant variations in the age distribution of patients between the two groups (P = 0.9332).The findings revealed that the baseline characteristics of the two groups were not significantly different.

Human evaluation
For the training and validation sets, the accuracy of the human evaluation was calculated to be 0.667 and 0.727, with specificities of 0.588 and 0.643 and sensitivities of 0.727 and 0.789,respectively, using "3" as the threshold value.Figure 3 shows the ROC curve of the 5-point Likert scale scores in the validation set.

Feature extraction and selection
A total of 1036 features were retrieved from both original and filtered pictures.These features were then placed into subgroups, including first-order statistics、shapebased features and texture features like grey-level cooccurrence matrix (GLCM), grey-level run length matrix (GLRLM), and grey-level size zone matrix((GLSZM).Details of the radiomic features are available in Additional file 1: Supplementary Material S1.
The mean Kappa coefficient of these features was 0.783 (95% CI 0.667-0.886),indicating that the intra-observer repeatability of the process was satisfactory.Therefore, the initial feature extraction outcomes serve as the basis for all subsequent statistical studies.Using the training data as its basis, the LASSO regression model reduced the number of radiomics features from 1036 to 16 (Fig. 4).
Figure 5 shows the descriptions of the 16 selected features by Shapley additive explanation (SHAP) analysis.

Model building and evaluation
The 16 radiomics features were applied to develop the diagnostic model for the preoperative prediction of muscle invasion in BCa using the training set.Four predictive models were built, including the LR, ANN, SVM, and DT.In addition, we performed grid-search cross-validation to identify the optimal parameters for all the aforementioned machine-learning algorithms.After that, the validation set was used to test the models.We then selected the most suitable algorithm for developing the final classification model based on how well they performed.In addition, we used grid search to find the most effective parameters for each of the above machine-learning algorithms.
Figure 6 shows the ROC curves of four different CT-based classifiers in the validation set.Using these selected radiomics features, the ANN model achieved the best performance with an AUC of 0.950 (95% CI 0.889-0.9984)(Table .2).According to the results of the DeLong test, the ANN model had better performance when compared with other models for the preoperative  In the ANN model, using the optimal cut-off value of 0.730 selected by the Youden J index applied to the validation set in the prediction of muscle invasion, patients were classified into low-risk (n = 43; ≤ 0.730) and highrisk (n = 38; > 0.730) groups.The predictions of the two groups were found to be considerably different (P values < 0.05).The sensitivity, specificity, and accuracy for the ANN model in the training set were 0.933, 0.986, and 0.968, respectively, and 0.897, 0.928 and 0.913 in the validation set, respectively.In this view, the performance of human evaluation was inferior to the ANN model in the validation set with a lower accuracy (0.727 vs. 0.913), sensitivity (0.789 vs. 0.897), and specificity (0.643 vs. 0.928).

Clinical applications
The radiomics model's DCA is shown in Fig. 7.The y-axis represents the net benefit, and the x-axis represents the threshold probability, respectively.The DCA showed that the ANN model, which delivered a net benefit over the "treat-all" or "treat-none" strategy at a threshold probability > 4.40%, had the highest clinical net benefit in the validation set.

Discussion
Bladder cancer is a type of cancer that affects the urinary system.The infiltration in the muscle layer must be determined before diagnosis and treatment [16].Many studies in recent decades have been conducted to develop reliable and predictive techniques for distinguishing MIBC from NMIBC.However, urethral infection, damage, and implant metastasis are at risk when a cystoscopic biopsy is conducted [17].The main advantage of transurethral resection (TUR) is its 100% specificity because, once a TUR specimen is found, muscle invasion is established regardless of the pathological result at RC [18].However, 27-51% of NMIBC diagnoses are subsequently upgraded to MIBC by TUR [19].Few studies have explicitly Fig. 5 The shapley value of each features investigated muscle invasion in people with bladder cancer.As a result, this study aimed to establish a radiomics signature that accurately predicts muscle invasion in patients by utilizing the most effective radiomics features.
The three-dimensional (3D) ROI was used in this study, which enhanced resolution effectiveness over two-dimensional (2D) regions of interest by considering all accessible slices.Then, we identified 1036 radiomic characteristics from 269 patients.We performed LASSO regression to eliminate duplication and finally identified seven optimum features to avoid overfitting.Four radiomic models were then developed utilizing CT scans as the data source.The presence of bladder cancer in the muscle was predicted using these models.The most significant diagnostic performance for predicting BCa muscle invasion non-invasively and preoperatively was evaluated utilizing ANN model analysis by comparing the AUC values from the ROC curves.The AUC of the model was found to be 0.950 (95% CI 0.889-0.998)whenapplied to the validation set.
We then used DCA to assess the clinical importance of each model.According to the decision curve, the ANN model was superior to others in predicting the presence  CT has its limitations due to its low resolution of soft tissue.Multiparametric MRI (mpMRI) can provide high spatial and contrast resolution images, regional anatomic structures and identification of the urinary bladder layers, contributing to reducing staging errors.Additionally, to standardize the scanning and reporting requirements based on mp-MRI, the Vesical Imaging Reporting And Data System (VI-RADS) was introduced in 2018 [20].
MRI-based modalities provide more significant benefits compared with CT.Firstly, patients are not exposed to ionizing radiation; thus, various sequences can be acquired without concern regarding the effects of longterm radiation exposure.Secondly, each sequence can provide new biological and anatomic information that might improve the classification algorithm's accuracy [21].Conventional T1-and T2-weighted imaging and more sophisticated sequences such as DCE and DWI have shown consistent results for assessing the BCa muscle invasion [22,23].
However, due to the high cost of MRI, it is not as widely utilized in China as CT.CT urography is still the primary imaging method to diagnose bladder cancer in China.The current study employed CT texture analysis (CTTA) to determine BCa muscle invasion.
Consistent with a previous study [25], the results in the present study also showed that a CT-based radiomics signature could be used to predict the BCa muscle invasion before surgery.Despite this, the discrimination effect was not as strong as indicated in the current study.In the internal validation set, the accuracy, sensitivity, specificity, and AUC were 0.796, 0.733, 0.810, and 0.861, respectively, while in the external validation set, they were 0.747, 0.710, 0.773, and 0.791, respectively.
The 16 texture features are also related to the extent of bladder cancer.For example, Long Run Low Gray-Level Emphasis(LRLGE) provides information about the spatial distribution of runs of consecutive pixels with the same gray level, in one or more directions, in 2 or 3 dimensions [26].
The originality of this work can be summarized as follows: (1) This study explored the capacity of a DL model based on CT images to determine the status of muscle invasiveness of BCa before surgery.(2) We used four machine learning algorithms to develop four predictive models simultaneously.The one with the highest AUC and the net benefit was selected using discriminant analysis.(3) Unlike other studies that used cross-validation or single-centre validation, this study used an external validation cohort enrolled from a different hospital, which allowed us to investigate the generalizability of the ANN model.
There are still some limitations to the present study.First, the sample pool was relatively small, potentially leading to bias.Second, the research retrospective scheme was not random (selecting the patients), producing biased results.Third, the tumor boundary was manually drawn, and interference due to the volume effect could not be avoided altogether.Finally, it is impossible to completely avoid the impact of beam-hardening artefacts or the contrast enhancement effects caused by the various CTU procedures.

Conclusions
In this study, four prediction models for assessing muscle invasion in patients with bladder cancer were constructed.The algorithms for radiomics and machine learning were also developed.Furthermore, the ANN model outperformed all other analyzed models.This method has significant implications for personalized medicine and may assist urologists and radiologists in differentiating between MIBC and NMIBC with greater accuracy.Moreover, a prediction model will be developed based on the current study's key findings, which will be more accurate and comprehensive by incorporating biochemical and clinical markers.

Fig. 3
Fig. 3 Five-point Likert scale-ROC curve in the validation set

Fig. 4
Fig.4 The LASSO regression model was used on the training data set to extract features for radiomics.a In an attempt to develop the parameter (λ) for the model of LASSO, a tenfold CV procedure was carried out 50 times.At the ideal values, vertical lines with dots are produced by the minimal criterion utilizing the minimum criteria (the 1-SE criteria) for the standard error of "1".log (λ) = − 2.435, and a value of 0.0876 was selected for λ. b The profile plots of the LASSO coefficients for each of the 1040 characteristics were produced.Once the plot had been constructed, the log (λ) series is shown as a coefficient profile.A vertical line is drawn at the selected value (16 non-zero coefficients for optimum λ) using tenfold cross-validation

Fig. 7
Fig. 7 Decision curve analysis validation set (n = 81).The y-axis represents the probability of a net benefit, and the x-axis shows the threshold

Table 1
The clinical features and incidence (%) of muscle-invasive bladder cancer in the training (n = 188) and validation (n = 81) sets # Non-parametric two-sample Wilcoxon test for continuous variables; chi-squared's test for categorical variables

Table 2
Diagnostic precision of various models in the validation set of invasive lesions.The ANN's utility resulted in the best diagnostic performance with fewer false positives and negatives, which might be considered more acceptable in future clinical practice.