Reference | Methodology specifications | Parameters studied | Main results and conclusions |
---|---|---|---|
Mackin [20] | CCR phantom NSCLC patient 2 CT vendor IBEX radiomics software | mAs (250–30) Noise kVp | The pattern of feature variation in two GE and Toshiba CT scan units is similar Smoothing does not affect the results of the features Noise is the main factor in changing features mAs seem to have the most significant impact on radiomics reproducibility among mAs, kVp, and pitch |
Midya [21] | Anthropomorphic phantom (consists of five simulated tissue types) In-house radiomics software | mAs (50, 100, 200, 300, 400, and 500 mA) Noise index (NI) levels (12, 14, 16, 18, and 20) Reconstruction (from FBP (0% ASIR) to 100% ASIR in increments of 10%) | Noise, tube current, and reconstruction algorithm significantly affect the reproducibility of radiomics results By reducing image noise, the reproducibility of features increases FBP algorithm has the most reproducibility By increasing the weight of the ASIR algorithm from 0 to 100%, the number of reproducible features decreased because the image noise gradually increased 8–25% of features are reproducible across mAs variation Noise: 19–25% of features are reproducible across Noise variation |
Berenguer [22] | Anthropomorphic pelvic phantom CCR phantom IBEX radiomics software | Test–retest (5 CT vendors) Pitch Reconstruction Kernel mAs kVp FOV | The more limited the range of variation of the scan parameters, the higher the reproducibility During intra-scanner analysis, changing the kernel has the most and the pitch has the least impact on reproducibility During test–retest, about 91% of features were reproducible During intra-scanner analysis, about 89% of features were reproducible with the change of pitch factor, and 43% were reproducible with the change of reconstruction algorithm During inter-scanner analysis, about 85% of features were reproducible in wood (heterogeneous), and only 15% were reproducible in polyurethane (homogeneous) In general, ten features out of 177 features remained reproducible after changing all parameters |
Buch [23] | In-house Phantom Two different CT brand In-house radiomics software | kVp (80–140) mAs (80–140) pitch Section thickness (0.625, 1.25, 2.5, 5 mm) Acquisition mode (axial vs. helical) | The change of kVp and mA has less impact on reproducibility than other scan parameters The features change significantly by changing the pitch, acquisition mode, and section thickness The features vary significantly by changing the pitch, acquisition mode, and section thickness |
Fave [24] | 20 NSCLC patient IBEX radiomics software | mA (100–150–200–250) kVp (80–100–120–140) 2D vs. 3D Respiratory phase (10-time phase) | Changing the tube voltage has little impact on the value of the features, while changing the mA leads to significant changes in the value of the features Reproducibility in intra-patient studies is higher than inter-patient studies By adding Gaussian noise to the images, the values of the features do not change By changing mA, about 43% of features (10 of 23) remain reproducible By changing the respiratory phase (motion), about 65% of features (15 of 23) remain reproducible By changing the dimensionality of ROI segmentation (2D vs. 3D), about 65% of features (15 of 23) remain reproducible kVp does not influence the features significantly |
Gao [25] | 105 Pulmonary patients Pyradiomics radiomics software | Dose (Low dose vs. Conventional dose CT) CTDIvol of LDCT ~ 2 mGy CTDIvol of CDCT ~ 12 mGy | With changing the radiation dose, 45% of features extracted from a solid nodule and 35% from ground-glass nodules remained reproducible |
Li [26] | CCR Phantom IBEX radiomics software | Inter-CT vs. intra-CT mAs kVp Pitch FOV Kernel Slice thickness | Reproducibility depends on the structure and texture of the material Parameters related to image resolution, such as FOV, slice thickness, and kernel, have a more significant impact on reproducibility than scanning parameters (mAs, kVp, pitch) The reproducibility of radiomics features depends on the noise level Test–retest show ICC > 0.9 The highest reproducibility was for shape features (94% of features were reproducible). Even in the least reproducibility, 14% of the features were still stable Changing the kernel (from bone to standard) significantly affects the reproducibility of features |
Larue [27] | CCR phantom In-house radiomics software | Inter-scanner Slice thickness gray-level discretization (bin widths ranging from 5 to 50 Hounsfield Units with a step size of 5 HU) voxel resampling (resampling into voxel sizes of 1 × 1 × 3 mm3 using cubic, linear, and nearest-neighbor interpolation) | CT scanner, slice thickness, and bin width affected radiomics feature values No impact of radiation exposure observed Resampling images before feature extraction decreases the variability of radiomics features 'GLRLM – RLN' features in 1.5 mm and 3 mm slice thickness were more similar after resampling, which was not the case for the 'GLSZM – SAE' feature values The test–retest analysis demonstrated that the feature 'GLRLM – RLN' is reproducible (CCC > 0.85) |
Mackin [28] | 20 NSCLC CCR phantom IBEX radiomics software | Inter-scanner (17 CT units) Patient vs. Phantom | Variability was large relative to the inter-patient variation in the NSCLC tumors for some features The variability in radiomics features extracted from CT images of the phantom was comparable in size to the variability observed in the same features extracted from CT images of NSCLC tumors The reproducibility of radiomics features extracted from different CT vendors is low, but the different brands of the same vendors have higher reproducibility |
Ibrahim [29] | 338 HCC patients with arterial and portal venous phases RadiomiX radiomics software | Inter-scanner (9 CT units) | About 25% of the features were reproducible across the inter-scanner study About 28% of features (42 of 167) are reproducible between the arterial and portal venous imaging phases The combat harmonization only improved by 1% reproducibility |
Caramella [30] | Phantom Lifex radiomics software | Inter-scanner (8 CT units) | About 23% of features (8 of 34) exhibited high reproducibility |
Zwanenburg [31] | Patient Phantom In-house radiomics software | Multicentral study (25 research teams) | The Image Biomarker Standardization Initiative produced and validated a set of consensus-based reference values for radiomics features 15% of features have good to excellent reproducibility in a validation dataset between patient and phantom 46% of features were reproducible in test-rest |
Balagurunathan [32] | 32 NSCLC patient In-house radiomics software | Manual vs. automatic segmentation 2D vs. 3D | About 22% of features (48 of 219) across segmentation methods (2D vs. 3D and manual vs. automatic) were reproducible (CCC > 0.9), and 13% (29 features) were reproducible with CCC > 0.95 |
Fave × 2015 [33] | CCR phantom & NSCLC patient IBEX radiomics software | Inter-scanner vs. intra-scanner (19 CBCT units of Linac accelerator) Noise(scatter) Motion (or ROI identification) | About 54% of the features (37 out of 68) were reproducible in the intra-scanner test, but none were reproducible in the inter-scanner test No feature can be reliably measured if the tumor motion is greater than 1 cm With 4 mm of motion, 12 features from the entire volume and 14 from the center slice measurements were reproducible Almost all features changed significantly when scatter material was added around the phantom. For the dense cork, 23 features passed in the thoracic scans and 11 in the head scans when the differences between one and two layers of scatter were compared |
Lorena Escudero Sanchez [19] | 43 HCC patient Pyradiomics radiomics software | Gray Level (8, 16, 32, 64, 128, and 256) Slice thickness (2 mm vs. 5 mm) | Features value depends on slice thickness Slice thickness does not affect the ROI segmentation The most optimal gray level for high reproducibility is between 32 and 64 |
Shafiq‐ul‐Hassan [34] | CCR phantom in-house radiomics software | Inter-scanner (8 CT units) slice thicknesses FOV Pixel sizes (0.39 to 0.98 mm) resampled (to a voxel size of 1 × 1 × 2 mm 3 using linear interpolation) Gray level (16, 32, 64, 128, and 256 GL) | 70% (150 of 213 features) were reproducible across voxel size variation Resample and normalizing feature values by voxel size can heighten reproducibility (resampling increases reproducibility until to 80%) Seventeen texture features were dependent on the number of gray levels. This dependency can also be removed or reduced by normalizing the number of gray levels used |
Mackin [35] | lung cancer patients Phantom IBEX radiomics software | Resampling Filtering (with Butterworth) | Resampling and low-pass filtering of CT images could correct much of the variability in features due to inconsistent image pixel sizes This correction may also reduce the variability introduced by other CT scan acquisition parameters This correction reduces the dependence of features on pixel size from 80 to 10% |
Solomon [36] | 20 patients In-house radiomics software | Reconstruction Algorithm (MBIR and ASIR vs. FBP) Radiation Dose | Among the 23 imaging features assessed, radiation dose significantly affected 5, 3, and 4 of the features for liver lesions, lung nodules, and renal stones, respectively ASIR reconstruction significantly affected 3, 1, and 1 features for liver lesions, lung nodules, and renal stones, respectively MBIR reconstruction significantly affected 9, 11, and 15 features for liver lesions, lung nodules, and renal stones, respectively |
Kim [37] | 42 patient Lung tumor (contrast-enhanced CT scans) in-house radiomics software | Reconstruction Algorithm (FBP vs. Iterative) ROI segmentation (Inter-reader vs. intra-reader) | About 40% of features (6 of 15) were reproducible among reconstruction algorithms Inter-reader variability was more significant than intra-reader or inter-reconstruction algorithm variability in 9 features Inter-reconstruction algorithm variability was more significant than inter-reader variability for entropy, homogeneity, and GLCM-based features |
Meyer [38] | 75 liver patients Radiomics version 1.0.9 radiomics software | Radiation Dose levels section thicknesses Kernels Reconstruction algorithm | About 11% of features (12 of 106) were reproducible for any variation of the different technical parameters Reconstructed section thickness had the most considerable impact on reproducibility (only 12% of features were stable) Reconstruction kernel had a minor impact on the reproducibility (53% of features were stable) inter-reader variability induced by the ROI segmentation was significantly higher than the reconstruction algorithm The number of reproducible radiomics features in: Kernels = 56 (52.8%) Section thicknesses = 42 (39.6%) Radiation Dose levels = 22 (20.08%) Reconstruction algorithm = 13 (12.2%) |
Huang lan He [39] | 240patient with a solitary pulmonary nodule In-house radiomics software | Reconstruction slice thickness Convolution kernel Contrast-enhancement (non-contrast vs. contrast CT) | NECT-based radiomics demonstrated better discrimination and classification capability than CECT in both primaries Thin-slice (1.25 mm) CT-based radiomics signature had better diagnostic performance than thick-slice CT (5 mm) Standard convolution kernel-based radiomics signature had better diagnostic performance than lung convolution kernel-based CT radiomics signature based on the non-contrast, thin-slice, and standard convolution kernel-based CT was more informative on the differential diagnosis of SPN |
Muenzfeld [40] | 48 prostate cancer patients Pyradiomics radiomics software | Kernel (two soft tissue kernels and one bone kernel) | 11 of 86 features (12.7%) as highly reproducibility with CCC ≥ 0.85 Feature reproducibility was also impaired for most first-order features by applying the sharp-edge kernel Bone kernel resulted in overall lower reproducibility compared to both soft tissue kernels |
Haarburger [1] | Patients with liver, kidney, or lung lesions Pyradiomics radiomics software | Manual vs. Automatic segmentation | Manual vs. automated segmentation approaches was highly correlated with a Pearson correlation coefficient of r = 0.921 Features found to be unstable based on human annotations were also found to be unstable based on automated annotations When a feature exhibited high reproducibility (i.e., ICC > 0.9) on one lesion, it also achieved high ICCs on others |
Zwanenburg [41] | 31 (NSCLC) patients 19 H&N SCC patients in-house radiomics software | Adding perturbation as: Noise addition (N) Translation (T) Rotation (R) Volume growth/shrinkage (V) Super voxel-based contour Randomization (C) | The reproducibility of NSCLC CT images under image perturbations (N, T, R, V, C) was higher Reproducibility of H&N SCC ICCs was generally lower |
J Kalpathy-Cramer [42] | Patient with lung disease in-house radiomics software | Manual vs. Automatic segmentation | 68% of features were reproducible across segmentations with CCC > 0.75 |
Kelahan LC [43] | – | Segmentation | Inter-reader reproducibility is dependent on the ROI size Groups of "large" and "small" lesions show different inter-reader reproducibility |
Ying Li [44] | Lung Phantom In-house radiomics software | mAs (25, 100, or 200) pitch (0.9 or 1.2) Slice thicknesses (0.75, 1.5, or 3 mm) reconstruction kernels ((medium or detail) gray-level (3 ranges) gray-level bin (11 sizes) | For the three gray-level ranges, 50% (44/88) of features were reproducible For gray level, bin size, 33.3% (24/72) of features were reproducible for 11 bin sizes Feature calculating parameters (range and bin size) may have a greater influence than imaging parameters (effective dose, pitch, slice thickness, and filter) on the reproducibility of CT radiomics features |
Jensen [46] | Homogeneous phantom Pyradiomics software | Sphere-shaped ROIs of diameters 4, 8, and 16 mm, and 4, 8, and 16 pixels | 70 CT-derived features were significantly different between ROI sizes many features indicated significant differences and only few showed excellent agreement across varying ROI sizes |
Jensen [47] | Homogeneous phantom Pyradiomics software | sphere-shaped ROIs of diameters 4, 8, and 16 mm parametric maps with a fixed voxel size of 4 mm3 were created | Fifty-five conventionally extracted and 8 parametric map-based features were significantly different between the VOI sizes Only 3 of 93 parametric map-based features showed excellent agreement across varying ROI sizes |