Conventional ultrasound, color Doppler, TI-RADS, and shear wave elastography for thyroid nodule differentiation: a study of efficacy compared with the histopathology results

Although a minority of the thyroid nodules is malignant, usually the invasive diagnostic procedures are warranted. This prospective study aims to assess the diagnostic performance of the US criteria in addition to the TI-RADS score and the SWE for the differentiation between the benign and malignant thyroid nodules as a potential surrogate for the invasive procedures. Ninety-nine patients with thyroid nodules (79 females and 20 males, with a mean age of 45.9 ± 7.7 years; 30–69 years) were enrolled in this study and underwent conventional ultrasound, color Doppler, TI-RADS scoring, and shear wave elastography (SWE); the findings were correlated to the histopathological results. Our results revealed a significant increase in SWE elasticity indices (EIs) and presence of color Doppler signals in malignant nodules as compared with the benign ones (ρ < 0.05). Combined TI-RADS and SWE as well as TI-RADS and color Doppler imaging had given a better sensitivity for detection of malignancy. Elasticity indices had shown a significantly high diagnostic performance that is almost approaching the histopathological results. Combined SWE, color Doppler and TI-RADS, as a sum of findings, could effectively differentiate between benign and malignant thyroid nodules. Furthermore, it had offered a non-invasive tool for accurate risk stratification of malignant nodules.


Background
Thyroid nodules are commonly encountered clinical and radiological findings in the clinical practice; however, only minorities (less than 5%) of them are truly malignant and need further evaluation [1][2][3][4][5].
In the past, the assessment of the thyroid nodules used to be done by clinical examination, ultrasound, and radioisotopic scanning. Nonetheless, the cytological assessment by either fine-needle aspiration cytology (FNAC) or biopsy is often needed to preclude malignancy [3,4].
The clinical examination per se cannot give a final diagnosis and usually, the US examination is requested; thus, some ultrasound criteria were employed for categorization of the suspicious nodules. A scoring system known as thyroid imaging reporting and data system (TI-RADS) has been validated, widely used, and updated in this regard [3,5,6].
Elastography has been added as an ultrasound-based technique including the strain and the shear wave elastography (SWE) for getting an objective assessment of the tissue stiffness in organs like the liver and the breast lesions; nevertheless; the diagnostic accuracy of the thyroid elastography in malignancy detection is still controversial in the literature [5,6].
Recently, functional imaging like diffusion-weighted MRI (DW-MRI) has been investigated for assessing thyroid malignancies as well as their recurrences, and it had reported fairly good results, but MRI has some limitations being contraindicated in claustrophobic patients and those with cardiac pacemakers; moreover, it is more expensive compared with the US [7,8].
Eventually, some studies had investigated the use of SWE as a potential surrogate for invasive procedures (like FNAC) in the evaluation of soft thyroid nodules [9].
This prospective study aims to assess the diagnostic performance of the US criteria in addition to the TI-RADS score, and the SWE for differentiation between the benign and malignant thyroid nodules in patients referred for FNAC and to correlate the imaging patterns and the obtained values of the lesions with their histopathological results as a reference standard.

Subjects
This study had enrolled 99 patients with thyroid nodules who were referred to US or intervention units in our institute from the outpatient clinics, and the inpatient departments within the period from January 2019 to August 2020, there were 79 females and 20 males having a mean age of 45.92 ± 7.7 years (30-69 years). This study was conducted following the ethical guidelines of the Research Ethics Committee of our institute (reference number: Code D-24-2019; date of approval 13-07-2019), and it was approved. All the participants were informed of the details and gave their written informed consent.
All of our patients were subjected to the following: Conventional ultrasound and shear wave elastography B-mode and color Doppler examinations were done traditionally for assessing the number, size, composition, echogenicity, shape, margin, and calcification of the nodules (Table 1). Then, the nodules were scored according to ACR TI-RADS criteria [10]. Vascularity-wise, the examined nodules were classified accordingly into avascular nodules (0), nodules with peripheral vascularity (1a), nodules with internal vascularity (1b), and nodules with both peripheral and internal vascularities (1c) ( Table 1).
TOSHIBA Aplio 500 machine (equipped with a 7.5-MHz linear probe) was used.
Shear wave examination was subsequently done by the same operator (who is a 3-year experienced subject with the US and elastography techniques) for target nodules using the same US machine and probe. After identification of the lesion, the transducer was kept in a stable perpendicular position without pressure for 3 s to minimize the compression artifact.
Shear wave mode was applied over the B-mode image. A color signal box of appropriate size was displayed as a colored area, where softer areas were blue, and the harder areas were red. Whenever the cine loop was stable by showing parallel lines or parallel circles (free of dot artifacts or zigzag lines), we froze the image and started the interrogation process.
Elastographic quantitative assessment was done using a suitable region of interest (ROI), that was placed in the stiffest region, but avoiding the cystic components, visible calcifications, and the surrounding blood vessels.
The average SWE values in the selected ROI were recorded in kilo Pascal (kPa) for each lesion. A second ROI of the appropriate size was placed in the normal thyroid parenchyma or on the sternocleidomastoid muscle to obtain the elastic ratio (ER), which is the mean stiffness for the lesion-to-normal parenchyma. Repeating the process (at least three successive measurements) was performed for the nodule to choose the best SWE image. If a large nodule was present, multiple measurements for different regions were done.
Ultrasound-guided fine-needle aspiration and cytological examination (FNAC) The suspicious nodules were localized and aspirated under US image using a 20-22G needle after sterilization without local anesthesia. At least five slides were obtained for the cytological analysis. The samples were immediately smeared and fixed in 95% ethanol. The slides were stained, examined, and analyzed by an expert cytopathologist using the six-tiered diagnostic Bethesda system [11].
Thyroid surgery was tailored and performed according to the cytological and clinical diagnoses. It was performed in only 72 cases, who had suspicious nodules (Bethesd a 4 & 5), had benign nodules (Bethesda 2) but causing compressive or clinical symptoms, or those who had cytological results of undetermined significance (Bethesda 3).

Statistical methods and data analysis
Data management and analysis were performed using a statistical package for Social Sciences (SPSS) version 25 for Windows [12]. The numerical data were statistically presented in terms of median, range, or mean, and standard deviation. Categorical data were summarized as numbers and percentages. A comparison between numerical variables was done by Student's unpaired t test or Mann-Whitney U test for parametric data. Comparing categorical variables was done by Chi-square test or Fisher exact test.
The diagnostic performance of SWE elasticity indices (EIs) and TI-RADS scores was assessed by analyzing receiver operating characteristic (ROC) curves for predicting malignancy, optimal SWE cutoff values, and TI-RADS scores. The p value was considered significant when p values are less than 0.05.

Results
The obtained results could be binned into six items as follows: Analysis of conventional ultrasound and color Doppler characteristics of the nodules (Table 1) We had found no significant correlation between the size, number, or site of the nodules and the possibility of malignancy (p value > 0.05).
We found that hypo or marked hypoechogenicity was the most significant single B-mode ultrasound criterion for malignancy; however, TI-RADS had a higher sensitivity in the detection of malignancy than the single US criterion alone.
Color Doppler had shown an important role in the detection of malignancy, where the presence of intranodular vascularity (type 1b) ( Table 1) was more specific to malignant nodules but its absence cannot exclude malignancy (Figs. 5 and 6).

Shear wave elastography of the nodules (Tables 2, 3, and 4)
We had detected that E mean and ER of SWE were significantly higher in malignant nodules than in benign nodules (p < 0.001). As compared with the other SWE parameters, ER with the optimal cutoff value set at 2.6 had the highest AUC value (96.3%; 95% CI 93.3-99.4%), showing a diagnostic sensitivity and specificity of 90.9 and 89.4%, respectively.
Cytological ± histopathological analysis of the nodules ( Table 5, 6, and 7) (Fig. 1 a and b) From the combined results of FNAC and surgery, we had 66 benign nodules (Figs. 2 and Fig. 3) consisting of 41 nodules of benign nodular goiter, 10 nodules of chronic lymphocytic thyroiditis (Fig. 4), and 15 nodules of follicular adenoma. Alternatively, there were 33 malignant nodules, consisting of 24 nodules of papillary carcinoma (Figs. 5 and 6), five nodules of follicular carcinoma, three nodules of medullary carcinoma (Fig. 7), and only one nodule that was anaplastic undifferentiated carcinoma.
The malignancy rate and TI-RADS scores in the nodules (Table 8) Based on our statistics, it has been shown that the malignancy rate and the TI-RADS scores are in a positive    relationship, as the malignancy rate increases with higher TI-RADS scores.
The diagnostic performance of TI-RADS, color Doppler and SWE separately and in combinations (Table 9) In the "parallel method", when one or both methods (SWE or TI-RADS) resulted in positivity for malignancy, the results were considered positive, and only when both methods resulted in negativity were the results considered negative.

Discussion
The prevalence of thyroid nodules has necessitated the differentiation between benign and malignant ones. As the clinical examination cannot provide a definitive diagnosis, thus, ultrasonography and radioisotope scanning were employed for sorting out the nodules that should be further assessed by histopathology [9]. The sonographic appearance of suspicious nodules can predict the need for histopathological evaluation of them. The US elastography can provide an objective assessment of tissue stiffness [6].
In this context, our results had outlined the most predictive US feature of malignancy by B-mode, as hypo or marked hypo-echogenicity, with the highest sum of the sensitivity and specificity (66.6 and 90.9%, respectively), and this was incongruent with Zhao et al. [13] who had described the micro-calcification as being the most predictive US feature of malignancy (85% sensitivity and 75.6% specificity); this difference is explained by the different sample sizes, where they had a larger sample (313 patients) and a relatively larger number of malignant nodules (194).
From the above and in concordance with Sibos study [14], there is no single US criterion that carries sufficiently high accuracy measures in distinguishing the nodules, but the combination of multiple criteria could increase the sensitivity and specificity. Horvath et al. [15] had introduced the TI-RADS for risk stratification of thyroid malignancy. Their tested sensitivity was 88%, and NPV was 88%; Russ et al. [16] had also reported high sensitivity (95.7%) and NPV (99.7%) for diagnosing thyroid malignancy by TI-RADS, then Tessler et al. [10] had updated the TI-RADS scoring; the latter was used in our study with the pooled sensitivity and NPV of TI-RADS (in our study) of 84.8 and 91.6%, respectively.
The ROC curves-in our study-had indicated that the cutoff value of ACR TI-RADS was TR4, and the AUC was 0.84 (95%CI 0.754-0.907) with the diagnostic sensitivity and specificity of ACR TI-RADS of 84.8% and 83.3%, respectively; this was concordant with Zhang et al. [17], where their cutoff value for the ACR TI-RADS was TR5, and AUC was 0.864 (95%CI 0.879-0.934) (81.4% sensitivity and 84.8% specificity); Xu et al. [18] also had reported an approximate cutoff point for malignancy by ACR TI-RADS, which was more than TR4 (80.6% sensitivity, 78.4% specificity, and 79.6% accuracy of the average value).
By using the color Doppler, we had noticed that the presence of intra-nodular vascularity (Type 1b) was close to the possibility of malignancy, but simultaneously, the presence of the peri-nodular vascularity or avascular nodules cannot exclude it, so the color Doppler solely had a limited role in the differentiation between the thyroid nodules; however, in congruence with Manoj et al. [19], we found that (1b) pattern was the cutoff value for the suggestion of malignancy with 54.5% sensitivity, 100% specificity, and 81.4% NPV.
Shear wave elastography is one of the elastography techniques that had gained a high sensitivity and specificity for evaluation of the thyroid nodules and can decrease the unnecessary invasive procedures [9,20]. We had found that EIs were significantly higher in malignant thyroid nodules than in benign ones (p < 0.0001), this was stated in the meta-analysis study done by Peiliang et al. (84.3% sensitivity and 88.4% specificity) [21].
Hye et al. [22] had reported higher EIs in thyroid carcinoma relative to the benign nodules with E mean with diagnostic specificity of 86.4% and PLR of 4.2; this is nearly compatible to our study, where, the E mean had a diagnostic specificity of 83.3% and PLR of 5.09.
Among the tested SWE EIs, we had selected the elasticity ratio (ER) as the best cutoff value because it has the highest sensitivity and specificity values in the differentiation between the malignant and the benign thyroid nodules. The ER cutoff value of 2.6 had the highest AUC value (96.3%; 95% CI 93.3-99.4%); it had a sensitivity, specificity, PLR, PPV, and NPV of 90.9, 89.4, 8.5, 81, and 85.1%, respectively.
Although matching Veyrieres et al. [23], Bhatia et al. [24], Sebag et al. [25], and Kim et al. [26] studies, up to our knowledge, the most accurate cutoff value of SWE has not yet been unified. This difference between the studies may be attributed to the choice of different standards. We had selected the best cutoff value in ER ( which was 2.6), whereas they had used the best parameter that gave a NPV or a PPV of at least 80%.
Consistent with this rule in the selection of the best cut-off value, many published studies over the past years are concordant with us, including Liu et al. [27], Park et al. [28], Zhao et al. [13], and Zhang et al. [17], where they had also considered that the EIs are significantly higher in malignant nodules with high accuracy measures. Adding to our knowledge, two important salient findings-in our study-were demonstrated where the combination of SWE and TI-RADS as well as color Doppler and TI-RADS to a certain extent had increased the diagnostic performance in differentiating thyroid nodules. Moreover, when we used the SWE (ER) and TI-RADS, we had obtained higher accuracy measures compared with the TI-RADS alone; therefore, such a combination of TI-RADS and EIs can minimize the need for unnecessary surgery or biopsy in suspicious thyroid nodules; this was concordant with Park et al. [28], Zhao et al. [13], and Xu et al.'s [18] studies, where the combined use of the TI-RADS findings and the EIs had increased the accuracy measures. However, it had been shown that the rise in the sensitivity was from 84.8% of TI-RADS alone to 90.9% with the combination of TI-RADS and SWE in "parallel" but the specificity was lower (74.2%) as compared with TI-RADS alone (83.3%); this was explained by the statistical way in which we chose the positive cases of possible malignancy as we use the parallel method when one or both methods had resulted in positivity, and the results were considered negative when only both methods had resulted in negativity; this relationship between both methods is termed "parallel"; and it can explain why the sensitivity is higher while the specificity is lower in the combined method than in TI-RADS alone.
Finally, from our results, we had found that the SWE and TI-RADS can form a complementary relationship in terms of the advantages. Where the TI-RADS can compensate for the limitations of SWE (that may be disturbed by macro-calcifications and the carotid artery pulsations) and vice versa, the SWE can compensate for TI-RADS, which can be influenced by the operator dependence and the interobserver variability. Thus, we suggest using SWE for the thyroid nodules with TI-RADS score greater than or equal to 4 as a complementary tool, other than doing the diagnosis separately to   avoid unnecessary invasive diagnostic procedures in suspicious nodules. Some limitations had been met in this work including the following: First, all malignant nodules (24/33) were papillary carcinomas, while only three nodules were medullary carcinomas, and most of the benign nodules were nodular goiters; thus, other pathological types were not included, such as Hurthle cell thyroid carcinoma and primary thyroid lymphoma; thus, their tested parameters were not examined in this study.
Second, it is a one-center experience, so the data need to be tested by prospective multicenter and nonspecialized members to eliminate this selection bias.
Third is the isthmic lesions, where the relatively small size of the isthmus compared with the thyroid lobes and the ROI should cover the whole nodule with sufficient surrounding parenchyma are needed; however, isthmic lesions were seen in 10 of our cases, most of them were "isthmi-lobar" in location, thus surrounding thyroid parenchyma served to avoid this limitation; however, in pure isthmic lesions, breath-holding and cessation of swallowing as well as the application of a copious amount of gel is considered of paramount importance, moreover, repeating the process, where at least 3 successive measurements were taken for each nodule to choose the best SWE image.
Forth, is the inter\intra-observer variability and the operator dependency, this was much lessened by conducting the examinations using the same US machine at the same setting of the conventional US examination, and by the same operator under the direct supervision of an experienced senior.
Last, selection bias may exist because patients included in our study were scheduled for US-guided FNAC for suspicious thyroid nodules with US features (TR ≥ 3), this may decrease the diagnostic performance on TI-RADS, causing false-negative cytologic results.
Finally, it would be interesting to see any solid and conclusive results that can lead to a change in practice as the researches about the SWE are there and giving promising results for a long time, but it is not included in any known thyroid scoring systems so far.

Conclusion
Elasticity indices had shown a significantly high diagnostic value that is comparable to the histopathological results. Combined SWE, color Doppler, and TI-RADS could effectively complement each other in the differentiation between the thyroid nodules as a sum of findings. Furthermore, the combined method could be used as a simple and non-invasive tool that accurately stratifies the risk of malignancy and surrogates the invasive diagnostic procedures.
Abbreviations EIs: Elasticity indices; E mean : The mean elasticity index of the stiffest portion of the nodule; ER: The ratio of mean elasticity index of the lesion and the normal parenchyma or subjacent sternocleidomastoid muscle; FNAC: Fineneedle aspiration cytology; SWE: Shear wave elastography; TI-RADS: Thyroid imaging reporting and data system