Artificial intelligence development for detecting prostate cancer in MRI

Artificial intelligence (AI) is the recently advanced technology in machine learning which is increasingly used to help radiologists, especially when working in arduous conditions. Microsoft Corporation offered a free-trial service calling Custom Vision to develop AI for images. This study included 161 prostate cancer images with 189 lesions from 52 patients. The 160-tag iteration presented the best performance: precision 20.0%, recall 6.3%, mean average precision (M.A.P.) 13.1%, and prediction rate 31.58%. The performance of a 1-h training was better than quick training, but was not different from a 2-h training. Health personnel can easily develop AI for the detection of prostate cancer lesions in MRI. However, the AI development is further required, and the result should be interpreted along with radiologist.


Background
Prostate cancer is the 4th most common cancer in Thai men with 6467 new cases in 2018 or 7.6% of all new cancer cases in Thai men [1]. Prostate cancer is the 2nd most common cancer in both incidence and mortality of men worldwide [2,3].
Several companies have developed computer-aided detection and diagnosis (CAD) for radiology since the late 1960s. But the real development and systematic research were begun in the early 1980s [4]. Artificial intelligence (AI), which is the recently advanced technology in machine learning, may improve CAD for radiology in clinical practice. In general, AI tasks included automated detection, localization of suspicious lesions, automated diagnostic classification, and prediction of the aggressiveness of cancer from prostate multi-parametric MRI [5][6][7]. Although AI recently causes concern that machine may replace human in the near future, these fears have occurred periodically among radiologists since the first development of CAD. Nowadays, CAD and AI have proven their support role for radiologists, especially under arduous condition [8][9][10].
Microsoft Corporation introduced its cloud platform called Azure supplying over 100 services, some are freetrial and some are always free. The machine learning is a feature-based algorithm of the AI before the advent of deep learning (DL), which is the main algorithm for developing AI for medical imaging. Under the budgetconstrained situation in the authors' hospital, an attempt was made to develop AI for detecting cancer lesions within MRI under "custom vision" which is one of the free-trial services from Azure. It was aimed to test the possibility of using this service; therefore, this is a pilot study conducted solely by clinicians with some guidance from one computer scientist.

Methods
This study was approved by the institution Ethics Committee for Human Research based on the Declaration of Helsinki and the ICH Good Clinical Practice Guidelines. No informed consent was needed because this was a retrospective study of stored images in the hospital PACS database.
The training processes were divided into 5 iterations of 30, 60, 100, 130, and 160 lesion datasets. The images were uploaded and every lesion manually taggged to help train the object detector. If an image has 3 PCa lesions, it added up to 3 tags in this dataset. After each training for 1 h, this AI was evaluated with testing a dataset from 10 different images that were not included in the training dataset. The testing dataset was composed of 19 PCa lesions.
The system presented the "Performance Per Tag" after the training process into 3 values: The clinical performance of this AI is presented with the amount and percentage of correct detections among 5 iterations of training.
Another factor that affects the AI performance should be the duration of training. One-hour training was used as a standard training process as previously mentioned. Then, "quick training" and "2-h training" iterations were Table 1 The "Performance Per Tag" of 5

Results
This study included 161 prostate cancer images with 189 PCa lesions from 52 patients. The "Performance Per Tag" of 5 iterations of 30, 60, 100, 130, and 160 tags are presented in Table 1. Ten images with 19 PCa lesions were tested in each iteration. The false-positive prediction from the 60-tag iteration is shown in Fig. 1. The 100-tag iteration showed true-positive predictions in Figs. 2 and 3; however, only one out of three PCa lesions was predicted in Fig. 3. The clinical performance of each training with the same testing dataset (10 images with 19 PCa lesions) is presented in Table 2.
The "Performance Per Tag" was improved from the quick training iteration to the 1-h training iteration, but the 2-h training iteration showed the same values as the 1-h training iteration ( Table 3). The clinical performance showed the same results as the "Performance Per Tag" (Table 4).

Discussion
Artificial intelligence (AI) is developed from computer algorithms to simulate intelligent behavior that is capable of learning, reasoning, problem-solving, and selfdeveloping. One of the more sophisticated sets of algorithms is often referred to as deep learning (DL) which is developed from the machine learning (ML). The ML is the ability of an AI to extract information from raw data and to learn from experience [11][12][13][14]. Microsoft Corporation provides the free-trial service called "custom vision" which health care personnel can use to develop the AI in their daily practice, especially in radiology. This free-trial service, however, can be regarded as an ML level, while DL needs some additional programming. So DL was not included in this study.
In theory, more learning makes better AI performance, so the "Performance Per Tag" should improve gradually from 30, 60, 100, 130, and 160-tag iterations. Although the 160-tag iteration showed the best performance values, other iterations showed inconsistent values. The clinical performances improved gradually from 30, 100, 130, and 160-tag iterations, except for the 130-tag  iteration which showed the results worse than the 100tag iteration. Many discrete tag varieties, in which each variety had few tag patterns, may confuse the AI on the 130-tag iteration. With more tag patterns, the AI made better clinical performances with the best prediction rate at 31.58%. The duration of training should affect the performance, and more sophisticated learning needed more time. The 1-h training model made better performances than a quick training model. The 2-h training model, however, was no different in performance from the 1-h training model. With only 160 tags, the AI needed 1 h to experience every pattern thoroughly. One more hour helped the AI learn nothing more. If more images were uploaded, a 2-h training may improve AI performance.
The accuracy and speed of the CAD/AI systems are dependent upon how their algorithms register data and how the system has been trained to learn effect calculation times [10]. The accuracy of detection for prostate cancer using CAD/AI system (43%) was comparable to standard ultrasound-guided biopsy (40%) [15]. Our study used the discrete images to train AI system which was less sophisticated than CAD/AI system, so it was not surprised to achieve low precision (20%) and recall (6.3%) which meant only 6.3 cases would be detected correctly from 100 positive cases. There is a need to perform additional studies with large data sets to improve the performance and impact of this system. Besides radiologist, other clinicians should use the AI system with utmost consideration.

Conclusion
Health personnel can easily develop AI for the detection of PCa lesion in T2W MRI. AI can predict one third of PCa lesions correctly after training with only 160 images and the free-trial service. However, the AI development is further required, and the result should be interpreted along with radiologist.

Funding
This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Availability of data and materials
All data and material in this study are available for your request.

Declarations
Ethics approval and consent to participate This study was approved by the Khon Kaen University Ethics Committee for Human Research based on the Declaration of Helsinki and the ICH Good Clinical Practice Guidelines with reference number HE621497. No informed Table 2 The clinical performance of 5 training datasets   Test/train  30 tags 60 tags 100 tags 130 tags 160 Table 3 The "Performance Per Tag   Aphinives and Aphinives Egyptian Journal of Radiology and Nuclear Medicine (2021) 52:87