Artificial intelligence (AI) is developed from computer algorithms to simulate intelligent behavior that is capable of learning, reasoning, problem-solving, and self-developing. One of the more sophisticated sets of algorithms is often referred to as deep learning (DL) which is developed from the machine learning (ML). The ML is the ability of an AI to extract information from raw data and to learn from experience [11,12,13,14]. Microsoft Corporation provides the free-trial service called “custom vision” which health care personnel can use to develop the AI in their daily practice, especially in radiology. This free-trial service, however, can be regarded as an ML level, while DL needs some additional programming. So DL was not included in this study.
In theory, more learning makes better AI performance, so the “Performance Per Tag” should improve gradually from 30, 60, 100, 130, and 160-tag iterations. Although the 160-tag iteration showed the best performance values, other iterations showed inconsistent values. The clinical performances improved gradually from 30, 100, 130, and 160-tag iterations, except for the 130-tag iteration which showed the results worse than the 100-tag iteration. Many discrete tag varieties, in which each variety had few tag patterns, may confuse the AI on the 130-tag iteration. With more tag patterns, the AI made better clinical performances with the best prediction rate at 31.58%.
The duration of training should affect the performance, and more sophisticated learning needed more time. The 1-h training model made better performances than a quick training model. The 2-h training model, however, was no different in performance from the 1-h training model. With only 160 tags, the AI needed 1 h to experience every pattern thoroughly. One more hour helped the AI learn nothing more. If more images were uploaded, a 2-h training may improve AI performance.
The accuracy and speed of the CAD/AI systems are dependent upon how their algorithms register data and how the system has been trained to learn effect calculation times [10]. The accuracy of detection for prostate cancer using CAD/AI system (43%) was comparable to standard ultrasound-guided biopsy (40%) [15]. Our study used the discrete images to train AI system which was less sophisticated than CAD/AI system, so it was not surprised to achieve low precision (20%) and recall (6.3%) which meant only 6.3 cases would be detected correctly from 100 positive cases. There is a need to perform additional studies with large data sets to improve the performance and impact of this system. Besides radiologist, other clinicians should use the AI system with utmost consideration.