Lancet Digit Health. 2019 Nov;1(7):e353-e362. doi: 10.1016/S2589-7500(19)30159-1. Epub 2019 Oct 17.
BACKGROUND: Current lung cancer screening guidelines use mean diameter, volume or density of the largest lung nodule in the prior computed tomography (CT) or appearance of new nodule to determine the timing of the next CT. We aimed at developing a more accurate screening protocol by estimating the 3-year lung cancer risk after two screening CTs using deep machine learning (ML) of radiologist CT reading and other universally available clinical information.
METHODS: A deep machine learning (ML) algorithm was developed from 25,097 participants who had received at least two CT screenings up to two years apart in the National Lung Screening Trial. Double-blinded validation was performed using 2,294 participants from the Pan-Canadian Early Detection of Lung Cancer Study (PanCan). Performance of ML score to inform lung cancer incidence was compared with Lung-RADS and volume doubling time using time-dependent ROC analysis. Exploratory analysis was performed to identify individuals with aggressive cancers and higher mortality rates.
FINDINGS: In the PanCan validation cohort, ML showed excellent discrimination with a 1-, 2- and 3-year time-dependent AUC values for cancer diagnosis of 0·968±0·013, 0·946±0·013 and 0·899±0·017. Although high ML score cohort included only 10% of the PanCan sample, it identified 94%, 85%, and 71% of incident and interval lung cancers diagnosed within 1, 2, and 3 years, respectively, after the second screening CT. Furthermore, individuals with high ML score had significantly higher mortality rates (HR=16·07, p<0·001) compared to those with lower risk.
INTERPRETATION: ML tool that recognizes patterns in both temporal and spatial changes as well as synergy among changes in nodule and non-nodule features may be used to accurately guide clinical management after the next scheduled repeat screening CT.