Lasso logistic regression to derive workflow-specific algorithm performance requirements as demonstrated for head and neck cancer deformable image registration in adaptive radiation therapy

Head and Neck Cancer
25/06/2020

Weppler S, et al. Phys Med Biol 2020.

ABSTRACT

PURPOSE: As automation in radiation oncology becomes more common, it is important to determine which algorithms are equivalent for a given workflow. Often, algorithm comparisons are performed in isolation; however, clinical context can provide valuable insight into the importance of algorithm features and error magnification in subsequent workflow steps. We propose a strategy for deriving workflow-specific algorithm performance requirements.

METHODS: We considered two independent workflows indicating the need for radiotherapy treatment replanning for 15 head and neck cancer patients (15 planning CTs, 105 on-unit CBCTs). Each workflow was based on a different deformable image registration (DIR) algorithm. Differences in DIR output were assessed using three sets of QA metrics: (1) conventional, (2) workflow-specific, (3) a combination of (1) and (2). For a given set of algorithm metrics, lasso logistic regression modeled the probability of discrepant replan indications. Varying the minimum probability needed to predict a workflow discrepancy produced receiver operating characteristic (ROC) curves. ROC curves were compared using sensitivity, specificity, and the area under the curve (AUC). A heuristic then derived simple algorithm performance requirements.

RESULTS: Including workflow-specific QA metrics improved AUC from 0.70 to 0.85, compared to the use of conventional metrics alone. Algorithm performance requirements had high sensitivity of 0.80, beneficial for replan assessments, with specificity of 0.57. This was an improvement over a naïve application of conventional QA criteria, which had sensitivity of 0.57 and specificity of 0.68. In addition, the algorithm performance requirements indicated practical refinements of conventional QA tolerances, indicated where auxiliary workflow processes should be standardized, and may be used to prioritize structures for manual review.

CONCLUSIONS: Our algorithm performance requirements outperformed current comparison recommendations and provided practical means for ensuring workflow equivalence. This strategy may aid in trial credentialing, algorithm development, and streamlining expert adjustment of workflow output.