Med Phys. 2020 Sep 22. doi: 10.1002/mp.14467. Online ahead of print.
PURPOSE: To develop a tool for the automatic contouring of clinical treatment volumes (CTVs) and normal tissues for radiotherapy treatment planning in cervical cancer patients.
METHODS: An auto-contouring tool based on convolutional neural networks (CNN) was developed to delineate 3 cervical CTVs and 11 normal structures (7 OARs, 4 bony structures) in cervical cancer treatment for use with the Radiation Planning Assistant, a web-based automatic plan generation system. A total of 2254 retrospective clinical computed tomography (CT) scans from a single cancer center and 210 CT scans from a segmentation challenge were used to train and validate the CNN-based auto-contouring tool. The accuracy of the tool was evaluated by calculating the Sørensen-Dice similarity coefficient (DSC) and mean surface and Hausdorff distances between the automatically generated contours and physician-drawn contours on 140 internal CT scans. A radiation oncologist scored the automatically generated contours on 30 external CT scans from three South African hospitals.
RESULTS: The average DSC, mean surface distance, and Hausdorff distance of our CNN-based tool were 0.86/0.19cm/2.02cm for the primary CTV, 0.81/0.21cm/2.09cm for the nodal CTV, 0.76/0.27cm/2.00cm for the PAN CTV, 0.89/0.11cm/1.07cm for the bladder, 0.81/0.18cm/1.66cm for the rectum, 0.90/0.06cm/0.65cm for the spinal cord, 0.94/0.06cm/0.60cm for the left femur, 0.93/0.07cm/0.66cm for the right femur, 0.94/0.08cm/0.76cm for the left kidney, 0.95/0.07cm/0.84cm for the right kidney, 0.93/0.05cm/1.06cm for the pelvic bone, 0.91/0.07cm/1.25cm for the sacrum, 0.91/0.07cm/0.53cm for the L4 vertebral body, and 0.90/0.08cm/0.68cm for the L5 vertebral bodies. On average, 80% of the CTVs, 97% of the organ at risk, and 98% of the bony structure contours in the external test dataset were clinically acceptable based on physician review.
CONCLUSIONS: Our CNN-based auto-contouring tool performed well on both internal and external datasets and had a high rate of clinical acceptability.