Automated Heuristic Optimization of Prostate VMAT Treatment Planning ()
1. Introduction
Plan generation is generally a manual procedure in which the planner tries to guide the treatment planning system (TPS) towards a favorable plan; but treatment planning has become increasingly complex over the years, particularly regarding the number of organs at risk (OARs) that are included in the optimization of Volumetric Modulated Arc Therapy (VMAT) treatment plans. VMAT planning is complex as it involves a number of dynamic delivery parameters that need to be sequenced with each other. Several VMAT optimization algorithms are in use [1] [2] [3] to choose the most suitable dose distribution. These algorithms employ cost functions to measure the plan quality and to select a good solution. However, the quality of the final plan depends on the planner’s skills and experience [4] [5] [6] and on the time allotted. It is therefore essential to develop automatic planning tools in order to reduce the workload, to make more consistent plans and to optimize plan quality.
Various approaches have been developed for automatic generation of optimal treatment plans such as lexicographic inspired approach [7] in the Plan Explorer module based on fulfillment of prioritized clinical goals already implemented in Ray Station (Ray Search Laboratories AB, Stockholm, Sweden), atlas-based planning [8] [9] , ideal dose distribution estimation [10] [11] and progressive optimization algorithm as used in Pinnacle3 Auto-Planning (Philips Medical System, Fitchburg, WI) [12] [13] [14] [15] .
Genetic algorithms [16] [17] (GAs) are commonly used to generate high-quality solutions for optimization and search problems by relying on bio-inspired operators such as mutation, crossover and selection. In the radiotherapy treatment planning GAs were used to optimize both beam weights and importance factors for three-dimensional forward planning [18] [19] or to find an optimal solution in the inverse planning in terms of selecting optimal plan or beam orientation [20] [21] [22] . The aim of this work was to develop a script using GA to create a good quality treatment plan with RayStation TPS. The script was retrospectively applied to a sample of 10 VMAT prostate patients and compared both with a commercial automatic planning solution and with a manual one.
2. Materials and Methods
2.1. Genetic Algorithms
Genetic algorithms are heuristic optimization algorithms inspired by the principle of natural selection and biological evolution. GAs are local and stochastic research methods based on the biological metaphor working on a population of potential solutions by applying the principle of survival of the fittest, evolving into a most suitable solution to the problem. A fitness function (FF) is defined to evaluate the solutions. New sets of solutions are created applying the genetic operators (mutation, recombination by crossover) to a subset of selected member of the population at each generation. The process leads to an evolution that best fits the environment and to the most suitable set of solutions for solving the underlying problem.
In this study a genetic algorithm-based script was developed for prostate tumor plans. The script was implemented in RayStation treatment planning system (r.v.5.99) using Python code. Two different clinical prescriptions were considered: 78 Gy prescribed to planning target volume (PTV) in 39 fractions (GROUP 1) and simultaneous integrated boost (SIB) (70.2 Gy to prostate bed and 61.1 Gy to seminal vesicles) in 26 fractions (GROUP 2). The script automatically creates beams, prescription doses, auxiliary regions of interest (ROIs) and optimizes doses to PTV and OARs according to genetic algorithm. All plans were generated with the volumetric modulated arc therapy (VMAT) technique and planned for ElektaAxesseTM linear accelerator (4 mm leaf width at isocenter).
The chromosomes of the algorithm were the max equivalent uniform dose (MaxEUD) functions, which were applied to the ROIs created by subtraction between rectum and PTV (Rectum1) and between bladder and PTV (Bladder1).
Figure 1 shows the workflow of genetic algorithm; an initial value of MaxEUD was selected by a 3rd degree polynomial function of two variables (volume of ROI and percentage of overlapping volume between original ROI and PTV). The function coefficients were derived from a baseline dataset of 50 patients already planned at the University of Turin radiotherapy clinic. Starting from these values, the range of MaxEUD was created by adding/subtracting 10 Gy ([MaxEUD ± 10 Gy]) (initial range). Ten couples of Rectum1 and Bladder1 MaxEUD were generated randomly ([Rectum1 MaxEUD; Bladder1 MaxEUD]). For each couple, a plan was generated with 20 optimization iterations and scored by FF; coverage of the PTV was assured by a series of optimization parameters with the highest weight inside the objective function. The fitness function was defined as
(1)
where CI and PD are respectively the conformity index (calculated as the ratio between the ROI volume covered by the isodose and the total isodose volume) and the prescribed dose to PTV while rAD, bAD, lfAD and rfAD are the average dose of rectum, bladder and femoral heads (left and right) respectively. The plan with the best FF (the lowest value) was saved and archived. Eight couples were then randomly selected from these initial ten; a randomly selected gender (female or male) and a number (n) between [0; 1] were then assigned to each of them. If n is inferior to 0.1 (mutation probability value), one of the two elements of the couple was randomly changed within the initial range established. The eight couples evolved by crossover action and new eight couples were created. Two new random couples were added to the eight (making a total of ten) and ten more plans were generated from these new couples. This workflow was repeated four times (for a total of 5 cycles). At the end, five plans were selected from the best couples ([Rectum1 MaxEUD; Bladder1 MaxEUD]). The best plan
Figure 1. Workflow of the genetic algorithm of the script.
in terms of FF was selected and the final plan was calculated (0.3 cm dose grid and 40 optimization iterations).
We selected ten initial couples and 5 cycles to compromise between total calculation time (more or less half an hour on a PC Intel Xeon, CPU E5-2630 v3 @2.4 GHz equipped with 64 GB ram) and quality of the plan.
2.2. Plans Evaluation and Comparison
Ten patients were selected (five patients for GROUP 1 and five patients for GROUP 2, Table 1) and three plans were generated: the first automatic plan was created by GA, the second automatic plan by Auto-Planning module (AP) of Pinnacle3and the third manual plan by the Monaco treatment planning system (M) (v.5.0, Elekta AB, Stockholm, Sweden).
The plans were evaluated with a total score (TS) generated by PlanIQ software (Sun Nuclear Corp, Melbourne, FL) that measures treatment plan quality with
Table 1. Demographic and clinical characteristics of the patients involved in the study.
metrics that reflect clinical goals in terms of CI of the PTV and clinical constraints of OARs, by means of quantitative Plan Quality Metric (PQM) formalism described in [6] [14] . Table 2 reports the objectives for the target and OARs and the total maximum achievable score for both groups; these scores derive from an adaptation of prostate protocol proposal of PlanIQ and our clinical constraints. All plans were loaded on the same platform (RayStation) and a clinical score (CS) ranging from 0 to 5 was asked to a well-trained Radiation Oncologist for each plan; a score of at least 3 was requested to consider a single plan acceptable for delivering. Three evaluations for each plan were performed at a different time taking the average value as reference, so considering the intra-evaluation variability. The ANOVA test between GA, M and AP was carried out for all parameters. When the p-value was significant, a Fisher-Hayter test for paired samples (GA and M, AP and M, GA and AP) was applied (5% significance level).
3. Results
All plans were compared in terms of conformity index and for the different constraints of OARs starting from the point that target coverage and homogeneity
Table 2. Objectives and corresponding score of PlanIQ functions for PTV and OARs.
was achieved by all methods as recommended by ICRU 83. Table 3 reports the PlanIQ scores, clinical score and statistical analysis results. Figure 2 and Figure 3 show thePlanIQ score of some relevant “metrics” for both groups of patients. Figure 4 shows examples of the three distributions of axial, coronal and sagittal dose for one of the GROUP 2 patients.
In GROUP 1, for four patients out of five, Auto-Planning had highest total score (TS) with really minimal difference with GA (less than 3%), while for the patient 2, GA has the highest value; mean value for TS were 150.6 ± 30.7, 146.3 ± 36.1 and 137.4 ± 35.7 for AP, GA and M respectively with no significant difference reported. In terms of CS, the highest value has been attributed to GA in four patients out of five.
For the second group of patients (GROUP 2), values of TS were higher for GA in three out of five patients whereas for the patient 3 GA has the lower value among the planning systems. In particular, we can observe the low value mainly
Table 3. PlanIQ scores, Clinical score and results of statistical analysis obtained for plans generated with genetic algorithm-based script of RayStation (GA), Monaco TPS (M) and Pinancle3 Auto-Planning module (AP) for both cases.
Figure 2. Total Score (a), PTV conformity index (b), average dose (AD) of bladder (c) and rectum (d) for the first group of patients (GROUP 1) obtained with genetic algorithm-based script of RayStation (GA), Monaco TPS (M) and Pinancle3 Auto-Planning module (AP).
Figure 3. Total Score (a), PTV conformity index (b), average dose (AD) of bladder (c) and rectum (d) for the second group of patients (GROUP 2) obtained with genetic algorithm-based script of RayStation (GA), Monaco TPS (M) and Pinancle3 Auto-Planning module (AP).
due to V58Gy that is over the acceptable value. Also, in this case CS was able to detect the difference giving the lower value to GA. Mean value for TS were 163.5 ± 16.8, 163.4 ± 24.7 and 162.9 ± 16.6 for AP, GA and M respectively. No significant
Figure 4. Isodoses and DVHs representing a single SIB treatment plan for GA (a), Pinnacle Auto Planning (b) and Monaco (c).
differences were reported from ANOVA’s test for all considered parameters including CS for both cases.
4. Discussion
The automation of inverse treatment planning optimization in radiation oncology has become an active research topic in recent years. The aim of our study was to develop a fully automated process for VMAT prostate treatment planning, based on a stochastic search for an optimal solution in terms of tumor control and sparing of normal tissue. The scripting capabilities in RayStation enables/facilitates the development of clinic specific approach to automatic planning. Python code scripts are capable of reading and entering all dosimetric data and thus enabling the user to produce automatic high-quality plans; however, plan quality assessment should be carried out in order to compare the quality of the manually generated plans with the automatic plans.
In this study, manually plans were created and, given the seven-year experience with Monaco TPS in the department, the overall quality is expected to be sufficiently high for carrying out a comparison. We also decided to use an evaluation version of a commercial system, Pinnacle3 Auto-Planning, equipped with a progressive optimization algorithm. It was employed to produce the best deliverable treatment plan for each patient. Other authors have observed that Auto-Planning software appears to be a useful tool for increasing the overall quality of the treatments and reducing the inter-observer variation present in manually-created plans [12] [15] . It is important to note the inherent complexity of a plan comparison study due to the different plan optimization strategies within any department. With Plan IQ software, we were able to obtain an independent score by means of quantitative scorecards and so it is an objective evaluation tool. But we may obtain different results with different metrics reflecting other specific clinical goals; therefore, every result in terms of planning comparison must be evaluated taking account of the uncertainty related to the metric. We also add a clinical evaluation of each plan from a Radiation Oncologist to give more robustness to the comparison.
GROUP 1 with high dose (78 Gy) also prescribed to seminal vesicles may be considered more challenging compared to GROUP 2 because lower values of TS are obtained. This was intentionally done to stress as much as possible the optimization algorithms. TS differences between the manual and automatic plans is more pronounced for GROUP 1 than for GROUP 2. The patient 3 was the most challenging patient in terms of anatomical intersection between target and organs at risk; actually, TS values are the lowest both for GROUP 1 and GROUP 2. It’s interesting to note the agreement of CS with TS for the best plan in the patient 3 as well as the worst plan in the patient 4, where the M solution has TS 20% lower than AP and GA. Also, for GROUP 2, the worst performance of GA in the patient 3 in terms of TS (15% lower than AP and M) was detected by CS, so giving more robustness to the PQM evaluation.
From statistical point of view, the plans of all involved platform didn’t present significance differences; the genetic algorithm, in a contest of relatively simple optimization problem as may be considered a prostate plan, has satisfied the “no difference” intent compared to two well-known robust commercial solutions represented by AP and M. From clinical point of view, it was considered the best solution in 8 patients out of 10; the involved Radiation Oncology had a very good impression in terms of shape of isodoses as well as visual and numerical inspection of DVH, that is the current way to decide the best plan in our clinical routine.
However, more and more Radiation Oncologists should be involved as in the definition of different PQMs as well in the evaluation of the quality of the plans to estimate the uncertainty related to the valuation process.
5. Conclusion
This preliminary study shows that it is feasible to use the heuristic approach for good quality plan generation in the context of VMAT prostate treatment planning with no significance difference compared to Pinnacle3 Auto-Planning that is a commercial solution that provides very good results in terms of plan quality and robust automation. Studies are underway to determine whether genetic algorithms can be used in other sites, as they appear to be a promising tool for automated inverse planning.