Comparison of Multimodality Image ‐ Based Volumes in Preclinical Tumor Models Using In ‐ Air Micro ‐ CT Image Volume as Reference Tumor Volume

Purpose: Changes in tumor volume are used for therapy response monitoring in preclinical stud‐ ies. Unlike prior studies, this article introduces in‐air micro‐computed tomography (micro‐CT) image volume as reference tumor volume in rodent tumor models. Tumor volumes determined using imaging modalities such as magnetic resonance imaging (MRI), micro‐CT and ultrasound (US), and with an external caliper are compared with the reference tumor volume. Materials and Methods: In vivo MR, US and micro‐CT imaging was performed 4, 6, 9, 11 and 13 days after tumor cell inoculation into nude rats. On the day of the imaging study, in vivo caliper measurements were also made. After in vivo imaging, tumors were excised followed by in‐air micro‐CT imaging and ex vivo caliper measurements of excised tumors. The in‐air micro‐CT image volume of excised tumors was determined as reference tumor volume. Then tumor volumes were calculated using formula V = (π/6) × a × b × c, where a, b and c are maximum diameters in three perpendicular dimensions determined by the three image modalities and caliper, and compared with reference tumor vol‐ ume by linear regression analysis as well as Bland‐Altman plots. Results: The correlation coeffi‐ cients (R2) of the regression lines for in vivo tumor volumes measured by the three imaging mo‐ dalities were 0.9939, 0.9669 and 0.9806 for MRI, US and micro‐CT respectively. For caliper meas‐ urements, the coefficients were 0.9274 and 0.9819 for caliperin vivo and caliperex vivo respectively. In Bland‐Altman plots, the average of tumor volume difference from reference tumor volume (bias) was significant for caliper and micro‐CT, but not for MRI and US. Conclusion: Using the in‐air mi‐ cro‐CT image volume as reference tumor volume, tumor volume measured by MRI was the most accurate among the three imaging modalities. In vivo caliper volume measurements showed unre‐ liability while ex vivo caliper measurements reduced errors. Corresponding author.


Introduction
The treatment efficacy of anti-tumor drugs is often assessed by measurements of tumor size change, as tumor size is thought to correlate with the remaining number of viable tumor cells in a tumor mass [1].Radiologic assessment of tumor size has been extensively performed using one-dimensional measurement (RECIST) or area calculated from two linear dimensions (WHO criteria), which are the only United States food and drug administration (US FDA) approved imaging biomarkers [2] [3].Yet, one-or two-dimensional measurements often mathematically misrepresent the change in tumor size by disregarding the second and/or third dimensions and lead to biased measurement results [4].For more accurate assessment of tumor burden, volume can be calculated from two or three linear dimensions using a variety of equations [5] or other volumetric techniques [4] [6] [7].Recently, a National Institutes of Health (NIH) study introduced combined density and volume assessment criteria to accurately reflect tumor burden and to predict cancer survival by using automated density and volume application (ADaVA) [8].
Unlike in the clinic, most preclinical anti-tumor drug studies use the caliper technique for in vivo volume measurement of subcutaneous tumors since this technique is convenient, cost-effective and non-invasive [9] [10].However, the technique readily overestimates tumor size due to skin thickness, hair, edema and obesity [7] [11]- [13].Inconsistent measurements by insufficiently trained personnel can also cause variable errors.In addition, available tumor volume equations do not necessarily represent actual tumor shape.Moreover, deeply located tumors cannot be measured using this technique.Thus, it has been proposed that the weight of an excised tumor is the most consistent and reproducible reflection of its volume (correlation coefficient ≈ 1.0000) [1] [7] [10].Yet, determination of a more accurate tumor volume from the weight would require the accurate density information of a particular tumor type even though the density doesn't vary greatly between one tumor type and another [7] [10].
Recently, Fullerton et al. confirmed the high accuracy of in-air micro-computed tomography (micro-CT) image (i.e.micro-CT image of a sample surrounded by air) volume by a voxel counting method using National Institute of Standards and Technology (NIST) traceable cylindrical phantoms [14].Micro-CT resolution is typically at the micron level [15].When an excised tumor is placed in a high contrast radiographic background such as air, the tumor size on the order of ten microns can be readily measured.This high contrast between tumor and air allows for accurate voxel counting for calculation of tumor volume in the micro-CT images [14].In addition, in comparison to magnetic resonance imaging (MRI), micro-CT is more cost-effective [16] and allows for much faster scan time, which can prevent dehydration of excised tumors.Foster and Ford used a similar techniquealgorithm based volume calculation of excised tumors submerged in ethyl alcohol by using the contrast between tumors and ethyl alcohol-to avoid inconsistency of contouring tumors [17] and their study supports Fullerton et al.'s methodology.Other researchers have used micro-CT to determine image volume of tumors after processing for histology [16]- [18].However, histologic processing is known to decrease the linear dimensions of tissue by 10% -15% and volume by 27% -38% [18]- [20].
Imaging modalities such as ultrasound (US), micro-CT, MRI or micro-Positron Emission Tomography (micro-PET) have been often utilized to monitor chemotherapeutic response to cancer treatment in preclinical tumor models [7] [10] [21]- [24].Jensen et al. compared tumor volume by micro-CT with micro-PET and caliper, and showed that micro-CT was more accurate than micro-PET or caliper [7].These results match Ishimori et al.'s findings on sequential CT scanning versus caliper measurements [12].Ayers et al. reported that US imaging was better than caliper for tumor volume measurement [10].Validation studies by Martiniova et al. demonstrated that the accuracy of T2-weighted MRI volume measurements was within about 8% [22].In the present study, unlike the prior studies, in-air micro-CT image volume of excised tumors was calculated as reference tumor volume using the methodology previously shown by Fullerton et al. [14].Using the reference tumor volume, then image-based volumes were quantified and compared among three commonly used imaging modalities (US, micro-CT and MRI) in rodent tumor models.In addition, we used the same data set to compare tumor volume de-termined by the caliper method with the reference tumor volume.

Tumor Cell Inoculation
The animal studies were conducted in accordance with federal guidelines and approved by our institutional animal care and use committee.For all procedures the rats were anesthetized with isoflurane inhalation (2%) in 100% oxygen.Twenty male rnu/rnu nude rats (3 -4 weeks old, 75 -100 g; Harlan, Indianapolis, IN) were inoculated subcutaneously with squamous cell carcinoma (SCC4) tumor cells (ATCC, Manassas, VA) in 0.20 ml of saline on the dorsum at the level of the neck as previously described by Bao et al. (2006) [25].One rat failed to form a tumor and the tumor take rate was 95%.The nineteen tumor-bearing rats were divided into five groups (each group had 4 rats except for group 4 which had 3 rats).Groups 1, 2, 3, 4 and 5 were studied 4, 6, 9, 11 and 13 days after tumor cell inoculation, respectively, to provide a range of tumor sizes based on SCC4 tumor growth characteristics determined by our laboratory.A workflow of the studies is shown in Figure 1.

Caliper Measurement of In Vivo Tumors
When each tumor group reached the determined day after inoculation, the tumor imaging measurement studies were initiated.Immediately before MR image acquisition, in vivo tumor dimensions were determined under anesthesia using a NIST traceable digital caliper (Fisher Scientific, Pittsburgh, PA) with an accuracy of 0.02 mm and a resolution of 0.01 mm by measuring the length (the longest dimension), width (the dimension perpendicular to the length in the plane parallel to the skin surface), and height of each tumor [5].Then tumor volume was

US Image Acquisition of In Vivo Tumors
US imaging of tumor-bearing rats was performed under anesthesia using the Vevo 770 unit with a 35 MHzfrequency transducer (Visualsonics Inc., Toronto, ON, Canada).Following hair removal, using B-mode, the plane with the greatest diameter (width) of each tumor in the transverse plane (plane perpendicular to the length of rats) was selected by moving the transducer to measure the width and depth of each tumor.The transducer was then rotated by 90˚ and the plane with the greatest diameter (length) of each tumor was selected to measure length of each tumor.Raw data were recorded and stored in the US system.Average scan time per rat was 10 min.

Micro-CT Image Acquisition of In Vivo Tumors
Micro-CT imaging of tumor-bearing rats was performed using the FLEX X-O micro-CT unit (Gamma Medica-Ideas (now Trifoil Imaging), Northridge, CA).Under anesthesia, CT projection images of the rats were acquired in fly mode as raw data using 75 kVp tube voltage, 370 µA tube current, 256 X-ray projections equally distributed over a 360˚ image acquisition angle, and 1024 × 1024 square image matrix size (resolution: 160 μm × 160 μm × 160 μm).Average scan time per rat was 2 min.The 256 X-ray projections were reconstructed to generate a 512 × 512 × 512 three dimensional (3D) CT data set using the COBRA software (EXXIM Computing Corporation, Pleasanton, CA) and the 3D CT data were converted to the DICOM format using Volumetric Image Visualization Identification and Display (VIVID) software (Gamma Medica-Ideas, Northridge, CA).

Weighing, Caliper Measurement and In-Air Micro-CT Image Acquisition of Excised (Ex Vivo) Tumors
After in vivo micro-CT image acquisitions, anesthetized rats were sacrificed by cervical dislocation and tumors were excised from the rats and trimmed of surrounding tissues.Tumors were immediately weighed on an OHaus GA 2000 electronic balance (Ohaus Corp., Pine Brook, NJ).The precision and accuracy of the balance were confirmed by three repeated measurements of a 1.00000 g NIST traceable standard mass, giving 1.0003 ± 0.0001 g (all empirical values reported as mean ± standard deviation).Then caliper measurements of excised tumors were made in three dimensions and volumes were calculated.
In-air micro-CT images of excised tumors were obtained using the FLEX X-O micro-CT unit.Excised tumors were mounted on a radiolucent Styrofoam holder of 5 cm × 7 cm × 2.5 cm (Figure 2(a)).The tumor and the Styrofoam holder were completely sealed using clear plastic wrap (Anchor Packing Co., Fenton, MO) to prevent dehydration of excised tumors during micro-CT scans.Then micro-CT images of excised tumors were acquired as the DICOM format using the same imaging protocols as described for in vivo tumors.The time between tumor excision and completion of the micro-CT scan was less than 10 min and it was assumed that no tissue dehydration occurred during that time.

Volume Calculation of Excised Tumors on In-Air Micro-CT Image as Reference Tumor Volume
In-air micro-CT image volume of the excised tumors was calculated using an automatic voxel counting method in Analyze software (Version 6.0, Mayo Clinic, Rochester, MN).First, window width and level were set as follows.Second, a ROI surrounding the tumor was defined.Third, the number of voxels inside the ROI was automatically counted and volume was calculated by multiplying the number of voxels by the voxel size.In-air micro-CT image volume of excised tumors was defined as "Reference Tumor Volume".Then mean density for excised tumors in all groups was calculated by plotting mass as a function of in-air micro-CT image volume of excised tumors using Graph-Pad Prism 5 software (Version 5.04, GraphPad Software Inc., San Diego, CA).

Volume Measurement of In Vivo Tumors on US, Micro-CT and MR Images
Tumor size was measured in three perpendicular dimensions on raw data images saved during US scanning.The longest diameter (width) and the longest perpendicular diameter (depth) of each tumor were measured in the transverse plane perpendicular to the length of the rats using a measurement tool in Vevo 770 v.2.2.3 software.
In the sagittal plane parallel to the length of the rats, the longest diameter (length) perpendicular to the two diameters was measured.
Tumor size was measured in three perpendicular dimensions on MR and micro-CT images using a Full Width at Half Maximum (FWHM) method in ImageJ software (Version 1.42q, National Institutes of Health, Bethesda, MD).ROIs (area: about 1.00 cm 2 ) were designated on tumor and background (soft tissue next to tumor) and the mean signal intensities (SI tumor and SI background ) were recorded.Window width and level were set to SI tumor − SI background and (SI background + SI tumor )/2, respectively.In the transverse plane perpendicular to the length of the rats, the longest diameter (width) and the longest perpendicular diameter (depth) of tumors were measured.Likewise, the longest diameter (length) perpendicular to the two diameters was measured in the sagittal plane.In the micro-CT images of tumor-bearing rats, the tumor margins were estimated by contouring the tumors, and the longest perpendicular diameters (width and depth) within the margins measured accordingly.US, MR and micro-CT image-based volumes were calculated for tumors in each group using the same equation as used for caliper-based volume.

Statistical Analysis
Statistical analysis was performed using GraphPad Prism 5 software.In vivo tumor volumes determined by caliper, MRI, US and micro-CT were compared with reference tumor volume by linear regression plots.In addition, caliper measurements of in vivo tumors and caliper measurements of excised (ex vivo) tumors were compared by performing a Student's t-test to determine if there is a statistically significant difference between the two caliper measurements.A p value less than 0.05 was considered statistically significant.
Agreement between caliper-based, MR, US, micro-CT image-based volumes and reference tumor volume was also analyzed by Bland-Altman plots (plots of volume difference against mean of volumes) where the central line (mean of volume difference) and the other two lines (±2 × standard deviation) indicate the bias and the limits of agreements (LoA), respectively [7] [26].The 95% confidence interval (CI) on bias was calculated to determine if bias is significant and if 0 was excluded in the 95% CI on bias, bias was considered significant [7].

Multimodality Imaging of In Vivo Tumors for Image-Based Volume
In vivo tumor images for four rats in group 3 (average tumor volume: 319.68 mm 3 ) acquired by MRI, US and micro-CT are shown in Figure 3. Tumors appeared with higher signal than surrounding tissues in MR and US images (Figure 3(a) and Figure 3(b)).The contrast between tumors and surrounding tissues was adequate to measure tumor size without contrast agent.CT images (Figure 3(c)) had poor soft tissue contrast as expected and tumor margins were estimated for segmentation of tumor from adjacent soft tissue.

In-Air Micro-CT Image Volume as Alternative Gold Standard
Excellent contrast between tumor and air in in-air micro-CT image allowed for accurate voxel counting in tumor mass (Figure 2(b)) and the calculated in-air micro-CT image volumes (i.e.micro-CT image-based volumes of ex vivo tumors) as reference tumor volumes are listed in Table 1.Mass (g) was plotted as a function of the in-air micro-CT image volume (cm 3 ) of excised tumors in all groups along with a linear regression plot in Figure 4. Mean density of excised tumors was 1.079 g/cm 3 from the slope of the regression line.5 and Table 2).As a result of linear regression plots, caliper in vivo versus reference tumor volume had the best fit of line Y = (1.056± 0.072) × + 42.96 ± 29.25 (R 2 = 0.9274; p < 0.0001) (Figure 5(a) and Table 2).Similarly, the best lines for MRI, US and micro-CT and caliper ex vivo are tabulated in Table 2. Of all the modalities, MRI measured volume was the most correlated with the reference tumor volume (R 2 = 0.9939) and in vivo calipermeasured volume the least correlated (R 2 = 0.9274) (Table 2).
The two (in vivo and ex vivo) caliper measurements were significantly different (p < 0.05) in a paired t-test.Each calipermeasured tumor volume was compared with the reference tumor volume as shown in Figure 5(a) and Figure 5(e), and in vivo and ex vivo calipermeasured tumor volumes were both overestimated (by about 6% from the slopes of regression lines).
Figure 6 depicts Bland-Altman plots of volumes measured by caliper in vivo , MRI, US, micro-CT, and caliper ex vivo versus reference tumor volume.The mean volume difference between caliper invivo and reference tumor volume (bias) was 59.32 mm 3 (95% CI on difference: 54.76 -99.09 mm 3 ; LoA: −114.00 -232.70 mm 3 ) (Table 3).Therefore, caliper in vivo had significant bias when compared with reference tumor volume.Similarly, the mean volume differences for MRI, US, micro-CT and caliper ex vivo are shown in Table 3. Significant bias of caliper in vivo , micro-CT and caliper ex vivo measurements and non-significant bias of MRI and US measurements compared with the reference tumor volume were observed (Table 3).

Multimodality Imaging of In Vivo Solid Tumors for Image-Based Volume
The 7T MRI unit used in the present study provided high spatial resolution (0.20 mm × 0.28 mm × 0.31 mm), high signal-to-noise ratio (SNR) and good soft tissue contrast as shown in Figure 3(a) and allowed for the obvious definition of tumor borders among the three imaging modalities.Field of view and matrix size determine spatial resolution while other factors such as slice thickness, field of view (FOV), size of the image matrix, number of acquisitions, scan parameters (TE, TR and flip angle) and magnetic field strength determine the SNR.These factors were optimized for better visualization of the tumors in T2-weighted images and high magnetic strength (7T) improved the SNR.Similarly, the 35 MHz frequency US transducer used in this study provided high resolution and relatively good soft tissue contrast which facilitated visualization of tumor boundaries (Figure 3(b)).However, FOV decreases with frequency and high frequency limits the size of the tumor that can be scanned (Table 1).In most cases, CT contrast agent is required to better visualize tumor borders in CT im-Even with the high resolution of the micro-CT (0.16 mm × 0.16 mm × 0.16 mm) used in this study, it was    still difficult to differentiate tumors from surrounding soft tissues in the non-contrast enhanced images (Figure 3(c)).

In-Air Micro-CT Image Volume as Alternative Gold Standard
In this study, the density (1.079 g/cm 3 ) of SCC4 tumor was obtained from accurate measurements of mass and in-air micro-CT image volume (Figure 4).If we had simply assumed that the density was 1.00 g/cm 3 as reported in several studies [7] [10], the tumor volumes would have been underestimated by about 8%.To our knowledge, the accuracy of in-air micro-CT image volume had not previously been validated for tumor studies and it has not been used before as a reference/true tumor volume.Fullerton et al. in the same research group initially proved that in-air micro-CT image volume is fairly accurate (~0.6%) [14] and we adopted the method.For this study, we assumed that no dehydration of excised tumors occurred during the time (less than 10 min) between tumor excision and completion of the micro-CT scans.However, in future studies excised tumors should be weighed immediately after excision and after completion of the micro-CT scans to determine if this assumption is acceptable.One limitation of widespread adoption of the in-air micro-CT method for tumor volume measurement is the availability, additional cost of micro-CT scanners and appropriate software for voxel counting compared with the caliper method.

Comparison of Caliper-, MR, US and CT Image-Based Volume with Reference Tumor Volume
There are several ways to measure tumor volume in animal studies as discussed earlier.Often diameters in two or three dimensions are linearly measured and volume is calculated using an appropriate tumor volume equation [5].Nowadays volume measuring software is commonly available and volume can be readily calculated by different techniques [4] [6] [7].In this study, we assumed an ellipsoidal tumor shape which is known to best represent the actual tumor [27] and the linear method allowed us to keep consistency among the four different modalities (three imaging modalities and a caliper) because a volumetric technique cannot be applied to the caliper method.The results shown in this study provides pharmaceutical laboratory users with an idea about the comparative accuracy of imaging methods available in achieving study design goals for new cancer therapy protocols.Hence, it is anticipated that use of the methods demonstrated here will assist animal imaging laboratories in providing more consistent results for costly drug studies and thereby reduce the number of animals needed to achieve their goals.Caliper measurements of in vivo tumors have been known for inaccuracy and unreliability [7] [12].In the present study, caliper measurements of in vivo tumors overestimated the tumor size (Table 1 and Table 2), which was a similar finding as reported by Shoma et al. [13].Average % deviations from the reference tumor volume ranged from 7.32% to 151.78% (Table 1).Jensen et al. also showed that an external caliper overestimated subcutaneous tumor volume (slope of linear regression line: 1.24, R 2 = 0.75) compared with reference tumor volume which was the tumor mass multiplied by tumor density of 1.00 g/cm 3 .Their results match our in vivo caliper measurements which show overestimation and systematic bias (Table 2 and Table 3).Percent deviations in Table 1 include both random and systematic errors.The majority of random errors are human errors and experienced observers can reduce the magnitude of errors to a certain extent.Systematic errors could be due to several reasons mentioned in the Introduction such as interference from hair, skin, fat and edema.If subtraction of known skin thickness (range of 500 -1500 µm) is applied [28], errors would be reduced.Although the equation reproduces the actual tumor volume the best, the volume equation used in this study also caused systematic errors because tumor shape was assumed to be ellipsoidal; however, most tumors especially those with large dimensions are an irregular shape [27].
The average deviations in caliper measurements of ex vivo tumor volumes were less for each group (20.50%, 2.67 %, −4.45 %, −0.29 % and 7.29% in order) compared with in vivo tumor caliper measurements (Table 1).Nonetheless, ex vivo tumor size was still overestimated in three groups (groups 1, 2 and 5) (Figure 5(e) and Table 1, Table 2) and bias (the mean of tumor volume difference from reference tumor volume) was significant in Bland-Altman plots (Table 3).These results confirm that caliper measurements are not accurate regardless of skin and/or hair and in spite of the observer measuring the longest lengths of ex vivo tumors in all three dimensions.It also supports Ayer et al.'s study that the tumor volume formula does not exactly represent actual tumor shape [10].The t-test results display no repeatability between in vivo and ex vivo tumor volume measurements.Nonetheless, the caliper technique is still commonly used because it can rapidly estimate tumor volume or tumor growth rate [27].
Of the in vivo modalities tested, MRI measured tumor volume most closely corresponded to the reference tumor volume (slope = 1.021;R 2 = 0.9939; p < 0.001) based on the linear regression analysis.Average volumes of three groups (groups 2, 3 and 5) were underestimated compared with reference tumor volume (Table 1).The Bland-Altman plot also showed non-significant bias (Table 3).In this study, the soft tissue contrast provided by the 7T MRI allowed an observer to differentiate tumors better than in other imaging modalities.This could reduce errors of volume measurements.The errors come from random and systematic errors as well.Like caliper measurements, random errors are related to human-involved errors.Firstly, although the MRI provided very thin slice thickness (0.31 mm), the observer could miss the border of the tumor located between two slices and the tumor size would be underestimated.Using a thicker slice thickness would not improve delineating the tumor border.Secondly, rats were positioned parallel to the bed in the MRI bore but the longest dimension of the tumors was not necessarily parallel to the axis of the image.This positioning could also have caused the underestimation of tumor size.Volume measurements by volumetric technique could resolve this issue.Thirdly, even with high spatial resolution (0.20 mm × 0.28 mm × 0.31 mm), defining the exact tumor boundary around the surrounding soft tissues is sometimes challenging and still subject to errors.Kawano et al. reported that small tumors less than 1.5 mm in diameter were difficult to be measured even in a 9.4T MRI [29].If available, a MRI with stronger magnetic field would provide higher resolution to measure small tumors despite the increased spatial distortion [29].MRI systematic errors result from similar reasons as for caliper measurements.First, the volume equation could create errors to a certain degree.In the same fashion, better volumetric techniques could improve the accuracy of volume measurements.Second, different imaging parameters or imaging sequences could affect imaging quality and consequently, tumor measurement accuracy.Martiniova et al. demonstrated that volume measurements by a 3T MRI with the resolution of 0.156 mm × 0.156 mm × 0.5 mm was within about 8% compared with pathology specimens [22].However, they didn't describe how the volumes were measured [22].Our study showed better results (2.1%) (Table 2).Kawano et al. demonstrated MRI correlated well with caliper measurements in mice models (r = 0.944) [29].In their study, MR measurement of tumor volumes was made using Image J software and caliper measurements at necropsy were made using the same formula as used in this study.Abou-Elkacem et al. showed that ex vivo tumor volumes by caliper significantly correlated with in vivo tumor volumes by a 3T MRI in mice models (R 2 = 0.96, p < 0.05) [30].In their study, the tumor volumes were calculated using the formula π/6 × length × width 2 for caliper and π/6 × length × width × depth for MRI assuming ellipsoidal shape.Similarly, our study showed that the correlation coefficients (r) in the regression lines of MRI versus caliper measurements were 0.9027 and 0.9725for caliper in vivo , and caliper ex vivo respectively.As known, MRI provides excellent soft tissue contrast and non-ionizing imaging.Nonetheless, MRI has high infrastructure costs and thus, it is not available in every institution.Also MRI requires preparation and lengthy scanning time which lead to motion artifacts in the images [8].
Our US study described that tumor volumes in groups 1 and 3 were overestimated while those in the other groups were underestimated and the slope (0.862) of the linear regression plot showed underestimation (R 2 = 0.9669, p < 0.0001) (Table 1 and Table 2).Still, bias was not significant in Bland-Altman plot (Table 3).The sources of errors can be divided into random errors and systematic errors as follows.In this study, a 2D plane with a maximum diameter (width) was identified during scanning and the maximum perpendicular diameters (width and depth) were measured in the plane based on WHO criteria.The same method was used for length.However, it was difficult to ensure that the longest diameters were identified during scanning as described by Graham et al. [21].Also the diameter perpendicular to the maximum diameter in one dimension was not necessarily the longest in the other dimension in the plane.The longest diameter in each dimension (width, length and depth) should be measured separately to reduce high random errors.Acquiring the 3D data set and calculating tumor volume offline afterwards would allow for enough time and control and therefore, result in more accurate volume measurements.A random error was also caused by the inaccurate definition of tumor margins in the US images.Although the 35 MHz frequency transducer offered high resolution (axial resolution: 0.11 mm; lateral resolution: 0.05 mm), it was still challenging to define the boundaries of the tumor as seen in Figure 3(b) due to reduced echogenicity at the edges, and this estimation of the tumor boundaries could cause errors.The highest deviations were observed in the smallest tumor group.Another random error was the underestimation of the tumor diameters for some tumors in the largest tumor group (group 5) that were out of the field of view due to the limited field of view provided by the high frequency US.Negative % errors in most tumors in group 5 were attributed to this reason.For large tumors outside of the field of view, a lower frequency transducer should be used or the tumor size should be limited to fit the field of view.No assumption of defined tumor shape could cause systematic errors.Other volume methods or techniques to better define tumor shape need to be performed to see if errors would be reduced.Ayers et al. reported that US measured volumes exhibit significantly more accuracy, precision and reproducibility than manual standard caliper measurements in subcutaneous tumors of nude mice models [10].In their study, true tumor volume was calculated from mass assuming density was 1.00 g/cm 3 .Then tumor volumes were calculated from caliper measurements using the formula 1/2 × a × b 2 where a and b are the two longest perpendicular dimensions in the a-b plane of each tumor.US image-based volume was determined by segmenting in-plane tumor and multiplying it by inter-slice spacing.Faustino-Rocha et al. presented a significant correlation between US and caliper volume measurements in rat models using the formula 1/2 × width 2 × length for caliper measurements and (4/3π) × (length/2) 2 × (depth/2) for US [31].In the present study, average US tumor volumes were closer to the reference tumor volume than average in vivo tumor volumes measured by a caliper (Table 1).US and caliper volume measurements correlated fairly well and correlation coefficients (r) in the regression lines of US versus caliper measurements were 0.8998 and 0.9374 for caliper in vivo and caliper ex vivo respectively.As displayed in Figure 3(b) and described in other studies, a high frequency US provides relatively good soft tissue contrast which facilitates volume measurements with accuracy and precision although slightly reduced echogenicity at the tumor edges was observed.Hastie et al. described that low levels of intra-observer variabilities (≈10% to 18%) were observed in high frequency US [32].
CT images have inherently poor soft tissue contrast and thus, it can be quite challenging to define tumor boundaries with this modality.As a result, tumor volume can be overestimated or underestimated.Contrastenhanced micro-CT may improve the sensitivity in detecting tumors and is effective especially for small tumors [33] [34].In the present study, without contrast agent, micro-CT tumor volumes were overestimated (Figure 5(d) and Table 1, Table 2) and had significant bias in a Bland-Altman plot although the linear regression plot showed a fairly strong correlation (R 2 = 0.9806) with the reference tumor volume.Major errors were due to poor soft tissue contrast even though tumors were located subcutaneously.Also the assumption of an ellipsoidal tumor shape in calculating tumor volumes caused a systematic error.The acquisition parameters for the micro-CT scanner used in this study were similar to those used predominately by cancer researchers to co-register an anatomical CT image to either a PET or SPECT image.Although the resolution (160 μm) was not as high as could be achieved by some commercial dedicated micro-CT scanners for small animal studies, the parameters used were typical for our laboratory and a good compromise between resolution, throughput, acquisition time and radiation dose, which all must be considered when designing small animal imaging studies.Jensen et al.'s study contradicted our results: the slope of regression line for micro-CT was 1.01 and there was no significant bias [7].Their study demonstrated that micro-CT was more accurate than caliper using a reference tumor volume of mass multiplied by density 1.00 g/cm 3 [7].For caliper measurements, the ellipsoidal formula was 1/2 × length × width 2 and for micro-CT, tumor volume was generated by summation of voxels [7].Ishimori et al. also showed similar results to Jensen et al.'s: micro-CT was better than caliper for tumor volume measurement [12].The same formula (1/2 × length × width 2 ) for caliper tumor volume determination and the segmentation of in-plane tumor multiplied by inter-slice spacing for micro-CT were used [12].In the present study, like in US imaging, average micro-CT volumes were closer to the reference tumor volume than average in vivo caliper volumes (Table 1).Abou-Elkacem et al. showed a good correlation (R 2 = 0.90; p < 0.05) between MRI volume and micro-CT volume for large tumors although non-contrast micro-CT was less sensitive than MRI [30].In comparison, our study showed correlation coefficient (R 2 ) of 0.9670 between these two modalities.Prajapati et al. compared ex vivo micro-CT based virtual histology volume with in vivo MRI-based volume and demonstrated that there was a strong correlation between the two datasets (R = 0.998) [16].A primary advantage of CT scans is they provide a rapid assessment of tumor burden although there is a concern that a high accumulation of radiation dose from repeated CT scans might interfere with tumor growth [30].Boone et al. reported the mean dose to a mouse per scan from a micro-CT scanner and the dose might not be negligible (>0.2 Gy per scan) depending on scan parameters and animal size [35].However, Foster and Ford concluded that the radiation dose from longitudinal micro-CT imaging does not cause inhibition of tumor growth [17].In this study, tumors in each group were excised immediately after in vivo micro-CT scans followed by ex vivo caliper measurements and in-air micro-CT scans and thus, radiation dose interference with tumor growth, if any, was not a concern.

Future Directions and Recommendations
Future studies will need to be performed as follows.First, better 3D volume image analysis techniques will need to be implemented for more accurate tumor volume measurements in MR, US and CT images.Methods used in this study provided a rapid and simple way for an observer who doesn't have access to software or tools for 3D volume measurement.Graham et al. showed that in US, there were large differences in measured tumor volumes between formula based and 3D segmented methods [21].The mean percent difference for B16F1 liver metastases was −8.8% ± 23.5% (range, −90.1% to 53.2%).For a formula based method, an ellipsoidal shape was assumed and the formula V = (π/6) × width × length × depth was used.Cheung et al. showed that a 3D method in US was more accurate and reliable than a formula based method for volume determination of regular and irregular shaped phantoms [36].There was a weaker correlation between 2D image-based volume (the same method as our study) and 3D segmented volume (cross-sectional area multiplied by slice thickness) than correlation (a correlation coefficient of 0.9813 (p < 0.0001) for tumors greater than 1 mm 3 ) between 3D image-based volume (volume calculated from maximum three diameters measured in 3D reconstructed images) and 3D segmented volume [31].Even intra-observer and inter-observer variabilities were lower for the 3D segmentation method.However, slice thickness may introduce minor errors in volume estimation [36].Second, intra-observer and inter-observer measurements in each modality need to be made as performed in other studies to estimate both accuracy and precision.Also a volume measurement comparison between trained and untrained personnel could be made.Lastly, validation procedures for accurate volume measurement need to be developed.Currently available US FDA approved imaging biomarkers are RECIST and WHO criteria.However, as mentioned in the Introduction, volume and density criteria are under development and these new methods could be the next surrogate for anti-tumor drug testing.Lee et al. developed quality assurance (QA) procedures for RECIST and WHO criteria [2] [3].Once volume and density criteria are fully developed, QA procedures for volume criteria will also need to be established.
In this study, rodent tumor volumes measured by four different modalities (caliper, MRI, US and micro-CT) and the deviations to the reference tumor volume that unexperienced or untrained personnel could achieve were presented.Unlike most studies depicting tumor growth curves in rodent models, the focus of this study was on the volumes of tumors grouped according to determined time-points after tumor inoculation.The results could be a good reference for tumor volume measurement and its errors in multiple modalities.Each modality has its own pros and cons and therefore, it is reasonable to conclude that accuracy, cost, time and other factors have to be considered when users select an appropriate modality [30].Before performing tumor volume measurements preclinically or even clinically, it is recommended that an observer perform validation testing in the modality that he or she chooses with the same imaging protocols and measurement methods using a standardized phantom as shown in Lee et al. for QA purposes [2] [3].

Conclusion
This study showed the possibility of using in-air micro-CT image volume as a reference tumor volume in rodent studies.As expected, caliper tumor volume measurements were unreliable.Of the preclinical imaging modalities tested, tumor volumes measured by MRI most closely corresponded to the reference tumor volume since the 7T MRI provided excellent soft tissue contrast, high spatial resolution and high SNR.

Figure 1 .
Figure 1.Workflow for tumor volume measurement in the SCC4 rodent tumor model.

Figure 2 .
Figure 2. (a) An excised tumor mounted on a Styrofoam holder and (b) in-air micro-CT image of the excised tumor after the adjustment of window width and level.The image slice corresponding to the axial-middle of the tumor was selected and a small (10 pixel × 10 pixel) region of interest (ROI) was placed at the center of the tumor.Then mean signal intensity of the ROI (SI tumor ) was recorded.A similar ROI was placed over air (background) adjacent to the tumor and the mean background signal intensity (SI background ) was recorded.Then an image display window level was set to SI tumor and a window width to the difference SI tumor -SI background in the entire micro-CT images (Figure2(b)).Second, a ROI surrounding the tumor was defined.Third, the number of voxels inside the ROI was automatically counted and volume was calculated by multiplying the number of voxels by the voxel size.In-air micro-CT image volume of excised tumors was defined as "Reference Tumor Volume".Then mean density for excised tumors in all groups was calculated by plotting mass as a function of in-air micro-CT image volume of excised tumors using Graph-Pad Prism 5 software (Version 5.04, GraphPad Software Inc., San Diego, CA).

Figure 3 .
Figure 3. (a) MR, (b) US and (c) micro-CT images of in vivo tumors for four rats (#9-12 listed in Table 1) in group 3.The images were acquired 9 days after tumor cell inoculation.Estimation of tumor margins in micro-CT images are denoted by yellow contours.

Figure 5 .
Figure 5. Linear regression plots for tumor volume determined by (a) caliper in vivo , (b) MRI, (c) US, (d) micro-CT, and (e) caliper ex vivo against reference tumor volume.Caliper in vivo and Caliper ex vivo denotes caliper measurements of in vivo and ex vivo tumors, respectively.Reference tumor volume was determined using in-air micro-CT imaging.

Figure 6 .
Figure 6.Bland-Altman plots of (a) caliper in vivo , (b) MRI, (c) US, (d) micro-CT and (e) caliper ex vivo for tumor volume measurements.The mean indicates the bias and mean ± 2SD indicate the limits of agreement (LoA).

Table 1 .
Tumor volumes (% deviation from reference tumor volume) measured using various modalities for each rat.In-air micro-CT image volume of ex vivo tumors is reference tumor volume.

Comparison of Caliper-, MR, US and CT Image-Based Volume with Reference Tumor Volume
Reference tumor volumes from in-air micro-CT images ranged from 16.99 mm 3 to 1039.52 mm 3 with a median of 253.39 mm3.The means for the reference tumor volume in each group were 25.53 mm 3 , 34.65 mm 3 , 297.31 mm 3 , 442.04 mm 3 and 693.45 mm In vivo caliper measured volumes had the highest mean deviations except for group 3. Ex vivo caliper measured volumes were much more accurate as expected.MRI had the lowest mean deviations among the three imaging modalities except for group 2. The smallest tumor volume group (group 1) had the highest deviations in all measured modalities.Tumor volume measured by in vivo and ex vivo caliper, MRI, US and micro-CT all correlated (R 2 ≥ 0.9274; p < 0.0001) with reference tumor volume (Figure 3in order.Table1summarizes the in vivo tumor volumes measured by caliper, MRI, US and micro-CT, and ex vivo tumor volumes measured by caliper and micro-CT in each group as well as the % deviations from reference tumor volumes.

Table 2 .
Linear regression plot results for tumor volume determined by four methods (caliper, MRI, US and micro-CT).A slope is significantly non-zero when p < 0.05.

Table 3 .
Bland -Altman plot results for tumor volume determined by four methods (caliper, MRI, US and micro-CT).Bias is the average of tumor volume difference from reference tumor volume.If 0 is within 95% confidence interval (CI), bias is not significant.LoA means 95% limits of agreement.ModalityBias (