Height/Length and Weight Growth Curves and Growth References of Children Aged 0-7 in Chongqing by GAMLSS

Aim: Using the generalised additive model for location, the scale and shape (GAMLSS), the standard of height (length), weight and percentage curve of children were formulated to understand the growth law in Chongqing and to provide a reference for clinical and preventive health care work of paediatrics in West China. Methods: Data were collected from the health clinic dataset of children aged 0 7 years in the Department of Child Health Care of a Third-Class A hospital in Chongqing from 2010 to 2017. By applying the GAMLSS technology and taking D(μ, σ, ν, τ) as the specific distribution form, BCT as the link function, P-spline function and cubic splines function as the smoothing function, the unit value of age-specific standard deviation of height (length) and weight were obtained, and the percentile map was drawn. Results: Based on the principle of minimum AIC and SBC values, BCT distribution was chosen as the model of the link function. Fitting results of height (length) and weight parameters of boys and girls, standard deviation tables of age and percentile curves were obtained. Conclusions: This study was an exploration of using GAMLSS method to establish a reference value range in the field of children’s growth and development in China. It could supplement the relevant data of Chongqing area for Reference standard for growth and development of children under 7 years of age in China, and also provide a reference for the rational diagnosis of children’s short stature and malnutrition in Chongqing.


Introduction
Height and weight are the most important indicators reflecting on children's growth, nutrition and health, and the growth curve is the most useful means of growth detection and evaluation [1]. The data of The Guidelines for Child Growth and Development revised by WHO in 2009 was from a multi-centre study on physical growth of children conducted from 1997 to 2003, involving six countries: Belgium, Ghana, India, Norway, Oman and the United States [2]. In 2009, the National Health Planning Committee of China issued the "Reference standard for growth and development of children under 7 years of age in China" (hereinafter referred to as "2009 Standards"), mainly through the data of "Investigation on Physical Development of Children under 7 Years Old in Nine Cities of China in 2005" (data from province of Beijing, Harbin, Xi'an, Shanghai, Nanjing, Wuhan, Guangzhou, Fuzhou and Kunming) to plot the standardised growth curve [3]. Therefore, according to the current information, it is necessary to draw a growth curve suitable for the growth and development characteristics of children in Chongqing. Since the growth data are usually non-normal distribution [4], the GAMLSS (Generalised additive model for location, the scale and shape) model was used in this study to fit the height (length) and weight data of children in Chongqing to understand the present situation of children's growth and development. Meanwhile, the standard of height (length) and weight and percentile curves for children in Chongqing was drawn to provide more precise and practical references for clinical health care and scientific research.

Data Sources
The subjects were normal boys and girls aged 0 -7 in Chongqing. To ensure the continuity of the edge curve, samples of children aged 7 -8 were added to draw the curve. The data were collected from the outpatient department of child health care of a Third-class A hospital in Chongqing from 2010 to 2017. The sources of respondents include 26 districts, 8 counties and 4 autonomous counties in Chongqing, which have achieved full administrative coverage. All data were entered by professional medical staff, registration forms were entered into the electronic database system, and data quality was checked regularly. Also, quality control and analysis of the aggregated data were carried out to eliminate unqualified data. The age of the respondents was obtained by subtracting the registered date of birth from the actual time of data collection, which was continuous. Moreover, the measuring method of children's length before 3 years old was lying position and that of height after 3 years old was standing position.

Research Object
Inclusion criteria: 1) Permanent Residents in Chongqing; 2) The physician diagnosed that there were no abnormalities and no previous history of disease; 3) Full-term birth (>28 weeks), single foetus; 4) After birth, they did not enter the department of neonatology and rehabilitation, and no dystocia occurred.
Exclusion Criteria: 1) There were missing values in age, height (length) and weight; 2) Height (length), weight less than-4 times standard deviation or over than 4 times standard deviation; 3) Age of registration was inconsistent with the age of consultation.

Methods
Drawing growth curve charts: In this study, the GAMLSS 5.1-2 package in R language version 3.4.4 was used for statistical analysis [5]. By applying the generalised additive model for location, the scale and shape (GAMLSS) proposed by Rigby and Stainopoulos in 2005, we tried to use various combinations of P-spline (P-spline function) [6] and CS (CUBIC SPLINES function) [7] as smoothing functions (thereby, adopting the combined smoothing method of P-spline smoothing parameterμand CS smoothing parameter σ) to establish the function and percentile curve of height (length) and age. Then, the veracity of the model was judged by worm plot [8] and residual test and the distribution was fitted by several commonly used link functions (BCT function). According to the principle of minimum AIC (Akaike information criterion) [9] and SBC (Schwarz's Bayesian criterion) [10], the optimal model was selected. Finally, the Q-Q chart [11] was used to test the data effect.
Definition of GAMLSS model [5]: GAMLSS is based on the LMS (lambda, mu, sigma) method with a specific distribution of D(μ, σ, ν, τ) [10], combined with Box-Cox-Cole-Green (BCCG) [12] and Box-Cox-power-exponential (BCPE) [13] distribution. Furthermore, four parameters including mean, standard deviation, Kurtosis and skewness are taken into account to solve the model distribution differences of nonlinear parameters in different age groups and to make the curve shape more smooth. GAMLSS provides a very general model class for single variable response variables, which is proposed in a unified and consistent framework. Moreover, GAMLSS allows response variables and all parameters of distribution to be modelled [5]. Let ( ) , , , n y y y y =  be the vector for observing response variables. Therefore, for 1, 2, , k p =  , set ( ) k g ⋅ to be a known monotone link function, which links k θ with explanatory variables and random effects through an additive model:  terpolation format and multi-resolution hierarchical structure. Marx [14] proposed a P-spline method for controlling functions by adding penalty terms to B-spline coefficients [15]. However, in general, B-spline is incapable of knowing the location and number of nodes, and the placement of a suitable number of nodes is a complex nonlinear optimisation process. To avoid singularity or folding effect in the deformation field, an additional regularisation term is needed [16]. Hence, we used P-spline as the median smoothing method of the GAMLSS model. Moreover, for the penalty of the P-spline model, additional regular terms were used to avoid singular points or folding effects of the deformation field.
Durrleman's CS function [7] is the most commonly used non-parametric curve smoothing method, which is more flexible and smoother than the general exponential model. Considering that the exponent of standard deviation is more likely to increase than the median with the increase of age, the cubic spline function was used in the fitting of standard deviation. Generally, cubic splines with three nodes (degree of freedom 3) can adequately represent data.

Standard Charts
Tables 1-4 showed the standard deviation of height (length) and weight for boys and girls by age (−3SD, −2SD, −1SD, median, 1SD, 2SD, 3SD). Figures 1-4 showed the percentile curve of height (length) and weight for boys and girls by age (line 3, 10, 25, 50, 75, 90, 97 percentiles). As the Q-Q diagram showed that the data in each group did not conform to the normal distribution, the height and weight data of boys and girls were skewed, so Box-Cox conversion was needed. The Worm plot method ( Figure 5, Worm plot only lists the height of boys due to space constraints) was used to evaluate the normality hypothesis of the GAMLSS model by evaluating the difference between the normal value and the Z Score (the degree of freedom of Z score is 23) of height (length) and weight for continuous age. In this study, since the Z score was satisfactory, the height (length) and weight of boys and girls were suitable for BCT model. Therefore, the unit value of the standard deviation obtained can be compared with the standard value of 2009. The specific values and comparison results are shown in Tables 1-4.

Fitting Characteristics
The reference range of this study is suitable for children aged 0 -7 years. Figure  1 and Figure 2 showed that the normal range of height of boys and girls at birth was relatively concentrated and gradually diverges with age. The curves were generally smooth and a few had inflection points. Both boys and girls had        growth peaks during the age of 0 -3, and the curves showed an exponential distribution. But after 3 years of age, the height increased steadily and the curve increased linearly. From Figure 3 and Figure 4, it can be seen that the curves were generally smooth, inflection points existed, and the curves fluctuated greatly in the age range of 4 -7 years. This may be due to the sampling error and as the age increased, the sample size of each group decreased, thus affecting the curve fluctuation. Nevertheless, each group of samples met the sample size requirements of the GAMLSS model, so the results are still credible. Both boys and girls had a peak weight gain during the period of 0 -1 years old and the curve showed an exponential distribution. After one-year-old, the body weight increased steadily and the curve showed linear growth.

Comparisons with Domestic Standards
 Comparing with the standard in 2009, the distribution and growth rate of each age group were approximately the same. The height curves were also inflection points at the age of 3 and the body weight curves were the same. Comparatively speaking, the curve of this study is more convergent and the range of percentiles under the 2009 standard is larger.  As can be seen from Table 1, except for the fitting value of 3SD, the other standard deviation unit values of boys in this study are all smaller than the corresponding value of the 2009 standard, indicating that the height level of normal children aged 0 -7 in Chongqing is generally lower than the national average standard of the 2005 survey. The gap between the median of this study and that of the 2009 standard was between 1 -2.84 cm; −3SD value increased gradually in the 0 -2 age group, but 1.8 -3.95 cm lower than the 2009 standard in the 2 -3 age group; the gap between the ages of 3 and 7 gradually increased and the gap increased to 7.15 cm at the age of 7. The difference between the height of girls and the standard of 2009 was generally smaller than that of boys, and the difference between the median of this study and the standard of 2009 was between 1.1 and 1.95 cm; the distribution of birth height was basically consistent with the standard of 2009, and the difference between the three-year-old was gradually increasing. In this study, the difference between the standard deviation unit value of boys' weight and the standard deviation in 2009 was either positive or negative, and the difference between the standard deviation unit values was small in the 0 -18 month period, ranging from −0.42 to 0.62. In the 21 -33 month period, the SD values of boys' weight exceeded the 2009 standard, and the difference increased with the percentage after 36 months. Moreover, the difference between the median and the 2009 standard ranged from −4.74 to 1.42, and the overall 3SD was higher than the 2009 standard. Compared with the standard curve in 2009, the gap of girls' weight curve between −2SD and −3SD was smaller, within ±1.7 and the gap between -3SD was relatively larger, with a maximum of 3.02. Therefore, the height curve is the biggest difference between the standard deviation unit value in this study and that in 2009, especially the height of boys, and the smallest difference is the weight of girls.

Methodological Significance
Children's growth and development is a continuous and complex dynamic process. With the depth of the research on children's growth law, and the progress of statistics and computer application, the method of making the growth curve is more advanced and standardised [3]. WHO and other countries have developed standard growth curves for children, but most of them used the mature LMS method to fit percentile curves. When WHO revised the WHO Guidelines for Children's Growth and Development in 2009 [2], the research group had hoped to explore the application of GAMLSS technology in standard-setting based on the LMS method but failed to achieve it. Meanwhile, many large-scale international studies, such as Child and Adolescent Health Survey Project (KiGGS) [17] in Germany in 2011, Development Group of Fourth Growth Curve (FDGS) [18] in the Netherlands in 2000, and Maternal and Child Health Project in Barcelona in Spain in 2009 [19], have begun to focus on the effective application of GAMLSS technology in reference standards for children and adolescents' growth and development. Compared with the LMS method, the normality of biological data is not very good, the percentiles estimated by LMS method may be different, while GAMLSS takes kurtosis simulation into account, and is more suitable for large data samples and skewed distribution data, which further may modify the model residuals and smoothen the curve shape [20]. Further-more, the GAMLSS method can obtain population-based reference curves and tables for correcting biologically skewed age-specific statistical models, for adapting percentiles or Z scores, and for evaluating the accuracy of extreme percentiles by calculating 95% confidence intervals. In summary, the GAMLSS method is suitable for drawing reference curve of height (length) and weight of children. In all fairness, this is a research and exploration of using GAMLSS to establish a reference range in the field of children's growth and development in China.
The on-site information collection process of this study was in line with the relevant national standards for child health, and the inclusion and exclusion criteria were strictly formulated and implemented. It is worth mentioning that the database outliers were repeatedly cleaned to maximise control bias and ensure the reliability of data sources. In the process of fitting, different function combinations were used to iterate the GAMLSS object, and the combination smoothing method of P-spline (smoothing parameter μ) and CS (smoothing parameter σ) was obtained, which had the best smoothness. Moreover, among the four commonly used linking functions (TF, BCT, BCPE and BCCG) of GAMLSS, the AIC and SBC values of BCT were relatively optimal. It has to be mentioned that in this study, the absolute values of AIC and SBC in any model were large, which is related to a large amount of data and the presence of noise interference. Therefore, in subsequent studies, to obtain a better model with high stability and consistency, repeated sampling from samples with large data volumes and repeated multiple modelling may be considered.

Clinical Significance
A recent study reported that the diagnostic rate of short stature through standard clinical evaluation is only 1.3%, so more diagnostic methods are needed for clinicians [21]. However, in different countries and regions, children's height is significantly affected by genetic and environmental factors, and there are population differences, which limits the application of evaluation charts from one group to other groups [22]. Also, the physical development investigation of children in nine cities of China did not include sample data of Chongqing. Moreover, although there is only one hospital in this research, this hospital is a national Third-Class A comprehensive children's hospital, which is the largest and most extensive professional institution in Chongqing, the sample of which contains 26 districts, 8 counties and 4 autonomous counties, achieving full coverage of the administrative region in Chongqing. Moreover, the data covered a long period, the process of data collection and preservation was mature and standardised, and the sex ratio of children in the study sample was relatively balanced. Furthermore, more than 95% of the children in the study sample were normal children's physical examination (this study only included the data of normal children's physical examination), and the inclusion and exclusion criteria were strictly controlled, thus the data in this study are well represented. In conclusion, this study supplements the reference range of height (length) and weight of children in Chongqing and provides a basis for the reasonable diagnosis of short stature and malnutrition.