Volatile Compounds Selection via Quantile Correlation and Composite Quantile Correlation : A Whiting Case Study

The freshness and quality indices of whiting (Merlangius merlangus) influenced by a large number of chemical volatile compounds, are here analyzed in order to select the most relevant compounds as predictors for these indices. The selection process was performed by means of recent statistical variable selection methods, namely robust model-free feature screening, based on quantile correlation and composite quantile correlation. On the one hand, compounds 2-Methyl-1-butanol, 3-Methyl-1butanol, Ethanol, Trimethylamine, 3-Methyl butanal, 2-Methyl-1-propanol, Ethylacetate, 1-Butanol and 2,3-Butanedione were identified as major predictors for the freshness index and on the other hand, compounds 3-Methyl-1-butanol, 2-Methyl-1butanol, Ethanol, 3-Methyl butanal, 3-Hydroxy-2-butanone, 1-Butanol, 2,3-Butanedione, 3-Pentanol, 3-Pentanone and 2-Methyl-1-propanol were identified as major predictors for the quality index.


Introduction
Fish freshness is a key attribute for the quality of fish, which is a highly perishable product.The fishing industry is an important contributor to many economies in the world.One of the senses used by consumers to determine the freshness of fish is the smell.Indeed, the volatilome of fish changes rapidly according to the product degree of freshness, and that is why sensory analysis are used by consumers and industrialists to assess fish quality.Then, the key volatile compounds that contribute to this characteristic odor can be measured and used as quality indicators [1] [2] [3].These characteristic aromatic volatile compounds are generated by different biological pathways including the lipid autoxidation, the action of spoilage organisms and autolytic enzymes.
Recently, Duflos et al. [1] studied the spoilage of whiting at five stages of ice storage by comparing the analysis of volatile compounds obtained by solid phase microextraction (SPME) coupled to the combination of gaz chromatography/mass spectrometry (GC/MS) and SPME with two sensory analysis methods.Two separate steps of statistical multidimensional approaches were used to identify volatile compounds and characterize fish freshness assessed by two different indices.In the first step, control charts were used to control the daily progression of freshness and spoilage indices.The second step begins by reducing the dimension of the data set (excluding the two indices variables) to two principal components via the application of Principal Component Analysis (PCA) method.
Then, a hierarchical clustering approach and a heuristic variable selection were used for clustering the fish samples on three classes and to identify the volatile compounds that respectively characterize these classes.However, the indices (or response variables) were not directly taken into account in the later procedure.
Recently, Sidi et al. [4] applied stability selection and randomization techniques in 1 L norm penalized quantile regression on the same data set.These approaches hig- hlighted volatile compounds that are more relevant for the evaluation of fish freshness throughout its storage, so, are assumed to influence more the fish freshness and quality.
Using penalized quantile regression approaches on whiting data set is motivated by the fact that consumers, generally faced different categories of fish freshness.The interest of quantile regression approach is its ability to provide a model for each level of quality.More details on quantile regression and penalized quantile regression can be found in the following references [5]- [12].
This paper aims at using Ma and Zhang [13] approach to select a reduced subset of volatile compounds which can be used to explain whiting spoilage during its conservation.This approach allows a robust and model-free feature screening based on quantile correlation proposed by Li et al. [14].
The lines below are organized as follow: The methodology is briefly presented in section 2 and section 3 is dedicated to the experimental framework.Finally, the results are discussed in section 4 followed by concluding remarks in section 5.

Methodology
This section is dedicated to the following methods: Quantile Correlation, Sure Independence Screening via Quantile Correlation, Composite Quantile Correlation and Sure Independence Screening via Composite Quantile Correlation.

Quantile Correlation
As advocated in Li et al. [14], quantile correlation is a novel measure used to examine the linear relationship between any two random variables Y and X for a given quantile ( ) ,Y Q τ is the th τ conditional quantile of Y and ( ) ( )

Sure Independence Screening via Quantile Correlation
Consider Y as the dependent variable and Sure Independent Screening via Quantile Correlation method selects the first d independent variables with largest ˆk ω ; where ( )( ) with k ω the sample estimate of k ω .

Composite Quantile Correlation
Composite Quantile Correlation(CQC) is motivated by the fact that previous quantile correlation cannot characterize the entire relationship between X and Y. So, the composite quantile correlation is defined by:

Sure Independence Screening via Composite Quantile Correlation
The CQC screening is based on the vector Sub models are selected based on decreasing values of k ω .Furthermore, as advo- cated by Ma and Zhang [13], when using screening techniques, the number of selected variables is often set to be 1 n − or the integer part of ( )

Materials and Methods
The sample preparation and sensory evaluation methods are briefly presented in this section.More details about the full experimental procedure can be found in [1].

Sample Preparation
As advocated by Duflos et al. [1], the sample considered is based on two different catches of respectively 20 and 15 fish.These catches were stored in crushed ice at 4˚C in self-draining polystyrene boxes for 7 days.Fresh crushed ice was added daily.Sensory evaluation and volatile analysis were performed on seven different fish on days 1, 2, 3, 4 and 7.

Sensory Evaluation
According to Duflos et al. [1], two methods were used for the sensory evaluation of fish.
These methods lead to freshness and quality indices which represent two response variables for our selection process.

Results and Discussion
The empirical results of the analysis of freshness and quality indices influenced by a great number of volatile compounds are presented below.
The sample size and the number of predictors (volatile compounds) are respectively 35 n = and 55 p = .So, the number of predictors is higher than the sample size.
In order to perform variables selection, screening methods are applied on whiting data set using QC-SIS package available for R software.
The tuning parameter d used to select covariates with significant effect on each response variable can be set to 1 34 n − = or the integer part of ( ) The results for 9 d = are presented in Table 1 and Table 2. Furthermore, Figure 1 and Figure 2 display the Pearson correlation matrix through bivariate scatter plots for each index with corresponding selected compounds.These figures have been made using Performance Analytics package available for R software.

Results for Freshness Index
According to Table 1, Quantile Correlation and Composite Quantile Correlation Sure Independent Screening methods select the same subset of volatile compounds for freshness index.These compounds have been previously identified as spoilage markers.
The previous seven compounds were included in the eight compounds (with limonene) that characterized category 2/3 (intermediate category between freshness and spoilage) in Duflos et al. [1].
Moreover, compounds Trimethylamine and 1-Butanol were not identified as correlated to the first principal component axis in Duflos et al. [1].

Results for Quality Index
According to in Duflos et al. [1].

Concluding Remarks
Sure Independence Screening via Quantile Correlation and Composite Quantile Correlation methods highlighted relevant volatile compounds influencing freshness and quality indices during whiting conservation.
For future investigation on whiting data, it will be very interesting to explore the following issues: 1) Simultaneous model selection in multiple quantile regression [11] 2) Selection of groups of highly correlated compounds [8] 3) Quantile regression models and inference processes based on [15] and [16].

Table 1 .
Selected volatile compounds ranked by decreasing weights for freshness index

Table 2 .
Selected volatile compounds ranked by decreasing weights for quality index.

Table 2 ,
Quantile Correlation and Composite Quantile Correlation Sure