Modelling Animal Activity as Curves : An Approach Using Wavelet-Based Functional Data Analysis

Temporal activity patterns in animals emerge from complex interactions between choices made by organisms as responses to biotic interactions and challenges posed by external factors. Temporal activity pattern is an inherently continuous process, even being recorded as a time series. The discreteness of the data set is clearly due to data-acquisition limitations rather than a true underlying discrete nature of the phenomenon itself. Therefore, curves are a natural representation for high-frequency data. Here, we fully model temporal activity data as curves integrating wavelets and functional data analysis, allowing for testing hypotheses based on curves rather than on scalar and vector-valued data. Temporal activity data were obtained experimentally for males and females of a small-bodied marsupial and modelled as wavelets with independent and identically distributed errors and dependent errors. The null hypothesis of no difference in temporal activity pattern between male and female curves was tested with functional analysis of variance (FANOVA). The null hypothesis was rejected by FANOVA and we discussed the differences in temporal activity pattern curves between males and females in terms of ecological and life-history attributes of the reference species. We also performed numerical analysis that shed light on the regularity properties of the wavelet bases used and the thresholding parameters.


Introduction
Temporal activity of animals is the result of behavioral responses to external (environmental) factors, such as availability of food resources and predation risk, and the internal states of individuals, such as nutritional condition, aversion to risk and reproductive drive [1] [2] [3] [4].The constraints imposed by external factors and internal states dictate that animal temporal activity pattern in terrestrial, aquatic, and aerial environments is interrupted by pauses, angle turnings, and changes in speed [5] [6] [7] [8] [9].Temporal activity pattern is marked with punctuations in the movement (pauses and changes in speed), temporally autocorrelated, and localized in nature [3] [9] [10].Therefore, ecological and behavioral inference on patterns and processes of animal temporal activity are based on data with inherent autocorrelation and localization properties [3] [11].
Animal temporal activity is recorded as a series of time points but one can successfully argue that the discreteness of the data set is due to a technological limitation (data-acquisition capabilities) rather than a true underlying discrete nature of the phenomenon itself.Thus, as opposed to the standard statistical cases in which the observations are often numbers (scalars) or vectors [18], highfrequency data such as temporal activity records are continuous curves (functional data).Wavelets are functions that are able to represent a signal in a time series in both time and large and small scale domains [18].Such decomposition into time-scale space allows the identification of the dominant modes of variability and how these modes vary with time [19].Wavelets have become the method of choice to deal with such data because wavelet transforms are particularly suited to handle data that are high dimensional, autocorrelated, and localized [3] [18] [20] [21].Whereas wavelet transforms have been recently used to characterize autocorrelative properties of animal temporal activity [3] [15] [22], wavelets have not yet been explored to model animal temporal activity as curves.In fact, whenever wavelets are used, their statistical analysis is based on statistical tests designed for scalar or vector random variables and not functional data [23].
The versatility of wavelets is breathtaking.They have been used to model problems as widely disconnected as resampling time series of surrogate data derived from random cascades on dyadic trees [24], to establish a connection between discrete wavelet transforms, and in entanglement renormalization for quantum systems on the lattice [25].Important applications of wavelets are also found in preserving motion discontinuities along the edges of weak textures and for dealing with rotations that exist in image sequences [26], the combined analysis of thermal and visible light images of plants to detect early disease with high accuracy [27], and early detection of melanomas from images of boundary irregularities of skin lesions [28].Wavelets are also useful for high-speed detection of transient high impedance faults and power-quality disturbances [29], feature extraction, discriminant analysis, and classification rules as crucial issues for face recognition [30], and in the development of a full-fledged theory of multiresolution sig-nal decomposition via wavelet representation [31].
In this note, we use wavelets to model temporal activity data as curves in the context of functional data analysis (FDA [32] [33] [34] [35]).One may argue the case for treating these temporal activity data as vector-valued data.Even though this may be possible, we can advance two main reasons why treating the data as curves is more appropriate in practice.First, the vector-based and the functional data paradigms interpret the augmentation of sample sizes in two opposing views.
The former understands new observations as an elongation of the time interval in which data are being provided.For example, if the original time points are 0.1, 0.2, 0.3, 0.4, 0.5, a new data point will happen at some time larger than 0.5, say 0.6.On the other hand, functional data analysis is based on a enhanced capacity of finer sampling.On the same example, instead of augmenting the original sampling scope from [0.1, 0.5] to [0.1, 0.6], one captures more points on the same interval [0.1, 0.5], say 0.15.This increasing sampling capacity is sometimes referred to as micro dynamics in opposition to the macro dynamics of vectorbased analysis.Because animals may live in a limited space in nature or they may be confined to a laboratory setting and have a finite life-time, increasing sample size as increasing space (or time) is less desirable than increasing sampling rate (timewise or spatially).This refinement can be easily related to infinitesimal calculus so functions are more appropriate than their finite dimensional vector counterparts.
Wavelets are therefore used to model temporal activity data as curves in the context of FDA, which deals with functional responses as when units are observed over time.FDA theory provides the means for testing hypotheses based on curves rather than on scalar and vector-valued data [18] [36].In particular, we will use functional analysis of variance (FANOVA), which is the equivalent of ANOVA for functional data [37].In FANOVA for two functional samples with a common covariance function one wishes to test the null hypothesis of equality of mean functions versus the alternative hypothesis that the mean functions are different [37].Our approach is exemplified by activity data obtained experimentally for a small neotropical marsupial, Gracilinanus microtarsus.We chose this species as a reference system because there is currently relevant ecological data available, including demography and life history [38]- [43].Survival rate for females remains fairly constant in the pre-and post-mating periods, whereas male survival rate decreases significantly in the post-mating period due to stress from aggressive interactions between males [41] [44].Adult males (30 -45 g) are much heavier than female gracile opossum (20 -30 g [41] [42]).Body size has a positive, linear association with the area traversed by an individual in its normal activities of food gathering, mating and caring for the young [45] [46] [47].As expected, the area traversed in nature is larger and more variable in male than in female gracile opossum [48].Therefore, we expect that activity patterns will differ between male and female gracile opossum.Specifically, we will test the null hypothesis of homogeneity of mean curves of activity patterns from male and female gracile opossum.To the best of our knowledge this is the first attempt to integrate wavelets to model curves derived from activity data with functional data analysis in the framework of functional analysis of variance.

Theory
Ramsay et al. (2005) [36] provide the reader with a generalized framework for the analysis of functional data, which basically depends on regularity conditions of the underlying curve.Wavelets are especially built to provide regular estimates through multiscale shrinkage [18].We refer to Kist and Pinheiro (2015) [20] for a detailed development of the wavelet functional data analysis for dependent errors.
Wavelets are basically elements of some specially built basis of the space of square-integrable functions.This means the following.If f is a function so that ( ) basis for the space of square-integrable functions, there are constants { } 0k c and { } jk d such that f can be written as: There are many different wavelet bases.Their main characteristic is that all elements of a wavelet basis have the same form, being different only in location and/or dilation.Therefore, wavelet analysis is immediately equipped with a fast transform algorithm.
In what follows, we give a general presentation of this procedure as needed for the animal temporal activity data set.
Each observation is composed of K time-point evaluations of a function of where  is the diffusion parameter, ( ) ( )

{ }
: gl W t t∈  are either independent standard Brownian motions,case independent and identically distributed, or independent Continuous Time Autoregressive Moving Average (CTARMA) processes [20] case dependent, for Suppose f belongs to a convenient Besov space [20].We write the non- linear wavelet estimators as One can write the 2 L − norm as ( ) ( ) , where The dual relationship between Besov spaces and wavelet bases leads to a natural change in the hypotheses being tested.Instead of testing 0 H vs 1 H , a slight formal change is made, as proposed in [49] for independent errors.A twostep procedure is developed by Kist and Pinheiro (2015) [20] for dependent errors.
The test statistic is compared to the cut-off point and the decision (reject or not 0 H ) can then be made.
Although these tests and hypotheses are mathematically different from the aforementioned hypotheses, for all applied purposes, they all yield the same interpretation as follows.Whenever the empirical evidences lead to the rejection of 0 H one can conclude that the data provide statistical evidence that at least two of the functions 1 , , G f f  are different.For instance, for our case, 2 G = , 1 f and 2 f are the underlying functional behavior of female and male specimens, respectively.Thus, rejecting 0 H means that the average time-curves are taken with respect to observations of female and male behaviors are statistically different.On the other hand, whenever 0 H is not rejected, one understands that there is not enough empirical evidence for each group to have different underlying functions.Again for our case, this can be interpreted as the data not providing statistically significant evidence that male and female specimens differ in their temporal activity behavior.

Materials and Methods
The specimens of the gracile mouse opossum (G.microtarsus) were live trapped in a savannah-like habitat in the city of Mogi-Guaçu in the state of São Paulo using Sherman live-traps (dimensions 7.5 × 9.0 × 23.5 cm) baited with banana and peanut butter.Individuals captured were marked with a numbered ear tag and their sex and age were recorded.In the laboratory, gracile mouse opossums were housed individually in acrylic boxes (44 cm width × 33 cm length × 20 cm height) with ad libitum access to food (commercial cat and dog chow) and water, and kept under artificial conditions of light (12-h light/dark cycle, light on at 7:00 A.M.) and temperature ( ) To assess spontaneous locomotor activity of the gracile mouse opossum we used an automated motor activity monitor (Acti-Track v2.7.10,PanLab, S.L. Instrument, Barcelona, Spain [50]).The apparatus consists of a transparent Perspex box (45 × 45 cm base, 35 cm height) connected to a photoelectric cell and locomotor activity is detected by light beam breaks.Thirty-two infrared beam breaks, 16 each on perpendicular walls, were mounted 3 cm above the box frame floor and connected to an interface (LE 8811, LSI Letica Scientific Instruments, Barcelona, Spain), and data were sent to a computer.Thence, locomotor activity was assessed as a rate of light beam breaks during the period of the experiment [51].This means that whenever the mouse opossum was moving vertically or horizontally the rate of light breaks would be recorded as the activity variable.Hence, the higher the rate of light beam breaks the higher the activity of the individual.
At the beginning of the experiment, each gracile mouse opossum was placed in the Perspex box and allowed to freely explore for 24 hours to habituate the individuals before conducting the experiments.Testing in the actimeter was done in an isolated room between 18:00 and 06:00, which corresponds to the activity of gracile mouse opossums in the wild.After each experiment, the Perspex box was carefully cleaned with a 5% ethanol cleaning solution.
Permission for animal collection was provided by SISBIO (Sistema de Autorização de Informação em Biodiversidade).Animal housing and experimental procedures were approved by Comissão de Ética no Uso de Animais, Universidade Estadual de Campinas.
Experimental settings such as those used in our study provide relevant information not only for the study of animal temporal activity pattern, but also for several areas of ecology and behavior including for example the association between social and sexual preferences and genetic variation at microsattelite loci [52], modulation of vocalization by hormones [53], and the link between heritable neuroendocrine variation and male sexual behavior [54].
The data analysis was performed as follows.Twelve hours were selected from the continuous observed curves.Three families of wavelet bases were employed on the data: Symmlets, Coiflets and Daubechies.Preliminary analyses led to one smoothness parameter for each family: Symmlets 8, Coiflets 3 and Daubechies 6.
These are different wavelet bases.This means the aforementioned functions 0k The aforementioned data set was composed of temporal activity curves from 12 hours of data acquisition, in which data were taken every second for 6 males and 7 females of G. microtarsus.This data set was analyzed using the proposed wavelet model for which there were 2 G = groups: females and males.We then estimated 1 f and 2 f .[ ] 0,1 .One can notice the differences between 1 f and 2 f (see Equations ( 1) and (2) above).Moreover, the proposed dependent estimators are much more regularized then the previously available wavelet estimators.

Results
Numerical results did not differ much among the bases chosen (shown in Figure 1).However, the visual results for Coifflets were in general coarser than the other two bases.Daubechies' bases were interesting because of their theoretical and numerical properties, whereas Symmlets were the bases which more closely relate to Daubechies, and look the most symmetrical.We should point out that there are no wavelets bases that are both compactly supported and symmetrical The robust mean absolute deviation (MAD) or the standard deviation may be employed for the estimates of the measure of the noise variability.Standard deviation is in general superior to MAD, since the latter yields less regular estimated curves [20].The choice of the wavelet basis is usually quite unimportant.The use of any such basis leads to the same inferential results.Some local characteristics of the estimated curves are highlighted or shadowed by each basis, but the results are the same.
The data were comprised of a total of 43,200 observations for each specimen, which means one observation at every second for 12 hours of experiment.The curves were estimated for each sex, and several choices of wavelet basis and/or thresholding were employed.The individual autocorrelation estimates varied from 0.38 to 0.66, whilst the overall autocorrelation estimates for females and males were 0.46 and 0.55, respectively.The maximum difference between any estimates for the same data set given by the choice of the wavelet basis was not greater than .02.
One should note that, as expected, the estimated curve with independent errors was much less regular than its dependent counterpart.This happens considering each pair of curves for fixed wavelet basis and thresholding procedure.Levels 6 -9 were thresholded to estimate ρ , while levels 8 -12 were thresholded in the final curve estimate for either independent or dependent models.Figure 1 displays the preliminary results for three families of bases.
From this, Coifflets do present visually less appealing results.
The FANOVA test results using Daubechies 6 and Symmlets 8, and the aforementioned thresholding configuration were all favourable to the inequality of 1 f and 2 f .For db6, the test statistics were An extensive numerical study was performed varying the regularity of the Daubechies and Symmlets bases and the thresholding parameters.The general results were that: 1) Test results were robust with respect to the choice of the wavelet basis and thresholding parameters.
2) The chosen basis was the least relevant aspect on the procedure, both in its inferential (and numerical) aspects, as well as the visual characteristics of the estimates.
3) The choice of the thresholded levels bears importance for the test resultsone should understand that the differences between males and females were statistically significant for any estimation procedure, but the value of the test statistics and the cut-off points changed considerably with the thresholded levels.
4) The choice of the thresholded levels influenced the visual characteristics of the estimated curves.
5) Standard deviation was superior to MAD.

Discussion
In this study, we used wavelets to model activity data as curves and functional data analysis, in particular functional analysis of variance, to test the hypothesis that the mean activity curves differed between male and female of the small marsupial G. microtarsus.We rejected the null hypothesis by FANOVA showing that despite the common laboratory environment, male and female G. microtarsus did differ in their temporal activity pattern.The statistical differences in the temporal activity curves may be attributable to endogenous factors associated with the sexes.In fact, male and female G. microtarsus differ in many aspects of their life history and ecology [55].Male and female G. microtarsus differ most remarkably in their demographics.Survival rate for females is constant during the pre-and post-mating periods.On the other hand, survival rate in males decreases sharply in the post-mating period and is significantly lower than that of females [41].The decrease in survival rate in males is probably explained by post-mating mortality associated with stress that results from aggressive behavior and fighting between males during the mating period [41] [44].This striking difference in life-history is possibly associated with differences in activity between male and female.Other aspects of the ecology and life-history of G. microtarsus may also play a role in the observed statistical difference between male and female curves as revealed by FANOVA.Sexual dimorphism is very pronounced with adult males (30 -45 g) being heavier than adult females (20 -30 g [41] [42]).The dimorphism in body size arises from males growing much faster than females, as inferred from a Gompertz growth model [48].Body size has a positive, linear association with home range size, which is the area traversed by an individual in its normal activities of food gathering, mating and caring for the young [45] [46] [47].In fact, observations in nature showed that home-range size in G. microtarsus was larger and more variable in males (0.14 ± 0.18 ha) than that in females (0.12 ± 0.09 ha [48]).
Whereas wavelets have long been applied to problems in behavior and ecology (e.g.[9] [56]), the use of fuctional data analysis is very promising in this context.
For example, functional principal components analysis (FPCA) was recently used to investigate the relationship between a prey species, the sandeel, and its predator, the black-legged kittiwake, in a dynamic marine ecosystem [57].They demonstrated that FPCA was an useful tool to assess spatio-temporal patterns in natural ecosystems and their study revealed the fine scale details of the interaction between environmental factors and prey behavior and predator foraging behavior.Just recently, [58] developed a Bayesian model to study animal movement patterns at different temporal scales within the context of functional data analysis.They applied this model to estimate movement paths and associated movement descriptors of the canadian lynx reintroduced to Colorado.In this application, B-splines were used but the model was general enough to incorporate other basis functions such as Fourier series and wavelets.The approach by [58] seems extremely promising to reveal details of animal movements with important implications for population dynamics.
We believe that functional data analysis when applied in conjunction with wavelets to model curves derived from temporal activity data has the potential to provide an important methodological breakthrough in the study of ecology and animal behavior.One final observation regarding the applicability of wavelet techniques such as those proposed here for animal temporal activity data concerns regular sampling.Some studies of important phenomena do not allow for equally spaced data registers.Several options are available for adapting the proposed procedure for these cases.Among them padding [21], lifted wavelets, and other second-generation wavelets can be employed [59].
marães Martins for critical comments that improved the quality of the manuscript.
mated approximation coefficient for the g -th group, 0-th scale and k -th position, while , ˆthr g jk d ′ is the thresholded wavelet coefficient for the g -th group, j -th scale and k -th position.The hypotheses of interest are

ψ
will be different if we use each of these bases.We could write

Figure 1
Figure 1 shows three estimators for each gender, based on three different wavelet bases: Symmlets 8, Coiflets 3, and Daubechies 6.The data is shown in gray.The respective estimators for the FANOVA model based on independent errors are shown in blue, and the estimators for the FANOVA model based on dependent errors are shown in red.With no loss of generality, time is transformed to

Figure 1 . 5 s j = and 9 J 5 s j = and 9 J 5 s j = and 9 J
Figure 1.Temporal activity curve curves analyzed using the proposed wavelet model.Curves obtained for three estimators for each gender, each onE based on a different wavelet base: Symmlets 8, Coiflets 3, and Daubechies 6.Data are shown in gray; the respective estimators for the FANOVA model based on independent errors are shown in blue; and the estimators for the FANOVA model based on dependent errors are shown in red.(a) Regularized Wavelet Mean 12-hour temporal activity curve of female Gracilianus microtarsus.The wavelet filter is the Symmlet 8, and the regularizing parameters are × against respective critical values of 10.67 and 5.78.