Performance Appraisal Criteria of Coaches According to the Age Group of the Athletes and the Level of Sports Competition

The purpose of our study was to analyse how coach performance appraisal frequency and criteria vary according to the age group of the athletes and the level of sports competition. To that end we surveyed a sample of 223 coaches of voluntary sports clubs in Madeira, Portugal, using an individual and ano-nymous questionnaire. Although in general coach performance appraisal was not treated as a systematic and structured process by the sports clubs, we did find significant differences between coaches of young athletes and coaches of adults, and between coaches at the regional level and coaches at the nation-al/international level, with respect to the importance of sports results as a coach evaluation criterion. The study suggests applying structured practices in coach performance appraisal and a different approach to assessment according to the age group and specific context of the competition.

performance appraisal is a complex process, and it is clear that the criteria should be adjusted to the context of each coach, so that the evaluation will be correct, fair, and fruitful (O'Boyle, 2014). MacLean (2001) states that it is impossible to manage an organization without reliable information about the details and the real performance of staff. This is the only way to find fair and substantiated answers that can lead to the modification and the improvement in work processes. For example: it is expected that for the coaches of young athletes, the performance appraisal takes into account criteria that is not exclusive to team and athlete performance in competitions. It is essential to include criteria that are related to learning and development indicators of the athletes' social skills and also their improvement in successive competitions. There is a significant effect of the skill level in the young athletes' competition participation. In order to reach an advanced skill level the athletes need to invest time and effort. The competition becomes a means to advance in the rankings (Koehn, 2012). Mallet and Côté (2006) recall that there are factors that do not depend on the coach performance, but can actually influence the athletes' performance, and highlight the following: the athletes' effort, punctuality and attendance, his/her diet, phase of puberty and stage of physical development, and sports injuries.
In order to improve the performance appraisal process of coaches, from the late twentieth century, scales and models have been suggested for the performance evaluation of coaches. For example, MacLean and Cheladurai (1995), to evaluate Canadian college coaches' performance, proposed a model based on six categories: sports scores of athletes/team; sports results achieved by the coach; skills directly related to the coach role (planning, guiding and evaluating training and competitions); behaviors indirectly related to the position of coach (e.g. selection of sport talent and recruiting new players); ability to secure financial stability, understanding the mission and regulations of the sports club; interpersonal skills and good public relations (namely cooperating with parents of athletes, partnerships and communicating with potential partners, etc.). However, their model does not contemplate how much weight should be assigned to each category in a particular sports context.
With the aim to assess the coaches work, regardless of the competitive level and the athletes' age, during training and competition, Côté, Yardley, Hay, Segwick and Baker (1999) developed a Coaching Behavior Scale for Sport (CBSS) that consists of six dimensions: physical training and planning; technical skills; personal rapport; goal setting; mental preparation; negative rapport. This classification scale aims to help athletes to express their opinions about the performance of their coaches. Another tool was developed by Feltz, Chase, Moritz and Sullivan (1999), a Coaching Efficacy Scale, that aims to assess the coach effectiveness. When compared to the Coaching Behavior Scale for Sport (Côté, Yardley, Hay, Segwick, & Baker, 1999), the main difference is that the assessment is carried out by the coaches themselves. Research indicated the direction of the evaluation process by proposing different categories on which attention should be placed by the evaluator. However, this process has to contemplate the context in which the coach is integrated. A coach who works with young athletes cannot be evaluated according to the same standards of a coach of adult athletes. Moreover, the competition level in which the coach acts, is a hidden variable that needs to be evaluated.
The preceding paragraph gains relevance in light of the Portuguese national coach training plan itself (IDP, 2010), which distinguishes four coaching levels, of which level 1 corresponds to coaches at the start of their career, who may only work with athletes in a learning stage, and whose range of intervention must be adjusted to the specific needs of children and young people in a process of physical, intellectual, and social growth. Youth coaches thus have specific and unique attributions, since their role is highly important in developing the attitudes, self-esteem and psychosocial maturation of the athletes (Liukkonen, Laasco, & Telama, 1996). Instruction sustained by support and encouragement together with silent monitoring turn out to be recurrent behaviours in youth coaches (Smith & Cushion, 2006). Leadership appears to be a further variable to consider, since as Duarte (2004) shows coaches whose athletes display a greater degree of satisfaction and better performances employ various leadership styles, adjusted to the needs and to the achievement of the goals of their athletes/teams. On the other hand, coaches training adult athletes, particularly those competing at the highest level, are more concerned with the development of the athletes' ability to succeed nationally or internationally (Ramirez, 2002). Cunningham and Dixon (2003), used assessment scales and models in the past that did not take into account the multi-level nature of sports organizations, has now developed a model that combined the most conservative theories with the most recent ones. Currently, the performance appraisal is more centered on the development of human skills which are essential to the success of the organization. The proposed model advocates the interdependence between the performance of the coach and the performance of the athlete and comprises six assessment dimensions: the team sports results, the team academic results, the ethical behavior, the financial responsibility, the recruitment quality and the athlete satisfaction.
The research so far shows that it is difficult to adopt a scale or a model to assess coaches without considering their specific context. The researchers argue that must be distinguished the role of the manager in accordance with the context in which they are involved. For example, they presented that there is a significant difference between youngsters and adults participating in sport competitions (Côté, Young, North, & Duffy, 2007;Feltz, Hpler, Roman, & Paiement, 2009), as well as, differences between the national and international competitions (Barber & Eckrich, 1998 2) To determine whether the age group of the athletes is associated with the practice of coach appraisal in sports organizations.
3) To determine whether the sports competition level is associated with the practice of coach appraisal. 4) To determine whether performance appraisal criteria are different for the coaches of young athletes and the coaches of adult athletes. 5) To determine whether coach performance appraisal criteria applied at the local/regional competition level are different from those at the national or international competitive level.

Methods
The analytical model underlying this research ( Figure 1) was set up to answer the previously mentioned research objectives. Considering variables identified as having explanatory value in the literature on sports coach performance appraisal, we have regarded as independent variables the age group of the athletes and the sports competition level, and as dependent variables coach appraisal and performance appraisal criteria. We thus formulated the following research hypotheses concerning the relations between these variables.
Hypothesis 1: appraisal would be the most frequent for coaches with adult athletes only, followed by those with adult and young athletes, and least frequent for coaches of young athletes only.
Hypothesis 2: appraisal would be more frequent at the national/international level than at the regional level.
Hypotheses 1 and 2 are supported by the literature. Soares, Antunes & Rodrigues (2011) remark that the coaches of older teams are more committed to performance appraisal than the coaches of younger teams, in the same way that athletes and teams in more competitive levels are subject to more scrutiny from stakeholders. It thus seems plausible that appraisal will be valued differently for coaches whose athletes/teams compete at a regional (lower) level than for those whose teams take part in national/international level competitions.  Hypothesis 3: performance appraisal criteria would weight more in the performance appraisal of coaches with adult athletes only than in the appraisal of coaches with young athletes only.
Hypothesis 4: performance appraisal criteria would be more important at the national or international competitive level than at the local/regional competition level.
Hypotheses 3 and 4 follow from performance appraisal models (Grote, 2002;Taylor & McGraw, 2006), which assume that appraisal criteria are related to the worker's functions. Thus if different roles are distinguished for coaches according to the target groups with which they work, i.e. athletes in different age groups or different competitive levels (IDP, 2010), then they must be appraised by different criteria. proportion with 95% confidence.
Since our aim was to study coaching in general, rather than in any particular sport, we attempted to make the sample reflect the representation of each sport Participation was requested in two ways: by making contact with the coaches in person, and by email.
The coaches were aged between 20 and 67 years old (mean = 34.3 years), 86.1% (192) were male and 13.9% (31) were female. The age groups of the athletes trained by these coaches were: under 11 years old (children), 11 to 13 years old (beginners), 14 to 15 years old (young), and 16 to 18 years old (junior).
These four categories we call "young only", and together they included 128 coaches (58% of replies). 27 coaches (12%) were involved in training adult athletes only, and 66 (30%) were involved in the training of both young and adult athletes. Two coaches did not reply to the survey question about the age group of the athletes they were training.
In what concerns the sport competition level, 113 coaches (50.9%) participated in regional competitions and 109 (49.1%) participated in national or international competitions. At the regional level young athlete coaches represented 80,5% while at the national and international level they represented 33%.
The study was approved by the scientific committee of the Department of Sport Sciences at the University of Madeira. Confidentiality of data for scientific purposes and anonymity of participants were guaranteed. All participants were volunteers and an informed consent was obtained from the subjects.

Instrument
As part of our survey, a questionnaire based on the dimensions, models, and scales for the appraisal of coach performance found in (MacLean & Chelladurai, 1995;Feltz, Chase, Moritz, & Sullivan, 1999;Cunningham & Dixon, 2003;Horn, 2002;Soares, Antunes, & Rodrigues, 2011) was developed and applied to the 127 coaches in our sample who stated that they had been evaluated prior to or at the start of the sports season. Table 1 lists the items in our questionnaire, the corresponding references in the literature. Coaches were asked to rate how important each criterion had been in their performance evaluation, on a 5-point Likert scale ranging from 1 "Not important at all" to 5 "Extremely important".  (1999) 10. Observation of the opponent strengths and weaknesses MacLean and Chelladurai (1995) 11. Updating of knowledge through courses and training Soares, Antunes and Rodrigues (2011) 12. Complying with the club rules and regulations MacLean and Chelladurai (1995); Cunningham and Dixon (2003) Advances in Physical Education Continued 13. Attendance of the athletes in training and competitions MacLean and Chelladurai (1995) 14. Punctuality of the coach MacLean and Chelladurai (1995) 15. Contribution to the valuation of the club through public relation activities MacLean and Chelladurai (1995) 16. Knowledge about the social phenomena regarding sports MacLean and Chelladurai (1995) 17. Ability to influence the athletes' learning Côté, Yardley, Hay, Segwick and Baker (1999) 18. The athletes' satisfaction towards the coach Cunningham and Dixon (2003) 19. Ability to attract new athletes MacLean and Chelladurai (1995); Cunningham and Dixon (2003) 20. Attendance of the coach at the practices Soares, Antunes and Rodrigues (2011) 21. Ability to make loyal athletes in the sports club Soares, Antunes and Rodrigues (2011) 22. Commitment and motivation of the coach Soares, Antunes and Rodrigues (2011) 23. Responsibility of the coach Soares, Antunes and Rodrigues (2011) Before applying the survey, it was subjected to a content validation aimed at assessing the questions' relevance and clarity, as well as the terminology. This was done by three experts with experience in the field of coach behavior analysis and by two academics who are involved in sports coaching. Among the main suggestions for changes or corrections proposed by the experts and other reviewers, were the following: 1) Changing the header, since in an early version of the questionnaire it included the aim of the survey but not the target population; 2) Adding a note at the start explaining what was meant by performance appraisal: "Performance appraisal should be understood as any actions by the evaluators (for instance: sports managers-presidents and/or directors, managers, general coordinators and/or sport coordinators) consisting of observing, assessing, ranking, following, or controlling the coach's activity, regardless of whether formal processes/instruments or meetings were used or not".
3) Placing the questions related to demographic data at the end of the survey rather than at the start, since such questions do not require as much reflection, thus being less subject to fatigue and lessened concentration.
After this phase, a pilot study was applied to 27 coaches from different sports activities and different competition levels to verify if the questions were clear and understandable. The participants understood all questions clearly and had no doubts replying to the survey.
Data analysis The data were analysed with the IBM Statistical Package for the Social Sciences (SPSS) version 25. We considered the significance level α = 0.05 in all statistical tests. One-sided tests were done in the cases when previous research suggested a given direction for the group differences, as specified in the Methods DOI: 10.4236/ape.2020.104032 398 Advances in Physical Education section. The frequency of coach performance appraisal at the regional level and at the national/international level was compared with the Pearson chi-squared test. To compare the age groups with respect to the prevalence of coach appraisal, we used the one-sided Mantel-Haenszel linear association test, since we expected the frequencies to rise from the "young only" age group to the "young and adult" group and from the latter to the "adult" group.
We further examined the relation between competition level and coach appraisal frequency while controlling for athlete age group, with the Cochran and Mantel-Haenszel tests (Armitage & Berry, 1994) of conditional independence.
Turning to the analysis of the performance evaluation criteria, we began by reducing the dimensionality of the 23 item rankings in our questionnaire by performing a factor analysis on the correlation matrix, with factor extraction by principal components, varimax rotation with Kaiser normalization, and missing data deleted pairwise. The number of factors retained was determined according to the Kaiser criterion, and factor scores were estimated by Bartlett's method (details below, under the heading "Exploratory factor analysis").
We then compared coach performance criteria between the regional and the national/international levels with parallel Student t-tests on the extracted factors.
Although the sample was sizeable (n = 118), we were concerned that the normality condition was rejected by the Kolmogorov-Smirnov test for three of the five factors, and that some left-skewness and outliers were noticeable in these, so we replicated the comparison with the Mann-Whitney-Wilcoxon rank-sum test.
We also compared the coach performance criteria among athlete age groups, using ANOVA multiple comparisons with the Bonferroni adjustment, as well as using the nonparametric Jonckheere-Terpstra pairwise comparisons (by the Dunn method) available in SPSS.
To determine whether the athlete age groups differed with respect to the coach performance appraisal criteria, we performed parallel one-way analyses of variance (ANOVAs) on the factor scores. Although the sample was sizeable (n = 118), normality was rejected by the Kolmogorov-Smirnov test for three of the five factors, which exhibited some left-skewness and outliers, so we performed the Kruskal-Wallis rank-sum test as well.
Exploratory factor analysis The number of replies to individual items in the questionnaire ranged from 123 to 126, and the number of nonresponses ranged from 1 to 4. One of the coaches who had been evaluated prior to or at the start of the sports season did not reply to any item. Modal item rankings ranged from 3 = "important" to 5 = "extremely important".
Except for item 1, Pearson correlations between items were positive and mod- The Kaiser-Meyer-Olkin statistic was high, 0.866. Individual measures of sampling adequacy were 0.453 for item 1, and between 0.740 and 0.946 for the remaining items. Bartlett's test of sphericity was significant with p < 0.001 and χ 2 (253) = 2182.8, but we must caution that many of the item rankings had markedly left-skewed distributions.
To facilitate data analysis and interpretation, and to make our results comparable with others in the literature, we conducted a factor analysis on the correlation matrix by the principal components method with varimax rotation. Missing responses were deleted pairwise. The rotation converged in 7 iterations.
Using the Kaiser criterion, five factors were extracted. Although the latter two comprise only two items each, and the Cronbach alphas for the corresponding items are only moderate to low, we argue that the factors we obtained can be meaningfully interpreted as tentatively described in Table 2, and are consistent with other findings in the literature. Factor scores were estimated by Bartlett's method and used in the subsequent analyses which we report in the following section.
The reliability of the items associated with each of the factors extracted was assessed with Cronbach's alpha and with the average inter-item correlation.

Frequency of Coach Performance Appraisal
Of the 223 coaches questioned, 43% (95) replied that their performance had not been assessed before the season, 57% (127) replied they had been evaluated, and one did not reply. The results of the study show that the practice of coach performance appraisal in sports clubs is weak and rather unstructured. This result may jeopardize the administrative and strategic objectives pursued by sports clubs, such as making decisions about rewards or training programs for coaches.
More, it can still undermine the motivation and the commitment of coaches, because without evaluative feedback, human resources become discouraged. Thus, 95 coaches of the initial sample no longer are able to provide information that would allow to characterize and analyze the functioning of the evaluation process, in particular regarding the evaluation criteria used.

Relations between Age Group of Athletes, Competition Level, and the Frequency of Coach Performance Appraisal
For coaches at the regional level of competition, the frequency of evaluation  Table 3). At the national/international level, the prevalence of evaluation was also similar between coaches of young athletes only and coaches of young and adult athletes. However, coaches in the "adult only" group stood out from the others, with nearly 10% more evaluations than those in the other groups of the same competition level. The overall difference between the frequency of evaluation for national/international and for regional coaches was only 5%, even though we are aware that coaches at the national and international competition levels are subject to a practical performance evaluation which is more demanding than those of coaches in the regional competitions.
Even though the numbers appear to be suggestive, differences between competition levels and between athlete age groups with respect to coach appraisal failed to reach statistical significance in our sample. Treating appraisal as the dependent variable, competition level as the independent variable, and age group as a stratification variable, we found that the strata can be regarded as

Ranking of Each Factor in Coach Performance Appraisal
According to the rating scale used (1-"Not important at all" and 5-"Extremely important"), the category "Attendance and punctuality of the coach", was the one with the highest average item ranking (mean = 4.4). Conversely, and curiously, sports results were the criterion the least valued by the coaches in their own performance appraisal (mean = 3.5) (

Comparison of Athlete Age Groups with Respect to Coach Performance Criteria
We performed a one-way analysis of variance to the factor scores to determine whether the coach performance appraisal criteria differed between the three athlete age groups. According to both tests, differences in coach appraisal criteria among athlete age groups are only significant with regard to factor V, "sports results and scouting" (see Table 5). left-skewness and some outliers. We therefore complement the results of the ANOVA with a Kruskal-Wallis test. We examined the factor "sports results and scouting" further by making pairwise comparisons, both parametrically with the Bonferroni adjustment and with the nonparametric Jonckheere-Terpstra pairwise comparisons, as shown in Table 6. Since we expected that sports results would be more important in the older age groups, we report one-sided test p-values. Both tests lead to the conclusion that the criterion "sports results and scouting" is given significantly less weight in the evaluation of coaches training only young athletes than in the evaluation of coaches training only adult athletes or a mix of adult and young athletes. Table 6. Multiple comparisons between athlete age groups with respect to the coach appraisal criterion "sports results and scouting".

Comparison of Competition Levels with Respect to Coach Performance Criteria
To compare the coach performance appraisal criteria in the two competitive levels, regional and national/international, we applied the t test to the factor scores (see Table 7). Again, only the differences with respect to sports results and scouting were statistically significant at the 5% level, scoring higher for the national/international competition level than for the regional competition level.

Discussion
A surprising result concerns the large number of coaches who have never been evaluated (43%). According to the theories of human resources management, which attest that performance appraisal of the staff is essential in the development of the organization, the results of the research are quite discouraging. However, it is important to recall that this assessment is conducted in a sports context, so we admit that the results are not so surprising. The studies show that there is discrimination in the practice of the human resources performance appraisal in sports organizations (Gilbert & Trudel, 2004;Lin, 2009). Barber and Eckrich (1998), through a study which surveyed sport managers with responsibilities in the evaluation of Basketball Coaches and Cross Country (National Collegiate Athletic Association), found that despite having been shown the importance attached to performance appraisal of coach, one of five sports departments in this association did not apply a formal evaluation system. More, many of these evaluations were essentially based on personal and subjective impressions obtained during competitions throughout the season. In this context, different factors can explain the results of this study. Firstly, the high cost of implementing an assessment process (Bennice, 1990), and then the fear of reciprocal reaction of the assessors and the assessed (MacLean, 2001). Finally, the fact that board directors have a lot of influence in decision-making (Ferkins, Shilbury, & McDonald, 2005), are somewhat resistant to implementing professional practices and show weak interest in conducting a regular and structured system of evaluation human resources. Perhaps, for this reason, the sports clubs studied present a very weak system of management (Taylor, Doherty, & McGraw, 2008). We also know that there are latent conflicts between volunteer board members and the role of the professional sports manager (Amis, Slack, & Berrett, 1995;Soares, Correia, & Rosado, 2010) and this is part of a complex sport governance process (Kikulis, 2000).
We recall that when we tested whether the age group of the athletes or the competitive level were associated with the presence of coach evaluation no significant association was found, which seems to underscore the point made in the previous paragraph.
The category of evaluation that was more valued in this study (attendance and punctuality of the coach) differs from some of similar studies such as, the study by Surujlal and Singh (Surujlal & Singh, 2006). In this study the most noticeable category was "strategy", which in other words means the capacity of the coach to effectively and efficiently use the techniques and tactics available. The overvaluation of this category in relation to the others by the board and coaches can be explained by the relation between the strategic options and the sport results obtained in competitions. The reason why the sport results and ranking of the study area are different from the other studies could be explained by the fact that many coaches surveyed work with younger athletes. At these levels the criteria are mostly linked to pedagogic skills, social skills and differentiation in training according to the athletes learning stage. These were considered more important than the categories of strategy and sport competition tactics.
Yet it is curious that in a previous study developed by Soares, Antunes and Rodrigues (Soares, Antunes, & Rodrigues, 2011), in which were surveyed the evaluators from the clubs where the coaches in our sample were working, the performance criterion "sports results" ranked second, preceded by leadership and motivation skills, and making loyal athletes in the sports club.
In a comparative analysis of the coach's evaluation criteria, depending on the age group with the coach works, except for the category "Sports results and scouting" (p < 0.001), there were no statistically significant differences. This result is relevant if we consider the trends of the models of sports training of young people. These tend to reject traditional models that replicate the practices developed with senior athletes, for the purpose of obtaining short-term sports scores (Bailey et al., 2011). The guidelines for the National Program of Coaches (IDP, 2010), as regards the skills and knowledge of the coaches and youth coaches from senior levels, suggest the need for differentiation in terms of the evaluation criteria. More could be identified, the results of this study, the existence of significantly different purposes for coaches of youth teams and adult coaches. In this sense, also the evaluation criteria should be distinguished.
As for the competitive level where coaches are involved, it showed no significant effect, except once more for the category of "sports results and scouting" (p = 0.001). Barber and Eckrich (1998) found that the sport competitive level of the athletes establishes a meaningful relationship with the evaluation criteria used for coach. The authors show that the category "success of the program", related to sport champion or positive score, by the athletes/team, was more valued in more advanced competitive levels (1st Division) compared with lower-level competitions (2nd and 3rd Divisions). Thus, it can be deduced as an important element in the coach's performance appraisal adult athletes the valuation of sports scores and ranking.
Often the success of coaches is based on predefined performance results, such as a linear relationship between organizational objectives and winners of the athletes. Thus, the comparison between the performance of coaches, through tangible parameters such as sports results achieved by athletes, is highly valued (Mclean & Mallett, 2012). This finding is shared by the fans, and above all, by organizations whose coaches participating at the highest level of sport competition (Rynne, Mallett, & Tinning, 2006).
In fact, the achievement of positive sports results is essential for athletes to remain to compete at higher levels. If athletes do not win competitions or trophies, clubs may fail to have media attraction. Accordingly, they may lose financial and logistical support from their sponsors, calling into question its sustainability. According to Vallerand and Losier (1999), this pressure on coaches may lead to a decrease intrinsic motivation and thus lead him to cognitive outcomes, affective and behavioral negative.
Finally, it is not surprising that the sports results are more relevant for the performance appraisal of the adult coaches (p < 0.001) and coaches in higher levels of sport competition (p = 0.001). There are the adult athletes that compete in higher level which offer more social status and recognition and to achieve positive and quantitative results/records is essential for athletes to continue competing at these levels. If athletes/teams do not win, clubs go down in the league and this makes impossible the participation in the main sport competitions.

Conclusion
The results suggest that there should be an introduction for structured practices in the coach performance appraisal in sports clubs and a differentiation in the process assessment according to the age group and level of sport competition. The criteria adopted in the assessment should take into account the specific context of the coach's work.
Performance appraisal should be based above all on criteria aligned with the goals set out for each particular coach, which means that it should allow sports clubs to make administrative decisions (progressions and rewards), or strategic ones (coach training) justified by the "actual" results of their performance. In response to the scarcity of literature concerning the present subject, more studies are required, as well as a research methodology to identify and clarify the coach performance appraisal process. To that end new research may be carried out, for instance in which the perceptions of the evaluators (sports directors, managers, coordinators) and the perceptions of the coaches would be directly confronted, so that one could figure out how far performance appraisal and the underlying cognitive, social, emotional and political relationships contribute to a change in the actual coach performance.
It is our view that the present work may serve to alert sports regulators to this need, so that they may develop a coach performance appraisal model with guidelines (both general and specific) to help sports clubs set up their own systems of evaluation. Any such proposal should promote the development of coaching skills, focusing on diagnosing the causes of poor performance, and not just to assess and justify whether a given objective has been met. An example is the model developed by the Coaching Association in Canada. Such a model could be incorporated into the Portuguese National Plan of Coach Training, which we find lacking in this respect.