Empirical Analysis of the Processes Driving Data Analytics and Their Outcomes

Abstract

How do data analytics increase firm performances? What organizational processes of data analytics matter for outcomes? This study empirically examines these questions by using RIETI Data Analytics Survey regarding data utilization, and finds following three imprecations. First, strategies for data analytics promote data analytics competencies of organization and human resources, and these competencies promote quality of datasets, resulting in increase of firm performances. Second, non-listed firms and non-manufacturing firms are likely to achieve outcomes of data analytics compared to listed firms and manufacturers. Third, strategies for data analytics tend to enhance productivities for existing products and customer relations, but not for new product/service developments.

Share and Cite:

Kanama, D. (2023) Empirical Analysis of the Processes Driving Data Analytics and Their Outcomes. Open Journal of Business and Management, 11, 1034-1052. doi: 10.4236/ojbm.2023.113057.

1. Introduction

Interest in the relationship between data analytics and firm performance has grown rapidly since the 2010s.1 The reason is that there are many cases where data analytics has been highly effective in key innovation and marketing activities, including product development, sales channel development, and customer management. Data analytics also contributes to improving internal operational efficiency, such as in human resource management and decision-making (e.g., Ghasemaghaei et al., 2018 ). Almost inevitably, many firms worldwide have launched strategic initiatives on data analytics (Mikalef et al., 2019) .

However, when it comes to data analytics capabilities and the initiatives based on those capabilities, there is still a strong sense of uncertainty as to what pathways will ultimately lead to sales and profit (Wamba et al., 2019; Mikalef et al., 2019) . This leaves corporate executives asking themselves: is our data analytics capability lacking; is our organizational structure flawed; or is the problem with the data itself? As a result, they may not be able to find appropriate solutions and may be forced to operate inefficiently without obtaining sufficient benefits from their data investments.

In light of the above issues, the purpose of this study is to quantitatively analyze the relationship between the formulation of a strategy for data analytics by a firm and the subsequent penetration and results of data analytics in that firm. The paper examines this research question using the Research Institute of Economy, Trade and Industry’s 2020 Questionnaire Survey on Data Analytics (hereinafter, RIETI Data Analytics Survey). This survey was conducted as part of a research project by the RIETI, titled “Research on Systems and Management for Global Promotion of Data and AI Analytics—Toward Establishment of a Global Data Supply Chain.”

In most firms, a vision (the image of what the firm wants to be, the direction it aims to take, etc.) and a mission (action guidelines for the entire firm, etc.) are set as upper-level management concepts and, then, lower-level strategies (management strategy, business strategy, intellectual property strategy, etc.) are formulated as measures to realize them. In this study, we are interested in strategies for data analytics and organizational responses that are at a level lower than these strategies. In other words, we assume that a strategy for data analytics will be formulated and, then, the organization will be reorganized and developed, knowledge will be acquired, human resources will be trained, and their capabilities will built as needed in accordance with that strategy. Progress on these initiatives will then lead to the achievement of specific business objectives.

The RIETI Data Analytics Survey used in this study asked for responses on a five-point Likert scale regarding the progress on strategies and the management related to data analytics. We also obtained information on each firm’s performance results. This method allowed us to statistically analyze the relationship between strategy formulation and outcomes. Further, we proposed two related estimation models and six hypotheses and verified them using multiple regression analysis.

This article is organized as follows. Hypotheses development with brief literature review is described in Section 2. This research generates six hypotheses to examine and two estimation models in Figure 1 and Figure 2. Next, dependent, independent and control variables are discussed in Section 3. The basic estimation equation is also expressed. In Section 4, the results of multiple regression analyses are showed with five tables, examining six hypotheses. In Section 5, this article concludes with a summary of the results, their academic contributions, and future challenges.

2. Prior Research, Hypotheses Development, and Estimation Models

Here, we review the empirical research on the relationship between data analytics and firm performance in recent years, and then generate our estimation models and hypotheses.

Many prior studies suggest that the effectiveness of data analytics can only be realized when it is deeply integrated into the business strategy (e.g., Wamba et al., 2019; Wamba et al., 2017 ). In particular, the commitment of management is important, as it ensures that an organizational structure for evidence-based decision-making is established and in-house data analytics literacy is improved (Coluccia et al., 2020; Santoro et al., 2021) . This leads to an improvement in the skills and knowledge of the personnel responsible for data analytics within the organization (Ghasemaghaei et al., 2018) .

Similarly, according to Ferraris et al. (2019) , data analytics is most effective when combined with in-house knowledge management. They cite the importance of strategically considering what type of knowledge should be supplemented and strengthened within the company and how it should be used to guide optimal decision-making before proceeding with the deployment of internal systems within the organization.

As described above, previous researches have shown the strong interest in the relationship between data analytics and firm performancesince the 2010s. However, when it comes to data analytics capabilities and initiatives based on those capabilities, there is still a strong sense of uncertainty as to what pathways will ultimately lead to sales and profit (Wamba et al., 2019; Mikalef et al., 2019) . Therefore, firm executives could not figure out appropriate processes on data analytics and utilization, resulting in the lack of sufficient benefits from their data investments.

In this regard, Ren et al. (2017) examine the relationship of two distinct aspects of data analytics with firm performance: the organizational deployment of information systems to handle the data (system reliability, convenience, flexibility, speed of response to requests, control of personal information, etc.); and the quality of the data itself (completeness, accuracy, compatibility with various data formats, etc.). Their results show that both aspects have a positive impact on firm performance, but the organizational deployment of information systems has a greater effect, which in turn encourages the improvement of the quality of the data itself.

Based on the above discussion, we propose the following two hypotheses.

Hypothesis 1: The formulation of a strategy for data analytics will lead to the deployment of a data analytics system and the acquisition of relevant skills and knowledge.

Hypothesis 2: The deployment of a system for data analytics and the acquisition of relevant skills and knowledge will lead to the qualitative improvement of the data.

In addition, the promotion of data analytics strategies improves the performance of the firm in the market. For example, according to Raguseo and Vitari (2018) , promoting data analytics improves a firm’s position against competitors in the markets it participates in. In particular, it has a strong positive effect on customer satisfaction. Similarly, data analytics has the potential to make a firm’s position more advantageous, particularly in highly competitive industries (Müller et al., 2018; Song et al., 2018) .

Song et al. (2018) examines data analytics and firm performance in the B2C online market. Their results show that the promotion of data analytics is particularly effective in highly competitive industries, especially for firms with a wide variety of products.

Based on the above discussion, we posit two more hypotheses.

Hypothesis 3: The qualitative improvement of data will lead to the realization of positive outcomes from data analytics.

Hypothesis 4: The formulation of a strategy for data analytics will lead to positive outcomes from data analytics.

The model shown in Figure 1 summarizes the previous research and the proposed hypotheses. Specifically, the hypotheses are structured according to the following process: formulating a data analytics strategy leads to system deployment and the acquisition of skills and knowledge, which in turn leads to the qualitative improvement of the data. The ultimate result of this process will be positive outcomes in business activities.

Hitherto, we have developed a set of hypotheses and an estimation model based on the assumption that changes in the external environment will lead to the formulation of a strategy to promote data analytics, followed by the deployment of an organizational structure and the acquisition of skills and knowledge on data analytics. We also assume that the quality of the data will improve as a result of the progress of these factors, ultimately leading to better firm performance.

However, under this structure, it is impossible to determine what business activities data analytics is effective for and, conversely, what business activities it is not effective for. In some firms, specific data analytics objectives may only become clear after the organizational structure for data analytics, skills and knowledge, data quality, etc. are in place. We therefore modified Hypotheses 3 and 4 in Figure 1 to consider this process. The revised estimation model is shown in Figure 2. Specifically, in the part of Figure 1 that refers to outcomes, we replace “outcomes” with “objectives” (Hypothesis 5) and attempt to ascertain the status of achievement of the objectives (Hypothesis 6).

Figure 1. Estimation model (1) (Created by author).

Figure 2. Estimation model (2) (Created by author).

Hypothesis 5: Progress in the formulation of a strategy for data analytics, deployment of a system, acquisition of skills and knowledge, and qualitative improvement of the data will lead to the clarification of the specific objectives of data analytics.

Hypothesis 6: Progress in the formulation of a strategy for data analytics, deployment of a system, acquisition of skills and knowledge, and qualitative improvement of the data will lead to positive outcomes from data analytics by meeting the specific objectives of the data analytics.

3. Variables and Estimation Equation

Variables

From estimation models (1) and (2), the dependent and independent variables are related in complex ways and vary depending on the hypothesis. Therefore, here, we first identify these variables and then the control variables, followed by the estimation equation.

1) Dependent and independent variables

The RIETI Data Analytics Survey included a question asking how much progress the responding firms had made in their data analytics efforts as of the end of March 2020.2 This question covered 19 initiatives related to data analytics, and responses were given on a five-point scale: “strongly agree,” “agree,” “neutral,” “disagree,” and “strongly disagree.” The 19 items were divided into four categories: “Strategy and policy (formulation of data analytics strategy)” (three items), “Implementation systems and tools (deployment of systems)” (six items), “Skills, knowledge, and talent (acquisition of skills and knowledge)” (four items), and “Data (qualitative improvement of data)” (six items).

The responses to these four item groups were set as the dependent or independent variables for Hypotheses 1 and 2. Specifically, “data analytics strategy” was only an independent variable, while “deployment of systems” and “acquisition of skills and knowledge” were set as dependent variables in Hypothesis 1 and independent variables in Hypothesis 2. In addition, “qualitative improvement of data” was set as a dependent variable for Hypothesis 2.

In Hypotheses 3 and 4, all four groups were set as independent variables, while the results of the responses regarding the outcomes of data analytics to date were set as dependent variables. For the questions about the outcomes of data analytics, the responding firms answered using a five-point scale, ranging from “No results have been achieved yet” to “Concrete results (sales, cost reductions, etc.) have been achieved in multiple businesses.”

For the dependent variables of Hypotheses 5 and 6, we used the results of a survey conducted in fiscal year 2019 on the objectives of the responding firms’ data analytics and the status of their achievements. The respondents were asked to respond to each of eight items, including “organizational reform and management strategy formulation” and “human resource development and capacity building,” on a four-point scale (“not an objective,” (if an objective) “did not achieve the objective,” “achieved the objective,” or “achieved or exceeded the objective”). For Hypothesis 5, we applied binomial logistic regression with “objective or not” as the dependent variable. Furthermore, for Hypothesis 6, we set up an ordinal logit with “the extent to which the objective was achieved” as the dependent variable. Since eight items were prepared, estimates were made for each one.

2) Control variables

Factors that could affect the dependent variable were net sales, R&D expenditures (total), industry (manufacturing industry dummy), whether the company is listed (listed dummy), number of patents held in Japan, and whether there is a person in charge of data analytics company-wide (person in charge dummy).

The RIETI Data Analytics Survey also asked the responding firms whether they had received data held by other institutions and whether they had implemented projects under a partnership agreement. We added this as a control variable based on the assumption that firms with this experience are more likely to advance their data analytics strategies and human resource development on an extrinsic basis.

The above variables are summarized in Table 1.

3) Estimation equation

The basic estimation equation is expressed as follows:

performance i = α + β × data i + k = 1 γ k X i + ε i (1)

Subscript i represents each responding firm in the data set. α, β, and γ are the parameters to be estimated and ε is the error term according to N ( 0 , δ ε 2 ) . Dependent variable performance is as described above. The method of analysis is ordinary least squares (OLS) regression.

4. Results

Table 1 shows the estimation results for Hypothesis 1. Models 1-1 and 1-2 present the results with system deployment as the dependent variable and Models 1-3 and 1-4 the results with skill and knowledge acquisition as the dependent variable. The four statistical significance levels are indicated in the notes of the table.

The formulation of strategies for data analytics significantly contributes to the deployment of systems and the acquisition of skills and knowledge. However, the significance levels and coefficients are higher for those firms who put forth strategies to collaborate with external organizations rather than those opting for a company-wide collaborative strategy.

Among the control variables, negative statistical significance is seen for some of the listed and manufacturing industry dummies. In other words, it is easier for unlisted firms than listed firms and for non-manufacturing firms than for manufacturing firms to deploy data analytics systems and acquire skills and knowledge.

Next, the results of testing Hypothesis 2 are shown in Table 2. The statistical significance levels show that the greater is the progress in system deployment and the acquisition of skills and knowledge, the greater is the qualitative improvement of the data. System deployment, in particular, has a significant impact. The listed and manufacturing industry dummies show a tendency similar to Hypothesis 1.

Next, the verification results of Hypotheses 3 and 4 are shown in Table 3. These results are key for understanding whether data analytics leads to positive outcomes and show that progress, in the formulation of a strategy for data analytics, the deployment of a system, acquisition of skills and knowledge, and qualitative improvement of the data tend to improve the outcomes of data analytics.

Interestingly, the results of Model 3 - 6 show that, when all four item groups are entered into the estimation formula, only data improvement is statistically significant at 0.1%. The magnitude of the coefficients suggests that the outcome of data analytics in a firm may be greatly influenced by how efficiently the process proceeds from one step to the next, from strategy formulation through the qualitative improvement of the data. Given that many firms are currently halfway through this process, continued efforts are important.

Table 1. Estimation results for Hypothesis 1.

Table 2. Estimation results for Hypothesis 2.

Table 3. Estimation results for Hypotheses 3 and 4.

Next, we examine the results of testing Hypothesis 5. As noted earlier, there are eight dependent variables; hence, the results of each estimation are comparable. At a first glance, the statistically significant relationships appear complicated. The relationships that stand out are those between “data analytics strategy” and “organizational reform and management strategy formulation,” and between “acquisition of skills, knowledge, and talent” and “human resource development and capacity-building.” There is a statistically significant relationship between these strongly related items, and the results confirm the reliability of the estimation. However, none of the relationships reached a high significance level.

Finally, we present the results of testing Hypothesis 6. While the hypothesis somewhat complicated, some of the results are clearer than for the other hypotheses.

The variables to focus on here are “productivity improvement and efficiency,” “improvement of existing products,” and “CRM.” All these dependent variables are business objectives primarily targeted by existing businesses or existing customers. The introduction of strategies for data analytics and the qualitative improvement of data contribute strongly to the achievement of those objectives.

By contrast, the coefficient on “development of new products” is relatively small and statistically insignificant. These results suggest that, as of fiscal year 2019, corporate efforts for data analytics do not contribute much to new product development, but are effective in improving productivity, products, and customer relationships in existing businesses.

5. Conclusion and Future Challenges

5.1. Summary of the Results and Their Academic Contributions

As previous studies indicate, interest in the relationship between data analytics and firm performance has grown rapidly since the 2010s. However, when it comes to data analytics capabilities and initiatives based on those capabilities, there is still a strong sense of uncertainty as to what pathways will ultimately lead to sales and profit (Wamba et al., 2019; Mikalef et al., 2019) . Hence, we used the RIETI Data Analytics Survey to examine the relationship between the formulation of a strategy for data analytics in a firm and the subsequent penetration and results of data analytics in that firm. As a result, clarified the following three points:

First, in the two estimation models, the formulation of a strategy for data analytics leads to the deployment of a data analytics system and the acquisition of the relevant skills and knowledge. In addition, the deployment of data analytics systems and the acquisition of skills and knowledge lead to improving the qualitative improvement of the data, which ultimately leads to the achievement of positive outcomes from data analytics.

This is consistent with many prior studies (e.g., Wamba et al., 2019; Wamba et al., 2017 ) suggesting that data analytics is most effective when it is deeply integrated into business strategy. It is also consistent to a certain extent with previous studies that report that commitment, especially at the management level, is important; specifically, when this commitment is secured, systems are successfully deployed within the organization and employees’ skills and knowledge related to data use are improved (e.g., Coluccia et al., 2020; Santoro et al., 2021; Ghasemaghaei et al., 2018 ).

Building on the literature, this study makes a new contribution to the literature; namely, to connect the effects of data analytics to firm performance, it is necessary to go through the process indicated in the estimation models. This can be confirmed by the fact that, as shown in Model 3 - 6 in Table 4, as well as in Table 5, it is difficult to achieve significant effects by only promoting the deployment of a data analytics system.

Second, it is easier for unlisted firms than listed firms and for non-manufacturing firms than for manufacturing firms to deploy data analytics systems and acquire skills and knowledge. As such, this means that listed firms in the manufacturing industry are the least likely to reap the benefits of data analytics. We could not find any clear prior research on this issue, which made it difficult to interpret. Intuitively, one would imagine that the large size of listed manufacturing companies, their complex decision-making structures, and the fact that they have many independent business units could hinder the penetration of processes throughout a company. However, additional research would be needed to confirm this intuition.

Finally, the corporate efforts for data analytics do not contribute much to new product development, but are effective in improving the productivity, products, and customer relationships of existing businesses. This point is easy to understand for many businesses. Data analytics is still in its infancy and the idea of “start with what you can do first” is still prevalent in this field. Therefore, it is easier to experiment with data analytics to refine and improve existing businesses with clear issues than to develop new products that are strongly exploratory. Conversely, it would not be an exaggeration to say that, if it becomes possible to utilize data analytics in new product development, this would mean that the penetration of data analytics has progressed considerably.

5.2. Future Research Directions

Finally, we discuss future research directions for this line of research, which is still in its early stages, meaning there are many issues to explore. The most significant issues are as follows.

The first point is to examine the process that leads to positive outcomes from data analytics. The two processes shown in this paper were constructed by incorporating new knowledge into limited previous research, meaning many other processes are possible. This naturally includes processes in the opposite direction; that is, the idea is that if progress is made in the qualitative improvement of the data, the deployment of in-house systems and securing of human resources

Table 4. Estimation results for Hypothesis 5.

Table 5. Estimation results for Hypothesis 6.

with skills and knowledge will lead to the formulation of a company-wide strategy for data analytics. This reverse process may be more natural in the Japanese manufacturing industry, which still has a bottom-up culture. If this were the case, it is natural that this hypothesis could not be developed by previous studies in top European and American journals. This calls for a redesign of our study using the dataset for a hypothesis-discovery approach rather than a hypothesis-testingones.

The second point is to deepen and refine the interpretation of the estimation results. Few qualitative studies supplement the interpretation of the estimation results in this study. Therefore, it is not possible to ascertain whether there is a discrepancy between our statistical analysis and the actual perceptions in the field and, if so, what discussions are lacking. To translate the results of this study into not only academic but also practical contributions, it is necessary to incorporate activities such as interviews and workshops on an ongoing basis.

Third, although this paper adopts a hypothesis-testing approach, it rather attempts to understand the actual situation under investigation. Furthermore, the estimation results should have been tested with a robustness check to enhance the credibility of the results. Therefore, to make a deeper contribution to the literature, it would be necessary to determine what academic and theoretical issues are at stake based on prior theoretical research and the underlying concepts, and conduct more elaborate hypotheses development and testing.

NOTES

1In the literature, the use and application of data is typically referred to as “data analytics,” and an organization’s ability to use and apply data is referred to as its “data analytics competency” or “data analytics capability.” The term “big data analytics” is sometimes used, but an attempt to define the distinction between normal datasets and “big data” is rarely made. Therefore, in this paper, we use the term “data analytics” generically, without addressing whether a firm’s datasets qualify as “big data” or not.

2For detailed descriptive statistics, see Watanabe et al. (2021) .

Conflicts of Interest

The authors declare no conflicts of interest.

References

[1] Coluccia, D., Dabic, M., Del Giudice, M., Fontana, S., & Solimene, S. (2020). R&D Innovation Indicator and Its Effects on the Market: An Empirical Assessment from a Financial Perspective. Journal of Business Research, 119, 259-271.
https://doi.org/10.1016/j.jbusres.2019.04.015
[2] Ferraris, A., Mazzoleni, A., Devalle, A., & Couturier, J. (2019). Big Data Analytics Capabilities and Knowledge Management: Impact on Firm Performance. Management Decision, 57, 1923-1936.
https://doi.org/10.1108/MD-07-2018-0825
[3] Ghasemaghaei, M., Ebrahimi, S., & Hassanein, K. (2018). Data Analytics Competency for Improving Firm Decision Making Performance. The Journal of Strategic Information Systems, 27, 101-113.
https://doi.org/10.1016/j.jsis.2017.10.001
[4] Mikalef, P., Boura, M., Lekakos, G., & Krogstie, J. (2019). Big Data Analytics and Firm Performance: Findings from a Mixed-Method Approach. Journal of Business Research, 98, 261-276.
https://doi.org/10.1016/j.jbusres.2019.01.044
[5] Müller, O., Fay, M., & Brocke, J. (2018). The Effect of Big Data and Analytics on Firm Performance: An Econometric Analysis Considering Industry Characteristics. Journal of Management Information Systems, 35, 488-509.
https://doi.org/10.1080/07421222.2018.1451955
[6] Raguseo, E., & Vitari, C. (2018). Investments in Big Data Analytics and Firm Performance: An Empirical Investigation of Direct and Mediating Effects. International Journal of Production Research, 56, 5206-5221.
https://doi.org/10.1080/00207543.2018.1427900
[7] Ren, S. J., Wamba, S. F., Akter, S., Dubey, R., & Childe, S. J. (2017). Modelling Quality Dynamics, Business Value and Firm Performance in a Big Data Analytics Environment. International Journal of Production Research, 55, 5011-5026.
https://doi.org/10.1080/00207543.2016.1154209
[8] Santoro, G., Thrassou, A., Bresciani, S., & Del Giudice, M. (2021). Do Knowledge Management and Dynamic Capabilities Affect Ambidextrous Entrepreneurial Intensity and Firms’ Performance? IEEE Transactions on Engineering Management, 68, 378-386.
https://doi.org/10.1109/TEM.2019.2907874
[9] Song, P., Zheng, C., Zhang, C., & Yu, X. (2018). Data Analytics and Firm Performance: An Empirical Study in an Online B2C Platform. Information & Management, 55, 633-642.
https://doi.org/10.1016/j.im.2018.01.004
[10] Wamba, S. F., Akter, S., Trinchera, L., & De Bourmont, M. (2019). Turning Information Quality into Firm Performance in the Big Data Economy. Management Decision, 57, 1756-1783.
https://doi.org/10.1108/MD-04-2018-0394
[11] Wamba, S. F., Gunasekaran, A., Akter, S., Ren, S. J., Dubey, R., & Childe, S. J. (2017). Big Data Analytics and Firm Performance: Effects of Dynamic Capabilities. Journal of Business Research, 70, 356-365.
https://doi.org/10.1016/j.jbusres.2016.08.009
[12] Watanabe, T., Hirai, Y., Yoshioka-Kobayashi, T., Kanama, D., Tatsumoto, H., Furuya, M., & Naganuma, M. (2021). Management and Utilization of Data Generated in Firms: Understanding the Actual Situation Using Questionnaire Surveys. RIETI Discussion Paper Series 21-E-017.

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.