COVID-19 Times: Impact on Information Generation and Data Sharing

At the beginning of 2020, human activities were interrupted by a new virus, identified as SARS-CoV-2, which causes COVID-19 disease. The scientific area was no exception: for a certain period, researchers around the world were forced to leave their laboratories and work remotely. There was a global necessity for finding alternatives focused on generating knowledge and publishing data, so repositories of scientific information, such as databases, represented strong support. In the specific case of life sciences, different strategies allowed rapid compilation of data and its sharing worldwide. Therefore, in this work, the impact of the SARS-CoV-2 pandemic on the amount of peer-reviewed and published papers during COVID-19 times was analyzed along with the role of databases. Our results pointed out that an increase in the number of papers belonging to different knowledge fields took place, with the medical field being the most significant. On the other hand, the complete genome of the new virus was sequenced, and repositories were created with sufficient data for monitoring, preventing, and controlling its

A survey conducted by Korbel and Stegle [4], and directed to the scientific community from Canada, France, Italy, Germany, Spain, the UK, and the USA, showed a loss of between 1 to 6 months of work due to laboratory closures, with a notable difference between wet and dry labs. While wet laboratory researchers reported a 73% loss in productivity, the decrease in dry laboratories was only 31% [4]. Due to frequent interruptions, wet labs experienced a larger productivity decrease, in contrast to dry labs, which were able to continue their work with remote access from domestic or institutional servers, and the use of provisional or permanent software licenses. Likewise, with education and distance working, researchers that adapted their workplace in their houses were forced to respond to the same challenges faced at home, for example, low connectivity, the number of computer equipment, or sharing space with family members. The abrupt slowdown of research activities led to the search for new strategies; nevertheless, despite the technological innovations developed during the past decades, there are few tools to carry out scientific work remotely. In this scenario, databases represent solid support, and in some cases, the sole source, for this task, as will be described in the following sections.

Relevance of Databases and Scientific Repositories: The Specific Case of Biological Databases
Scientists conduct thorough searches before, during, and after doing research, in order to design experimental strategies, formulate hypotheses, attain significant results and derive robust analysis from them. Access to academic electronic databases, defined as organized collections of related data that allow retrieval of scientific information, has a crucial role in generating further knowledge. This is a common practice whether an extraordinary situation such as the one lived throughout 2020 and 2021, is taking place or not. However, as stated above, a major fraction of the scientific community was compelled to change its paradigm for doing research, relying on the experiments so far achieved or on digital information repositories. Internet or online research has played a fast-developing and highly transformative role in how communication, information, and networking technologies influence the conduction of studies and projects [5]. During the COVID-19 pandemic, this form of doing research or keeping up with it represented the only alternative for many scientists.
At present, virtually all research projects, within the life science fields, rely to some extent on the use of computers and specialized software, which have become essential parts of the toolkit for the biological area [6]. New technologies have allowed the accumulation of biological data at unpreceded rates. High throughput data is derived from nucleic acid and protein sequencing, omics, flow cytometry, gene expression analysis, molecular screening, phylogenetics,  [7]. Biological databases integrate massive amounts of omics-derived information (genomics, transcriptomics, proteomics, secretomics, metabolomics, and glycomics), turning them into indispensable elements for wet and dry lab scientists, in their efforts to share, retrieve, use, analyze, and correlate these data. Table 1 summarizes the types of biological databases based on several criteria.
According to Zou et al. [8], different aspects apply in the classification of biological databases, depending on 1) their diverse aims, 2) the several types of data encompassed, or 3) the curatorial levels and methods. Likewise, Bhatt et al. [7] mentioned four major criteria used for their classification: 1) data source, 2) data types, 3) database design, and 4) composite databases. Due to their significance, each year, since 1999, the Nucleic Acids Research journal reviews new biological databases, eliminates discontinued URLs, and updates an existing compilation of them. This resource can be freely consulted at https://www.oxfordjournals.org/nar/database/c/, rendering a current total of 1637 databases for the early 2021 issue [9] and 1645 for the following one [10], divided into 15 categories shown in the following graph ( Figure 1). The higher number of databases are encompassed in fields focused on genomes, followed by data regarding sequences (nucleotides, RNA, and proteins). A smaller number of   [12], this raises concerns, mainly about the publishing quality and validity.
The aim of our study centered on analyzing the impact of lockdown, i.e., re- First, we looked into the overall quantity of published papers, followed by a more specific analysis based on categorizing the research according to its scientific field. In order to validate the findings from the overall published scientific literature as well as the papers categorized by topic, results derived from our search were subjected to statistical validation. For this task, analysis of variance (ANOVA), followed by a post hoc test (Tukey) only when means were considered significantly different, was performed using R software [13]. A notable difference in ciphers was observed between the scores rendered by Scopus and PubMed, compared to Pro-Quest, probably accountable to their correspondent search algorithm. Therefore, the raw data were normalized by calculating the annual increment rate, as the ratio between the publications from the current year compared to the papers from the prior. In addition, with this modification, it was possible to assess the normality of the data, by a Shapiro Wilk test, in order to use ANOVA and Tukey statistical tests. The shift was expressed as n-fold, employing the following relation: TN_year n /TN_year n−1 , in which, TN_year n represents the number of articles for the year of study, while TN_year n−1 , is the peer-reviewed material from the previous one. The following example is given in order to illustrate the latter: average hits for overall published material corresponding to the years 2019 and 2020 were in round numbers, 1,836,000 and 2,089,000, respectively. Hence, by dividing TN_2020 by TN_2019, a 1.14-fold was calculated.
Continuous growth in the amount of scientific literature has been registered during the past years ( Figure 2  on areas concerning medicine and engineering. Although there has been an expansion in publications, with respect to years before the pandemics, except for agricultural and biological sciences, we confirmed that the real increment in 2020 was clearly observed in the medical field. Areas concerning social sciences also exhibited a higher growth in the quantity of published material during 2020, compared to previous years, while material sciences continued its positive ten-dency. On the other hand, research dealing with engineering, physics and astronomy, computer science, chemistry, and mathematics exhibited, as well, continuous growth in the number of published papers until 2019. In the past year, all these areas suffered a negative impact of varying degrees. Topics related to biochemistry, genetics, and molecular biology did not show a clear tendency, apparently remaining unchanged during the years studied. Finally, as mentioned above, papers focused on agricultural and biological sciences contrasted with the other disciplines. While 2016 marked a notable interest in this area, after this year, a pronounced decrease was observed in the number of publications regarding this matter. ANOVA pointed out that no difference in annual means could be attributed to published material belonging to specific research areas, except for the case of medicine. Therefore, only significant results derived from the statistical analysis will be subsequently described, i.e., peer-reviewed literature not restricted by topic and the medical field. However, all data derived from our search can be accessed upon request. An average shift value, ranging from 1.01 to 1.05-fold, was calculated for the years before the global spread of SARS-CoV-2 (2015-2019) for all publication types. Nevertheless, this finding notably contrasted with the 1.14-fold increase in the number of papers between 2019 and 2020 ( Figure 3(a)). The research focused on medicine also showed a marked increment in published material for the same time period (2020/2019), exhibiting a mean shift of 1.16, compared to prior years ( Figure 3(b)). Based on the post hoc statistical tests, it was concluded that the sanitary emergency lived throughout 2020, the year  i.e., the tendency exhibited in the number of publications via peer-reviewing.

The Role of COVID-19 in the Increase of Published Material
Due to its impact on humankind, since its appearance, there has been a rapid interest in COVID-19 and the causative agent of this illness (SARS-CoV-2). A clear example of this was illustrated by the emergence of biological databases, such as LitCovid, which reached more than 85,000 publications on the subject by December the 31 st , 2020 [14]. In the latest Nucleic Acids Research review issue [10], seven new databases inspired by the COVID-19 pandemic were added:  [16]. Therefore, the next aim of our study centered on knowing how much attention was drawn to topics regarding SARS-CoV-2 from scientists, reviewers, and editors.
Using the same three search engines (PubMed Central, SciVerse Scopus, and Pro-Quest), we looked into the total number of articles released between January 1 and December 31, 2020, that contained the terms "COVID-19", "SARS-CoV-2", one or the other ("COVID-19" OR "SARS-CoV-2"), and both ("COVID-19" AND "SARS-CoV-2") ( Figure 4). "COVID-19" was more popular than "SARS-CoV-2" as reflected by its usage order of magnitude higher, compared to the virus. In accordance with LitCovid (https://www.ncbi.nlm.nih.gov/research/coronavirus/), regarding the amount of material focusing on this topic, it was found that no more than 4% of the published research was related to it. Thus, despite the compel for remote work brought by the quarantine, this outcome highlighted the great effort by researchers to publish their work related to different knowledge fields. The latter finding was reinforced by the 1.14-fold increase in publications calculated in this study when comparing the years 2019 and 2020. Therefore, in spite of the attention drawn by COVID-19 and SARS-CoV-2, the growth in the percentage of articles in 2020 (14%) cannot be entirely attributed to them. On the other hand, when the analysis was restricted to literature belonging to the medical field, as expected, the percentage of papers containing "COVID-19" or both ("COVID-19"/"SARS-CoV-2") was higher (8%). Therefore, unlike for the case of overall research material, specific focus on medicine reflects a marked influence, accountable for 50% on the positive shift (1.16-fold), shown in Figure 3

The Year 2021: Vaccines, Reopening, and Variants
Despite the numerous efforts on many battlefields against COVID-19, this respiratory disease remains the world's leading health problem. Nucleocapsid protein (N) and Spike protein (E) are considered to be highly immunogenic and have been studied using bioinformatics to find potentially immunodominant regions [19].   [20], and since mid-2021, labs have been arduously working on variant-specific boosters.
After contrasting vaccination campaigns around the globe, reopening became a hot topic, mainly, due to its economic and social impacts. Businesses were compelled to implement strategies that guaranteed the health and safety of their customers, primarily focused on limiting full capacities. During the past year, a gradual and partial return to everyday activities represented a worldwide issue. Research centers were no exceptions: their staff was coming back to work but under restricted operating conditions.
In view of this scenario, once again we searched for the scientific production achieved during the period comprehended between January 1 and December 31, 2021, as described above. To our surprise, the Pro-Quest engine yielded results that contrasted greatly with our former findings, which we presumably account for a modification in its searching algorithm. Therefore, for the analysis corresponding to the year 2021, it was decided to employ hits only derived from PubMed and Scopus and to compare them to the data from 2020, excluding Pro-Quest ciphers, as well. Statistical validation was performed by a Student's t-test with the aid of R software [13]. After assessing the normality of the data, the means of the different years were compared in order to find a significant difference. A two-sided test was used first, focused on validating the difference between media as the alternative hypothesis. Next, only significant cases were further subjected to a one-sided test in order to know if the mean of scientific production from 2019 to 2020 was greater than the attained from 2020 to 2021.
A very slight increment was observed for 2021 in the total number of publications ( Figure 5(a)); nevertheless, when the search was directed to specific scientific fields, only medicine and areas regarding biochemistry, genetics, and molecular biology, exhibited appreciable differences between the studied years. Published material in the medical field showed a 1.11-fold shift from 2020 to 2021, while topics belonging to biochemistry, genetics, and molecular biology decreased by a 0.6 factor ( Figure 5(b)).
Differences were found only for an overall number of publications and the medical-related field, with averages significantly greater for material published during 2020 ( Figure 6). Despite the result observed for papers belonging to biochemistry, genetics, and molecular biology, the t-test showed that the differences were not statistically significant, which might be accountable to the high values calculated in their standard deviations. Once again, this analysis reinforced our previous findings towards pointing out the role of COVID-19 in the generation of information.
Nowadays, the tracking down of SARS-CoV-2 mutations, as they arise, is feasible due to genomic surveillance. Scientists are monitoring genetic changes over time, on a scale never seen before; in the last year, more than 360,000 SARS-CoV-2 genomes have been sequenced and stored in GISAID [21]. Many  variants have been detected, but only those that have represented a public health problem have been studied. Therefore, only the following variants related to outbreaks and high viral loads will be described. In September 2020, a SARS-CoV-2 variant was detected in England and was called B117 (alfa) [22]. Moreover, in October, the variant 501Y.V2 (beta) was detected in South Africa [23]. A little later, in December, the P.1 (gamma) strain emerged in Brazil [24], and the B.1.617.2 (Delta) led to an increase in the number of cases in India [25].  1.427/429). In a recent study, it was found that this variant was able to cause more severe disease and higher viral loads [26]. Currently, the variant Omicron (B.1.1.529) has spread worldwide, impacting considerably due to its severe contagiousness and mutations on the Spike (S) protein receptor-binding domain (RBD); therefore, further investigation on the effectiveness of vaccines against this variant is needed [27]. Sequence analysis of SARS-Co-V-2 variants has demonstrated the emergence of mutation in the Spike protein, which represents the main target for vaccine generation. At present, a key issue relies on whether COVID-19 vaccines will be able to protect against infectious diseases caused by new and upcoming variants. The efficiency of vaccines against known variants has been demonstrated [28]; still, the development of immunotherapies effective to combat these variants is urgently needed [29]. Certainly, the presence of the new strains has been an arduous challenge to face; nevertheless, the utility of databases has allowed multidisciplinary work (doctors, chemists, biologists, mathematicians, engineers, computer scientists, etc.) to face this adversity. As pointed out by Zhou and Wang [30], research focused on the design of new vaccines requires hard endeavors within laboratories (wet and dry) and clinical trials, which in turn rely on the genomic monitoring of viruses for variant detection.

A Deeper Look into Published Material: Subtopics Analysis from the Specific Knowledge Fields Formerly Describes
Years 2019, 2020, and 2021 were selected for this section, and our search was based on the subcategories included in Scopus for each of the scientific fields previously analyzed. As described above, data were collected from PubMed and Scopus, limiting the query to the specific subtopic. The comparison between years was expressed in n-fold, as calculated before, and the statistical validation was accomplished by a Student's t-test using R software [13]. Despite the number of subareas analyzed which can be consulted in the supplementary material, significant differences were extremely limited and only related to the medical field ( Figure 7). Within this scientific area, "endocrinology" and "virology" ex- As stated before, the increase in medical publications was expected due to the health situation lived during the past two years. Except for the case of virology, in which increment is accountable for the sanitary situation due to COVID-19, the differences found within other subfields of knowledge cannot be explained by our analysis. In this matter, we reviewed hits for 2019 as the year that started the health emergency of COVID-19 and the subsequent years of its worldwide spreading (2020 and 2021) to determine the variation of topics of medical interest. The search was performed on PubMed and Scopus using the term "Medicine" and then separated by the publication year. The subtopics found in these first queries were grouped in the three databases using the following commands: (TITLE-ABS-KEY (medicine)) AND (subtopic) AND (LIMIT-TO (PUBYEAR, year)). Results derived from our quest are presented in Figure 8.  Our findings showed that the following keywords: "age", "infections", "mortality", "public health", and "risk factors" exhibited a mild increase during 2020. On the other hand, keywords such as "coronaviruses", "COVID-19", and "pandemic" increased significantly during this same year, confirming again that much of the published material focused on SARS-CoV-2. Other important health issues such as "cancer" recovered importance after the year marked by the global pandemic, while "vaccines" attracted a lot of attention due to the health, economic, political, and social pressures lived throughout 2021. Once more, these results confirmed the effect of COVID-19 on the research carried out through 2020, and at the same time, for subsequent years, the diversification on other topics.

Final Comments: Information Sources, Availability, Data Sharing, and Joint Efforts
With the spread of the internet and information technologies, humanity is currently living in an era referred to as the "information age". This time period is marked by complexity and change, challenging our capacity to create, store, retrieve, manage, and communicate data. In this scenario, the term "information literacy" is defined as "the set of integrated abilities encompassing the reflective discovery of information, the understanding of how information is produced and valued, and the use of information in creating new knowledge and participating ethically in communities of learning" [31]. According to UNESCO [32], Advances in Internet of Things information literacy enables people to interpret and make informed judgments as users of information sources, as well as to become data producers. In a digital era, it is essential to have skills that allow the use of information technologies (computer-based data systems), in order to perform these tasks. Even though the concept has traditionally been associated with an academic environment, information literacy has a broader spectrum in an individual and a social context, reflected in good decision-making in education, health, and/or politics, among other areas. The need to be informed is crucial in our modern society, a phenomenon observed by the increasing number of people around the world looking for data. In this context, the use of computers, and in past years, smaller gadgets such as smartphones or more recent smartwatches with an internet connection and a search engine have been fundamental. Parallel to this necessity lies the speed of acquiring information, so each time, user-friendly devices with larger and more powerful capacities are designed. Therefore, access to data has been facilitated and enhanced by technology in an inconceivable manner, focused on a wide variety of areas, for example, business, education, entertainment, fitness, research, shopping, and even, traveling.
In times of the information age, social media (blogs, podcasts, social networking, and streaming) exert a profound impact on people's perceptions. Unfortunately, sometimes this way of being informed might lack trustworthiness and be misleading. Regarding the present coronavirus sanitary emergency, "infodemic" arose, reflected by a daily exponential growth of information on COVID-19. A serious issue about this phenomenon consists of the dissemination of fake news, which in turn, influences population awareness of the virus and government response against it. Whether verified or not, data availability from traditional (radio, TV, and printed information) or social media, negatively influenced people's minds and their consciousness of SARS-CoV-2. Consequently, the rapid spread of news concerning COVID-19 infections and deaths, caused anxiety, panic, stigma, mistrust, and rumormongering among the general population [33]. In order to counterattack this global threat WHO, UN, UNICEF, UNESCO, UNAIDS, ITU, UN Global Pulse, and IFRC [34] published a joint statement, calling on the Member States and other stakeholders (media, social media platforms, researchers, technologists, civil society leaders, and influencers) to promote the spread of accurate data and combat against infodemics. In addition, social media platforms, such as Facebook, Reddit, Twitter, and YouTube, were compelled to adopt countermeasures versus mis-and disinformation regarding COVID-19, highlighting: 1) flagging, 2) fast checking by third parties, 3) less visibility or deletion of bots or offenders accounts, and 4) adding visible links or redirecting to reliable information [35]. In this regard, Tornberg et al. [36] assessed the dissemination of COVID-19-related data, i.e., articles, across social media by alternative metrics. Still, there is more to overcome, as the vaccination phase against the COVID-19 pandemic has faced contradictory opinions with their respective misleading judgements. sharing of data represented, and still does, an excellent alternative for attaining information literacy on this topic, labor supported by the existence of databases.
Many international academic journals directed efforts for quick review and publication of scientific papers on COVID-19, and various websites have been created to assemble and disclose articles on this disease, thus, reducing public panic and providing real-time guidance [33]. The attention drawn by this disease was assessed via Scopus, analyzing the number of citations regarding this topic, as well as other impact indexes, such as "Field-Weighted Impact Factor" and "Field Weighted View Impact" (Figure 9). It was observed that 2020 marked a clear tendency in the interests of the scientific community, with COVID-19 as a hot topic. Figure 9 shows that the number of COVID-19 citations concentrated during 2020, the year in which the pandemic had a larger impact on global restrictions. In 2021 and 2022, scientists were able to work more relaxed, although the effect of SARS-CoV-2 continues to play a significant role in global  Undoubtedly, COVID-19 revolutionized our lifestyles and continues to do so, as new regulations for fighting against its spreading are constantly being implemented. For example, negative COVID tests are required before letting travelers enter some countries, or in some cases, people are being asked to show vaccination certificates in order to get access to public places. In the case of Mexico, sanitary filters have been installed before accessing closed places such as cafeterias and restaurants, health centers and hospitals, gyms, shopping centers, supermarkets, schools and education centers, among others. Even though control measures tend to relax, wearing the face mask indoors continues to be mandatory. Therefore, human daily activities were and still are strongly impacted by SARS-CoV-2, as pointed out by Popouet's study in Africa focused in areas such as economy, food security and household income, financial balances, and unemployment [37].
As pointed out above, during the lockdown period, research centers and universities were compelled to close, and work had to be done remotely. Different strategies were adopted by the scientific community in order to prevent a pause in the generation of knowledge and sharing of data. In addition to the fight against COVID-19, another important issue was represented by infodemics, so another battlefield took place within the informatics area. In this work, we turned our attention to databases, as repositories of reliable information, on knowing the effect of pandemics on the amount of published material during COVID-19 times. It was found that, in spite of the global closure, a considerable increase in the number of publications from different scientific fields took place, with medicine being the most notable and significant one, especially, with topics regarding "COVID-19" and "SARS-CoV-2". The latter was reinforced by the subtopic analysis, in which notorious differences were observed in comparison to other knowledge areas. Even though other knowledge areas exhibited no significance in the statistical tests, the year 2020 might be considered a very productive period, suggesting the usefulness of information repositories, while laboratories were closed, for the generation of publications not directly related to the pandemic, as demonstrated by our review.
Since the beginning of the pandemics, general interests have focused on its effects on various fields. Publishing during COVID-19 times represented quite a challenge and the center of several research papers. An analogous study to this work was conducted by Sepúlveda-Vildósola et al. [38] during the first months of the worldwide closure (March-April, 2020), in which they addressed scientific material related to COVID-19, highlighting: 1) the number of new articles pub-lished daily focused on specific fields of knowledge, 2) the journals with most papers dealing with the pandemics, 3) the languages used for writing those documents, 4) the countries of origin, and 5) the type of publication. Nevertheless, despite certain similarities, our review represents the first report in which the impact of COVID-19 was studied at a much broader time scale, addressing overall publications and specifically categorizing them by the area of knowledge they belonged to. This time interval included five years before the pandemics (2015-2019), the year marked by the spread of the disease around the globe (2020), and the subsequent one, characterized by the advent of hope in the form of vaccines. Certainly, due to the sanitary emergency, it could be expected that the medical field drew almost all the attention of the scientific and non-scientific community; with our study, this hypothesis was confirmed and statistically validated.
Databases have made it possible to collect and analyze data from more than 207 countries on 6 continents [20]. In addition, the international exchange has improved the monitoring, prevention and control of the spread of the virus. In 2020, the year marked by the beginning of the pandemic, data generation and data sharing were important allies in the fight against COVID-19, and during 2021, their support has continued to be crucial. To illustrate this, we conducted a search, using the same engines described before, focused on research papers containing the term "COVID-19". Surprisingly, for 2021, the same amount of articles was published in comparison to the previous year. When the search was narrowed down by the inclusion of the term "vaccine", a 3-fold increase in the number of publications was observed in 2021, compared to 2020. Nowadays, it is amazing how technology has contributed to speeding up, at unconceivable rates, the process of information generation, data sharing, and its accessibility. In COVID-19 times, an outstanding example is represented by the research and experimentation behind vaccine production; considering the traditional time for their development, 2020-2021 could be considered as a scientific record. Finally, new challenges concerning SARS-CoV-2 are arising, such as the presence of novel strains; nevertheless, the utility of databases has allowed inter-and multidisciplinary collaboration in order to face it.