Short-Term Memory Capacity across Time and Language Estimated from Ancient and Modern Literary Texts. Study-Case: New Testament Translations ()
1. Short-Term Memory and Literary Texts
The aim of this paper is to study the short-term memory (STM) capacity of ancient readers of the Greek New Testament and its translations to Latin and to modern languages. For modelling the STM capacity, we consider the number of words between any two contiguous interpunctions, termed “words interval” and indicated by Ip [1] - [6] . This parameter can reveal, as we show, whether the population of readers of given translation overlaps, as far as the STM capacity is concerned, with the population of readers of Greek and other languages. In other words, the study reveals how many translations a reader—supposed to be able to understand any language equally well—could read by engaging his/her STM.
The parameter Ip varies in the same range of the STM capacity, given by Miller’s 7 ± 2 law [7] , a range that includes 95% of all cases. For words, namely data that can be restricted (i.e., “compressed”) by chunking, it seems that the average value in Miller’s range is not 7 but 5 to 6 [7] .
As discussed in [1] , very likely the two ranges are deeply related because interpunctions organize small portions of more complex arguments (which make a sentence) in short chunks of text, which represent the natural STM input [8] - [19] . Moreover, Ip, drawn against the number of words per sentence, PF, tends to approach a horizontal asymptote as PF increases [1] - [6] . The writer, therefore, maybe unconsciously, introduces interpunctions as sentences get longer because he/she acts also as a reader, therefore limiting Ip approximately in Miller’s range.
These findings can be explained, at least empirically, according to how our mind is thought to memorize “chunks” of information in the STM. When we start reading a sentence, our mind tries to predict its full meaning from what already read. Only when an interpunction is found our mind can better understand the meaning of the text. The longer and more twisted the sentence is, the longer the ideas remain deferred until the mind can establish its meaning from all its words (i.e., from all the words interval contained in the sentence), with the result that the text is less readable. Readability, traditionally, is therefore measured mainly according to the length of sentences by any readability formula [20] - [29] , neglecting the STM capacity required to read the sentence.
To overcome this shortcoming, in Reference [6] we have proposed a universal readability formula—applicable to any alphabetical language—which includes the STM capacity modeled by the word interval IP.
By considering IP, we can perform experiments with ancient readers—otherwise impossible—by studying the literary works they used to read, for example the texts belonging to Greek and Latin Literatures. These “experiments” can reveal unexpected similarity and dependence between texts, because they consider four deep-language parameters [1] —two of which are PF and IP, being the other two number of characters per word, CP, and the number of interpunctions per sentence MF—not consciously controlled by writers.
After this introduction, Section 2 reports the mean values of IP for the languages/translations of the New Testament considered; Section 3 recalls and models the probability density function of IP; Section 4 discusses the probability of overlap and defines the overlap index; Section 5 defines the population of universal readers of the New Testament and Section 6 reports some final remarks and proposes future research work. Appendix A and Appendix B report the detailed numerical results used in the paper.
2. The New Testament from Greek to Latin and to Modern Languages
We study the statistical characteristics of IP by considering a large selection of the New Testament (NT) books written in Greek—namely the Gospels according to Matthew, Mark, Luke, John, the Book of Acts, the Epistle to the Romans, the Book of Revelation (Apocalypse), for a total of 155 chapters, according to the traditional subdivision of the original Greek texts—and their translation to Latin and 35 modern languages. A similar study could be done, of course, with other alphabetical texts.
The rationale for studying NT translations is based on its great importance for many scholars of multiple disciplines, besides the personal value for many readers. These translations, although are very rarely verbatim, strictly respect the subdivision in chapters and verses of the Greek texts—as they are fixed today, see Reference [30] for recalling how interpunctions where introduced in the original scriptio continua—therefore they can be studied at least at these two different levels (chapters and verses), by comparing how a deep-language variable, like IP, varies from translation to translation [3] [5] . Notice that in this paper “translation” is indistinguishable from “language”—because we deal only with one translation per language—but notice that language plays only one of the roles in translation, being the addressed audience another one [1] - [6] . A “real translation”—the one we always read—is never “ideal”, i.e. it never maintains all deep–language mathematical characteristics of the original text [2] .
For our analysis, as done in References [3] [30] , we have chosen the chapter level because the amount of text is sufficiently large to assess reliable statistics. Therefore, for each translation/language, we have considered a database of 155 × 37 = 5735 samples of IP, sufficiently large to give reliable statistical results.
Like in all our previous studies, samples were statistically weighted with the fraction of the total words, therefore in Matthew (28 chapters)—18,121 total words in Greek—each sample (chapter) does not weigh 1/128 = 0.0078 but the number of its words divided by the total words: Chapter 5, for example, is made of 824 words, therefore its weight is 824/18,121 = 0.0455. This choice is mandatory to avoid that a short chapter (or, in general, a short text) affects the statistical results like a long one. After this processing we have obtained the mean values
, and the standard deviation
reported in Table 1 for the languages/translations considered—studied also in Reference [3] for other issues—subdivided in language families. Notice that in all languages the list of names reported in Matthew 1.1 - 1.17 and in Luke 3.23 - 3.38 (genealogy of Jesus of Nazareth) have been deleted for not biasing the statistics of linguistic variables [3] .
Figure 1 shows the mean value and ±2-standard deviation bounds of IP. At
![]()
Table 1. Mean value
and standard deviation
of
for the indicated translation and language family of the New Testament books (Matthew, Mark, Luke, John, Acts, Epistle to the Romans, Apocalypse), calculated from 155 samples. Notice that the list of names reported in Matthew 1.1 - 1.17 17 and in Luke 3.23 - 3.38 (genealogy of Jesus of Nazareth) have been deleted for not biasing the statistics of linguistic variables [3] . Samples were statistically weighted with the fraction of the total words. The source of the texts considered is reported in Reference [3] .
![]()
Figure 1. Mean value
for the indicated language in abscissa, see Table 1. The continuous cyan refers to the overall mean value 6.03, the two cyan dashed lines to a ±2-standard deviations bounds (95% of the samples in Miller’s range).
first glance, we can notice a large spread. However, all values are within Miller’s range 7 ± 2.
The overall mean value is 6.03, close to 6.56 found in seven centuries of Italian Literature [1] —a further confirmation that IP is centered about the mean value predicted when memorizing words [7] —and the overall standard deviation (i.e., the square root of the sum of the mean variance and the variance of the mean [31] ) is 1.56. Therefore, by considering 2 standard deviations (which correspond to consider 95% of the samples in Miller’s range), we get 6.03 ± 2 × 1.56, hence the range 2.91 - 9.15, reported in Figure 1. Notice, however, that the lower bound 2.91 is smaller than the value we should expect because—as we show in the next section—the probability density function of IP is skewed to the right, it is not symmetrical.
For our analysis, directed to study and compare the STM capacity of ancient and modern readers of the New Testament (study case), we need to recal, in the next section, how to model the probability density function of IP.
3. Probability Density Function of IP
Given the experimental mean value
and standard deviation
, like those reported in Table 1, in Reference [1] we have shown that the experimental probability density function can be modelled with a log-normal with three parameters:
,
(1)
where the constants are given by [31] :
(2)
(3)
Figures 2-4 show, as examples,
for Ukrainian, Russian, Greek, English, Latin, Italian, Spanish, French. We can see that some densities can each other largely overlap, like Greek and English, or Italian and Spanish, while others overlap only slightly, like Ukrainian and Russian, Greek and Latin.
![]()
Figure 2. Probability density function
for Ukrainian (green line), Russian (black line), Greek (red line), English (blue line). The vertical magenta lines give the thresholds to be used in Equation (4) for the indicated populations. Other thresholds can be drawn such as those between English and Russian or English and Ukrainian.
![]()
Figure 3. Probability density function
for Latin (blue line), Italian (green line), Spanish (yellow line), Greek (red line). The vertical magenta lines give the thresholds to be used in Equation (4) for the indicated populations. Other thresholds can be drawn, such as those between Spanish and Latin, Spanish and Greek.
![]()
Figure 4. Probability density function
for English (green line) and French (black line). The vertical magenta line gives the threshold to be used in Equation (4).
How can we compare the STM capacity of the readers of a language/translation to those of another language/translation? Since IP seems to be a reliable estimate of the STM capacity, then
represents the probability density function that defines a population of readers according to their STM capacity. This is very important because we can do some experiments even with ancient readers by considering the texts they used to read.
In the next section, we propose a way of comparing probability density functions like those shown in Figures 2-4, by measuring the probability of overlap of readers (i.e., readers who can read both texts) and by defining an “overlap index”.
4. Probability of Readers’ Overlap and Overlap Index
Let us assume that readers can read (and understand, of course) any alphabetical language. These readers represent mankind because we study their STM capacity through the word interval IP. Can we “measure” how many readers of text j can potentially read text k, either written in the same language or in another language? What is the minimum percentage of readers who can read both, according to the probability density function of IP of the two texts? We study this issue by first defining the minimum average probability of overlap and then the overlap index.
A mathematical analysis of a similar problem [3] shows that the minimum average probability of overlap
between the populations of readers of text j and text k is given by:
(4)
This probability is interpreted as the percentage of readers who can theoretically read both texts because they share the same STM capacity.
In Equation (4)
and
are the log-normal probability density functions of readers of text j and readers of text k, like those shown in Figures 2-4. The decision threshold
is given by the intersection of
and
. The integral limits in Equation (4) assume
, as shown in Figures 2-4 with the magenta lines, therefore,
.
Let us study the range of
. If
, there is no overlap between the two densities; their mean values are centered at −∞ and +∞, respectively, or the two densities have collapsed to Dirac delta functions. In other words, the two populations of readers are disjoint (mutually exclusive). If
, then the two densities are identical, i.e. text j and text k coincide (e.g., it almost occurs in the cases of Greek versus English, Italian versus Spanish, or English versus French, see Figures 2-4). In conclusion:
, therefore, when
the two populations of readers do not overlap; when
, the two populations fully overlap because
.
Table A1 of Appendix A reports all values of
for the languages listed in Table 1. For example,
for Ukrainian and Russia (Figure 2);
for Greek and English (Figure 2);
for Greek and Latin (Figure 3);
for Italian and Spanish (Figure 3);
for English and French (Figure 4). In other words, Greek and English readers, as well Italian and Spanish readers, can be confused because very likely they share the same STM capacity.
We define the overlap index IO as:
(5)
In Equation (5),
;
means non-overlapping (mutually exclusive) populations,
means totally overlapping populations.
Figure 5 shows the probability distribution of exceeding a given IO, calculated from Table A1. It seems that a uniform probability density function—given by the −1 slope line—fits well the data. Notice that
with probability 0.1 (10% of the cases).
According to the Theory of Communication [32] , if a probability distribution is defined in the finite interval [a, b] ([0 100] in our case) then the uniform distribution gives the maximum entropy supported in this interval. This seems to be the case for the overlap probability and the derived overlap index, as Figure 5 shows. In other words, the common subset of readers who can theoretically read both texts can assume any value between 0% and 100%.
![]()
Figure 5. Probability distribution function of exceeding the overlap index
in abscissa. The red line refers to a uniform distribution.
Figure 6 shows the scatterplot of IO calculated by comparing the population of Greek readers, assumed to be the reference population, to readers of all the other languages; or the readers of French (reference language) or English (reference language) to all the other languages.
In these examples it is evident the strong correlation between the values that assume Greek as reference language (scatterplot with red circles) and those that assume French (black circles) or English (green circles) as reference languages.
Figure 7 shows the scatterplots and regression lines of IP in two languages, for several cases. For example, Greek, French and English readers can be each other confused, while this is not possible with Greek and Spanish readers. Table B1 of Appendix B reports all values of rO.
An interesting parameter, linked to the correlation coefficient rO, is the coefficient of variation [29] :
(6)
The coefficient of variation R gives the fraction of the total variance of the dependent variable y accounted for by the regression line
, and
the proportion not accounted for. In other words, if
, then
, the regression line tells all the story linking y to x because there is no scattering, hence the relationship between y and x is deterministic.
Figure 8 shows the probability distribution function of exceeding a given value R. We can see that with probability less than 0.1 (10% of the cases)
,
![]()
Figure 6. Scatterplot of the overlap index
versus language by assuming as reference language Greek (red circles), French (black circles) and English (Green circles).
![]()
Figure 7. Scatterplot of IP and regression line of IP in two languages, for several cases. French (y) versus Greek (x) (red circles), English versus Greek (blue upward triangles), English versus French (cyan downward triangles), Spanish versus Greek (green circles).
![]()
Figure 8. Probability distribution function of exceeding the coefficient of variation R in abscissa.
therefore for these latter cases 95% of the variance of the samples of y is due to the regression line linking it to x. Table 2 lists, for example, some cases in which
by reading in Table B1 (Appendix B) only the cases of positive correlation coefficients
. We can notice that belonging to a language family makes little difference, although some populations can be confused more than others, like in the cases of Italian and Spanish.
In the next section we define a “universal” reader of the New Testament.
![]()
Table 2. Reference language for which the coefficient of variation
. Data taken from Table B1 (Appendix B) only in the cases of positive correlation coefficients
.
![]()
Figure 9. Probability density function
for the Universal Reader (Un, cyan line), German (Ge, magenta line), English (En, blue line) and Greek (Gr, black line).
![]()
Figure 10. Overlap index
of the probability density function
of the languages in abscissa with the probability density function of the universal reader. The mean is 62.55%, the median is 69.20%.
5. Universal Reader of the New Testament
As mentioned in Section 2, the overall mean value of the data reported in Table 1 is 6.03 and the overall standard deviation is 1.56. Figure 9 shows the corresponding log-normal probability density function compared to that of some specific languages. This model can be considered as the probability density function of a population of “universal” readers who can read, as far as the STM capacity is concerned, any NT translation.
Figure 10 shows the overlap index
calculated by comparing the probability density function
of the universal reader with the probability density function of the language in abscissa. More than 50% of the languages overlap with the universal reader with probability
.
6. Final Remarks and Future Work
We have studied the short-term memory (STM) capacity of the ancient readers of the original New Testament written in Greek, and of readers of its translations to Latin and to modern languages. A similar study could be done with other alphabetical texts belonging to any literature.
For modelling the STM capacity, we have considered the number of words per interpunctions, namely the “words interval” Ip, because this parameter seems to describe how the human mind memorizes “chunks” of information in the STM.
Since IP can be calculated for any alphabetical text, we can perform experiments with ancient readers—otherwise impossible—by studying the literary works they used to read. These “experiments” can reveal unexpected similarity and dependence between texts, because they consider parameters not consciously controlled by writers, either ancient or modern.
The “experiments” done have compared the STM capacity of the readers of a language/translation to those of another language/translation, by measuring the probability of overlap of two languages/populations of readers and by defining an “overlap index”. For example, Greek and English readers, as well Italian and Spanish readers, can be confused because they practically share the same probability distribution of IP. The detailed experimental values reported in large tables in Appendix A and Appendix B can give details on the other languages.
We have also defined a population of universal readers, namely readers who can read (and understand) the New Testament in any language. We have found that more than 50% of the languages overlap with the universal reader with probability
.
Future work is vast, with many research tracks, because alphabetical Literatures are very large and many experiments such as those reported in this paper can be done, according to specific purposes, such as comparing authors, translations or even texts written by artificial intelligence tools.
Appendix A. Values of the Probability of Overlap pO
![]()
Table A1. Values of the probability of overlap
for the indicated languages. The languages indicated in the first row are the reference languages, then languages indicated in the first column are the dependent languages. For example, if Greek is the reference language, the Latin overlaps for 16.31% of the readers, French overlaps for 96.56%. Of course, symmetry is due to the definition of
.
Appendix B. Values of the Correlation Coefficient rO
![]()
Table B1. Values of the correlation coefficient
for the indicated languages. The languages indicated in the first row are the reference languages, the languages indicated in the first column are the dependent languages. For example, if Greek is the reference language, then the correlation coefficient is
with Latin,
with French. Of course, symmetry is due to the definition of
.