Multiple Communication Channels in Literary Texts

Abstract

The statistical theory of language translation is used to compare how a literary character speaks to different audiences by diversifying two important linguistic communication channels: the “sentences channel” and the “interpunctions channel”. The theory can “measure” how the author shapes a character speaking to different audiences, by modulating deep-language parameters. To show its power, we have applied the theory to the literary corpus of Maria Valtorta, an Italian mystic of the XX-century. The likeness index , ranging from 0 to 1, allows to “measure” how two linguistic channels are similar, therefore implying that a character speaks to different audiences in the same way. A 6-dB difference between the signal-to-noise ratios of two channels already gives IL ≈ 0.5, a threshold below which the two channels depend very little on each other, therefore implying that the character addresses different audiences differently. In conclusion, multiple linguistic channels can describe the “fine tuning” that a literary author uses to diversify characters or distinguish the behavior of the same character in different situations. The theory can be applied to literary corpora written in any alphabetical language.

Share and Cite:

Matricciani, E. (2022) Multiple Communication Channels in Literary Texts. Open Journal of Statistics, 12, 486-520. doi: 10.4236/ojs.2022.124030.

1. Linguistic Communication Channels

Any language can communicate—across space and time—personal and intimate thoughts, stories and knowledge through literary (fiction), essays and scientific texts.

In recent papers [1] [2] [3], we have developed a general statistical theory on translation of literary texts, based on communication theory, which involves linguistic stochastic variables and communication channels suitable defined. For “translation” we mean not only the conversion of a text from one language to another—what is properly understood, of course, as translation—but also how some linguistic parameters of a text are related to those of another text in the same language. “Translation” therefore in the theory refers also to the case in which a text is compared (metaphorically “translated”) with another text, whichever is the language of the two texts.

In the literature most studies on relationships between texts concern translation because of the importance of automatic (i.e., machine) translation. Translation transfers meaning from one set of sequential symbols into another set of sequential symbols and was studied as a language-learning methodology or as part of comparative literature. Over time the interdisciplinary and specialization of the subject increased and theories and models have been imported from other disciplines [4] [5]. References [6] - [12] report results not based on mathematical analysis of texts, as we have done [1] [2] [3]. When a mathematical approach is used, as in References [13] - [25], most of these studies neither concern the aspects of Shannon’s communication theory [26], nor the fundamental connection that some linguistic variables have with reader’s reading ability and short-term memory capacity [1] [2] [3]. In fact, these studies are mainly concerned with automatic translations, not with a response of human readers. Very often they refer only to one linguistic variable, e.g. phrases [24]. As stated in [25], statistical automatic translation is a process in which the text to be translated is “decoded” by eliminating the noise by adjusting lexical and syntactic divergences to reveal the intended message. In our theory, on the contrary, what we define as “noise”—given by quantitative differences between source text (input) and translated text (output)—must not be eliminated because it makes the translation readable and matched to reader’s short-term memory capacity, a connection never considered in the mentioned references.

Since the 1950s, automatic approaches to text translation have been developed and have now reached a level at which machine translations are of practical use. References [10] - [44] are a small sample of the vast literature on machine translation, all characterized by the same paradigm.

However, as machine translation is becoming very popular, its quality is also becoming increasingly more critical and human evaluation and intervention are often necessary for arriving at an acceptable text quality. Of course, human evaluation can only be done by experts; therefore it is an expensive and time-consuming activity. To avoid this cost, it is necessary to develop mathematical algorithms which approximate human judgment [27]. The theory developed in [1] [2] [3] and the advances presented in the present paper can set some benchmarks for assessing translation quality.

The variables considered in the theory are: Total number of words W, sentences S, interpunctions I; readability index for any alphabetical language G; number of words nW, sentences nS, and interpunctions nI per chapter (or any chosen subdivision of a literary text, large enough to provide reliable statistics, e.g. few hundreds of words); number of characters per word CP, words per sentence PF, words per interpunctions IP (this parameter, called also “words interval”, is linked to the short-term memory capacity of readers [1] [2] [3]), interpunctions per sentence MF (this parameter gives also the number of IP contained in a sentence).

To study the chaotic data that emerge, the theory compares a text (the reference, or input text) to another text (output), with a complex communication channel—made of several parallel channels, two of which are considered in the present paper—in which both input and output are affected by “noise”, i.e. by different scattering of the data around an average relationship, a regression line in the theory.

In [3] we have shown how much the mutual mathematical relationships of texts in a language are saved or lost in translating them into another language. To make objective comparisons, we have defined the likeness index IL, based on probability and communication theory of noisy digital channels.

We have shown (e.g., see Section 4 of [3]) that two linguistic variables—e.g. nS and nW, or MF and nScan be linearly linked by regression lines. This is a general feature of texts. For example, if we consider the regression line linking nS to nWin a reference text and that found in another text (e.g., written in the same language), it is possible to link nS of the first text to nS of the second text with another regression line without explicitly calculating its parameters (slope and correlation coefficient) from the samples, because the mathematical problem has the same structure of the theory developed in Section 11 of [2]. The theory, of course, does not consider the meaning of texts.

In the present paper, we apply the theory to compare how a literary character speaks to different audiences by diversifying and adjusting two important communication channels, namely the “sentences channel” and the “interpunctions channel”. In other words, we study how an author shapes a main character’s speaking to different audiences by modulating some of the linguistic parameters mentioned above. To show the possibilities and usefulness of the theory, we show how it can be applied to a relevant, great and voluminous literary corpus written by an Italian mystic of the XX-century, Maria Valtorta, whose texts (in Italian) have been studied with a multidisciplinary approach in the latest years [45] [46] [47] [48]. A similar approach can be, of course, applied to any other literary corpus written in any alphabetical language.

After this introduction, Section 2 recalls the fundamental relationships present in linguistic communication channels; Section 3 reports some biographical data on Maria Valtorta and her literary corpus; Section 4 recalls and applies a useful vectors plane of linguistic variables; Section 5 defines the theoretical signal-to-noise ratio in literary communication channels; Section 6 discusses the experimental signal-to-noise ratio obtained with Monte Carlo simulations; Section 7 uses the likeness index and the symmetry index to compare channels; finally Section 8 summarizes the main points of the paper and draws a conclusion. Appendices A and B report full data banks useful for assessing some relationships studied in the main text.

2. Fundamental Relationships in Linguistic Communication Channels

In this section we recall the general theory of linguistic channels. In a text, an independent (reference) variable x (e.g., nW) and a dependent variable y (e.g., nS) can be related by the regression line passing through the origin of the axes:

y = m x (1)

In Equation (1) m is the slope of the line.

Let us consider two different texts Y k and Y j , e.g. the sermons that a character, in the literary fiction, addresses to audience k and to audience j. For these texts, we can write more general linear relationships, which take care of the scattering of the data—measured by the correlation coefficients r k and r j , respectively, not considered in Equation (1)—around the average values (measured by the slopes m k and m j ):

y k = m k x + n k (2a)

y j = m j x + n j (2b)

As is known, the linear model Equation (1) connects x and y only on the average (through m), while the linear model Equation (2) introduces additive “noise” through the stochastic variables n k and n j , with zero mean value [1] [2] [3]. The noise is due to the correlation coefficient | r | 1 , not considered in Equation (1).

We can compare two texts by eliminating x, in other words, we compare the output variable y for the same number of the input variable x. In the example just mentioned, we can compare the number of sentences in two texts—for an equal number of words—by considering not only the average relationship Equation (1), but also the scattering of the data, measured by their correlation, Equation (2). We refer to this communication channel as the “sentences channel”.

If the linear relationship is between the number of interpunctions per sentence y = M F and the number of sentences x = n S , then by eliminating nS, we get the linear relationship between M F , k of the first text with M F , j of the second text. We refer to this communication channel as the “interpunctions channel”. Notice that, because MF is also the number of IP (called “words interval” [1]) contained in a sentence, and IP is linked to short-term memory capacity [1] [2] [3], this channel would describe how the short-term memory of the two audiences is addressed by the character with the interpunctions channel.

By eliminating x, from Equation (2) we get the linear relationship between, now, the input number of sentences (or interpunctions) in text Y k (now the reference, input text) and the number of sentences (or interpunctions) in text Y j (now the output text):

y j = m j m k y k m j m k n k + n j (3)

Compared to the new reference text Y k , the slope m j k is given by:

m j k = m j / m k (4)

The noise source that produces the new correlation coefficient between Y k and Y j is given by:

n j k = m j m k n k + n j = m j k n k + n j (5)

The “regression noise-to-signal ratio”, R m , due to | m j k | 1 , of the new channel is given by [2]:

R m = ( m j k 1 ) 2 (6)

The unknown correlation coefficient r j k between y j and y k is given by [49]:

r j k = cos | arcos ( r j ) arcos ( r k ) | (7)

The “correlation noise-to-signal ratio”, R r , due to | r j k | < 1 , of the new channel from text Y k to text Y j is given by [2]:

R r = 1 r j k 2 r j k 2 m j k 2 (8)

Because the two noise sources are disjoint and additive, the total noise-to-signal ratio of the channel connecting text Y k to text Y j , for a given stochastic variable, is given by [2]:

R = ( m j k 1 ) 2 + 1 r j k 2 r j k 2 m j k 2 (9)

Notice that Equation (9) can be represented graphically [2]. Finally, the total signal-to-noise ratio is given by:

Γ = 1 / R (10a)

Γ d B = 10 × log 10 Γ (10b)

Of course, we expect, and it is so in the following, that no channel can yield | r j k | = 1 and | m j k | = 1 , therefore Γ d B = , a case referred to as the ideal channel, unless a text is compared with itself (self-comparison, self-channel). In practice, we always find | r j k | < 1 and | m j k | 1 . The slope m j k measures the multiplicative “bias” of the dependent variable compared to the independent variable; the correlation coefficient r j k measures how “precise” the linear best fit is.

In conclusion, the slope m j k is the source of the regression noise, the correlation coefficient r j k is the source of the correlation noise of the channel. Before proceeding with the study, in the next Section we sketch the biography of Maria Valtorta and introduce her literary corpus.

3. Maria Valtorta and Her Literary Corpus

Maria Valtorta (1897-1961) was an Italian mystic writer active in the years of World War II. Her literary and voluminous work—based, as she claims, on mystic visions, whose assessment is of course beyond science and our investigation—contains a detailed life of Jesus Christ. A rigorous and scientific analysis of her literary corpus on Jesus’ life—narrated in her main work Il Vangelo come mi è stato rivelato (The Gospel as revealed to me, in the following EMV), published in 10 volumes [50] —has evidenced the presence of many data on facts and events allegedly occurred 2000 years ago in Palestine, well beyond her knowledge, culture and skills [45] [46] [47]. She reports, in real time, what she sees and hears during many mystical visions—as she claims—in a period lasting several years [48]. She mentions towns, villages, buildings and palaces, Roman roads, mountain tracks, river Jordan, ports of the Mediterranean, lakes (Tiberias, ancient Meron), creeks, mountains and hills, trees and flowers, fragrances and perfumes, dresses, food, weather, sceneries and monuments of Palestine at Jesus’ times, a geographical area she never visited.

Bedridden since 1934 because paralyzed below the waist, she writes on a small stand, sitting on the bed with shoulders supported by pillows in Viareggio (Tuscany), during World War II and the few following years. In spite of a complete lack of any data possibly available at her times, every time some of the data she reports have been checked, they are unexpectedly correct, sometimes even anticipating what scholars would find years later her writings [47] [50] [51] [52].

She wrote in Italian 13,193 pages of 122 school notebooks [53], without making any correction, with a set of fountain pens always filled with ink because she did not know when the alleged visions would come. In these notebooks there are not only the events now published in the EMV, but also many other mystic writings, as she intercalated pages describing the events on Jesus’ life with many pages on various topics, including dictations and monologues addressed to her by the alleged Jesus (text referred below as Jesus says) or by other heavenly persons. In the following we drop the adjective “alleged”, although we always mean it throughout the paper because it is not our duty, or task, to declare or establish that her “visions” were real, because this is beyond the realms of science.

In this voluminous literary corpus, the character Jesus addresses different audiences: friends, disciples, parables and speaks extempore sermons to people, sermons in Synagogues, at the Temple in Jerusalem. The character delivers two well organized and coherent series of sermons at a locality named Clear Water, Jordan River Valley, and at a locality that Maria Valtorta describes in great detail and looks very alike the Horns of Hattin (Galilee). Some of the content spoken at the Horns of Hattin is reported in the gospel according to Matthew (Mt, 5), and universally known as the Sermon of the Mountain, although the “Sermon” reported in the EMV lasts a week, not a single day [46].

Table 1 reports the average values of the linguistic parameters in the indicated texts attributed to Jesus, mostly extracted from the EMV, and already studied (except Jesus says) for their setting, topics and duration in [46].

We first study the averages of the linguistic variables reported in Table 1 by using a vector representation [1] [2] [3] [54], which gives an overall view of how “close” the texts are.

4. The Vectors Plane of Linguistic Variables

The linguistic averages of Table 1 can be used to assess how “close” texts are in a Cartesian plane, by using a graphical tool which effectively compares different literary texts seen as vectors, representation discussed in detail in [1] [2] [3], here briefly recalled.

Let us consider the following six vectors of the indicated components: R 1 = ( C P , P F ) , R 2 = ( M F , P F ) , R 3 = ( I P , P F ) , R 4 = ( C P , M F ) , R 5 = ( I P , M F ) , R 6 = ( I P , C P ) and their resulting vector:

R = k = 1 6 R k (11)

The choice of which parameter represents the component in the abscissa and ordinate Cartesian axes is not important. Once the choice is made, the numerical results will depend on it, but not the relative comparisons and general conclusions.

Figure 1 shows the resulting vector (11) for the texts listed in Table 1, and referred to (normalized) the coordinates of Clear Water (CW, located at the origin, coordinates (0, 0)), and those of Jesus says, JS, located at (1, 1). As already observed [46], we can notice, very clearly, that the data concerning the sermons

Table 1. Total number of words and sentences in the texts referred to the indicated audiences (the number in parentheses is the number of text subdivisions considered in calculating averages and regression lines) and average number of: characters per word (Cp), words per sentence (PF), words per punctuation marks (interpunctions)—which coincides with the word interval (IP) [1] —and punctuation marks per sentence (MF), which is also the number of word intervals contained in a sentence.

at Clear Water (delivered in 14 days) and at the Horns of Hattin (delivered in 7 days) are displaced from the other texts. They seem to belong to a set of data with different linguistic statistics. This striking difference underlines the peculiarity of these two coordinated and apparently planned series of sermons, compared to the other extempore sermons. Notice also that Jesus says is very much displaced from all other texts. It seems that the character Jesus speaks quite differently to a modern listener (i.e., Maria Valtorta) than when he speaks to people of his (alleged) own historical time. The clear distinction of Jesus says with the other texts will be further analyzed below.

Now, if Maria Valtorta’s claim could be accepted—i.e. she had visions of Jesus’ public life events and received Jesus’ dictations and monologues—the differences just underlined would not be surprising because, in this case, Jesus would be a real person living in his times when he speaks to people, and a contemporary person when he speaks to Maria Valtorta. However, because we, as scientists, are not allowed to accept her claim, we must therefore conclude that she is a very capable writer, because she distinguishes audiences, settings and topics in which the character Jesus acts.

Besides the vector analysis shown in Figure 1, in the next Section we study

Figure 1. Coordinates x and y of the resulting vector (11) of a literary work, referred (normalized) to the coordinates of sermons at Clear Water and the dictations addressed to Maria Valtorta, Jesus says, by assuming Clear Water as the origin, coordinates (0, 0), CW, and Jesus says located at (1, 1), JS. P: Parables; D: Disciples; PP: People; S: Synagogues. T: Temple; CW: Clear Water; HA: Horns of Hattin.

some communication channels linked to specific linguistic variables, such as sentences and interpunctions.

5. Theoretical Signal-to-Noise Ratio in Literary Communication Channels

In this Section we study how sentences and interpunctions build specific communication channels in a literary text, and calculate their signal-to-noise ratio defined in Section 2.

To apply the theory of Section 2, we need the slope m and the correlation coefficient r of the regression line between: (a) the number of sentences nS and the number of words nWto study the “sentences channel”; (b) the number of interpunctions per sentence MF and the number of sentences nSto study the “interpunctions channel”.

Table 2 reports the slope m and the correlation coefficient r of the regression line for the indicated texts. For example, in Hattin, if n W = 100 , then in text blocks of 100 words there are on average n S = 6.61 sentences and 2.1903 × 6.61 = 14.48 interpunctions (punctuation marks).

Figures 2-7 show, for some texts, the scatterplots and their regression lines. By looking at these figures, we can see at glance which texts have very similar regression lines. More difficult is to see whether the scattering of data is similar or not.

For example, in Figure 3 the regression lines of Disciples (cyan) and People (magenta) coincide. In other words, a given number of words contains, on the average, the same number of sentences in both texts. Therefore, the character Jesus addresses the two audiences with sentences of about the same length, on the average; see also the average values of PF in Table 1, PF = 16.30 and PF = 17.10, respectively. The correlation coefficients are very similar, r = 0.9462 and r = 0.9397. According to the theory of Section 2, the signal-to-noise ratio of the

Table 2. Line slope m and correlation coefficient r of the regression lines between the indicated variables, in the texts listed. Correlation coefficients are reported with 4 decimal digits because some coefficients differ only from the third digit.

Figure 2. Scatterplots and regression line between words (independent variable) and sentences (dependent variable) in the following texts: Hattin (blue squares and blue line); Clear Water (red squares and red line); Synagogues (black circles and black line).

Figure 3. Scatterplots and regression line between words (independent variable) and sentences (dependent variable) in the following texts: Disciples (cyan circles and cyan line); People (magenta circles and magenta line); Jesus says (black dots and black line). Notice that the regression lines of Disciples (cyan) and People (magenta) coincide.

Figure 4. Scatterplots and regression line between words (independent variable) and sentences (dependent variables) in the following texts: Hattin (blue squares and blue line); Parables (green circles and green line); Temple (magenta circles and magenta line).

Figure 5. Scatterplots and regression line between sentences (independent variable) and punctuation marks (interpunctions, dependent variable) in the following texts: Hattin (blue squares and blue line); Clear Water (red squares and red line); Synagogues (black circles and black line).

Figure 6. Scatterplots and regression line between sentences (independent variable) and punctuation marks (interpunctions, dependent variable) in the following texts: Hattin (blue squares and blue line); Parables (green circles and green line); Temple (magenta triangles and magenta line). Notice that the regression lines of Hattin (blue) and Parables (green) coincide.

Figure 7. Scatterplots and regression line between sentences (independent variable) and punctuation marks (dependent values) in the following texts: Disciples (cyan circles and cyan line); People (magenta circles and magenta line); Jesus says (black dots and black line). Notice that the three regression lines practically coincide.

sentences channel obtainable—i.e. the channel that transfers (translates) the number of sentences of the input text into the number of sentences of the output text—should be quite large, as we will show below (TableA4).

Similar results can be found in the scatterplots of interpunctions versus sentences. For example, in Figure 7 we can notice that the regressions lines of Disciples (cyan), People (magenta) and Jesus says (black) practically coincide. But the correlation coefficients are quite different: r = 0.9419 in Jesus says (Table 2, rightmost column) against r = 0.9586 in Disciples and r = 0.9567 in People.

Regression lines, however, take care and describe only one aspect of the relationship, namely the average values—recall that average values, as those shown in Table 1, belong to the regression line—and do not show the other aspect of the relationship, namely the scattering of data, which may not be the same when two regression lines almost coincide. The theory of linguistic channels recalled in Section 2, on the contrary, by considering both slopes and correlation coefficients, provides a reliable tool for comparing two sets of data, each described by the linear relationship Equation (2), according, for example, to the signal-to-noise recalled in Section 2 or to Shannon channel capacity [2].

Let us calculate the theoretical signal-to-noise ratios obtained in the sentences and interpunctions channels according to Section 2. Table 3 (sentences channel) and Table 4 (interpunctions channel) report the theoretical signal-to-noise ratio Γ t h , d B (dB) in the channel between the (input) text indicated in the first column and the (output) text indicated in the first line.

For example, in the sentences channel (Table 3), from Parables (input) to Hattin (output) we read Γ t h , d B = 17.38 (dB)—54.7 in linear units—and Γ t h , d B = 16.48 (dB)—44.5 in linear units—in the reverse channel from Hattin (input) to Parables (output), showing asymmetry, a characteristic of linguistic communication channels [2] [3]. In the interpunctions channel (Table 4), from Parables

Table 3. Sentences channel. Theoretical signal-to-noise ratio Γ t h , d B (dB) in the channel between the (input) text indicated in the first column and the (output) text indicated in the first line. For example, if the input is Parables and the output is Clear Water Γ t h , d B = 20.95 dB.

Table 4. Interpunctions channel. Theoretical signal-to-noise ratio Γ t h , d B (dB) in the channel between the (input) text indicated in the first column and the (output) text indicated in the first line. For example, if the input is Parables and the output is Clear Water Γ t h , d B = 22.57 dB.

(input) to Hattin (output) Γ t h , d B = 25.78 (dB)—378.4 in linear units—and Γ t h , d B = 26.13 (dB)—410.2 in linear units—in the reverse channel from Hattin (input) to Parables (output). Notice the large Γ t h , d B 40.5 dB (11,220 in linear units) in the interpunctions channels Disciples People (Table 4).

Besides the asymmetry of the channels, these results say, for example, that the two texts in the channels HattinParables are more similar in the interpunctions channel than in the sentences channel. In other words, the regression lines and scattering (i.e., “noise”) are more similar when the scatterplots of the interpunctions channels, if they were explicitly available, are compared than when the scatterplots of the sentences channels are compared. Therefore, the theory of linguistic channels can finely describe differences that, to a first approximation—as are the average values just reported and the regression line—would be largely lost. In conclusion, multiple linguistic channels can describe the “fine tuning” that a literary author can use to distinguish characters or the same character in different situations, as Maria Valtorta does.

However, as discussed in [3], an important issue here arises because of the different sample size used in calculating the regression line parameters listed in Table 2. In the next Section, we recall this issue and show how to deal with it.

6. Experimental Signal-to-Noise Ratio

Because of the different sample size used in calculating the regression parameters listed in Table 2, the slope m and the correlation coefficient r of a regression line, being stochastic variables, are characterized by average values (those reported in Table 2) and standard deviations, which depend on the sample size [55]. The theory would yield improved estimates of Γ t h , d B , of course, if the sample size were larger. With a small sample size, the standard deviations of m and r can give too large a variation in Γ d B predicted by the theory—see the sensitivity of this parameter to the slope m and the correlation coefficient r in [3]. Only Jesus says is based on a relatively large sample size, 302 couples in the scatterplot. To avoid this inaccuracy—due to the small sample size from which the regression lines are calculated, not to the theory of Section 2—in [3] we have defined, used and discussed a “renormalization” based on Monte Carlo simulations—whose results we consider as “experimental”.

Now, we first recall the steps of the Monte Carlo simulation to be performed, and then we report the results concerning the sentences channel and the interpunctions channel.

6.1. Monte Carlo Simulations

For example, let us take Hattin as output text and the others as input texts, on turn. The steps of the Monte Carlo simulation, for example in the sentences channel, are the following:

1) Generate 7 independent numbers (the number of texts—i.e. sermons—in Hattin) from a discrete uniform probability distribution in the range 1 to 7, with replacement—i.e., a sermon can be selected more than once.

2) “Write” another possible “Hattin” with new 7 sermons, e.g. the sequence 2; 1; 6; … hence take sermon 2, followed by sermon 1, sermon 6, etc. up to seven sermons. The text of a sermon can appear twice (with probability 1/72), three times (with probability 1/73), et cetera, and the new Hattin can contain a number of words greater or smaller than the original text, on the average (the differences are small and do not affect the statistical results).

3) Calculate the parameters m j and r j of the regression line between words (independent variable) and sentences (dependent variable) in the new Hattin.

4) Compare m j and r j of the new Hattin (output, dependent text) with any other text (input, independent text, m k and r k , values listed in Table 2), in the cross-channels so defined, including the original Hattin(self-channel).

5) Calculate m j k , r j k and Γ d B of the cross-channels (linking sentences to sentences), according to the theory of Section 2.

6) Consider the values of Γ d B so obtained as “experimental” results Γ d B , e x , to be compared to the theoretical results of Section 5. Notice that it is not necessary to generate also new Clear Water texts, et cetera, because we compare the experimental results to the theoretical results, therefore the input m k and r k must be the same, therefore those of Clear Water, et cetera. A new Clear Water, et cetera, is generated in the reverse channel.

7) Repeat steps 1 to 6 many times (we did it 5000 times).

Besides the usefulness of the simulation as a “renormalization” tool, shown in [3], the new sermons obtained in step (2) might have been “pronounced” by Jesus in the same occasion, because they maintain the statistical relationships between the linguistic variables of the original sermon.

In conclusion, the Monte Carlo simulation should take care of the inaccuracy in estimating slope and correlation coefficient due to a small sample size.

6.2. Sentences Channel

For the sentences channel, Table 5 shows the results for Hattin. Appendix A reports the results for all other texts.

The results in Table 5 clearly show the impact of m j k and r j k on Γ d B , e x . For example, although r j k = 0.9796 in Temple Hattin is very close to r j k = 0.9778 in Jesus says Hattin, the average Γ d B , e x are quite different namely Γ d B , e x = 8.25 dB (i.e., 6.7 in linear units) in Temple Hattin and Γ d B , e x = 12.18 dB (16.5 in linear units) in Jesus says Hattin, being the difference mainly due to the different slopes: m j k = 1.317 in Temple Hattin (i.e., 100 sentences in Temple are “translated” into 131.7 sentences in Hattin for the same number of words) and m j k = 1.172 in Jesus says Hattin. Similar observations can be done when slopes are very close but correlation coefficients are not.

Figure 8 and Figure 9 show the scatterplots between Γ d B , e x and Γ d B , t h for all texts. From them we can notice that Γ d B , e x and Γ d B , t h agree quite well up to about 20 - 25 dB, beyond which saturation occurs, a trend also shown in [3]. In other words, we can be confident in the reliability of Γ d B , t h up to about 20 - 25 dB. For larger values, Γ d B , t h can also be reliable, but in this case, a deeper statistical assessment would be necessary with regard to the sample size of input and output texts.

Figure 8 and Figure 9 also show channels that coincide. For example, in Figure 8, lower panel, rightmost figure, the channels HattinParables, PeopleParables, DisciplesParables coincide. In this Figure, we can also notice that the channel with the largest Γ d B , e x Γ d B , t h is Clear Water Parables. In other words, for sentences Clear Water is the text closest to Parables, therefore stating that the character Jesus is addressing the two audiences similarly. This result is

Table 5. Sentences channel. Theoretical Γ d B , t h and experimental (Monte Carlo) Γ d B , e x in the indicated cross-channels, obtained by assuming Hattin as output text. The standard deviations are shown in parentheses. Hattin Γ d B , e x refers to its self-channel. The average values and standard deviations of m j k and r j k refer to the estimated regression lines between the number of sentences in Hattin (output, dependent variable) and the number of sentences in the indicated texts (input, independent variable). We report 4 decimal digits in correlation coefficients because some values differ only from the third digit.

Figure 8. Sentences channels. The title refers to the input text. Scatterplots between the average Γ d B , e x (Monte Carlo) and Γ d B , t h . Hattin (blue square); Clear Water (red square); Temple (black triangle); Parables (green circle); Disciples (cyan circle);Synagogues (red circle); People (magenta circle); Jesus says (black triangle).

due to the combination of slope and correlation coefficient.

The higher Γ d B , e x , the more similar the texts, as for example in Figure 9 People and Disciples (upper panel, leftmost figure, lower panel, leftmost figure).

In Section 7, we objectively compare channels and texts according to the likeness index IL, defined in [3].

6.3. Interpunctions Channel

Table 6 shows the results for Hattin and the interpunctions channel. Figure 10 and Figure 11 show the scatterplots between Γ d B , e x and Γ d B , t h for all texts (Appendix B reports the tables for the other texts.). We can notice, for example, that Hattin (Figure 10, upper panel, left) and Clear Water (Figure 10, upper panel, right) are the texts closest to Parables (green circles).

As in the sentences channel, also in the interpunctions channel Γ d B , e x and agree quite well up to about 20 - 25 dB, beyond which saturation occurs, as is clearly shown in Figure 11, upper (Disciples) and lower (People) panels, left.

Notice that in general both Γ d B , e x and Γ d B , t h tend to be larger than those in sentences channel. Because this channel is connected with the words interval IP, and therefore with the short-term memory capacity [1] [2], this result may highlight the fact that most audiences are addressed by distributing the interpunctions

Figure 9. Sentences channels. The title refers to the input text. Scatterplots between the average Γ d B , e x (Monte Carlo) and Γ d B , t h . Hattin (blue square); Clear Water (red square); Temple (black triangle); Parables (green circle); Disciples (cyan circle);Synagogues (red circle); People (magenta circle); Jesus says (black triangle).

Table 6. Interpunctions channel. Theoretical Γ d B , t h and experimental (Monte Carlo) Γ d B , e x in the indicated cross-channels, obtained by assuming Hattin as output text. The standard deviations are shown in parentheses. Hattin Γ d B , e x refers to its self-channel. The average values and standard deviations of m j k and r j k refer to the estimated regression lines between the number of sentences in Hattin (output, dependent variable) and the number of sentences in the indicated texts (input, independent variable). We report 4 decimal digits in correlation coefficients because some values differ only from the third digit.

Figure 10. Interpunctions channels. The title refers to the input text. Scatterplots between the average Γ d B , e x (Monte Carlo) and Γ d B , t h . Hattin (blue square); Clear Water (red square); Temple (black triangle); Parables (green circle); Disciples (cyan circle);Synagogues (red circle); People (magenta circle); Jesus says (black triangle).

Figure 11. Interpunctions channels. The title refers to the input text. Scatterplots between the average Γ d B , e x (Monte Carlo) and Γ d B , t h . Hattin (blue square); Clear Water (red square); Temple (black triangle); Parables (green circle); Disciples (cyan circle);Synagogues (red circle); People (magenta circle); Jesus says (black triangle).

within a sentence in a similar way, except for Temple ( I p = 7.22 ) and Jesus says ( I p = 7.59 ), see Table 1.

In Section 7, we objectively compare channels and texts according to the likeness index IL, defined in [3].

7. Likeness Index and Symmetry Index

The likeness index IL is based on probability theory and allows to “measure” how a linguistic communication channel is similar to another channel. In other words, the likeness index measures how much a text can be “mistaken”, mathematically, with another text, e.g., Hattin with Clear Water, by studying self- and cross-channels and their signal-to-noise ratios Γ d B , e x , whose probability density functions are modelled as Gaussian, with average value and standard deviation reported in Table 5, Table 6. The probability problem is binary because a decision must be taken between two alternatives and its theory is fully developed in [3].

The likeness index is bounded in the range 0 I L 1 ; I L = 0 means totally independent texts, I L = 1 means totally dependent texts.

Although IL depends on both average value and standard deviation of Γ d B , e x , a first assessment can be seen in Figure 12, which shows IL versus the difference

Figure 12. Scatterplot of the likeness index IL versus the difference between Γ d B , e x of the self-channel (large value) and Γ d B , e x of the cross-channel (smaller value) in the sentences channels (blue circles) and interpunctions channels (red circles), for all texts.

between Γ d B , e x of the self-channel (usually the largest value) and Γ d B , e x in a cross-channel (smaller value). Clearly, as the difference between the two Γ d B , e x increases, IL rapidly decreases. The scattering of the values in Figure 12 is due to different standard deviations. A 6-dB difference (i.e., in linear units Γ d B , e x of the self-channel is 4 times larger than Γ d B , e x of a cross-channel) gives already I L 0.5 , which we can assume as a threshold below which two texts depend very little on each other. We report next the full results concerning the sentences channels and the interpunctions channel.

7.1. Sentences Channel

Table 7 reports IL between the indicated texts in the sentences channels. For example, in the channel Parables Hattin I L = 0.886 , while in the reverse channel HattinParable I L = 0.384 , with a large asymmetry. The largest value is in the practically symmetrical channel People Disciples with I L = 0.990 and I L = 0.968 .

Let us discuss in more detail the results. Let us consider, for example, the channels to Hattin (column Hattin of Table 7). We see that Hattin is very similar to Parables ( I L = 0.886 ), People ( I L = 0.854 ), Disciples ( I L = 0.837 ) and enough similar to Clear Water ( I L = 0.690 ). This means that in every new Hattin simulated in step 2 of the Monte Carlo algorithm of Section 6.1, the regression line between sentences and words is very similar to that of the input text Parables, People, Disciples or Clear Water, so that the theory of Section 2 produces, in the end, these large values of IL. In other words, particularly Parables, People and Disciples are, with a large confidence measured by IL, “contained” in Hattin. Notice that the reverse situation is not true because of the large asymmetry: Parables ( I L = 0.384 ), People ( I L = 0.584 ) and Disciples ( I L = 0.354 ), Clear Water ( I L = 0.181 ).

Table 7. Likeness index IL between the indicated texts, sentences channel. The text in the first line indicates the output text, the text in the first column indicates the input text. For example, in the channel Parables Hattin I L = 0.886 , while in the channel HattinParable I L = 0.384 .

Other interesting observations can be done:

1) People contains Disciples and vice versa. The two sets are, practically, the same set of data, can be fused together.

2) Clear Water barely contains Parables but not vice versa. Clear Water does not contain any other text.

3) Jesus says contains Temple but not vice versa. The “modern” Jesus includes the “ancient” Jesus but not vice versa. The ancient Jesus (column Jesus says) does not speak as the “modern” Jesus does.

4) Hattin “contains”, as already mentioned, all extempore sermons/speeches delivered to audiences made of unpredictable listeners (Parables, People and Disciples), but it does not contain Temple and Synagogues. In other words, in these institutional sites Jesus seems to speak differently than at the Horns of Hattin where he presents his Manifesto [46]. Because Clear Water came before Hattin (see the alleged chronology in [46]), there seems to be a significant change in the oratory and statistical characteristics of the sermons delivered in the two occasions, the last one (Hattin) being the model followed later by the character Jesus in other occasions.

7.2. Symmetry Index

As mentioned above, asymmetry is typical of most linguistic channels. Therefore, it is useful to define a new parameter, the symmetry index IS, linked to the likeness index by the relationship:

I S = 1 | I L , j k I L , k j | I L , j k + I L , k j (12)

In Equation (12) I L , j k refers to the channel k j (e.g. in Parables Hattin), I L , k j refers to the reverse channel j k (e.g., in HattinParable).

It can be shown that the symmetry index defined in Equation (12) is bounded in the range 0 I S 1 [56]; I s = 0 means no symmetry, I S = 1 means total symmetry.

Table 8 shows this index for all texts. As anticipated, the most symmetrical channel is People Disciples. The least one is Jesus says Clear Water, therefore confirming that the modern character Jesus speaks differently than the alleged ancient Jesus.

7.3. Interpunctions Channel

Table 9 reports the likeness index IL between the indicated texts in the interpunctions channels. We can notice that the largest IL is found in the channel People Disciples, I L = 0.997 and I L = 0.993 , therefore confirming that People contains Disciples and vice versa, also in this linguistic channel. In other words, as already observed, the character Jesus does not distinguish the two audiences.

Table 8. Symmetry index IS between the indicated texts, sentences channel. For example, in the channel Parables Hattin I S = 0.605 . The most symmetrical channel is the channel Disciples People, I S = 0.989 . The most asymmetrical channels are Jesus say Hattin and Jesus say Temple I S = 0.017 .

Table 9. Likeness index IL between the indicated texts, interpunctions channel. The text in the first line indicates the output text, the text in the first column indicates the input text. For example, in the channel Parables Hattin I L = 0.825 , while in the channel HattinParables I L = 0.780 .

In general, the likeness index of the interpunctions channels is lower than that in the sentences channel. Other observations are:

1) Hattin contains, with decreasing values, Parables ( I L = 0.825 ), Disciples ( I L = 0.705 ) and People ( I L = 0.651 ), but not vice versa.

2) Jesus says contains People ( I L = 0.870 ) and Disciples ( I L = 0.842 ), but not vice versa.

Because the interpunctions channel concerns the number of words interval Ip contained in the same number of sentences, the sermons delivered to different audiences have significantly different lengths of sentences, as we can notice in the average values of PF reported in Table 1. The “fine tuning” due to the linguistic channel describes more clearly the impact of this parameter.

Finally, Table 10 shows the symmetry index Is Equation (12) for all texts. Again, the most symmetrical channel is People Disciples; the least one is Jesus says, therefore confirming that the modern character Jesus speaks differently than the ancient Jesus.

8. Conclusions

We have applied the theory developed in [1] [2] [3] and recalled in Section 2, based on regression lines, to compare how a literary character speaks to different audiences by diversifying and adjusting two important linguistic communication channels, namely the “sentences channel” and the “interpunctions channel”. The theory can “measure”, how an author shapes a character speaking to different audiences by modulating mainly deep-language parameters.

To show the power of the theory, we have applied it to the great literary corpus written by an Italian mystic of the XX-century, Maria Valtorta. In this voluminous literary corpus, the character Jesus addresses different audiences: friends, disciples, people and delivers extempore or planned sermons to people.

Because the estimate of slope and the correlation coefficient of a regression line, on which the theory is based, depend on sample size, we have used a “renormalization” based on Monte Carlo simulations [3], and considered its results concerning the signal-to-noise ratio of channels as “experimental”.

The likeness index IL, ranging between 0 and 1, defined in [3], based on probability theory, allows to “measure” how a linguistic communication channel is similar to another channel, i.e. it measures how much a text can be “mistaken”, mathematically, with another text by studying self- and cross-channels and their signal-to-noise ratios.

Table 10. Symmetry index IS between the indicated texts, interpunctions channel. For example, in the channel Parables Hattin I L = 0.972 . The most symmetrical channel is the channel Disciple People, I S = 0.998 . The most asymmetrical channel is Jesus say Temple, I S = 0.011 .

Although IL depends on both average value and standard deviation of the experimental signal-to-noise ratio Γ d B , e x , a first assessment is given by the difference between Γ d B , e x of the self-channel (usually the largest value) and Γ d B , e x in a cross-channel (smaller value). As this difference increases, IL rapidly decreases. A 6-dB difference gives already I L 0.5 , which can be assumed as a threshold below which two texts depend very little on each other.

As discussed in [2] [3], asymmetry is typical of most linguistic channels. The symmetry index IS defined in the paper, ranges between 0 and 1. In very few channels, I S 1 , therefore indicates that the character Jesus addresses the two audiences as if they were indistinguishable. In most channels I S 1 , therefore indicates Jesus addresses the two audiences quite differently.

In conclusion, multiple linguistic channels can describe the “fine tuning” that a literary author can use to distinguish characters or the same character in different situations, as Maria Valtorta did. Of course, a similar approach can be used to study any literary corpus written in an alphabetical language.

Appendix A

In this Appendix we report the full data bank of the experimental (Monte Carlo) average and standard deviation of Γ d B , e x in the indicated cross-channels, obtained after 5000 simulations in the sentences channels. The standard deviations are shown in parentheses. The average values and standard deviations of m j k and r j k refer to the calculated regression lines between the number of sentences in the indicated output text (dependent variable, column 1) and the number of sentences in the indicated input text in the Tablecaption. Correlation coefficients are reported with 4 decimal digits because some values differ only from the third digit (Tables A1-A7).

Table A1. Sentences channel. Experimental (Monte Carlo) average and standard deviation of Γ d B , e x (dB), average and standard deviation of slope m j k and correlation coefficient r j k of the texts indicated in column 1 (input) obtained by assuming Clear Water as output text. The standard deviations are shown in parentheses.

Table A2. Sentences channel. Experimental (Monte Carlo) average and standard deviation of Γ d B , e x (dB), average and standard deviation of slope m j k and correlation coefficient r j k of the texts indicated in column 1 (input) obtained by assuming Temple as output text. The standard deviations are shown in parentheses.

Table A3. Sentences channel. Experimental (Monte Carlo) average and standard deviation of Γ d B , e x (dB), average and standard deviation of slope m j k and correlation coefficient r j k of the texts indicated in column 1 (input) obtained by assuming Parables as output text. The standard deviations are shown in parentheses.

Table A4. Sentences channel. Experimental (Monte Carlo) average and standard deviation of Γ d B , e x (dB), average and standard deviation of slope m j k and correlation coefficient r j k of the texts indicated in column 1 (input) obtained by assuming Disciples as output text. The standard deviations are shown in parentheses.

Table A5. Sentences channel. Experimental (Monte Carlo) average and standard deviation of Γ d B , e x (dB), average and standard deviation of slope m j k and correlation coefficient r j k of the texts indicated in column 1 (input) obtained by assuming Synagogues as output text. The standard deviations are shown in parentheses.

Table A6. Sentences channel. Experimental (Monte Carlo) average and standard deviation of Γ d B , e x (dB), average and standard deviation of slope m j k and correlation coefficient r j k of the texts indicated in column 1 (input) obtained by assuming People as output text. The standard deviations are shown in parentheses.

Table A7. Sentences channel. Experimental (Monte Carlo) average and standard deviation of Γ d B , e x (dB), average and standard deviation of slope m j k and correlation coefficient r j k of the texts indicated in column 1 (input) obtained by assuming Jesus says as output text. The standard deviations are shown in parentheses.

Appendix B

In this Appendix we report the full data bank of the experimental (Monte Carlo) average and standard deviation of Γ d B , e x in the indicated cross-channels, obtained after 5000 simulations in the sentences channels. The standard deviations are shown in parentheses. The average values and standard deviations of m j k and r j k refer to the calculated regression lines between the number of sentences in the indicated output text (dependent variable, column 1) and the number of sentences in the indicated input text in the Tablecaption. Correlation coefficients are reported with 4 decimal digits because some values differ only from the third digit (Tables B1-B7).

Table B1. Interpunctions channel. Experimental (Monte Carlo) average and standard deviation of Γ d B , e x (dB), average and standard deviation of slope m j k and correlation coefficient r j k of the texts indicated in column 1 (input) obtained by assuming Clear Water as output text. The standard deviations are shown in parentheses.

Table B2. Interpunctions channel. Experimental (Monte Carlo) average and standard deviation of Γ d B , e x (dB), average and standard deviation of slope m j k and correlation coefficient r j k of the texts indicated in column 1 (input) obtained by assuming Temple as output text. The standard deviations are shown in parentheses.

Table B3. Interpunctions channel. Experimental (Monte Carlo) average and standard deviation of Γ d B , e x (dB), average and standard deviation of slope m j k and correlation coefficient r j k of the texts indicated in column 1 (input) obtained by assuming Parables as output text. The standard deviations are shown in parentheses.

Table B4. Interpunctions channel. Experimental (Monte Carlo) average and standard deviation of Γ d B , e x (dB), average and standard deviation of slope m j k and correlation coefficient r j k of the texts indicated in column 1 (input) obtained by assuming Disciples as output text. The standard deviations are shown in parentheses.

Table B5. Interpunctions channel. Experimental (Monte Carlo) average and standard deviation of Γ d B , e x (dB), average and standard deviation of slope m j k and correlation coefficient r j k of the texts indicated in column 1 (input) obtained by assuming Synagogues as output text. The standard deviations are shown in parentheses.

Table B6. Interpunctions channel. Experimental (Monte Carlo) average and standard deviation of Γ d B , e x (dB), average and standard deviation of slope m j k and correlation coefficient r j k of the texts indicated in column 1 (input) obtained by assuming People as output text. The standard deviations are shown in parentheses.

Table B7. Interpunctions channel. Experimental (Monte Carlo) average and standard deviation of Γ d B , e x (dB), average and standard deviation of slope m j k and correlation coefficient r j k of the texts indicated in column 1 (input) obtained by assuming Jesus says as output text. The standard deviations are shown in parentheses.

Conflicts of Interest

The authors declare no conflicts of interest.

References

[1] Matricciani, E. (2019) Deep Language Statistics of Italian throughout Seven Centuries of Literature and Empirical Connections with Miller’s 7 ∓ 2 Law and Short-Term Memory. Open Journal of Statistics, 9, 373-406.
https://doi.org/10.4236/ojs.2019.93026
[2] Matricciani, E. (2020) A Statistical Theory of Language Translation Based on Communication Theory. Open Journal of Statistics, 10, 936-997.
https://doi.org/10.4236/ojs.2020.106055
[3] Matricciani, E. (2022) Linguistic Mathematical Relationships Saved or Lost in Translating Texts: Extension of the Statistical Theory of Translation and Its Application to the New Testament. Information, 13, Article No. 20.
https://doi.org/10.3390/info13010020
[4] Catford, J.C. (1965) A Linguistic Theory of Translation. An Essay in Applied Linguistics. Oxford University Press, Oxford.
[5] Munday, J. (2008) Introducing Translation Studies: Theories and Applications, 2nd Edition, Routledge, London.
[6] Proshina, Z. (2008) Theory of Translation, 3rd Edition, Far Eastern University Press, Manila.
[7] Trosberg, A. (2000) Discourse Analysis as Part of Translator Training. Current Issues in Language and Society, 7, 185-228.
https://doi.org/10.1080/13520520009615581
[8] Tymoczko, M. (1999) Translation in a Post-Colonial Context: Early Irish Literature in English Translation. St Jerome, Manchester.
[9] Warren, R. (Ed.) (1989) The Art of Translation: Voices from the Field. North-Eastern University Press, Boston.
[10] Williams, I. (2007) A Corpus-Based Study of the Verb Observar in English-Spanish Translations of Biomedical Research Articles. Target, 19, 85-103.
https://doi.org/10.1075/target.19.1.06wil
[11] Wilss, W. (1996) Knowledge and Skills in Translator Behaviour. John Benjamins, Amsterdam and Philadelphia.
https://doi.org/10.1075/btl.15
[12] Wolf, M. and Fukari, A. (Eds.) (2007) Constructing a Sociology of Translation. John Benjamins, Amsterdam and Philadelphia.
https://doi.org/10.1075/btl.74
[13] Gamallo, P., Pichel, J.R. and Alegria, I. (2020) Measuring Language Distance of Isolated European Languages, Information, 11, Article No. 181.
https://doi.org/10.3390/info11040181
[14] Barbançon, F., Evans, S., Nakhleh, L., Ringe, D. and Warnow, T. (2013) An Experimental Study Comparing Linguistic Phylogenetic Reconstruction Methods. Diachronica, 30, 143-170.
https://doi.org/10.1075/dia.30.2.01bar
[15] Bakker, D., Muller, A, Velupillai, V., Wichmann, S., Brown, C.H., Brown, P., Egorov, D., Mailhammer, R., Grant, A. and Holman, E.W. (2009) Adding Typology to Lexicostatistics: Acombined Approach to Language Classification. Linguistic Typology, 13, 169-181.
https://doi.org/10.1515/LITY.2009.009
[16] Petroni, F. and Serva, M. (2010) Measures of Lexical Distance between Languages. Physica A: Statistical Mechanics and Its Applications, 389, 2280-2283.
https://doi.org/10.1016/j.physa.2010.02.004
[17] Carling, G., Larsson, F., Cathcart, C., Johansson, N., Holmer, A., Round, E. and Verhoeven, R. (2018) Diachronic Atlas of Comparative Linguistics (DiACL)—A Database for Ancient Language Typology. PLOS ONE, 13, Article ID: e0205313.
https://doi.org/10.1371/journal.pone.0205313
[18] Gao, Y., Liang, W., Shi, Y. and Huang, Q. (2014) Comparison of Directed and Weighted Co-Occurrence Networks of Six Languages. Physica A: Statistical Mechanics and its Applications, 393, 579-589.
https://doi.org/10.1016/j.physa.2013.08.075
[19] Liu, H. and Cong, J. (2013) Language Clustering with Word Co-Occurrence Networks Based on Parallel Texts. Chinese Science Bulletin, 58, 1139-1144.
https://doi.org/10.1007/s11434-013-5711-8
[20] Gamallo, P., Pichel, J.R. and Alegria, I. (2017) From Language Identification to Language Distance. Physica A: Statistical Mechanics and its Applications, 484, 152-162.
https://doi.org/10.1016/j.physa.2017.05.011
[21] Campos, J., Otero, P. and Loinaz, I. (2020) Measuring Diachronic Language Distance Using Perplexity: Application to English, Portuguese, and Spanish. Natural Language Engineering, 26, 433-454.
[22] Eder, M. (2015) Visualization in Stylometry: Cluster Analysis Using Networks. Digital Scholarship in the Humanities, 32, 50-64.
https://doi.org/10.1093/llc/fqv061
[23] Brown, P.F., Cocke, J., Della Pietra, A., Della Pietra, V.J., Jelinek, F., Lafferty, J.D., Mercer, R.L. and Roossin, P.S. (1990) A Statistical Approach to Machine Translation. Computational Linguistic, 16, 79-85.
[24] Koehn, F., Och, F.J. and Marcu, D. (2003) Statistical Phrase-Based Translation, Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, Edmonton, 27 May-1 June 2003, 48-54.
https://doi.org/10.3115/1073445.1073462
[25] Michael Carl, M. and Schaeffer, M. (2017) Sketch of a Noisy Channel Model for the Translation Process. In: Hansen-Schirra, S., Czulo. O. and Hofmann. S., Eds., Empirical Modelling of Translation and Interpreting, Language Science Press, Berlin, 71-116.
[26] Shannon, C.E. (1948) A Mathematical Theory of Communication. The Bell System Technical Journal, 27, 379-423, 623-656.
https://doi.org/10.1002/j.1538-7305.1948.tb00917.x
[27] Elmakias, I. and Vilenchik, D. (2021) An Oblivious Approach to Machine Translation Quality Estimation. Mathematics, 9, Article No. 2090.
https://doi.org/10.3390/math9172090
[28] Lavie, A. and Agarwal, A. (2007) Meteor: An Automatic Metric for MT Evaluation with High Levels of Correlation with Human Judgments. Proceedings of the Second Workshop on Statistical Machine Translation, Prague, 23 June 2007, 228-231.
https://doi.org/10.3115/1626355.1626389
[29] Banchs, R. and Li, H. (2011) AM-FM: A Semantic Framework for Translation Quality Assessment. Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Vol. 2, Portland, 19-24 June 2011, 153-158.
[30] Forcada, M., Ginestí-Rosell, M., Nordfalk, J., O’Regan, J., Ortiz-Rojas, S., Pérez-Ortiz, J., Sánchez-Martínez, F., Ramírez-Sánchez, G. and Tyers, F. (2011) Apertium: A Free/Open-Source Platform for Rule-Based Machine Translation. Machine Translation, 25, 127-144.
https://doi.org/10.1007/s10590-011-9090-0
[31] Buck, C. (2012) Black Box Features for the WMT 2012 Quality Estimation Shared Task. Proceedings of the 7th Workshop on Statistical Machine Translation, Montreal, 7-8 June 2012, 91-95.
[32] Assaf, D., Newman, Y., Choen, Y., Argamon, S., Howard, N., Last, M., Frieder, O. and Koppel, M. (2013) Why “Dark Thoughts” Aren’t Really Dark: A Novel Algorithm for Metaphor Identification. Proceedings of the 2013 IEEE Symposium on Computational Intelligence, Cognitive Algorithms, Mind, and Brain, Singapore, 16-19 April 2013, 60-65.
https://doi.org/10.1109/CCMB.2013.6609166
[33] Graham, Y. (2015) Improving Evaluation of Machine Translation Quality Estimation. Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, Beijing, 26-31 July 2015, 1804-1813.
https://doi.org/10.3115/v1/P15-1174
[34] Espla-Gomis, M., Sanchez-Martinez, F. and Forcada, M.L. (2015) UAlacant Word-Level Machine Translation Quality Estimation System at WMT 2015. Proceedings of the Tenth Workshop on Statistical Machine Translation, Lisboa, 17-18 September 2015, 309-315.
https://doi.org/10.18653/v1/W15-3036
[35] Costa-jussà, M.R. and Fonollosa, J.A. (2015) Latest Trends in Hybrid Machine Translation and Its Applications. Computer Speech & Language 32, 3-10.
https://doi.org/10.1016/j.csl.2014.11.001
[36] Kreutzer, J., Schamoni, S. and Riezler, S. (2015) QUality Estimation from ScraTCH (QUETCH): Deep Learning for Word-level Translation Quality Estimation. Proceedings of the Tenth Workshop on Statistical Machine Translation, Lisboa, 17-18 September 2015, 316-322.
https://doi.org/10.18653/v1/W15-3037
[37] Specia, L., Paetzold, G. and Scarton, C. (2015) Multi-level Translation Quality Prediction with QuEst++. Proceedings of the ACL-IJCNLP 2015 System Demonstrations, Beijing, 26-31 July 2015, 115-120.
https://doi.org/10.3115/v1/P15-4020
[38] Banchs, R.E., D’Haro, L.F. and Li, H. (2015) Adequacy-Fluency Metrics: Evaluating MT in the Continuous Space Model Framework. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 23, 472-482.
https://doi.org/10.1109/TASLP.2015.2405751
[39] Martins, A.F.T., Junczys-Dowmunt, M., Kepler, F.N., Astudillo, R., Hokamp, C., Grundkiewicz, R. (2017) Pushing the Limits of Quality Estimation. Transactions of the Association for Computational Linguistics, 5, 205-218.
https://doi.org/10.1162/tacl_a_00056
[40] Kim, H., Jung, H.Y., Kwon, H., Lee, J.H. and Na, S.H. (2017) Predictor-Estimator: Neural Quality Estimation Based on Target Word Prediction for Machine Translation. ACM Transactions on Asian and Low-Resource Language Information Processing, 17, Article No. 3.
https://doi.org/10.1145/3109480
[41] Kepler, F., Trénous, J., Treviso, M., Vera, M. and Martins, A.F.T. (2019) OpenKiwi: An Open Source Framework for Quality Estimation. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, Florence, 28 July-2 August 2019, 117-122.
https://doi.org/10.18653/v1/P19-3020
[42] D’Haro, L., Banchs, R., Hori, C. and Li, H. (2018) Automatic Evaluation of End-to-End Dialog Systems with Adequacy-Fluency Metrics. Computer Speech & Language, 55, 200-215.
https://doi.org/10.1016/j.csl.2018.12.004
[43] Yankovskaya, E., Tättar, A. and Fishel, M. (2018) Quality Estimation with Force-Decoded Attention and Cross-lingual Embeddings. Proceedings of the Third Conference on Machine Translation: Shared Task Papers, Belgium, 31 October-1 November 2018, 816-821.
https://doi.org/10.18653/v1/W18-6466
[44] Yankovskaya, E., Tättar, A. and Fishel, M. (2019) Quality Estimation and Translation Metrics via Pre-Trained Word and Sentence Embeddings. Proceedings of the Fourth Conference on Machine Translation, Florence, 1-2 August 2019, 101-105.
https://doi.org/10.18653/v1/W19-5410
[45] Matricciani, E. and De Caro, L. (2017) Literary Fiction or Ancient Astronomical and Meteorological Observations in the Work of Maria Valtorta? Religions, 8, Article No. 110.
https://doi.org/10.3390/rel8060110
[46] Matricciani, E. and De Caro, L. (2020) Jesus Christ’s Speeches in Maria Valtorta’s Mystical Writings: Setting, Topics, Duration and Deep-Language Mathematical Analysis. J, 3, 100-123.
https://doi.org/10.3390/j3010010
[47] De Caro, L., La Greca, F. and Matricciani, E. (2020) Saint Peter’s First Burial Site According to Maria Valtorta’s Mystical Writings, Checked against the Archeology of Rome in the I Century. J, 3, 366-400.
https://doi.org/10.3390/j3040029
[48] Matricciani, E. (2022) The Temporal Making of a Great Literary Corpus by a XX-Century Mystic: Statistics of Daily Words and Writing Time. Open Journal of Statistics, 12, 155-167.
https://doi.org/10.4236/ojs.2022.122010
[49] Lindgren, B.W. (1968) Statistical Theory. 2nd Edition, MacMillan Company, New York.
[50] Valtorta, M. (2001) Il Vangelo come mi è stato rivelato. Centro Editoriale Valtortiano, Isola del Liri.
[51] La Greca, F. (2019) Gesù e il mondo greco-romano nell’Opera di Maria Valtorta. Centro Editoriale Valtortiano, Isola del Liri.
[52] De Caro, L., La Greca, F. and Matricciani, E. (2021) Hidden and Coherent Chronology of Jesus’ Life in the Literary Work of Maria Valtorta. SCIREA Journal of Sociology, 5, 477-529.
https://doi.org/10.54647/sociology84718
[53] Pisani, E. (2010) Catalogo dei quaderni autografi di Maria Valtorta. Centro Editoriale Valtortiano, Isola del Liri.
[54] Matricciani, E. and De Caro, L. (2018) A Mathematical Analysis of Maria Valtorta’s Mystical Writings. Religions, 9, Article No. 373.
https://doi.org/10.3390/rel9110373
[55] Papoulis, A. (1990) Probability & Statistics. Prentice Hall, Hoboken.
[56] Abramovitz, M. and Stegun, I.A. (1972) Handbook of Mathematical Formulas. 9th Edition, Dover Publications, New York, 11.

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.