CONCLUSION: THE DNA OF THE COMMEDIA
For the benefit of textual scholars who may not be au fait with molecular biology and the terminology of genes and DNA, I will attempt here to offer a fuller account of the ways in which identical forces operate in the two fields of genetic replication (the mechanism by which life is created through successive generations) and scribal copying (the mechanism by which texts were disseminated before the invention of printing), and show why computer programmes devised to analyse data in the first lend themselves to the same task in the second.
DNA and a literary text both consist of linear information conveyed by an alphabet: in this structural sense they are identical rather than merely analogous. DNA is text, a sequence or string of information conveyed by what is for all intents and purposes an alphabet of four letters; that information string is replicated (copied) in procreation. A gene is a text (a section of the DNA sequence); it can be read by a molecular biologist and conveys meaning; substitutions or omissions can leave the meaning unchanged, or alter the meaning of the text, or render it meaningless. (See below.)
The textual, linear nature of the DNA message is of course independent of the notational system. It happens that the letters of the DNA alphabet used by geneticists to label nucleotides are by convention letters from the Roman alphabet, but any form of notation (symbols, even colours) would serve equally well to convey the informational content. There is a true homology here which operates at two levels, between: a) the verbal text (of the Commedia in this case) and the DNA text (written in bases on a molecule); and b) between DNA replication and scribal copying, i.e. descent with variation in both cases. The crucial structural element is linearity, and sequence is what conveys the meaning.
Genes are texts composed of words (technically, codons). Each word or codon is three letters long, each letter representing a nucleotide: this is what is known as the triplet genetic code. (It would surely have delighted Dante, had he known it, that life itself is informed by a pattern of three-in-oneness, just as the poem he wrote and the metrical scheme he devised to write it embody that same pattern of three-in-one, which itself reflects the triune god in whose image he believed the world to have been created.) Each word in the genetic text can have (does have) spelling variants which do not affect meaning.
The words (or codons, or triplets) which make up a gene are composed of any permutation or combination of three of the four letters A C G and T. These letters designate the nucleotides: Adenine, Cytosine, Guanine, Thymine. Thus, for example, we might have a gene which reads ATG.AAT.TCG.GGC.…..
Codons specify amino acids, the building blocks of proteins. Thus the nucleotide triplet TTA codes for leucine; the triplet CAA codes for histidine; the triplet GGG codes for glycine, and so on. The order of the amino acids in the protein specify its structure and thereby its function. Thus the order (sequence) of codons in the gene is translated into the specific function of a protein – the ‘meaning’ of the gene text. A chromosome is a string of genes – a library of books, say 1000 books – with each gene a text in the sequence.
Genetic replication (the copying of genes) is subject to mutation, i.e. genetic change. Change can occur within any one or more of the sequences of three letters which make up the codons of which the gene is formed.
Drift occurs when there is a change in the ‘spelling’ of the codon (so AAT might become ATT or AAA) but there is no change in the protein coded for, i.e. no change in the significance of the codon. The sense of the gene remains the same. This is the exact equivalent of a spelling or formal variant in a verbal text: a small change with no effect on meaning (so a scribe might write abysso instead of abisso, with no change in the sense of the word or the phrase which contains it; he might write de lo or dello, again with no change in meaning). So in both cases – DNA and literary text – this kind of copying error makes no change to meaning. Drift is the name for ‘silent’ mutations within the gene. The textual scholar talks of spelling and formal variants.
More significant mutations come in various guises. There are mutations which make sense but whose sense is altered from the original meaning, though perhaps only slightly; and there are mutations which substantially alter the sense of the ensuing text. The text may still make sense but mean something quite different; but equally it may not make any sense at all, the change turning the sense to nonsense. These mutations may be substitutions (as with drift, but now with significant consequences); or they may be omissions or insertions.
We can illustrate these various kinds and degrees of mutation with textual examples involving substitution: a scribe might substitute the word viso for volto: the word fits grammatically (and in the case of a poetic text, metrically) and there is no change in meaning. This is a variante di lettura, a variant reading: neither variant is self-evidently right or wrong. Or the scribe might substitute the word corpo for volto: the word still fits grammatically (and metrically), but the meaning is altered: it may make sense in context, but equally it may no longer make sense. Or again he might substitute voglio for volto: the substitution of a verb for a noun no longer fits grammatically and the phrase will almost certainly no longer make any kind of sense.
There are some differences between DNA text and verbal text, but they are not relevant when thinking about the replication process. Large portions of the DNA text are ‘junk’ (insignificant in evolutionary terms): verbal texts have no equivalent for 'junk'. Because of this, with DNA there is a reading frame: the geneticist must start reading at the right point so that the sequence of triplets is meaningful. If one starts reading at the wrong point, there is no significant pattern of codons, and one fails to identify or pick out the gene (to discover a gene is precisely to identify a meaningful stretch of DNA, to read the text correctly). There are therefore DNA sequences that have framing, structural and regulatory significance (compare covers, frontispiece, blurb, spine, index of a book), but no sense implications as regards the main text.
With the addition or omission of nucleotides in a DNA sequence, the result is likely to be nonsense simply because the reading frame is lost. With verbal texts the effects of omissions and additions will depend on various factors: on size, on context, on whether the structure remains grammatically intact, as it will for example if the lost word is an adverb, but probably will not if it is a verb. But often the result will be nonsense, and an editor is alerted to the possibility of omission precisely because the text at a given point fails to yield a satisfactory sense.
The two processes we are considering – genetic replication effected by biological systems and scribal copying effected by human agency – have inherent sources of error which are strictly analogous, but which are less intuitively apparent than the obvious parallels outlined up to this point.
i. DNA has inherent slippage or, more precisely, replicability, i.e. there are inherent qualities in the text that interfere with accurate transmission. This is equivalent to eyeskip (saut du même au même, salto per omoioteleuto) in a verbal text; but whereas eyeskip usually generates an omission, as the eye typically slides from one word to the same word a line or two below, in genetic replication the slippage tends to be in the opposite direction (back up the page, as it were), with replication rather than omission being the outcome. (This is not unknown in textual transmission, but much rarer: in the textual tradition of Dante’s Monarchia, for example, which survives in twenty manuscripts, there are just one or two cases of replication caused by eyeskip but hundreds of cases of omission.) Repeating elements in the DNA become more numerous with time, and long repetitive sequences are common in ‘junk’.
ii. There can be a particular stretch of DNA which is infective and mobile. In text terms, the equivalent phenomenon is resonance and its effect on scribal memory. A scribe remembers a resonant phrase, and introduces it in place of a somewhat similar but not identical phrase elsewhere in the text. Scribal memory creates a transposable element and moves text about, as when various copyists of the Commedia at Inf. vii 11 replace the phrase vuolsi ne l’alto là with vuolsi così colà famously enunciated on two earlier occasions in the poem (Inf. iii 95 and Inf. v 23).
iii. There is a DNA equivalent of contamination (the scourge of textual editors) in the form of lateral gene transfer: bits of text are moved by viruses between organisms which are not closely related enough to have that information in common by heredity. This is exactly what happens when variants are introduced by lateral transmission in a contaminated manuscript tradition.
iv. Genetic recombination creates a hybrid text in much the same way as a copyist switching exemplar halfway through the transcription process creates a hybrid text. Indeed the creation of a new living creature is the creation of a hybrid text (although clearly in an infinitely more intricate and complicated way than the simple switch of exemplar for a scribe).
A final point. Historically, evolutionary biologists have had a problem with convergence (when two species independently develop the same morphology through genetic mutation). Textual scholars are dealing with a similar phenomenon when they are confronted with polygenetic error (also known as convergent error): a change in the text which may arise independently in unrelated copies and which cannot therefore be used as proof of descent from a common ancestor. Although convergence operates for biologists at the level of gross morphology, and for textual scholars at the level of text, the parallel is striking. The ‘spectre of convergence’ is as problematic for evolutionary biologists as contamination is for textual critics.
Genetic replication and scribal transmission are dissimilar only in the value placed on their outcomes or end-products. In the animal kingdom genetic replication is the engine of evolution, which is often thought of in terms of progress, and is at the very least morally neutral: it does not involve value judgments. Scribal copying over many generations commonly involves degradation and loss of quality as the author’s original is eroded in the course of transmission. The main drive of the textual scholar is recovery: to move backwards to the lost original and to reconstruct the text from which the later imperfect copies with their various mutations descend. This is the only significant difference between the two processes, and this difference does not involve the mechanism of change or possible ways of analysing it.
If we look at the history of phylogeny and cladistics, the discipline was transformed by the discovery of DNA in 1953, and has continued to be transformed by advances in the understanding of genetics since that time. Previously, living creatures were assigned to phyla based on gross morphology: biologists have always had a wealth of gross morphological features to examine and analyse. Dragon-flies, birds and bats all have wings: do they have a common winged ancestor? (The answer of course is no: this is a classic example of convergent evolution.) Traditionally biologists looked at the meta-level rather than at the text itself, because the text was not available. Now that DNA analysis is possible, they are using the textual level to check and verify hypotheses elaborated on the basis of gross morphological features. There have been striking case-histories of phylogenetic reassignment after long-disputed history based on gross morphology.
Take the case of the marsupial wolf (thylacine), extinct in Australia for some time. Zoologists trying to classify it from morphology argued over whether it was more closely related to (i) extinct carnivorous marsupials in South America that came to Australia when the two continents were geographically connected, or (ii) carnivorous marsupials in Australia (such as the Tasmanian Devil) which happened to evolve to look rather like the South American ones. (The thylacine and the South American borhyaenids uniquely share certain dental and pelvic traits.) Eventually, DNA was extracted from museum specimens and sequence data obtained: this showed that the correct explanation was the second one.
This is precisely the kind of problem faced by textual scholars analysing the transmission history of the Commedia. In the electronic Commedia project we have tackled a specific instance of disagreement about the assigning of an individual to a particular branch of a tree. We have two rival hypotheses about the relationship of ms. Rb to other early surviving manuscripts. Petrocchi and Sanguineti elaborated their respective hypotheses on the basis of their judgment of the significance of certain features in surviving copies of the poem, Petrocchi concluding that ms. Rb belongs in the β family while Sanguineti believes it to be a member of the α family. We have noted how small the number of readings is on which Sanguineti bases his stemmatic hypothesis: basing a cladistic hypothesis on a single variant or small group of variants is always problematic, whether in biology or in textual studies. Depending on the choice of features highlighted, and the significance attached to them, persuasive arguments can be made for very different hypotheses. This is quintessentially an exercise of iudicium, of fallible human judgment. The new genetic science forgets about morphological features in biology and just takes the DNA text, with sometimes surprising but always conclusive results.
The preceding pages have established a parallel between the two copying systems of genetic replication and manuscript transmission, and pointed to the very significant ways in which descent with variation underlies these two apparently unrelated areas of scientific investigation. It has been shown conclusively in biology over the last few decades that the best (most constructive) approach is the one which considers all the data. Until the possibility of DNA analysis became a reality, phylogenetic trees were tugged about for years as some people argued that certain features were of decisive importance, only to be contradicted by others who highlighted other features and made an equally persuasive case. It has been found that to get the right answer one needs to plug in all the data. Any form of selection or weighting of the data involves the operation of subjective human judgment, meaning different people will produce different results, and the disagreement between them will be unresolvable.
For textual scholars to use the programmes designed by evolutionary biologists is to piggy-back on a huge body of established research, and apply it in a new area. It is to be hoped that more textual scholars will feel able to adopt this new approach in the coming years, not in place of tried and tested philological methods which remain valid, indeed indispensable, in so many areas involved in the production of critical editions of medieval texts, but as a supplementary methodology able to resolve disputes about manuscript relationships where traditional means have proved unable to do so.
 Richard H. Thomas, Walter Schaffner, Allan C. Wilson & Svante Pääbo, DNA phylogeny of the extinct marsupial wolf, in Nature, vol. 340, 10th August 1989, 465-67.
CLICK HERE to return to 'Commedia DVD'.