Methodology and practice

This document explains how the online texts in HTML for use with a plain Web browser are generated from the TEI master copies.

Representation of the markup

HTML does not possess sufficient capability to represent adequately all the various nuances of markup which are encoded in the original copies. It is therefore necessary to make some typographical judgments about how best to retain as much of the original value as possible, without overloading the display.

To see the original TEI file, with all the markup preserved, you can use a real SGML browser like Panorama or MultiDoc Pro for which stylesheets and resource files have been developed. In the meantime, the following representations have been adopted:

Abbreviations (<abbr>)
Additions (short) (<add>)
Additions (long) (<addspan>)
The Section symbol in curly braces {§} marks the start of a longer addition (one spanning across other markup). The endpoint of the addition is marked by another Section symbol.
Additional names (<an>)
Unmarked, but names in general are in bold.
Apparatus criticus (<app>)
Parallel readings are hyperlinked. Where a lemma is given, this is highlighted, and clicking on it reveals the readings. Where no lemma was given, the last reading is given in the text, and a Section symbol (§) is highlighted, providing access to the other readings.
Corrections (<corr>)
Deletions (<del>)
Expansions (<ex>)
Forenames (<fn>)
Unmarked, but names in general are in bold
Foreign (<frn>)
Gaps (<gap>)
[Reasons italicised in brackets]
Group names (<gn>)
Unmarked, but names in general are in bold
Scribal hands (<hand> and <handshift>)
If identified, the identity is in bold; the name of the hand is enclosed in [brackets] followed by the name of the scribe, if known.
Highlighting (<hi>)
Blinking colour has been used to highlight examples of this encoding.
Linebreaks (<LB>)
Line numbers start the line followed by a single closing bracket]
Lemmata (<lem>)
See Apparatus criticus
Line groups (<lg>)
Stanzas are numbered where appropriate. Lines within stanzas are numbered only if encoded with exogenous numbering
Milestones (<mls>)
Foliation is given in {curly braces}
Name links (<nk>)
Particles are not separately highlighted
Numbers (<num>)
Numbers are not given any typographic distinction
Organisation names (<on>)
Page breaks (<pb>)
Horizontal line across the screen, page number in small type on the right.
Personal names (<ps>)
Placenames (<pn>)
Readings (<rdg>)
See Apparatus criticus
Regularisation (<reg>)
Role names (<rn>)
Unmarked, but names in general are in bold
Sic (<sic>)
Surnames (<sn>)
Unmarked, but names in general are in bold
Supplied (<sup>)
Special terms (<term>)
Unclear (<uncl>)
Text is enclosed in (parentheses).

Character representation

Length marks

The ISO 8859-1 (Latin-1) character set is used by default, but this handles only one length mark, the fada or acute, and only over vowels. When XML software becomes available, UniCode (ISO 10646) will be used.

Editorial length marks given as macrons in the source texts are shown in the HTML copies online as acute accents on vowels, but with the character underlined. Macrons on non-vowels are shown by underlining only. A rare suspension dí (d&imacr;) is spelled out in full as didiu.

Other diacritics

The séimhiú (dot-over) accent is rendered with a following `h' and underlined. The dot-under accent used to indicate elision is rendered with underlining alone.

The notum for uel, graphically identical with stroked-l is rendered l^, L^.

E, e with an ogonek which represents tall e (e-caudata) are rendered as simple E, e, but underlined.

The Insular Ampersand is rendered &.

If you have any suggestions about how the information could be better presented in HTML, please let us know.

