=================================================== =================================================== Peter Schreiner Input conventions for the transliterated Sanskrit text of Ghera.n.da--Sa.mhit-a (GHS) 0. Source: The text transliterated is based on a file which represents the text as printed in: Ghera.n.dasa.mhit-a : Sanskrit--deutsch. Ed. Peter Thomi. Wichtrach: Institut f˙uur Indologie, 1993 (ISBN 3 7187 0013 1). 274 pp. This file has kindly been made accessible to Peter Schreiner by Peter Thomi who has given his permission that his file be modified according to the input conventions explained in the following paragraphs and be made freely accessible to the scholarly community. Any user of the electronic version is requested to consult the printed version for the critical apparatus, editorial comments, translation, annotations and indexes. 1 Transliteration 1.1 Diacritical marks: Punctuation marks are used to code diacritics. All diacritics are typed in front of the letter to which they belong (which imitates the traditional "layout" of typewriters where accents etc. are placed on dead keys and need to be typed before the character is typed). . = subscript dot (e..g. k.rta.h) ; = superscript dot (e..g. a;nga) ? = tilda (superscript) (e..g. praj?n-a) - = superscript hyphen (macron) (e..g. -atm-a) / = aigu (superscript) (e..g. /s-astra) Where, as in this document or in comments to variant readings within the text, the use of punctuation marks in their proper function has to alternate with their function as diacritical marks, their use as punctuation marks is distinguished by a following blank (or other signs of punctuation including parentheses and figures) or by doubling where a blank is not possible (e..g. in abbreviations). Thus, the dots in ".r.s.i" are diacritics, but the dot after ".r.si. " is a full stop (becaue of the following blank). Similarly, since no blank can be inserted after a hyphen, the actual hyphen is written by doubling it ("--"). The rationale behind this convention is a feature of TUSTEP, the T˙uubingen System of Text Processing Programs, which allows to perform global exchanges to which "exceptions" can be formulated. Thus, the exchange of all full stops which are diacritical marks amounts to formalizing the task: "Exchange all periods, BUT NOT periods before blanks, before figures, before other marks of interpunctuation." Da.n.da: Exclamation mark is used to represent the da.n.da (vertical bar, which serves as punctuation mark in Sanskrit texts); the quarter of a verse (p-ada) is also marked by a ! (i..e. vertical bar), if a P-ada--index is to be created. Line division: In verses of anu.s.thubh--metre a new line begins after quarter 2 and 4. Longer metres (longer than the /sloka, that is) are typed in such a way that each p-ada gets a different line. In the introduction new lines or beginnings of paragraphs are marked by dollar sign. Other markers: @ represents the asterisk in the printed edition which marks the editor's (Peter Thomi) conjectures regarding the wording of the text. The exact place of this sign may differ slightly from where it is placed in the printed edition. Passages which in the printed text and//or in the file appear in parentheses are put in square brackets. References: The full reference (chapter and verse) is given at the end of the verse to which it refers. The beginning of references is marked by double bar and the end is marked by a single bar. Always after a reference a new line begins. In addition to the reference as given in the printed text, each sentence (i..e. line) is numbered in accordance with the internal reference system of TUSTEP. This additional reference is written between pointed parentheses; its first and second segments correspond to the textual reference (chapter, verse); the segment after the slash counts the lines within each verse. Lines of the base text are numbered by hundreds, with the zeros on the right margin not being printed: 1.1/1(00), 1.1/2(00). 1.2 Sandhi The "principle of transliteration" has been that the input format should reproduce the letters of the printed text as closely as possible, i..e. that one types what one sees. However, to what is printed (in Devan-agar-i) markers are added (in the transliteration) to mark sandhi changes. A sandhi change is defined with regard to the "pausa form" of a word, i..e. the form which the termination of a word would take at the end of a line or out of context (vigraha). Note that in case of nouns this pausa form is normally not identical with the declensional stem which would be entered in a dictionary ("lemma"). Thus, consonants which have undergone a sandhi change in the text are marked by *. Similarly, final vowels which have changed due to sandhi are marked by * (e..g. -as-id* r-aj-a nalo* n-ama). In case of vowel sandhi the above--mentioned principle of transliteration suffers an exception: Vowel sandhi is dissolved and marked (e..g. na*asti, ca*eva). Similarly, avagraha is reconstituted, the originally omitted initial "a" being marked as sandhi vowel (e..g. devo* *api). In some special cases the marking of sandhi has to be extended to include some disambiguating information: -- to half--vowels which substitute for a long vowel the diacritic for "long vowel" (-) is added (e..g. devy-* api); (the disambiguating sign is "added", i..e. we write "y-*" rather than "-y*", since the minus sign is something which is not contained in the copied text but is an editorial addition by the copyist); -- if final "a" in sandhi does not stand for "a.h" (with visarga), then the original vowel which has been substituted by the "a" is added (e..g. lokae* eva, where "loka eva" is printed, which is the sandhi form for "loke eva"). In case of "double sandhi" the sandhi is marked by double ** (e..g. sa**eva in case of "saiva" instead of "sa eva"). There is one case of vowel sandhi across the p-ada separator (5.61/1) which is transliterated as follows: --u* !*u-- Blank is inserted between words wherever this is possible in transliteration (but not necessarily in Devan-agar-i), e..g. "hy* api, nalo* *api. 1.3 Separation of compounds Separation of compounds is marked by inserting + between the members of a compound (e..g. brahma+pur-a.na). In case of sandhi, the + functions also as sandhi--marker, i..e. no additional sandhi--marker is added (e..g. tapo+vane, mah-a+-atmana.h). Separation of compounds is restricted to nominal compounds (including upapada--compounds like ura+ga, go+p-i) and does not include grammatical analysis. For details, special cases etc. see the introduction to Sanskrit Indices and Text of the Brahmapur-a.na, Wiesbaden 1987, p. xvi--xvii, by P. Schreiner and R. S˙ouhnen. 1.4 Colophones Colophones which are part of the printed edition are enclosed by double square brackets. 2 Textual analysis The text transliterated according to the above conventions constitutes what we call the "input format" or "input version". On its basis two further versions are generated automatically, the so--called "text format" and the so--called "pausa format" 2.1 Text format The text format represent the conventionally transliterated text without markers and with compounds and sandhis reconstituted. This version can be processed for output even in Devan-agar-i with programs which work on the basis of transliterated input (e..g. TeX). With metrical texts, this may serve as basis for metrical analysis. 2.2 Pausa format The pausa format of the text is generated by changing all the characters marked by * (and + ) according to the sandhi rules of Sanskrit grammar. Each word appears in the phonetic form which it would assume at the end of a line (e..g. -adibhir*, -adibhi.s*, -adibhi/s*, -adhibhis* all become -adibhi.h). Members of compounds are separated. This form is the basis for indexing. Comments and questions may be addressed to Peter Schreiner Abteilung f˙uur Indologie Universit˙aut Z˙uurich R˙aumistr. 68 CH--8001 Z˙uurich Switzerland email: pesch@indoger.unizh.ch ================================================== ==================================================