Tutorial on the TextTools.

How to format in Hebrew, Greek, Coptic, and Latin.


This ecourse will instruct you how to use both the TextCoder and the TextDoctor to put together HTML documents ready to present on the web. I will lead you step by step through the various functions on both utilities.

The above links will open these tools in a separate window so that you can easily switch back and forth between them and this tutorial.

Many of the functions overlap between the two tools. Remember throughout that the main difference between the two is that the TextCoder is designed to format an entire text at once, while the TextDoctor is designed to format only those portions that you select.

Text and code.
Stretch and set.
Status box.
Show and hide breaks.
Reverse and filter text.
Apply font..
Restore text.
Language options.
Archaic and clean.
Count.
Code and recode.
Markup options.
Character buttons.
Table.
Display.
Copy and clear.
Embed.

Text and code.

The point of both utilities is to turn text into code, that is, into Unicode or HTML character references and markup code.

The TextCoder actually has two basic fields, one for each of those terms, text and code. You enter your text to be converted into the text field, and when you are finished operating on it the results will be posted in the code field. Note that, if you enter text written in a font into the text field, it will be transformed back into the keystrokes that were used to type it up in the first place. If you enter a Unicode text into the text field, many or even most of the characters will appear as they ought to.

The TextDoctor has only one field. You enter your text into it, and then progressively turn portions of it into code. So this single field could be referred to as either the text field or the code field.

Note that the text field on both utilities is accompanied by other smaller areas or boxes. The TextCoder has one text area above and another below the text field. These areas are used only for creating a table. No text entered into them will show up in your code unless you are creating a table of some kind. The TextDoctor has only the one text box below the text field. This box is the status box.

Stretch and set.

Both utilities have a stretch button. Its purpose is to adjust the length, top to bottom, of the text field. The default value, in rows, is given in a small text box next to the button. If you wish to add rows to or take rows away from the text field, simply enter a different value and click the button. Any text that has been entered into the field will not be lost, but the field will be changed in length.

Only the TextCoder has a set columns button. Next to the button are four radio buttons. Select how many columns you wish to use, then click the button. This feature is necessary for creating synoptic tables.

If you have selected two or more columns and intend to code a synopsis, it is imperative at this point to stretch the text fields until the scroll bar for each vanishes. The whole point of a synopsis is to match the rows up between the columns, which is impossible if the text must be scrolled.

Status box.

Only the TextDoctor has a status box. It appears just beneath the text field. Roll your cursor over each button and the status box will tell you the name or function of that button.

Show and hide breaks.

A break is a hard return in the text. Text fields have a property called word wrap, and this property disguises hard returns by creating a soft return at the right margin of the field when a text is too long to fit on one line.

The problem is that, for many purposes on these utilities, it is vital to have a hard return (not just a soft return) at the end of some lines or even every line. One such purpose is the creation of a synopsis. You must have a hard return at the end of each line or the final code will not necessarily keep the lines together correctly, thus spoiling the very reason for making a synopsis.

Marking up a text as paragraphs or as list items also requires hard breaks, but we shall get to marking up later.

Clicking the show breaks button will place a double pipe, ||, at every point in the text where a hard return is present. It will also turn the show breaks button into a hide breaks button. Click it again, then, to take out the double pipe markers once you have made certain that you have breaks where you need them.

The TextCoder has another button called break text. Click on this button and a hard break will be added after every word in both the text and the code fields. It will turn a text like this...:

Try it out.

...into this...:

Try
it
out.

Why, you may ask, would anyone wish to do such a thing? Well, for a synopsis it may be useful, instead of working at having to create a break at the end of every line, to start with the breaks already in place. Then the trick becomes grouping the words that belong together rather than separating those that do not, which seems to me, at any rate, to be the easier of the two, especially if working in the narrow confines of three or four columns.

Furthermore, it may also be useful, once your text is coded, to break the text in the code field in order to make it more manageable when you copy and paste it into your text editor. Each word will be rather long once coded into HTML or Unicode numerical references:

Text. Code.
אלהים אלהים
θεος θεος

The code field will wrap the text, but your text editor will not unless you set that option. I myself prefer to keep my code all in one column where I can read it, rather than let it spill out into unreadability across the right margin. This option is, of course, entirely optional.

Reverse and filter text.

The TextCoder alone has these two features.

The reverse text button does just that, reverses the entire text like a palindrome, which can be more than just a diversion. Some Hebrew fonts, when rendered improperly, will come out backwards when you try to code them into Unicode. The remedy is to reverse the text before coding it.

The filter numbers button will remove all numbers from the text. It will also remove many accompanying marks such as periods (2.) or hypens (3-). This feature is intended to remove the versification from a text.

Apply font.

Both utilities have an apply font button. When the time comes to turn keystrokes into Unicode, both the TextCoder and the TextDoctor follow a particular keymap of my own design for each language. Unless you have followed the prescribed keystrokes exactly, your text will not be properly turned into code.

The apply font button, however, will turn the keystrokes associated with a selected font into the keystrokes required by the utility. You do not even have to have any idea what my keymaps are. Simply use the keystrokes of your favorite font and apply it.

If you have selected more than one column on the TextCoder, incidentally, each column will have an apply font button. You could, then, apply one font in one column and another font in another. When you click the button, all columns will be affected, so for any columns that you wish to leave untouched do not select any font.

Restore text.

Both utilities have a restore, or restore text, button. This button will undo the most recent change that you have made to the text. Simply typing or deleting text does not qualify as a change. A change is the clicking of a button.

Restoring the text goes back only one button-click, so be careful.

Language options.

Neither the language nor the character set option will be applied until you click on the code or recode button.

On both the TextCoder and the TextDoctor there is a drop-down box in which you can select a language, and a set of radio buttons with which to select a language option. If working with more than one column on the TextCoder, each column will have its own drop-down box and set of options.

The options for each language are described in detail on the keymap for each language. Simply put, the lower the option (1 is the lowest, 4 the highest), the more different characters are allowed, and will be coded. Ancient languages can often be written in various ways (Hebrew with or without vowel points, for example, or Greek with or without accents and breathings). The lower-numbered options will allow for these special characters; the higher-numbered options will not.

Why not, you may ask, always use option 1 and thus allow for all possible characters? Why not cover all bases, as it were? The primary answer lies in the next section, archaic and clean.

A secondary answer, however, is that you may not wish all characters to be turned into letters in your chosen ancient language. You may wish to leave certain characters, such as brackets or various punctuation marks, exactly as they are, and the lower-numbered options will probably turn those marks into vowel points or accents or such. On the higher-numbered options these marks will be left alone.

Archaic and clean.

Neither the archaic nor the clean function will be applies until you click on the code or recode button.

The archaic function works closely with the character set option, and will archaicize the text in some way. It may, for instance, turn text of mixed lowercase and uppercase letters into all uppercase, or restrict punctuation to that found on inscriptions. Refer to the keymap for each language for exact information. The archaic function is described near the bottom of each keymap page.

The clean function filters out undesired textual marks. It is with the clean function that you may best use the various character set options. With this feature you can easily turn pointed Hebrew into unpointed Hebrew, polytonic Greek into unaccented Greek, or Coptic with supralinear strokes into Coptic without them.

Simply select a language and character set option and check the clean box, then click on the code or recode button, and the resulting code will be cleaned of any and all characters that do not appear in the keymap for the language and option that you have selected.

You will not wish to use this funtion with the code button if you are working on a Unicode text. Why not? Because it will clean out all the characters already coded in the ancient language! Instead, code without the clean function, then recode with it.

Count.

The count button on the TextCoder will count the number of characters, words, returns, and lines in the text. It will show these results in the code field.

Code and recode.

The code button takes whatever is in the text field, works on it in the ways that you have specified with your choice of language, character set, and archaic and clean functions, then gives you the results in the code field. The recode button, on the other hand, has nothing to do with the text field. It takes what is already in the code field and codes it more fully into Unicode than the code function has coded it. Hence the name recode.

Coding the text has no effect on characters that are already in Unicode. It affects only those keystrokes specified in the character set that you have chosen. Any character missing from the keymap will simply pass through the machine untouched.

This simple fact has interesting implications for correcting the texts that you are modifying. Suppose that you have a Unicode text with errors in it which you wish to correct. You copy and paste the text into the text field, and wish to change one ancient character into another. Deleting the incorrect character is easy enough, but your computer keyboard probably does not have the ancient character that you wish to put in its place. You can, then, type the appropriate keystroke from the keymap in its place instead.

Your text now has all ancient characters except one, the ASCII character that you are using to make the correction. This situation is fine. Just click on code.

Now the code field has all ancient characters except one, which has now been transformed into an HTML character reference for Unicode: &#, plus a number, plus ;. Now click on recode, and all the characters will be changed into those HTML character references. Coding and recoding work together.

Markup options.

Markup options work on the code field only, not the text. On the TextCoder they work only with one column. Setting two or more columns causes the text to be marked up automatically as a table when the code button is pressed.

The TextCoder has three basic markup options: Break, paragraph, and list.

The break option adds a break tag (<br/>) to the code in the code field at the end of every line (as defined by the presence of a hard return, if you recall), and a nonbreaking space marker and break tag (&#160;<br/>) for every skipped line.

The paragraph option adds a break tag (<br/>) to the end of every line with a hard return, and paragraph tags (<p></p>) at the appropriate positions for every skipped line.

The list option adds a break tag (<br/>) to the end of every line with a hard return, and line-item tags (<li></li>) at the appropriate positions for every skipped line. It also, of course, adds the appropriate ordered list tag (<ol> or </ol>) to the beginning and end of the text.

Furthermore, the paragraph and list options (but not the break option) will add identification values to each paragraph or line-item. The text boxes for these values are found immediately above the markup button. There are two such boxes. The first applies only if you fill it out, and applies identically to each element. The second applies automatically, must be a numerical value (the default is 1), and increments with each new paragraph or line-item. The two are separated by a dot, or period, within the identification value.

For example, if you enter chapter1 in the first box and leave the second box at 1, the first formatted paragraph or line-item will be marked up as follows...:

<p id= "chapter1.1" class= "english"></p>
<li id= "chapter1.1" class= "english"></li>

...the second as follows...:

<p id= "chapter1.2" class= "english"></p>
<li id= "chapter1.2" class= "english"></li>

...and so forth.

Notice that a class value (class= "english") is added to the tag. This value comes from the class box. It expects, of course, any class from your cascading style sheets. If you enter nothing in this box, then (as per my recommendations) the paragraph or line-item will be given the class name that matches the language option that you have chosen. (If you do not use classes, this class name will not affect your HTML at all. Leave the class in your code; you may use it later.)

Since the markup function operates on the code field, which may be chock-full of almost unreadable strings of HTML character references, it is important to know where the hard returns are, as well as the skipped lines (each of which is actually only a hard return immediately preceded by another hard return), in the text field before you code it into the code field.

These three functions will turn this text...:

Try it out, line 1.
 
Try it out, line 2.
Try it out, line 1.
 
Try it out, line 2.
Try it out, line 1.
 
Try it out, line 2.

...respectively into this code...:

Try it out, line 1.<br/>
&#160;<br/>
Try it out, line 2.
<p>Try it out, line 1.</p>
 
<p>Try it out, line 2.</p>
<li>Try it out, line 1.</li>
 
<li>Try it out, line 2.</li>

...except that the listed text will be introduced with an <ol> tag and concluded with an </ol> tag.

As convenient as this function may be, there will be times when your text is not already nicely divided up with skipped lines. If your text is rather long, it could get tedious to go through it adding extra hard returns after each line.

There is, for this reason, a shortcut. No need to run through the text skipping lines; just show the breaks in the text, then mark it up.

This shortcut will turn this text, with its breaks shown as double pipes...:

Try it out, line 1.||
Try it out, line 2.
Try it out, line 1.||
Try it out, line 2.
Try it out, line 1.||
Try it out, line 2.

...into the exact same code:

Try it out, line 1.<br/>
&#160;<br/>
Try it out, line 2.
<p>Try it out, line 1.</p>
 
<p>Try it out, line 2.</p>
<li>Try it out, line 1.</li>
 
<li>Try it out, line 2.</li>

The TextDoctor has these same features, but they work, not on the entire text, but on the portion that you select with your cursor. The TextDoctor also adds another markup function, the blockquote button, which adds the <blockquote></blockquote> tags to the selected text, but only to the very beginning and the very end of the selection.

Furthermore, the TextDoctor features a host of markup buttons that format the selected text with standard HTML style tags...:

  • <span style= "font-size: XX"></span>, in which XX is a number in points.
  • <span style= "text-decoration: YYY"></span>, in which YYY is either underline, overline, or line-through.
  • <span style= "color: #ZZZZZZ"></span>, in which #ZZZZZZ is a hexadecimal color reference.

...or nonstyle tags:

  • <b></b>, boldfaced text.
  • <i></i>, italicized text.
  • <h1></h1> through <h6></h6>, headings.
  • <sup></sup> and <sub></sub>, superscripts and subscripts.

The TextDoctor also features an anchor button which can turn any selected word into a hyperlink. Just click the button, write the desired web address into the status box, select the desired text, then click the button again. Something akin to...:

<a href= "http://www.webaddress.com">hypertext</a>

...ought to be the result.

Character buttons.

The TextDoctor offers several grids of individual character buttons. Scroll down to the radio buttons just underneath the words character buttons. Click on a language. A bank of buttons, each with its own ancient or foreign character, will appear. Roll your mouse over a button and read the status box for the Unicode number and name of that character. With your cursor in the text field, click on a button to add that character, or the code for that character, to your text.

Table.

Only the TextCoder has a table function. With one column this function is optional. With two or more columns it is automatic.

The table options consist of...:

  • Table class: A class from your stylesheet. The default is data, which comes from my stylesheet.
  • Table row class: The default, again, is data.
  • Table datum class: Once more, the default is data.
  • Table border: A border of 2 is a frequent choice, but you may choose to have thicker or thinner lines.
  • Table caption: The word or phrase that will appear directly above or below your table. I have the code set up to display the caption below the table, but you can change this arrangement easily by changing the attribute align= "bottom" to align= "top" in the <caption></caption> tag toward the beginning of the resulting code. (You can accomplish the same thing by erasing the attribute, since the HTML default position for a caption is at the top.)

Again, if you do not use classes, any attributes and values added to your HTML tags will not affect anything.

With two or more columns you may enter a font-size, a font-family (or the name of a particular font), and a divide class for each column. If no divide class is entered, the default is the language option that you have chosen, as per my recommendations.

Display.

The display button operates on the code field. It will show what that code will look like when interpreted by a browser. Click the button, then scroll down the page until you reach the display box. Your HTML will be displayed for you.

Coding, encoding, or recoding, as well as marking up, will also display the code. You can click the display button after working directly on the code in the code field.

Copy and clear.

The copy button will copy the contents of the code field to the clipboard in your computer to be pasted elsewhere. The clear button will clear out the contents of all fields, whether text or code. If you clear these fields by mistake, the restore button will save the text (but not the code).

Embed.

So far, you will have foreign or ancient material written up completely in escape sequences (in the format &#XXXX;). If you have a text editor that allows you to encode in Unicode instead of just ASCII or ANSI, you may prefer to change these escape sequences into the actual glyphs.

The embed button allows you to do just that. Once you have your code marked up exactly as you like it, click embed, then look at the display box. Instead of displaying how your work will appear in a browser, it will now display how it ought to appear in a text editor, HTML tags and all.

Since, however, the tags are now being displayed, they are no longer acting on their contents, and your ancient language characters may come out as empty boxes or other nondisplays.

The remedy is to write the name of a Unicode font (with the desired range) installed on your system in the table caption box before pressing the embed button. (Any table caption that you may have wished to use will already be part of your code.)

Now the display box will be displaying your ancient text in the proper font. Instead of copying the code from the code box, copy the displayed material from the display box (hold the left button down on your mouse and pass it over the text). Then paste it into your text editor, already set up for Unicode.

Fight spam!

Keymaps.

(Character sets.)

| Hebrew. | Greek. | Coptic. | Latin. |

If you have enjoyed or benefitted from this or any other ecourse on this site, or if you have suggestions for improvement, please leave feedback. If you spot any errors, please file them as errata.