Chapter 4: Writing Programs in Leo

Important note 1: This chapter tells how to use the noweb markup language.

Important note 2: This chapter tells how to use @file trees.

Overview
Sections and section definitions
Organizing @file trees 
More about directives

@color and @nocolor control syntax coloring
@comment sets comment delimiters
@delims sets sentinel delimiters in @file trees 
@first and @last allow leading and trailing lines in @file trees
@language specifies the target language
@path, @pagewidth and @tabwidth set preferences
@raw and @end_raw delimit raw text

Overview

This is the most important chapter because it describes everything you need to know to write programs with Leo. Unless otherwise mentioned, this chapter applies to all versions of Leo.  Newcomers to Leo should read Introducing Leo before reading further.

I strongly recommend that you have look at an example of a Leo outline like LeoPy.leo while you read this documentation. It is much easier to use Leo than to describe all the rules in detail.  Looking at a real Leo outline will make many points clear that are difficult to describe succinctly in words.

Terminology: Directives are commands that appear in body text.  Directives start with an @  followed by a directive name. For example, @color.  We often refer to outline nodes by the directives they contain.  For example, an @others node is a node containing an @others directive, an @ignore node is a node containing an @ignore directive, and so on.  Exception: an @file node is a node whose headline starts with @file.  We also speak of @file trees, trees whose root is an @file node, and so on.

The essentials of writing programs in Leo are as follows:

  1. A Leo program is composed the all the body text in an outline, or part of an outline.
  2. Body text is a sequence of sections.
  3. Sections may contain section references that refer to other sections by their name.
  4. Leo uses outline structure to control the scope of section definitions.
  5. Leo creates source files called derived files by expanding all section references in an @file tree.

Sections and Section References

There are two kinds of sections: code sections and doc sections. Code sections start with the @c directive. Within @file trees, the entire body text of a node is a code section by default, so the @c directive can be omitted if there are no doc sections in the body pane. Doc sections start with the @ directive (an @ followed by whitespace or a newline at the start of a line). Doc sections continue until the end of body text or until the next @c or @ directive.

The term "section" has two related meanings. Sections are syntactic units of text in the body pane. Leo writes the text of section to derived files, so another meaning of "section" is "the text that is written to the derived file." Which meaning is intended should be clear from context.

Section references have the form <<section name>>. A section reference is any sequence of characters except newlines enclosed in << and >>. Leo ignores whitespace and case inside section names, so the following are equivalent:

<< Read file into s >>
<<READ File into S>>
<<readfileintos>>

Leo underlines section names whose definitions are not found in any descendent. Such sections are always invalid in @file trees, and may be valid in @root trees provided that the section is defined somewhere else.

Paired << and >> characters on the same line always denote a section name, even within comments and strings.  That is, << and >> characters that do not delimit a section name must be placed on separate lines.  If << and >> are not paired on a line, they are treated as literal << and >> characters.

Body text may contain zero or more sections in any order.  A code section is named if the node's headline starts with <<section name>>.  Otherwise, the code section is unnamed. Body text that contains no @ or @c directive is considered to be a single unnamed code section. @ and @c directives terminate any previous section.  For example,

@ This method puts an open node sentinel for node v.
@c
def putOpenNodeSentinel(self,v):
    if v.isAtFileNode() and v != self.root:
       << issue an error message >>
    else:
        s = self.nodeSentinelText(v)
        self.putSentinel("@+node:" + s)

An @c directive is optional at the start of body text, so:

@c
def hasChildren(self):
    return self.firstChild() != None

is equivalent to the following text without any markup:

def hasChildren(self):
    return self.firstChild() != None

The @c directive is needed here:

@ Returns true if the receiver has child nodes.
@c
def hasChildren(self):
    return self.firstChild() != None

However, we could replace the doc part with a comment, like this:

# Returns true if the receiver has child nodes.
def hasChildren(self):
    return self.firstChild() != None

The choice of using doc parts or comments is a matter of style.  Doc parts are convenient for longer comments that would be tedious to format by hand.

Minor note 1: For historical reasons, the @code directive is a synonym for the @c directive and the @doc directive is a synonym for @ directive. Only the @c, @code, @ and @doc directives terminate sections.

Minor note 2: A line of the form:

@ %def identifiers

terminates the previous code section and indicates that the preceding code section defines the list of identifiers.  This list contains identifiers separated by whitespace; any sequence of non-white characters may be an identifier.

Organizing @file trees

Lets see how to organize typical outlines.  The only rules we must follow are:

A node that is not referenced is called an orphan node.  If an @file tree contains an orphan node or @ignore node, Leo issues a warning and does not write the derived file when saving an outline. Leo saves all the information in the @file tree in the .leo file, so nothing is lost; Leo will recreate the @file tree from the .leo file when reading the outline the next time.

We can avoid having to refer to sections by name by using the @others directive.  This can save a lot of work.

The @others directive refers to all unnamed sections.  An @file tree may contain more than one @others directive.  @others directives that descend from other @other directives refer only to unnamed nodes that descend from them.  The @others directive that occurs highest in the @file tree refers to all other unnamed nodes.

There are two minor restrictions on the @others directive:

When saving an outline, Leo creates derived files from all changed @file trees by expanding references to named and unnamed sections:

When reading an outline, Leo recreates @file trees from derived files.  In particular, you may change derived files outside of Leo, and Leo will update the outline accordingly.  You may change derived files in any way provided that you don't change sentinel lines.  Sentinel lines are comment lines whose first character (after the comment delimiter) is @.  Leo represents outline structure in sentinel lines, so changing sentinel lines will corrupt the outline structure. (The appendix lists the format and meaning of all sentinel lines.)

Some examples should make all this much clearer.

We shall use the following notation to discuss outline structure. Lines starting with + denote a node with body text;  lines starting with - denote a node without body text. All other lines denote body text associated with the preceding headline. Indentation indicates outline structure; child nodes are indented from their parents.

Using this notation, a we would typically organize an outline that creates a C file as follows:

+ @file f.c
<< constants >>
<< declarations >>
@others
	+ << constants >>
	the constants
	+ << declarations >>
	the declarations
	+ function one
	function one
	+ function two
	function two
	...

The @file node generates the file f.c. The body text of the @file node is simply:

<< constants >>
<< declarations >>
@others

This body text ensures that the expansions of << constants >> and << declarations >> precede the expansions of all unnamed sections. It is good style to name those sections that must appear in a particular place in the derived file.

The children of the @file node contain the definitions of all sections, both named and unnamed. Unnamed sections usually contain functions or method. It is bad style for the meaning of a derived file to depend on the order of unnamed sections in the outline.

One could define a Python class as follows:

+ @file classX.py
<< imports for classX >>
class classX:
	@others
	+ << imports for classX >>
	+ method1
	+ method2
	...

The body text of the @file node is just:

<< imports for classX >>
class classX:
	@others

Indentation is significant in languages like Python. When Leo expands a section reference (or an @others directive), Leo adds the leading whitespace of the line containing the section reference to all lines of the reference's expansion. The text of body text need not be indented beyond its natural indentation.

When two classes appear in the same file, a single @others directive does not suffice. However, no node may contain more than one @others directive.  To get around this limitation we can organize the outline as follows:

+ @file mainClass.py
<<mainClass imports >>
<<class mainClass>>
<<class helperClass>>
	+ <<mainClass imports>>
	+ <<class mainClass>>
	class mainClass:
        	@others
		+ method1
    		+ method2
    		...
    	+ <<class helperClass>>
	class helperClass:
		@others
    		+ method1
    		+ method2
    		...

The body text of the @file node is:

<<mainClass imports>>
<<class mainClass>>
<<class helperClass>>

the body text of the <<class mainClass>> node is:

class mainClass:
	@others

and the body text of the <<class helperClass>> node is:

class helperClass:
	@others

The @others directive refers only to unnamed nodes in the descendents of the @others node, so the two @others directives refer to disjoint sets of unnamed sections.

Organizing nodes (nodes with no body text) do not affect the derived file in any way. In particular, such nodes never change indentation. Organizing nodes are often useful. For example,

+ @file classX.py
class classX:
	@others
	- xtors
		+ __init__
		+ __del__
	- getters
		+ getter1
		+ getter2
	- setters
		+ setter1
		...

The xtors, getters and setters nodes are organizing nodes. They exist merely to make the outline more easy to understand.

As mentioned earlier, no named node may "intervene" between an unnamed node containing body text and an @others node.  For example, the following outline is invalid.

+ @file classX.py
class classX:
	@others
	- methods
		+ <<section one>>
			+ A

A is an unnamed node containing body text, and the <<section one>> node intervenes between Node A and the @others node.  This restriction arises from the way that Leo represents named sections in derived files.

The @ignore directive is not valid in @file trees because derived files contain primary data.  The .leo file also contains a copy of the information in the @file tree, but this information is used for backup and error recovery. As a result, everything in an @file trees must correspond to information the derived file and the @ignore directive may not appear in @file trees.

More about directives

The following sections discuss directives that have not been discussed previously.  Unless otherwise noted, these directives are valid anywhere in an outline.

@color and @nocolor control syntax coloring
@comment sets comment delimiters
@delims sets sentinel delimiters in @file trees 
@first and @last allow leading and trailing lines in @file trees
@language specifies the target language
@path, @pagewidth and @tabwidth set preferences
@verbose, @terse and @silent control comments in @root trees

@color and @nocolor control syntax coloring

Syntax coloring is on by default in all body text. Leo formats comments and documentation parts in red, directives and C keywords in blue, strings and character constants in gray and all other text in code parts in black.

The @nocolor directive disables syntax coloring for the body text in which it appears. No syntax coloring is done until an @color directive re-enables syntax coloring.

If a node contains neither the @color nor the @nocolor directive it may inherit the syntax coloring attribute from an ancestor. The nearest ancestor that contains exactly one of the @color or @nocolor directives will control the syntax coloring. Ambiguous nodes, nodes containing both the @color and @nocolor directives, never affect the coloring of their offspring.

Note: the @color or @nocolor directives do not affect the Tangle commands in any way. In particular, the Tangle commands will recognize section definitions as usual even after an @nocolor directive is seen.

@comment sets comment delimiters

Note: You should use the @language directive whenever possible. Untangle will not process an @root or @unit node if an @comment directive is in effect because Untangle can't be sure of properly parsing a derived file if the language of the derived file isn't known. It might be possible to assume some defaults in this case, but that is not done at present and is not a high priority.

By default, the Tangle Commands produces C-language comments. Single-line comments generated during tangling start with ///, while documentation parts are surrounded by /* and */. The @comment directive allows you to use Tangle to produce shell and make files, as well as source code for other programming languages.

The @comment directive may be followed by zero to three delimiters, separated by whitespace. This directive sets the single-line comment delimiter and the opening and closing block comment delimiters as follows:

 @comment (no delim) restores the defaults to ///, /* and */
 @comment /// (one delim) sets the single-line comment and clears the other delimiters.
 @comment /* */ (two delims) sets the opening and closing block comment delimiters and clears the single-line comment.
 @comment /// /* */ (three delims) sets all three delimiters.

If only one delimiter is given, Leo does not write any documentation parts while tangling. If two delimiters are given, block-style comments are used instead of single-line comments.

For example, the @comment { } directive could be used to Tangle Pascal files.

The @comment directive is only recognized in @root, @unit or @file nodes, and the @comment directive must precede the first section name or @code directive. An @comment directive in the body text of an @unit directive specifies the current global defaults. An @comment directive in the body text of an @root directive affects comments generated for one root only. Comments in all other roots are governed by the global defaults.

New in leo.py 3.0: Leo will convert underscores in @comment directives to significant spaces. For example,

@comment REM_

causes the comment delimiter to be "REM " (Note the trailing space).

@delims directive specifies comment delimiters in @file trees

The @delims directive changes the comment strings used to mark sentinel lines. This directive is often used to place Javascript text inside XML or HTML files.

The @delims directive contains one or two delimiters, separated by whitespace. If only one delim is present it delimits single-line comments. If two delims are present they delimit block comments. The @delims directive can not be used to change the comment strings at the start of the derived file, that is, the comment strings for the @+leo sentinel and the initial @+body and @+node sentinels. The @delims directive inserts @@delims sentinels into the derived file. The new delimiter strings continue in effect until the next @@delims sentinel in the derived file or the end of the derived file.

Leo can not revert to previous delimiters automatically; you must change back to previous delimiters using another @delims directive. For example:

@delims /* */ 
Javascript stuff 
@delims <-- --> 
HTML stuff 

Adding, deleting or changing @@delims sentinels will destroy Leo's ability to read the derived file. Mistakes using the @delims directive have no effect on Leo, though such mistakes will thoroughly mess up a derived file as far as compilers, HTML renderers, etc. are concerned.

@first and @last directives allow leading and trailing lines in @file trees

The @first directive allows you to place lines at the very start of files derived from @file nodes. For example, the body text of @file spam.py might be:

@first #! /usr/bin/env python

The body text of @file foo.perl might be:

@first #/usr/bin/perl

@first directives are recognized only at the start of the body text of @file nodes. No text may precede @first directives. More than one @first directive may exist, like this:

@first #! /usr/bin/env python
@first # more comments.

Similarly, @last directives are recognized only at the end of body text of @file nodes. No text may follow @last directives. More than one @last directive may exist.  Here is how a PHP file might be set up:

@first <?php
...
@last ?>

@language specifying the target language

The @language directive specifies the comment delimiters and string types used by the Tangle and Untangle commands. This directive over-rides the default specified in the Preferences panel. The form of this directive is

@language x

where x is one of the following: c, c++, html, java, objective-c, pascal, perl, perlpod, python and shell. Shell files have comments that start with #. Case is ignored in the language specifiers, but not in the @language itself. Thus, the following are equivalent:

@language html
@language HTML
@language hTmL

but the following is invalid:

@LANGUAGE html

@path, @pagewidth and @tabwidth directives set preferences

The @path, @pagewidth and @tabwidth directives allow preferences to be set on a file-by-file basis: they override the corresponding defaults in the Preferences panel.

The @path directive specifies the directory to be used if an @file or @root directive does not specify a full path name. The form of the @path directive is @path filename, where filename is taken to be everything following @path to the end of the line.  If the filename in @file pathname or @root pathname is an absolute filename the location of the derived file is specified only by the filename. Otherwise, if the filename is a relative filename, the location of the derived file is relative to:

  1. the directory specified the applicable @path directive, or
  2. the "Default Tangle Directory" in the Preferences panel if no @path directive is in effect, or
  3. the directory in which the .leo resides if the .leo file has ever been saved.

An error occurs if no absolute path can be computed according to these rules, or if the filename or directory does not exist.

The form of the @pagewidth directive is @pagewidth n, where n is a positive integer that indicates the width of tangled pages in columns. This setting only affects how Tangle outputs block comments.

The form of the @tabwidth directive is @tabwidth n, where n is a positive integer that indicates the width of tabs in spaces. This is used by Tangle to output leading whitespace.

@raw and @end_raw delimit raw text

New in leo.py 3.8: The @raw and @end_raw directives are valid only within @file trees. The @raw directive starts a section of "raw" text. The @end_raw directive ends such a section, as does the end of body text. No section references are recognized within "raw" text, and no additional leading whitespace is generated within "raw" text when writing the derived file.