×Top 20 recent searches

Formatting information
An introduction to typesetting with LATEX

Basic document structures

 
Copyright © 2010 Silmaril Consultants, version 5.5 (2011-05-30)

Chapter 3

Basic document structures

If the quick-start exercise in section 2.2 was enough to show you how a LATEX document works, then this is where you get the rest of the basic information. If you skipped Chapter 2 then be prepared to go back to some of the sections in it, because I'll be referring to things you might not have come across yet.

LATEX's approach to formatting is to aim for consistency. This means that as long as you identify each element of your document correctly, it will be typeset in the same way as all the other elements like it, so that you achieve a consistent finish with minimum effort.

Consistency helps make documents easier to read and understand, as well as making them more visually attractive. Consistency is also what publishers look for. They have a house style, and often a reputation to keep, so they rightly insist that if you do something a certain way once, you should do it the same way each time.

‘Elements’ are the component parts of a document: all the pieces which make up the whole. Almost everyone who reads books, newspapers, magazines, reports, articles, and other classes of documents will be familiar with the common structure of parts, chapters, sections, subsections, subsubsections, titles, subtitles, paragraphs, lists, tables, figures, and so on, even if they don't consciously think about it.

ToC3.1 The Document Class Declaration

How to tell LATEX what type of document this is

In order to set things up correctly, LATEX needs to know up front what type of document you are going to be writing. There are probably lots of different types of document you deal with: in LATEX they are called ‘classes’ of documents — ‘class’ is just the computing science word for ‘type’.

To tell LATEX what class of document you are going to create, the first line of your file must identify it.1 To start a report, for example, you would type the \documentclass command like this as the first line of your document:

\documentclass{report}

There are four built-in classes provided, and many others that you can download (some may already be installed for you):

report

for business, technical, legal, academic, or scientific reports; theses2, dissertations

article

for white papers, magazine or journal articles, reviews, conference papers, or research notes;

book

for books, booklets, and whole journals;

letter

for letters.3

These default classes are quite broad, so they are easily customised by adding packages, which are the style and layout plug-ins that LATEX uses to let you automate formatting.

The article class in particular can be used (some would say ‘abused’) for almost any short piece of typesetting by simply omitting the titling and layout (see section 3.3) and adding the relevant packages.

The built-in classes are intended as starting-points, especially for drafts, and for compatibility when exchanging documents with other LATEX users, as they come built into every installation of LATEX and are therefore guaranteed to format identically everywhere. They are not intended as final-format publication-quality layouts and should never be used as such. For most other purposes, especially for publication, you use LATEX packages to extend these classes to do what you need:

  • The memoir package and the komascript bundle contain more sophisticated replacements for all the built-in classes, as well as additional ones;

  • Many academic and scientific publishers provide their own special class files for articles and books (on their Web sites for download);

  • Conference organisers may also provide class files for authors to write papers for presentations;

  • Many universities provide their own thesis document class files in order to ensure exact fulfillment of their formatting requirements;

  • Businesses and other organizations can provide their users with corporate classes on a central server and configure LATEX installations to look there first for packages, fonts, etc..

Books and journals are not usually printed on office-size paper. Although for draft purposes LATEX's layouts fit on the standard A4 or Letter stationery in your printer, it makes them look odd: the margins are too wide, or the positioning is unusual, or the font size is too small, because the finished job will normally be trimmed to a completely different size entirely — try trimming the margins of the PDF version of this book to make it 185mm by 235mm (the same as The LATEX Companion series) and you'll be amazed at how it changes the appearance!

These document classes are therefore adequate for drafts or for sending to a colleague to edit, but they are not designed for final-format publishing. For this you need a style file (package or class file) designed by the publisher to fit their series of publications (quite often based on the default classes, but looking very different). Many publishers provide these on their web sites for authors to download. Some are also available in the CTAN repository, along with packages and classes for other predefined formats such as university theses.

  1. Readers familiar with SGML, HTML, and XML will recognize the concept as similar to the Document Type Declaration (it's still called a ‘type’ there, not a ‘class’).
  2. Theses and dissertations require an Abstract, which is provided in the report class but not in the book class.
  3. The built-in letter class is rather idiosyncratic: there are much better ones you can use which you will find in the memoir package and the komascript bundle.

ToC3.1.1 Document class options

The default layouts are designed to fit as drafts on US Letter size paper.4 To create documents with the correct proportions for A4 paper, you need to specify the paper size in an optional argument in square brackets before the document class name, e.g.

\documentclass[a4paper]{report}

The two most common options are a4paper and letterpaper. However, many European distributions of TEX now come preset for A4 instead of Letter.5

The other default settings are for: a) 10pt type (all document classes); b) two-sided printing (books and reports) or one-sided (articles and letters); c) separate title page (books and reports only). These can be modified with the following document class options which you can add in the same set of square brackets, separated by commas:

11pt

to specify 11pt type (headings, footnotes, etc. get scaled up or down in proportion);

12pt

to specify 12pt type (again, headings get scaled to match);

oneside

to format one-sided printing for books and reports;

twoside

to format articles for two-sided printing;

titlepage

to force articles to have a separate title page;

draft

makes LATEX indicate hyphenation and justification problems with a small square in the right-hand margin of the problem line so they can be located quickly by a human. This option also sets graphics to print as an outline rectangle containing the name of the image, so that it prints quickly.

If you were using LATEX for a report to be in 12pt type on Letter paper, but printed one-sided in draft mode, you would use:

\documentclass[12pt,letterpaper,oneside,draft]{report}

The 10pt, 11pt, and 12pt settings cover between them probably 99% of all common document typesetting. There are extra options for other body type sizes in the extsizes bundle of document classes (extarticle, extbook, extreport, etc). In addition there are the hundreds of add-in packages which can automate other layout and formatting variants without you having to program anything by hand or even change your text.

  Exercise 1. Create a new document

  1. Use your editor to create a new document.

  2. Type in a Document Class Declaration as shown above.

  3. Add a font size option if you wish.

  4. In North America, omit the a4paper option or change it to letterpaper.

  5. Save the file (make up a name) ensuring the name ends with .tex.

  1. Letter size is 8½″×11″, which is the trimmed size of the old Demi Quarto, still in use in North America. The other common US office size is ‘Legal’, which is 8½″×14″, a variant cutting close to the old Foolscap (8¼″×13¼″). ISO standard ‘A’, ‘B’, and ‘C’ paper sizes, used everywhere else, are still virtually unknown in most parts of North America.
  2. Note that the standard built-in document classes (book, article, report, or letter) only use the paper size to adjust the margins: they do not embed the paper size name in any PostScript or PDF output file. If you are using pdfLATEX, or intend creating PostScript output, and you want to change the default paper size, you must specify it both in the Document Class option and as an option to the geometry package (see the example ‘Read all about it’ in section 5.1.2), in order to ensure that the paper size name gets embedded correctly in the output, otherwise printers may select the wrong paper tray, or reject the job.

ToC3.2 The document environment

After the Document Class Declaration, the text of your document is enclosed between two commands which identify the beginning and end of the actual document (you would put your text where the dots are):

\documentclass[11pt,a4paper,oneside]{report}

\begin{document}
...
\end{document}

The reason for marking off the beginning of your text is that LATEX allows you to insert extra setup specifications before it (where the blank line is in the example above: we'll be using this soon). The reason for marking off the end of your text is to provide a place for LATEX to be programmed to do extra stuff automatically at the end of the document, like making an index.

A useful side-effect of marking the end of the document text is that you can store comments or temporary text underneath the \end{document} in the knowledge that LATEX will never try to typeset them (they don't even need to be preceded by the % comment character).

...
\end{document}
Don't forget to get the extra chapter from Jim!

This \begin ...\end pair of commands is an example of a common LATEX structure called an environment. Environments enclose text which is to be handled in a particular way. All environments start with \begin{...} and end with \end{...} (putting the name of the environment in the curly braces).

  Exercise 2. Adding the document environment

  1. Add the document environment to your file.

  2. Leave a blank line between the Document Class Declaration and the \begin{document} (you'll see why later).

  3. Save the file.

ToC3.3 Titling

The first thing you put in the document environment is almost always the document title, the author's name, and the date (except in letters, which have a special set of commands for addressing). The title, author, and date are all examples of metadata or metainformation (information about information).

\documentclass[11pt,a4paper,oneside]{report}

\begin{document}

\title{Practical Typesetting}
\author{Peter Flynn\\Silmaril Consultants}
\date{December 2009}
\maketitle

\end{document}

The \title, \author, and \date commands are (I hope) self-explanatory. You put the title, author name, and date in curly braces after the relevant command. The title and author are compulsory; if you omit the \date command, LATEX uses today's date by default.

You must always finish the metadata with the \maketitle command, which tells LATEX that it's complete and it can typeset the titling information at this point. If you omit \maketitle, the titling will never be typeset. This command is reprogrammable so you can alter the appearance of titles (like I did for the printed version of this document in Figure 3.1). It also means publishers can create new commands like \datesubmitted in their own document classes, in the knowledge that anything like that done before the \maketitle command will be honoured.

When this file is typeset, you get something like Figure 3.1 (I've cheated and done it in colour for fun — yours will be in black and white for the moment):


Figure 3.1: Titling information


  Exercise 3. Adding the metadata

  1. Add the \title, \author, \date, and \maketitle commands to your file.

  2. Use your own name, make up a title, and give a date.

The order of the first three commands is not important, but the \maketitle command must come last.

The double backslash (\\) is the LATEX command for a premature (forced) linebreak. LATEX normally decides by itself where to break lines, and it's usually right, but sometimes you need to cut a line short, like here, and start a new one. I could have left it out and just used a comma, so my name and my company would all appear on the one line, but I just decided that I wanted my company name on a separate line. In some publishers' document classes, they provide a special \affiliation command to put your company or institution name in instead. The double backslash is also used in the \author command for separating multiple authors.

The document isn't really ready for printing like this, but if you're really impatient, look at Chapter 4 to see how to typeset and display it.

ToC3.4 Abstracts and summaries

In reports and articles it is normal for the author to provide an Summary or Abstract, in which you describe briefly what you have written about and explain its importance. Abstracts in articles are usually only a few paragraphs long. Summaries in reports or theses can run to several pages, depending on the length and complexity of the document or the readership it's aimed at.

In both cases the Abstract or Summary is optional (that is, LATEX doesn't force you to have one), but it's rare to omit it because readers want and expect it. In practice, of course, you go back and type the Abstract or Summary after having written the rest of the document, but for the sake of the example we'll jump the gun and type it now.

\documentclass[11pt,a4paper,oneside]{report}
\usepackage{ucs}
\usepackage[utf8x]{inputenc}
\usepackage[T1]{fontenc}
\renewcommand{\abstractname}{Summary}
\begin{document}

\title{Practical Typesetting}
\author{Peter Flynn\\Silmaril Consultants}
\date{December 2009}
\maketitle

\begin{abstract}
This document presents the basic concepts of 
typesetting in a form usable by non-specialists. It 
is aimed at those who find themselves (willingly or 
unwillingly) asked to undertake work previously sent 
out to a professional printer, and who are concerned 
that the quality of work (and thus their corporate 
æsthetic) does not suffer.
\end{abstract}

\end{document}

After the \maketitle you use the abstract environment, in which you simply type your Abstract or Summary, leaving a blank line between paragraphs if there's more than one (see section 3.6 for this convention).

In business and technical documents, the Abstract is often called a Management Summary, or Executive Summary, or Business Preview, or some similar phrase. LATEX lets you change the name associated with the abstract environment to any kind of title you want, using the \renewcommand command in your Preamble to give the command \abstractname a new value:

\renewcommand{\abstractname}{Key Points}

This does not change the name of the environment, only its printed title: you still use \begin{abstract} and \end{abstract}.

Notice how the name of the command you are renewing (\abstractname, in this case) goes in the first set of curly braces, and the new value you want it to have goes in the second set of curly braces (this is an example of a command with two arguments).

If you look carefully at the example document, you'll see I sneakily added a few extra commands to the Preamble. We'll see later what these mean (Brownie points for working it out, though, if you read section 2.6).

  Exercise 4. Using an Abstract or Summary

  1. Add the \renewcommand as shown above to your Preamble.

    The Preamble is at the start of the document, in that gap after the \documentclass line but before the \begin{document} (remember I said we'd see what we left it blank for: see the panel ‘The Preamble’ in section 3.4).

  2. Add an abstract environment after the \maketitle and type in a paragraph or two of text.

  3. Save the file (no, I'm not paranoid, just careful).

ToC3.5 Sections

In the body of your document, LATEX provides seven levels of division or sectioning for you to use in structuring your text. They are all optional: it is perfectly possible to write a document consisting solely of paragraphs of unstructured text. But even novels are normally divided into chapters, although short stories are often made up solely of paragraphs.

Chapters are only available in the book and report document classes, because they don't have any meaning in articles and letters. Parts are also undefined in letters.6

Depth Division Command Notes
1 Part \part Not in letters
0 Chapter \chapter Books and reports
1 Section \section Not in letters
2 Subsection \subsection Not in letters
3 Subsubsection \subsubsection Not in letters
4 Titled paragraph \paragraph Not in letters
5 Titled subparagraph \subparagraph Not in letters

In each case the title of the part, chapter, section, etc. goes in curly braces after the command. LATEX automatically calculates the correct numbering and prints the title in bold. You can turn section numbering off at a specific depth: details in section 3.5.1.

\section{New recruitment policies}
...
\subsection{Effect on staff turnover}
...
\chapter{Business plan 2010--2020}

There are packages to let you control the typeface, style, spacing, and appearance of section headings: it's much easier to use them than to try and reprogram the headings manually. Two of the most popular are section and sectsty.

Headings also get put automatically into the Table of Contents, if you specify one (it's optional). But if you make manual styling changes to your heading, for example a very long title, or some special line-breaks or unusual font-play, this would appear in the Table of Contents as well, which you almost certainly don't want. LATEX allows you to give an optional extra version of the heading text which only gets used in the Table of Contents and any running heads, if they are in effect (see section 8.1.2). This optional alternative heading goes in [square brackets] before the curly braces:

\section[Effect on staff turnover]{An analysis of the effects 
of the revised corporate recruitment policies on staff 
turnover at divisional headquarters}

  Exercise 5. Start your document text

  1. Add a \chapter command after your Abstract or Summary, giving the title of your first chapter.

  2. If you're planning ahead, add a few more \chapter commands for subsequent chapters. Leave a few blank lines between them to make it easier to add paragraphs of text later.

  3. By now I shouldn't need to tell you what to do after making significant changes to your document file.

  1. It is arguable that chapters also have no place in reports, either, as these are conventionally divided into sections as the top-level division. LATEX, however, assumes your reports have chapters, but this is only the default, and can be changed very simply (see section 9.6).

ToC3.5.1 Section numbering

All document divisions get numbered automatically. Parts get Roman numerals (Part I, Part II, etc.); chapters and sections get decimal numbering like this document, and Appendixes (which are just a special case of chapters, and share the same structure) are lettered (A, B, C, etc.). You can easily change this default if you want some special scheme.

You can change the depth to which section numbering occurs, so you can turn it off selectively. In this document it is set to 3. If you only want parts, chapters, and sections numbered, not subsections or subsubsections etc., you can change the value of the secnumdepth counter using the the \setcounter command, giving the depth value from the table in section 3.5:

\setcounter{secnumdepth}{1}

A related counter is tocdepth, which specifies what depth to take the Table of Contents to. It can be reset in exactly the same way as secnumdepth. The current setting for this document is 2.

\setcounter{tocdepth}{3}

To get an one-time (special case) unnumbered section heading which does not go into the Table of Contents, follow the command name with an asterisk before the opening curly brace:

\subsection*{Shopping List}

All the divisional commands from \part* to \subparagraph* have this ‘starred’ version which can be used in isolated circumstances for an unnumbered heading when the setting of secnumdepth would normally mean it would be numbered.

ToC3.6 Ordinary paragraphs

After section headings comes your text. Just type it and leave a blank line between paragraphs. That's all LATEX needs.

The blank line means ‘start a new paragraph here’: it does not (repeat: not) necessarily mean you get a blank line in the typeset output. Now read this paragraph again and again until that sinks in.

The spacing between paragraphs is an independently definable quantity, a dimension or length called \parskip. This is normally zero (no space between paragraphs, because that's how books are normally typeset), but you can easily set it to any size you want with the \setlength command in the Preamble:

\setlength{\parskip}{1cm}

This will set the space between paragraphs to 1cm. See section 2.7.1 for details of the various size units LATEX can use. Leaving multiple blank lines between paragraphs in your source document achieves nothing: all extra blank lines get ignored by LATEX because the space between paragraphs is controlled only by the value of \parskip.

White-space in LATEX can also be made flexible (what Leslie Lamport calls ‘rubber’ lengths). This means that values such as \parskip can have a default dimension plus an amount of expansion minus an amount of contraction. This is useful on pages in complex documents where not every page may be an exact number of fixed-height lines long, so some give-and-take in vertical space is useful. You specify this in a \setlength command like this:

\setlength{\parskip}{1cm plus4mm minus3mm}

Paragraph indentation can also be set with the \setlength command, although you would always make it a fixed size, never a flexible one, otherwise you would have very ragged-looking paragraphs.

\setlength{\parindent}{6mm}

By default, the first paragraph after a heading follows the standard Anglo-American publishers' practice of no indentation. Subsequent paragraphs are indented by the value of \parindent (default 18pt).7 You can change this in the same way as any other length.

In the printed copy of this document, the paragraph indentation is set to 10pt and the space between paragraphs is set to 0pt. These values do not apply in the Web (HTML) version because not all browsers are capable of that fine a level of control, and because users can apply their own stylesheets regardless of what this document proposes.

  Exercise 6. Start typing!

  1. Type some paragraphs of text. Leave a blank line between each. Don't bother about line-wrapping or formatting — LATEX will take care of all that.

  2. If you're feeling adventurous, add a \section command with the title of a section within your first chapter, and continue typing paragraphs of text below that.

  3. Add one or more \setlength commands to your Preamble if you want to experiment with changing paragraph spacing and indentation.

To turn off indentation completely, set it to zero (but you still have to provide units: it's still a measure!).

\setlength{\parindent}{0in}

If you do this, though, and leave \parskip set to zero, your readers won't be able to tell easily where each paragraph begins! If you want to use the style of having no indentation with a space between paragraphs, use the parskip package, which does it for you (and makes adjustments to the spacing of lists and other structures which use paragraph spacing, so they don't get too far apart).

  1. Paragraph spacing and indentation are cultural settings. If you are typesetting in a language other than English, you should use the babel package, which alters many things, including the spacing and the naming of sections, to conform with the standards of different countries and languages.

ToC3.7 Table of contents

All auto-numbered headings get entered in the Table of Contents (ToC) automatically. You don't have to print a ToC, but if you want to, just add the command \tableofcontents at the point where you want it printed (usually after the Abstract or Summary).

Entries for the ToC are recorded each time you process your document, and reproduced the next time you process it, so you need to re-run LATEX one extra time to ensure that all ToC page-number references are correctly resolved.

The commands \listoffigures and \listoftables work in exactly the same way as \tableofcontents to automatically list all your tables and figures. If you use them, they normally go after the \tableofcontents command.

We've already seen in section 3.5 how to use the optional argument to the sectioning commands to add text to the ToC which is slightly different from the one printed in the body of the document. It is also possible to add extra lines to the ToC, to force extra or unnumbered section headings to be included.

  Exercise 7. Inserting the table of contents

  1. Go back and add a \tableofcontents command after the \end{abstract} command in your document.

  2. You guessed.

The \tableofcontents command normally shows only numbered section headings, and only down to the level defined by the tocdepth counter (see section 3.5.1), but you can add extra entries with the \addcontentsline command. For example if you use an unnumbered section heading command to start a preliminary piece of text like a Foreword or Preface, you can write:

\subsection*{Preface}
\addcontentsline{toc}{subsection}{Preface}

This will format an unnumbered ToC entry for ‘Preface’ in the ‘subsection’ style. You can use the same mechanism to add lines to the List of Figures or List of Tables by substituting lof or lot for toc.

There is also a command \addtocontents which lets you add any LATEX commands to the ToC file. For example, to add a horizontal rule and a 6pt gap, you could say \addtocontents{toc}{\par\hrule\vspace{6pt}} at the place where you want it to occur. You should probably only use this command once you know what you are doing.