Your support for our advertisers helps cover the cost of hosting, research, and maintenance of this document

Formatting Information — An introduction to typesetting with LATEX

Chapter 5: Textual tools

Section 5.3: References and citations

This is one of the most powerful features of LATEX. As we mentioned when discussing Figures and Tables, you can label any point in a document with a name you make up, so that you can refer to it by that name from anywhere else in the document (or even from another document) and LATEX will always work out the right cross-reference number for you, no matter how much you edit the text or move it around.

As we well see later, a similar method is also used to cite documents for a bibliography or list of references, and there are packages to sort and format these in the correct style for different journals or publishers.

5.3.1 Cross-references

You label the place in your document you want to refer to by adding the command \label followed by a short name you make up, in curly braces: exactly as we did for labelling Figures and Tables in § 4.2.2.

\section{New Research}
\label{newstuff}
        

You can then refer to this place from anywhere in the same document with the command \ref followed by the name you used, eg

In section~\ref{newstuff} there is a list of recent 
projects.
	  

In section 13.5 there is a list of recent projects.

  • Note the use of the unbreakable space (~) between the \ref and the word before it. This prints a space but prevents the line ever breaking at that point, should it fall close to the end of a line when being typeset.

  • The \S command can be used if you want the section sign § instead of the word ‘section’ (there is also a \P command that produces the paragraph sign or pilcrow ).

Labels MUST be unique (that is, each value MUST occur only once as a label within a single document), but you can have as many references to them as you like. If you are familiar with HTML, this is the same concept as the internal linking mechanism using # labels (or IDs in XHTML or HTML5).

Labels in normal text

If the label is in normal text, as above, the reference will give the current chapter/section/subsection number (depending on the current document class).

Labels in Tables or Figures

If the label is inside a Table or Figure, the reference provides the Table number or Figure number prefixed by the chapter number (remember that in Tables and Figures the \label command MUST come after the \caption command).

The \ref command does not produce the word ‘Figure’ or ‘Table’ for you: you have to type it yourself, or use the varioref package which automates it.

Labels in lists

A label in an item in an enumerated list will provide the item number. In other lists its value is null or undefined.

Labels elsewhere

If there is no apparent countable structure at the point in the document where you put the label (in a bulleted list, for example), the reference will be null or undefined.

The command \pageref followed by any of your label values will provide the page number where the label occurred, instead of the reference number, regardless of the document structure. This makes it possible to refer to something by page number as well as by its \ref number, which is useful in very long documents like this one (varioref automates this too).

LATEX records the label values each time the document is processed, so the updated values will get used the next time the document is processed. You therefore need to process the document one extra time before final printing or viewing, if you have changed or added references, to make sure the values are correctly resolved. Most LATEX editors handle this automatically by typesetting the document twice when needed.

Unresolved references are printed as two question marks, and they also cause a warning message at the end of the log file. There’s never any harm in having \labels you don’t refer to, but using \ref when you don’t have a matching \label is an error, as is defining two labels with the same value.

5.3.2 Bibliographic references

The mechanism used for references to reading lists and bibliographies is very similar to that used for normal cross-references. Instead of using \ref you use \cite or one of the variants explained in § 5.3.2.3; and instead of \label, you attach a label value to each of the reference entries for the books, articles, reports, etc that you want to cite. You keep these reference entries in a bibliographic reference database that uses the BIBTEX data format (see § 5.3.2.2).

This does away with the time needed to maintain and format references each time you cite them, and dramatically improves accuracy. It means you only ever have to enter the bibliographic details of your references once, and you can then cite them in any document you write, and the ones you cite will get formatted automatically to the style you specify (eg Harvard, Oxford, IEEE, Vancouver, MLA, APA, etc).

5.3.2.1 Choosing between BIBTEX and biblatex

LATEX has two systems for doing citations and references, BIBTEX (old) and biblatex (new): both of them use the same file format, also called BIBTEX. Both support the numeric and abbreviated alphabetic style formats built into LATEX plus a very wide range of others.

BIBTEX

the older BIBTEX has been in use for several decades and is still specified in many older document classes, especially journal article and book publishing formats. While it will continue to work, it has several drawbacks:

  1. it doesn’t handle non-ASCII characters easily;

  2. nor does the sort-and-extract program it uses (also called bibtex)

  3. the style format files (.bst files) are written in its own rather strange and unique language, making it extremely difficult to modify them or write new ones;

  4. many of the style format files are old and out of date;

  5. the range of data fields in references is limited and also out of date.

biblatex

the newer biblatex system is now a well-established LATEX package to replace almost all of BIBTEX. The main advantages are:

  1. it works with UTF-8 characters, so non-ASCII, non-Latin, and other writing systems are handled natively, especially when using XƎLATEX.

  2. there is a new sort-and-extract program (biber) to replace bibtex, which also handles UTF-8 natively.

  3. the style format files are written entirely in LATEX syntax, and under active development, so updating or writing layout formats it is much easier than with BIBTEX.

  4. it extends the number and type of data fields that you can use in references.

  5. it supports the popular author–year citation format natively, without the need for additional packages.

The only current drawback is that not all of the citation and reference formats supported in BIBTEX are yet available in biblatex, but the range is still wide, and modifications are usually easy to make.

The biblatex package with the biber program is therefore recommended, especially with XƎLATEX. However, as mentioned, there are a few classes and packages which have not yet been rewritten and may still require BIBTEX style formats, and there are some less common style formats which are not available for biblatex. However, the flexibility of the biblatex data model means altering or extending style formats to create your own is much easier than it is for BIBTEX, although this is not a task for the beginner.

Cheatsheet

Clea F Rees has written an excellent cheatsheet with virtually everything on it that you need for quick reference to using biblatex. This is downloadable as the package biblatex-cheatsheet from CTAN.

5.3.2.2 The BIBTEX file format

The format for BIBTEX files is used for both BIBTEX and biblatex, using either biber or bibtex. The file format is specified in the original BIBTEX documentation (look on your system for the file btxdoc.pdf). The biblatex package and its updated style formats provide many more fields and document types.

Each BIBTEX entry starts with an @ sign followed by the name of the type of document, followed by the whole entry in a single set of curly braces. The first value is the unique key (label) that you make up, followed by a comma:

@book{fg,
...
}
	  

Then comes each field (in any order), using the format:

fieldname = {value},
          

There MUST be a comma after each line of an entry except the last line:

@book{fg,
   title        = {{An Innkeeper's Diary}},
   author       = {John Fothergill},
   edition      = {3rd},
   publisher    = {Penguin},
   year         = 1929,
   address      = {London}
}
	  

Some TEX-sensitive editors have a BIBTEX mode which understands these entries and provides menus or templates for writing them. The rules are:

  • There MUST be a comma after each line of an entry except the last line;

  • There MUST NOT be a comma after the last field in the entry (only — eg after {London} in the example)

  • Some styles recapitalise the title when they format: to prevent this, enclose the title in double curly braces as in the example;

  • Also use extra curly braces to enclose multi-word surnames, otherwise only the last will be used in the sort, and the others will be assumed to be forenames, for example the British explorer can be sorted under T with author = {Ranulph {Twisleton Wykeham Fiennes}}

  • Multiple authors go in the one author field, separated by the word and (see example below)

  • Values which are purely numeric (eg years) may omit the curly braces;

  • Fields can occur in any order but the format must otherwise be strictly observed;

  • Fields which are not used do not have to be included (so if your editor automatically inserts them as blank or prefixed by OPT [optional], you can safely delete them as unused lines)

There is a required minimum set of fields for each of a dozen or so types of document: book, article (in a journal), article (in a collection), chapter (in a book), thesis, report, conference paper (in a Proceedings), etc, exactly as with all other reference management systems. These are all (entry types and entry fields) listed in detail in the biblatex documentation (Lehman, Kime, Boruvka & Wright, 2015, sections 2.1 & 2.2 p.8).

Here’s another example, this time for a book on how to write mathematics — note the multiple authors separated by and.

@book{mathwrite,
  author = 	 {Donald E Knuth and Tracey Larrabee and Paul M Roberts},
  title = 	 {{Mathematical Writing}},
  publisher = 	 {Mathematical Association of America},
  address = 	 {Washington, DC},
  series = 	 {MAA Notes 14},
  isbn = 	 {0-88385-063-X},
  year = 	 {1989}}
	  

Every reference in your reference database MUST have a unique key value (label or ID): you can make this up, just like you do with normal cross-references, but some bibliographic software automatically assigns a value, usually based on an abbreviation of the author and year. These keys are for your convenience in referencing: in normal circumstances your readers will not see them. You can see these labels in the right-hand-most column and at the bottom of the screenshot in Figure 5.1, and in the examples above. You use this label in your documents when you cite your references (see § 5.3.2.3).

There are many built-in options to the biblatex package for adjusting the citation and reference formats, only a few of which are covered here. Read the package documentation for details: it is possible to construct your own style simply by adjusting the settings, with no programming required (unlike the older BIBTEX styles, which are written in a programming language used nowhere else).

Many users keep their BIBTEX files in the same directory as their document[s], but it is also possible to tell LATEX and BIBTEX that they are in a different directory. This is a directory specified by the $BIBINPUTS shell or environment variable. On Unix & GNU/Linux systems (including Apple Macintosh OS X), and in TEX Live for Windows, this is your TEX installation’s texmf/bibtex/bib directory — the same one that old-style BIBTEX .bst style files are kept in — but you should use your personal TEX directory and create a subdirectory of the same name in there for your own .bib files. MiKTEX also uses the same $BIBINPUTS variable, but it is not set on installation: you need to set it using the Windows Systems Settings (see for example https://www.computerhope.com/issues/ch000549.htm).

5.3.2.3 Citation commands

The basic command is \cite, followed by the label of the entry in curly braces. You can cite several entries in one command: separate the labels with commas.

\cite{fg}
\cite{bull,davy,heller}
	  

For documents with many citations, use the Cite button or menu item in your bibliographic reference manager, which will insert the relevant command for you (you can see it activated for the TEXStudio editor in Figure 5.1).

How the citation appears is governed by two things:

  1. the reference format (style) you specify in the options to the biblatex package (see § 5.3.2.4) or in a \bibliographystyle command if you are using BIBTEX (see § 5.3.2.5) instead of biblatex;

  2. the type of citation command you use: \cite, \textcite, \parencite, \autocite, \footcite, etc, as shown below.

There are three built-in formats in biblatex:

1. authoryear

There are two basic types of author–year citation:

a. year in parentheses

used in phrases or sentences where the name of the author is part of the sentence, and the year is there to identify what is being cited; in biblatex this command is \textcite{fg}

...as has clearly been shown by Fothergill (1929).

This is sometimes called ‘author-as-noun’ citation.

b. whole citation in parentheses

used where the phrase or sentence is already complete, and the citation is being added in support: in biblatex this command is \parencite{fg}

...as we have already clearly shown (Fothergill, 1929).

Note that author–year format is not built into BIBTEX but can be done with the use of the natbib package and others.

2. numeric

This format is popular in some scientific disciplines and \cite produces just a number in square brackets [42]. The references at the end of the document may be numbered in order of reference or sorted by author.

3. alphabetic

This format is also popular in some scientific disciplines and \cite produces a three- or four-letter abbreviation of the author’s name and two digits of the year, all in square brackets [Fot29]. The references at the end of the document are listed with the the abbreviated key value as their label. This format is also called ‘abbreviated’.

To direct your reader to a specific page or chapter, you can add a prefix and/or a suffix as optional arguments in square brackets before the label.

...as shown by \textcite[p 12]{mathwrite}.
	  

A prefix gets printed at the start of the citation and the suffix gets printed at the end, but all still within the parentheses, if any. As they are both optional arguments, and as suffixes are far more common than prefixes, when only one optional argument is given, it is assumed to be the suffix.. The example here therefore produces:

...as shown by Knuth, Larrabee & Roberts (1989, p 12).

If you are using bibtex instead of biblatex, you can only specify a suffix.

Footnoted citations are common in History and related disciplines, to the extent that scholars in these fields actually call their references ‘footnotes’, which is confusing to others. The command \footcite does these (see Table 5.1) but it is only relevant for author-year styles (in numeric style it just produces the number, which would be misleading).

There are many variant forms of the citation commands, either for specific styles like Harvard, IEEE, APA, MLA, etc; or for grammatical modifications like capitalising name prefixes, omitting the comma between name and year, or adding multiple notes; or for extracting specific fields from an entry (eg \titlecite). If you have requirements not met by the formats described here, you can find them in the documentation for the biblatex package.

Modern Language Association (MLA) citation is a special case, as it omits the year and instead REQUIRES the location of the citation within the document (eg the chapter, section, page, or line). It may include the title, if there would otherwise be ambiguity. The biblatex format for MLA citation handles the context-dependent formatting with the command \autocite; for BIBTEX there is an old version of MLA implemented in the mla and hum2 packages.

Table 5.1: Built-in citation style commands and formats (biblatex)

StyleCommandResult
authoryear\parencite{fg}(Fothergill, 1929)
authoryear\textcite{fg}Fothergill (1929)
authoryear\footcite{fg}¹
numeric\cite{fg}[42]
alphabetic\cite{fg}[Fot29]
authoryear\cite{fg}Fothergill 1929

¹ Fothergill 1929.

If you are using BIBTEX instead of biblatex, the commands you can use are not standardised except for \cite. Instead, they depend on which style format you use; for example the popular natbib package, which implements author-year citation for the natural sciences, uses \citet and \citep instead of \textcite and \parencite.

Figure 5.1: JabRef displaying a file of references, ready to insert a citation of Fothergill’s book into a LATEX document being edited with TEXStudio

jabref 

Your reference management software will have a display something like Figure 5.1 (details vary between systems, but they all do roughly the same job in roughly the same way), showing all your references with the data in the usual fields (title, author, date, etc).

Your BIBTEX file, which contains all your bibliographic data, can be saved or exported as a .bib file from most reference management software (JabRef uses this format natively), It looks like the examples in § 5.3.2.2. Your .bib file works with both biblatex and , but biblatex provides more field types and document types so that your references can be formatted more accurately.

If your bibliographic management software doesn’t save BIBTEX format direct, save your data in RIS format, then import the .ris file into JabRef and save it as a .bib file from there.

5.3.2.4 Setting up biblatex with biber

You set up your document with the following packages:

  1. the babel package with appropriate languages, even if you are only using one language. The default language is American English, so there are commands to map this to other language variants (the example below shows this for British English)

  2. the csquotes package, which automates the use of quotation marks around titles or not, depending on the type of reference;

  3. the biblatex package itself, specifying the biber program and the style of references you want, either numeric, alphabetic, or authoryear; or a publisher’s style (in this example I am using APA format); and any options for handling links like DOIs, URIs, and ISBNs;

  4. the language mapping command, if needed (see the documentation for the style you have chosen to find out if you need this)

  5. finally, the name of your BIBTEX file[s] (see the sidebar ‘Bibliographic reference databases’) with one or more \addbibresource commands.

\usepackage[frenchb,german,british]{babel}
\usepackage{csquotes}
\usepackage[backend=biber,doi=true,isbn=true,
  url=true,style=apa]{biblatex}
\DeclareLanguageMapping{british}{british-apa}
\addbibresource{myrefs.bib}
	  

At the end of your document you need to add the \printbibliography command (or elsewhere at the point in your document where you want the full list of references you have cited to be printed). See § 5.3.2.6 for details of how XƎLATEX produces the references.

Versions of biber and biblatex

One critically important point to note is that biblatex and biber are step-versioned; that is, each version of the biblatex package only works with a specific version of the biber program. There is a table of these dependencies in the biblatex documentation PDF. If you manually update biblatex for some reason (perhaps to make use of a new feature), you MUST also update your copy of biber to the correct version, and vice versa, otherwise you will not be able to produce a bibliography.

5.3.2.5 Using BIBTEX

The BIBTEX method is nowadays deprecated, partly because it does not handle UTF-8 character-encoding correctly, and partly because its unusual internal programming language makes it hard to extend.

The principles underlying BIBTEX are identical to biblatex: you create and maintain your BIBTEX file of references in exactly the same way, and you use the \cite in the same way. But there is no basic package to load with options: instead, you specify the name of the style you want, using a \bibliographystyle command in your Preamble, plus any additional packages needed to help with the formatting. You then use the \bibliography command to give the name of your BIBTEX file (without the .bib extension) at the point you want the references to be printed.

\bibliographystyle{apsr}
\usepackage{natbib,har2nat}
...
\bibliography{myrefs}
	  

In this example the American Political Science Review (APSR) variant of the Harvard reference style has been selected (common in the political and economic sciences), which normally means numerical citation, but the author wants the Natural Sciences (author-year) form of citation, which is provided by the natbib package, which in turn requires the har2nat package to handle Harvard-style formatting. As explained above, the natbib package provides \citet for textual citation like Fothergill (1929), and \citep for parenthetic citation like (Fothergill, 1929).

There is an option on all the cite commands and variants to let you specify a suffix to the citation, so \citep[Foreword, p.13]{fg} produces (Fothergill, 1929, Foreword, p 13).

See § 5.3.2.6 for details of how PDFLATEX produces the references.

5.3.2.6 Producing the references

Because of the record⟶extract⟶format process (the same as used for cross-references), you will get a warning message about ‘unresolved references’ the first time you process your document after adding a new citation for a previously uncited work and running biber or bibtex. The bibtex program produces a bold ?? where the unresolved reference will be; biber produces the entry label in bold instead. This will disappear once LATEX has been run again, which is why most editors have a Build function to do the job for you.

Your LATEX editor’s Typeset or Build button or menu entry should therefore handle the business of running biber or bibtex for you. If not, here’s how to do it manually in a Command window:

For XƎLATEX with biber

Run XƎLATEX, then run biber to extract and sort the details from the BIBTEX file, and then run XƎLATEX again:

xelatex myreport
biber myreport
xelatex myreport
		
For PDFLATEX with bibtex

Run PDFLATEX, then run bibtex to extract and sort the details from the BIBTEX file, and then run PDFLATEX again twice (to resolve the references):

pdflatex myreport
bibtex myreport
pdflatex myreport
pdflatex myreport
		

In practice, authors tend to retypeset their documents from time to time during writing anyway, so they can keep an eye on the typographic progress of the document. Just clicking the Typeset or Build button after adding a new \cite command, and subsequent runs of LATEX will incrementally incorporate all references without you having to worry about it.

If you work from the command line, the latexmk script automates this, running bibtex or biber and re-running LATEX again when needed.

  1. Be aware that in some disciplines where cross-references are not much used, the word ‘references’ may be used to mean ‘bibliographic references’. 

  2. This section is labelled normalxref, for example. 

  3. Thus I can refer here to the label at the start of this section as \ref{normalxref} and get the value ‘§ 5.3.1’.  

  4. It’s not clear how they refer to conventional footnotes, or if they even use them.