If you are planning on representing bird taxonomy using a relational database, flat files are a universal format accepted by all the major database systems. In a flat file, each record consists of a sequence of fixed-length fields.
When you run nomcompile3, it writes three files:
The tree file represents the entire taxonomic tree, including subspecific forms from the alternate forms file.
Its name is the same as the input file, except it has
extension .tre. For example, if the
input files are aou640.std and
aou640.alt, the tree file will be
called aou640.tre.
The abbreviations file defines all
the six-letter bird codes, and documents the English name
from which each code was derived. This file has extension
.ab6.
The collisions file describes
every six-letter bird code that is invalid because two
or more names would all abbreviate to that code. Its
extension is .col.
The sections below describe the formats of these product files.
The tree file defines all the different scientific names used in the input. Here is the format of that file:
| Length | Contents |
|---|---|
| varies |
The taxonomic key number. The exact format of
this field depends on the content of the
ranks file; see Section 7.1.1, “Taxonomic key numbers”.
|
| 6 | If this taxon has a standard six-letter bird code, that code appears here; otherwise the field is blank. |
| 1 |
For generally accepted forms, this field is
blank. If the form is not in the main AOU Check-List, a
question mark (?)
appears here.
|
| 36 |
The next field is the scientific name of the group to which this form is referred, for example, Junco hyemalis. The field is aligned flush left and padded on the right with spaces. For forms not identified to species, the smallest containing taxon is used, e.g., Aves for “bird sp.”
For subspecific forms defined in the alternate
names file, this field contains the scientific
name with a space and an integer appended. For
example, in the line for the standard species
Snow Goose, this line will have the value
“ |
| 56 |
The English name of the form appears next, aligned
flush left and right-padded with spaces. For
multi-word names, the generic part comes first,
followed by a comma, one space, and the specific
part. No underbar (“ Examples: Dunlin Loon, Red-throated grebe sp. bird sp. bird, large sp. teal, Blue-winged x Cinnamon Junco, (Gray-headed x Slate-colored) Dark-Eyed
|
The taxonomic key number can be used to sort records
into phylogenetic order, as defined by the AOU Check-List. It
contains one or more digits for each rank (except for
the root rank). The number of digits for each rank is
determined by the third column in the ranks file.
It is an extremely bad idea to use this number to represent a taxon for any other purpose other than sorting. Not only is it spectacularly meaningless out of context, but any change to the input files will change all of the taxonomic key numbers.
For example, if your ranks file
looks like the example given above (2-digit order,
2-digit family, 1-digit subfamily, 2-digit genus, 2-digit
species, and 2-digit form), each taxonomic key number
would have these components:
The two-digit serial number of the taxonomic order in
which this form is placed, or “00” if the form is not placed
into an order (e.g., “bird sp.”).
The two-digit serial number of the taxonomic family
within this order, or “00” for forms not placed within a specific
family. Note that the sequence of families starts over
at “01” again
within each order.
The one-digit serial number of the subfamily within
the family, or “0” if the subfamily is unknown.
The two-digit serial number of the genus within the
family, or “00” if the genus is unknown.
The two-digit serial number of the species within the
genus, or “00” if the species is unknown.
The two-digit serial number of the form within the
species, or “00” if the form is unknown.
For example, code daejun
(Dark-eyed Junco) might have a taxonomic key number of
“21 24 3 47 01 00”
(the spaces here are for clarity—they are not actually
present in the record). This key would mean that this
form is in the 21st order, and in the 24th family within
that order, the 3rd subfamily within that family, the
47th genus within that subfamily, and the first
species within that genus, and not in any known subform
of the species.
Other forms that are included within Dark-eyed Junco will
have keys “21 24 3 47 01 01”, “21 24 3 47 01
02”, and so on. Examples of such
forms include races such as Gray-headed Junco, hybrids
among the different races (e.g., “Gray-headed ×
Slate-colored Junco”), and obsolete names
(“Northern Junco”).
Note that the taxonomic key number can be used to deduce
relationships between form codes. For example, to find
out what genus a species is in, just construct a key
number that is the same as the species' key number, but
with its species number set to “00”. Continuing the example above,
suppose Gray-headed Junco has this key number:
21 24 3 47 01 01
Then we can deduce all the higher ranks by substituting zeroes in the appropriate fields:
21 24 3 47 01 00
| The containing species, Junco hyemalis |
21 24 3 47 00 00
| The containing genus, Junco |
21 24 3 00 00 00
| The containing subfamily, Emberizinae |
21 24 0 00 00 00
| The containing family, Emberizidae |
21 00 0 00 00 00
| The containing order, Passeriformes |
00 00 0 00 00 00
| The containing class, Aves |