This method takes care of converting the external XML
file into an et.ElementTree instance. It
also validates that tree against the schema. For details
of the validation process, see Automated validation of input
files in Python XML processing with lxml.
# - - - B i r d N o t e S e t . _ v a l i d a t e
def _validate(self, fileName):
"""Build an XML tree and validate it against the schema.
[ fileName is a string ->
if (SCHEMA_RNG names a readable, well-formed RNG
bird notes schema) and
(fileName names a readable XML bird notes file
that validates against that schema) ->
return the root node of a document representing
that bird notes file as an et.Element ]
"""
Before we can validate the notes file, we have to
translate the schema itself into an ElementTree.
#-- 1 --
# [ if SCHEMA_RNG names a readable, well-formed XML file ->
# schemaDoc := a new et.ElementTree representing
# that file
# else -> raise IOError ]
try:
schemaDoc = et.parse(SCHEMA_RNG)
except et.XMLSyntaxError:
raise IOError("Schema file '%s' is not "
"well-formed XML." % SCHEMA_RNG)
except IOError, detail:
raise IOError("Can't read schema file '%s': %s" %
(SCHEMA_RNG, str(detail)))
The next step is to convert the schema's tree into an
et.RelaxNG instance that knows how to
validate against that schema.
#-- 2 --
# [ if schemaDoc is a valid Relax NG schema ->
# schema := an et.RelaxNG instance representing schemaDoc
# else -> raise IOError ]
try:
schema = et.RelaxNG(schemaDoc)
except et.RelaxNGParseError, detail:
raise IOError("File '%s' is not a valid "
"RNG schema: %s " % (SCHEMA_RNG, str(detail)))
Next we convert the bird notes file into a tree. The
et.parse() function reads the document and
turns it into an et.ElementTree. To find
the root element of an ElementTree, use
the .getroot() method.
If the file doesn't exist or is unreadable, we'll get an
IOError exception. If it exists but is not
well-formed, we get an et.XMLSyntaxError
exception.
#-- 3 --
# [ if fileName names a readable, well-formed XML file ->
# doc := that file as an et.ElementTree
# else -> raise IOError ]
try:
doc = et.parse(fileName)
except et.XMLSyntaxError:
raise IOError("File '%s' is not well-formed XML." %
fileName)
except IOError, detail:
raise IOError("Can't read file '%s': %s" %
(fileName, str(detail)))
The schema.validate() method returns 1 if
a document validates, 0 otherwise. Assuming all of that
succeeds, we can then return the document's root element.
#-- 4 --
# [ if doc fails to validate against schema ->
# raise IOError
# else -> I ]
if not schema.validate(doc):
raise IOError("File %s is not a valid bird notes "
"file: %s" % (fileName, schema.error_log))
#-- 5 --
# [ return the root element of doc ]
return doc.getroot()