# - - - h t m l S u b e l t
def htmlSubelt(node, rawEng, cssClass):
'''Add rawEng to an et.Element
[ (node is an et.Element) and
(rawEng is a string) and
(cssClass is a string or None) ->
if rawEng contains invalid markup ->
raise ValueError
else if cssClass is None ->
node := node with content added representing rawEng,
normalized, using an 'i' element
else ->
node := node with content added representing rawEng,
normalized, using a 'span' element with
class=(cssClass) ]
'''
The purpose of this function is to assist in building XHTML
pages using the etbuilder module (see Section 4, “Imported modules”). The node argument is
some et.Element instance to which we will add the
English name rawEng.
Our first task is an error check. Two characters are replaced
in the generated XHTML: double-quotes and underbars. There must
an even number of each type. If all is well, we convert the
name to Unicode as required by the et module.
#-- 1 --
# [ if rawEng contains even numbers of double-quotes and even
# numbers of underbars ->
# uniEng := rawEng converted to Unicode
# else -> raise ValueError ]
if (rawEng.count('"') % 2) != 0:
raise ValueError("Unbalanced double-quotes: '%s'" % rawEng)
if (rawEng.count('_') % 2) != 0:
raise ValueError("Unbalanced underbars: '%s'" % rawEng)
uniEng = unicode(rawEng)
The overall approach is to break rawEng into
chunks at underbar characters. In the resulting list of chunks,
the even-numbered ones are not italicized, so they can be added
to the parent node using the addText() function
from the etbuilder module. The odd-numbered
chunks are placed in a child element (either i or
span).
We can't use Section 10.4, “class Htmler: State machine for HTML
markup” to convert
double-quotes into the paired versions, because the et module will escape the ampersand. This would
render the string defined in Section 5.7, “LDQUO” as
““”, which a browser
will render as “Éc;”.
So instead we'll just use the variable quotes as
a toggle to remember whether we're inside quotes or not.
#-- 2 --
# [ chunkList := uniEng broken into pieces at '_' characters
# quotes := False ]
chunkList = uniEng.split(u'_')
quotes = False
As we loop through the chunks, the enumerate()
function sets k to the index of each chunk and
chunk to its text.
#-- 3 --
# [ node +:= elements of chunkList, with odd-numbered elements
# marked up for italics using cssClass and even-numbered
# elements treated as plain text, and double-quotes
# converted to paired double quotes starting with the current
# state of htmler
# htmler := htmler with its state modified to reflect the
# double-quotes and underbars in chunkList ]
for k, chunk in enumerate(chunkList):
#-- 3 body --
# [ if k is even ->
# node := node with chunk added as plain text
# else if cssClass is None ->
# node := node with an 'i' element added containing
# chunk
# else ->
# node := node with a 'span' element added with class
# (cssClass), containing chunk ]
For the logic that converts double-quotes to the paired form,
see Section 10.3, “uniQuotes(): Convert double-quotes to
Unicode paired form”.
#-- 3.1 --
# [ reChunk := chunk with double-quotes converted to HTML
# starting with the state of htmler
# htmler := htmler with its state modified to reflect the
# double-quotes in chunk ]
reChunk = uniQuotes(chunk)
#-- 3.2 --
if (k % 2) == 0:
addText(node, reChunk)
elif cssClass is None:
node.append ( E.i ( reChunk ) )
else:
node.append ( E.span ( CLASS(cssClass), reChunk ) )