-
Notifications
You must be signed in to change notification settings - Fork 1
XML Interchange Format
Contentment allows for easy import and export of individual assets or whole trees. This document contains notes on the format.
The smallest acceptable XML file is:
<?xml version="1.0" encoding="utf-8"?>
<Extract xmlns="https://xml.webcore.io/component/asset/1.0">
</Extract>
This describes an export that exported nothing. The root tag of valid Contentment XML must always be Extract
.
All objects persisted in the database must support data interchange. There are legal reasons for this, but it does provide a convenient method to perform backups. Objects participating in this protocol must define an __xml__
method that accepts one named argument, recursive
, defaulting to False
. This method must return an iterable of the unicode XML fragments used to describe that object.
A Python example:
class Something:
def __xml__(self, recursive=False):
return ["<Something />"]
Template functions built using cinje are suitable for direct use:
# encoding: cinje
: def export obj, recursive=False
<Something />
With the above, the following Python would be valid:
from template import export
class Something:
__xml__ = export
An accessor property is provided to retrieve the non-recursive XML representation named as_xml
, to match as_html
, as_json
, and friends.
The Asset
base class defines the bulk of the export machinery for itself and its participating subclasses. The tag used is the name of the class. The smallest acceptable bare Asset
is:
<Asset name="example">
<title>Example Asset</title>
</Asset>
Attributes of Asset
instances fall into three categories: simple, complex, or compound:
-
Simple types are generally the fundamental ones, unicode text, numbers, etc., that do not represent a container for other values. These are stored as attributes on the containing XML tag.
-
Complex types are ones for which the value (really its class) has overridden export behaviour.
-
Compound types represent containers for other values. Both complex and compound types are stored as discrete child tags.
Translated attributes are stored internally in a mapping, and as such represent a compound type. An example of this is the title
of an Asset
instance. These are encoded using a field-specific singular tag, which may differ from its name in cases of singular/plural, with the tag repeated for each language.
<title>This page could use some color.</title>
<title lang="en">This page could use some colour.</title>
<title lang="en-US">This page could use some color.</title>
<title lang="fr">Cette page pourrait utiliser certaines couleurs.</title>
As can be seen above, both region-free ISO 639-1 and region-specific IETF language tags can be used. There may be an instance of the tag without a language specified, but there must not be more than one; this would represent the ultimate default fallback, and would be used last if no better match could be found. If you do not use the translation machinery, you will only see single tags not tagged with a language.
Metadata associated with an Asset
instance via the properties
accessor is stored via the Properties
class, and represent a "complex" type. Properties may export data in two ways:
<property name="width" type="int">0</property>
<property name="title" separator=": " direction="ltr" />
Because metadata may be of a variety of types, if it is not a unicode string or dictionary the type must be included in the XML tag. If the property is itself a dictionary it must only contain basic unicode strings, and is given a simplified, empty tag encoding as XML attributes. (This, consequently, forbids use of name
and type
as metadata properties.)
All Asset
instances may contain child Asset
instances. These would be encoded after any other properties are. The "path" of an Asset
is determined by the combination of its name and the names of its parent elements.
Pages are containers for layout and general site content. They are an Asset containing a linear list of blocks. An example encoded page would be:
<Page name="terms">
<title lang="en">Terms of Service</title>
<ReferenceBlock target="/theme/part/header" />
<TextBlock>
<content lang="en"><![CDATA[Content would go here.]]></content>
</TextBlock>
<ReferenceBlock target="/theme/part/footer" />
</Page>
Notably, Asset
contributions towards the exported XML are everything except the series of Blocks. Blocks behave according to the Asset
encoding rules with regards to which attributes to supply as XML attributes, and which to populate as nested tags. In the above example, content
is a translated attribute, but because TextBlock
expects HTML content (which would require excessive encoding), it mandates wrapping of those values in CDATA.