Basing the data of a CMS or documentation initiative on XML isn’t a bad idea, as there
are many useful tools around to handle it. From a single XML
source, the documents can be converted to (X)HTML, PDF using XML-FO, TeX using TeXML and many more.
However, XML is inconvenient to enter, even with advanced tools like
Emacs and its nxml-mode
.
But there is a solution: Most documents are only simple
structured, and there are many ways of marking up them using plain ASCII,
for example
- Markdown
- Textile
- Wiki-like syntaxes
- selfmade formats
Ruby, my favorite implementation language (XSLT excluded, which has
many advantages in XML processing), supports all of these: Markdown
due BlueCloth, Textile using RedCloth, RDoc is Wiki-like and selfmade
formats can be implemented easily because of powerful string
processing features.
Therefore, I propose a RFC2822 like format to turn all these formats into
XML easily and support metadata. Here a document with some features:
Title: A sample document
Author: Christian Neukirchen <chneukirchen@yahoo.de>
Date: Tue, 20 Apr 2004 13:54:54 +0200
X-Comment: make better version
This is some example document of
marking up non XML data...
This will get transformed to something like this:
<document xmlns="\..." xmlns:dc="\...">
<head>
<dc:title>A sample document</dc:title>
<dc:creator>Christian Neukirchen
<chneukirchen@yahoo.de></dc:creator>
<dc:date>2004-04-20</dc:date>
<comment>make better version</comment>
</head>
<body xmlns="\...">
<p>This is some example document of
marking up non XML data...</p>
</body>
</document>
… or something like that. (Don’t take the format for granted, but it
will be something in that style.) By default all backends will create
simple XHTML which easily can be transformed to DocBook without big
problems. The *Cloth allow embedding of arbitrary tags, so all
features of the XML workflow can be used (for example, generation of
a ToC, insertion of automatically generated data etc.).
More to follow…