What is XSweet

XSweet is the suite of XSL scripts that produce HTML Typescript. The purpose is to first transfer the content from MS Word to HTML, carry over as much descriptive information as possible regarding the structural attributes of the content, and finally to rationalise the structural information to assist further manual or automated structuring down the line. Or, in other words: Transfer, Describe, Rationalise.

XSweet is a set of tools supporting data acquisition, editorial and document production workflows, on an XML stack with XML/HTML/CSS interfaces. We like the XML stack (XSLT in particular) for these purposes, because it is well suited to encapsulating discrete processes in document transformation, providing performant, scalable, reusable and robust solutions in a ‘pluggable’ way. XSweet should “just work”. But it should also be adaptable.

Aims:

  • MS Word “Office Open” XML (aka WordML) into “HTML typescript”
  • Arbitrary HTML / CSS mapping and munging (HTML tweak)
  • Validation services against ad-hoc (project based) schemas and constraint sets
  • Conversion from editorial system (Enhanced typescript) into structured targets (e.g. TEI, JATS/BITS)

Design constraints:

  • All open source (specifications and components)
  • W3C XSLT 2.0 is okay
  • INK will provide a pipelining infrastructure, but pipelines need to be operational outside INK as well.