Reimagining Publishing

Kristen Ratan Jun 20, 2016

Scholarly journal and book publishing today is a convoluted, expensive and slow process that is mired in print paradigms. It can take months to years for finished research to be published and the final product lacks the interactivity and richness possible in the digital era.

Most research communication workflows operate as they did 10 or 15 years ago. Work is done in isolated, proprietary tools and formats and these are shuffled around between people using the equivalent of email, with the manuscript as an attachment. The manuscript is not made web-ready until just prior to publication.

Reimagining publishing means shifting to a collaborative webspace with a digital-first process that posits an HTML document at the center of a flexible set of tasks and action. The Coko community is building many tools including a sophisticated web editor and a flexible workflow engine that can be configured for many different content workflows and adjusted easily to fit changing process needs.

CKF Technology: PubSweet and INK

CKF is rapidly constructing an open source technology framework with separate components that can be assembled into different platforms and adopted by anyone seeking a research communication channel. CKF is building the tools for creating many different platforms.

There are two main frameworks:

  1. PubSweet: a publication framework enabling knowledge management, creation, processing and dissemination. PubSweet includes a robust back end and a set of interoperable components that together comprise a flexible, configurable network capable of managing many content types. The initial use cases are in scholarly journal and book publishing. The components include tools for authoring and collaboration, editing and production, flexible and configurable workflow management and user dashboards, and administration interfaces.
  2. INK: ingestion, conversion and syndication environment that converts content and data from one format to another, tags with identifiers and normalizes metadata. Typical use cases include converting Word and other proprietary formats into highly structured formats such as HTML5, XML, and ePub, and outputting to syndicated services, the web and PDF. INK will add common identifiers such as DOIs and geolocation IDs and ensure compliance with standards for content and metadata.

Decoupling Architecture

Monolithic architectures are the dominant approach in the content management system (CMS) world. The CKF founders have experience with many monolithic publishing platforms and chose to configure the PubSweet and INK architecture as a decoupled set of components that work with one or more frameworks. A decoupled architecture creates “complex systems from simple, independent, reusable components”. (Constantine, Myers, Stevens). In the case of PubSweet, it means the user or organization can choose their desired components and link them together to meet their needs. With INK, users can easily build and customize recipes from modular, chainable steps, and add or build new steps as their needs arise.

  1. Product flexibility
    By using many of the same components PubSweet can be assembled and customized to meet a wide variety of use cases. You could, for example, use PubSweet as a monograph production suite, a journal publishing system, or to create a wholly new form of communicating knowledge. This same product flexibility also eases the pathways going forward into the rapidly changing future of publishing and knowledge production.
  2. Build only what you need
    If a component you need doesn’t exist the good news is that you don’t have to rebuild the entire system, you only build the component you require. There are many advantages to this including lowering the investment needed to customize the system which, as Stevens, Myers, and Constantine pointed out in 1974, “…will become increasingly important as the cost of the programmer’s time continues to rise.”
  3. Innovate with new components
    Lastly a decoupled architecture enables innovation. You can build new innovative components quickly and integrate them with existing PubSweet components. If, for example, you wanted to move beyond the manuscript as the main research object for scientific publishing, you could build a new type of content production interface and use it together with existing dashboard and workflow (etc) components.

PubSweet: the platform to build new platforms

The decoupled nature of PubSweet reflects the growing realization in the software world that the one-size fits all approach for platforms doesn’t work anymore. Each individual or organization has different requirements which reflect not only features but how people work. PubSweet solves this issue by enabling you to assemble the solution you need, and to build the platform you want from existing components.

For example, by combining PubSweet with simple components, many of which are shared, we can configure the framework to meet a file QA, monograph production, or journal workflow.

Arrange components to create platforms for different use cases

Component development is popular in many open source technology spheres, and there will be many people building components to meet a growing set of use cases.

CKF employs or partners with top developers who are building very advanced technologies in this field. For example, Austrian-based have built superior content production interfaces. CKF has partnered with Substance to develop an open source editing component for PubSweet. It is feature-rich and also well placed for writing scholarly content, with a deep understanding of the specific requirements for highly technical and scientific content and data.

Additionally CKF is working with, created by Nokome Bentley, which creates data-driven documents.

PubSweet is written in JavaScript (Node) and very easy to extend. All code developed by CKF is open source under an MIT license.

INK: Ingest, conversion and more

In publishing, file formats cost a large amount of time and money. Unfortunately, documents are still mostly generated in MS Word with arbitrary formatting, and although docx is ostensibly XML, it often comes with embedded binary objects such as Microsoft’s proprietary Mathtype for equation mark up. Currently publishers intake MS Word documents from authors and either attempt to convert to more manageable file formats during the editorial process or, far more likely, simply send these files to external vendors to turn into XML, HTML, and PDF just prior to publication. This means that the manuscripts move through the process in a locked format and as an attachment to the editorial and peer review processes. Data files are also attachments, frequently left unopened.

INK is primarily designed to tackle file transformation issues. It is a web based job management service. INK has the notion of ‘steps’ and ‘recipes’. A step is a specific reusable file transform. A recipe is a collection of steps to be executed in the described order. Steps and recipes can be called with an appropriate API request. Each step is a job resource that is managed and monitored by INK. The results of steps and recipes can be inspected and then rerun so that tweaks can be made to transformation code to improve the result. Validation steps can be written and shared to validate the results against the target file format standards. Any set of steps and recipes may be built or customized by any organization for type of document or needed workflow. Hence INK, like PubSweet, is very modular and can support a huge array of use cases.

INK is designed as an API web service that can be used by multiple clients. Many instances of PubSweet, for example, can send conversion requests to a single INK instance. Other softwares can also leverage the INKs simple API for conversion requests, each independently authenticated, and each with their own organization settings and resources (fonts for embedding, etc).

INK is a kind of typesetting hub. Send INK a file and a conversion request and INK sends back the appropriate output. However, we need INK to be not just a typesetting hub, but the project itself must be a hub for modern typesetters. File conversion experts that have had enough of hard wiring flakey conversion pipelines every time they have a new type of conversion request. Hence INK will grow to include QA tools to automate the running and validating of conversion steps written for INK. INK will also become the central focus for a community to share file conversion code, best practices, hints and tips.

INK – conversion, normalization and enrichment

INK is also an enrichment tool. A file format is not just a file format, it is a container for research. When we have the file, we have the research and so we can automate not just file format conversion but the interpretation and improvement of the data contained within.

Hence INK can be used not just at the beginning or end of the research production process, but during the production of the research materials. Consider the production of a manuscript containing references to disease names or geolocations. While the paper is written the text could be sent through INK, analyzed for the appropriate lexicon and sent to the appropriate an API based databases, automatically linking terms in the manuscript to the appropriate entities. Hence INK can help authors produce and validate networked research objects while they are writing their research materials.

INK can also be used as a conduit to content mining services, identification systems, file storage services, discovery services, data repositories, and more. INK might start in the relatively humble but problem-laden world of file conversions, but it is a service that could prove pivotal to enabling the next wave of research objects that are first class citizens in a networked data environment.

INK is written in Ruby on Rails with a React Javascript library. Steps can be written in any language.


Post by Kristen Ratan

About the Author

Kristen Ratan

Co-Founder of Coko.

Posts by this author