Hi, I’m Wouter Janssens, Co Founder and CEO of Digita. Welcome to the third episode of Digita's Tech Talks. In our first two episodes, we introduced the building blocks of the Solid ecosystem, and talked about how the versatility of Solid pods increases interoperability beyond that of traditional data storage solutions, enabling a unified storage experience. But what exactly is interoperability, and how does Solid achieve it?
Interoperability is the ability of multiple heterogeneous systems to work together: to communicate with each other, to exchange data, to process this data, and to use the data in a meaningful way. Each of these steps increases the level of interoperability, typically to the point where human intervention or a custom-made connection is needed. Studies have estimated that the interoperability-related lack of efficiency in the US healthcare, automotive and construction industries has a cost of well over 100 billion dollars per year.
The key to achieve long-lasting interoperability is standardization of protocols, interfaces, data formats and terminology. While standards exist for each of these aspects, with Solid the W3C bundles a number of the most successful ones in a single, seamless specification, that addresses each of these steps.
Let's take an in depth look at each of the layers of interoperability Solid provides. The first, foundational layer — also called the technical layer — addresses machine-level interoperability, and encompasses the raw ability of communication and data exchange. In the previous episodes, we already mentioned that Solid pods and apps communicate using HTTP, the Hypertext Transport Layer Protocol. Choosing this protocol determines the channel over which communication will take place, like writing a letter instead of calling or meeting in person. While it is good for interoperability to know the other side also communicates by letter, this is not enough to be sure they will accept the letter, or understand what we wrote. To make sure our communication arrives well, without being tampered with, and the receiver believes the letter to be a genuine message from us, Solid prescribes additional protocols about identity, authentication and authorisation. In future episodes, we'll go deeper into each of these, but for now, let's focus on the interoperability of WHAT we communicate.
A second layer of interoperability is called syntactic interoperability, and is all about the formal or structural aspect of our message, the packaging of the content. If we write even the simplest message in a language that our communicating partner does not understand, we will get nowhere. Data formats are a way in which multiple systems can agree upon this aspect. By specifying HOW a message should be written, they make sure each side of the communication can parse the message to an internal representation, check the message for errors, and serialize the answer back to the shared syntax. Well-known examples of such data formats are CSV, XML and JSON.
As we already demonstrated in the previous episode, Solid is a rather unique data ecosystem, which is capable of handling unstructured data as well as semi-structured and fully-structured data. How does this versatility go hand in hand with the structural requirements of syntactic interoperability? Well, Solid allows any kind of unstructured document to be stored, and uses HTTP's content negotiation mechanism to let systems discover the available data formats. The real syntactic power, however, lies in its use of RDF for structured data.
RDF is a so-called abstract syntax. This means that it prescribes an ideal, mathematical structure to which the data must adhere, but for which there exists a multitude of concrete ways of writing this structured data, both in common data formats like XML and JSON, and in some new formats like Turtle and N3. Communicating systems can thus process the data in the format they want, because it can always be translated to some other format, and parsed to the same abstract syntax. The latter is especially important, because RDF's abstract syntax also forms the foundation for the next level of interoperability.
This third level is called semantic interoperability, and tries to align the meaning of the data. After all, what good is being able to communicate and parse a message, if your partner does not know how to interpret some or all of the words? To really understand each other, no system should use words in a different way then any other system. They should therefore form their messages using the same terminology; they should use the same vocabularies, or sets of words with an unambiguous, agreed-upon meaning.
RDF facilitates the interoperable use of such vocabularies, by inherently linking the syntactic expression of a word to its semantic reference. Remember how every RDF resource has its own URI? Well, every word in RDF is such a URI, and every URI has its own unique meaning. Not only is this practical, since vocabularies of URI's can easily be publicly shared — or even standardized — but it is also extremely self-contained: every piece of data automatically links to the meaningful content of every word it uses, becoming a completely independent, self-describing bundle of information.
Within semantic interoperability, one further step can be taken, however, by not only relying on vocabularies, but also on ontologies, which describe contextual semantics. After all, most of the time communicating systems not only talk about the same things, but also talk about them within a specific domain of application. Semantic schema-languages like RDFS (which stands for RDF Schema) and ontology languages like OWL (the Web Ontology Language), allow us to define conceptual and contextual relations between multiple context-independent words from our vocabularies. Since this additional info itself is also expressed as RDF and represented as a URI, it can easily be linked to from a message.
As we progressed through the different levels of interoperability, it has become clear that the grunt of Solids syntactic and semantic interoperability comes from the inclusion of RDF, while the underlying technical interoperability is reached by combining a number of existing web standards centered around secure HTTP communication. Support for these increasing levels of interoperability leads to an increased efficiency in data processing and exchange within large decentralized ecosystems. By building on a number of successful existing standards Solid provides a single point of reference for implementing such an efficient, highly interoperable data-ecosystem.