Cory Doctorow argues that fighting the Enshittification of the world involves making data and software more interoperable, in order to reduce switching costs from centralized Big-Tech platforms. But even making data formats interoperable that are based on open standards is often easier said than done. For instance, the YAML 1.2 spec is very complex for humans to understand, and many YAML parsers aren't fully spec-compliant either. And does your Nix tree-sitter grammar cause weird editor behavior too?
We argue that we need a more concise and ergonomic way to express all the different syntaxes and data models of textual and binary data formats. A way that makes it easier to reason and talk about them, and maybe come up with your own when in need. And of course to transparently convert between them for better interoperability based on your specific needs. Most existing approaches are implementation-defined (e.g. Pandoc's abstract context representation lives in Haskell code files) or otherwise tied to an ecosystem (e.g. JSON schema). Can we properly unify them somehow? Without causing the standards proliferation effect known from the famous XKCD comic [ https://xkcd.com/927/ ]?
In this talk, we propose an approach to interoperability that conceptually resembles the ideas behind LSP and LLVM: Instead of writing many ad-hoc compatibility layers for individual pairs or small groups of formats, we give each and every textual and binary format a concise schema that bidirectionally maps it to a canonical AST representation. Now every ecosystem only needs to implement a single metamodel and schema parser to support all formats that have a schema available. Viewing textual and binary formats as mere representations of a canonical AST also makes it easy to create syntax extensions, preprocessors or a fully custom syntax in a language-independent way. Syntax does not matter - making all formats interoperable via canonical AST representations does. Let's do it!
Session materials will be available here: https://eh23.schmyntax.net/
This work is licensed under CC BY-NC 4.0. To view a copy of this license, visit https://creativecommons.org/licenses/by-nc/4.0/