Watch the talk on Youtube.
Download the slides for this talk (PDF, 722k).
I'm Andrew Plotkin. I've been an interactive fiction enthusiast since, oh, 1978. I've played a bunch of it, I've written a bunch of it, and I've worked on a bunch of the tech that supports it, which is what I'm going to talk about today.
To be clear, I will be talking about old-school parser-based interactive fiction. That is, games where you type commands and the game responds with text. The term "IF" has gotten very broad in the past few years, what with Twine, choice-based games, narrative indie games, and so on. Which are awesome! But this talk is about the Zork format.
In 1979, a bunch of geeks sat down in Cambridge to invent a virtual machine. They wanted to launch their game Zork on home computers. But there were a lot of home computers (Apple, TRS-80, more coming). The Z-machine format let them write a game once and run it everywhere.
So they set up Infocom. They use the Z-machine to ship thirty-five IF games, and then VGA graphics are invented and that's the end of Infocom. But, there are lots of Infocom fans on the Internet (the Internet is a thing now). So a group called the InfoTaskForce sits down to reverse-engineer the Z-machine and write an open-source interpreter, so that everyone can keep playing the Infocom game files.
This is where I show up. Around '93, I decided I wanted to write a nice Z-machine interpreter. GUI, proportional fonts, scroll bars, copy and paste. At the time I was most familiar with Unix and X-windows coding. So I took a terminal-window Z-machine engine called Zip, and turned it into an X-windows interpreter called XZip.
A couple of years later I got into MacOS programming, so I ported XZip over to the Mac. Same engine, new display code. I called that MaxZip, short for Mac-XZip, which was a terrible name decision. But not my last.
I also started writing IF games, using a language called Inform which targeted the Z-machine. But -- different talk.
The next project was a Mac port of TADS. TADS is a free IF development language like Inform, but it has its own virtual machine and interpreter. I ported that to Mac. I used roughly the same display code as MaxZip, but with the TADS engine underneath.
So here's what we've got. It's crying out to be modularized! Imagine an API which connects a display library to a VM engine. You compile one with the other and you have an interpreter. And then I could fill in that last X-windows-TADS square (which I never got around to doing).
And there were other IF engines like AGT, Hugo, Alan, ADRIFT. If I could pull them into this scheme, then porting them would get really easy.
At the same time -- this was around 1997 -- it was clear that Infocom's Z-machine was ageing out. It had a 16-bit memory map and other limits which were cramping ambitious Inform games.
So here's my grand plan. First I design the API. Then I write display library implementations for X-windows, MacOS, and terminal window. (Someone else can do Win95.) Then I design a new 32-bit IF VM and update the Inform compiler to target it. At the same time, everybody updates their IF engines to use the new API, and after that all IF technology is uniform, portable, and awesome.
So I did that.
I named the display library "Glk" and the VM "Glulx", because the whole IF community was on the Internet. We would never have to say these names out loud, so it didn't matter if they were hard to pronounce. Right?
There's a reason that Inform and TADS and all these other IF systems have the same basic interface: they're all trying to replicate Infocom. And Zork was Colossal Cave fanfic. So the interface goes back to mainframe terminals and teletypes. The game outputs a stream of text. Nice and abstract. You can do anything with it. Word-wrapping, proportional font, even text-to-speech.
Infocom added simple styles (like italics). They also added an optional status line, which was just a terminal window -- an addressable grid of fixed-width characters -- less abstract, but still easy to implement.
So when I designed Glk, I used that model, slightly generalized. You have one or more windows and each of them is either a text grid or a text stream. The windows can't overlap, because that's messy. Scrolling, wrapping, fonts and so on are entirely the interpreter's problem. So the model runs on everything from GUI systems down to a raw telnet stream. (For telnet, the game only gets a single text stream window. But it can cope with that.)
So this model covers the least common capability set of the Z-machine, TADS, Hugo, and so forth. Great!
The problem is that everybody is ambitious; everybody wants to extend the common IF model and add graphics, fonts, sounds, video, smellovision, what have you. Infocom did this: they had the version-6 Z-machine, which added graphics and text layout -- but it broke the simple text stream. That was a gigantic pain in the ass. Then there's Multimedia TADS, which is HTML-like, and Hugo has its own graphics model, and so on.
So I didn't try to handle any of those in Glk. I came up with my own modest graphics model. My VM uses Glk directly to do graphics and sound and hyperlinks, but Glk can't emulate the graphics models of any of the other IF engines.
So we have a compromise. Glulx has decent multimedia support. All these other engines have Glk ports, which are super-portable, but limited to the common text UI.
Maybe this isn't very impressive. But "super-portable" is interesting, and I'll explain why. See, about ten years ago we got this radical notion: you should be able to play IF in a web browser, instead of having download a weird-ass interpreter app.
This poses an interesting challenge for Glk, which, you may recall, is a C API. It's literally defined by a C header file. How do we run that on the web?
Or how about this. The C API is a bunch of function calls, but it's meant to be used in an event loop. You make a bunch of output calls, and then call the magic select function, which blocks and waits for input. But a web app already has an event loop. How about we just generate a JSON object representing all the output? And then accept a JSON object representing the next input. That's webbier.
We don't have to change the game structure. We can put the JSON layer on top of the JS API. So the game calls Glk functions, auto-generated JS functions, and those accumulate the JSON output object. When the game calls the magic select function, the JSON flies out and the interpreter sets up callbacks for input events.
We haven't changed the UI model, but now it's JSON objects instead of function calls. Data is great! We can do all sorts of things with data. We can log the data objects, and then we get a transcript of the game session. We can clone the data objects to another machine, and then the game session is mirrored for other people to watch. We can do regression tests. Nothing is easier to test than a stream of JSON outputs.
This was so much fun that I went back and reimplemented the JSON layer in C. Now you can take one of the C engines, compile it with the JSON display library, and you get an interpreter which does JSON stdin/stdout. That's a handy component.
If I had more than ten minutes, I could go into the vast array of things I did wrong or failed to do right. Notably, I've been promising to integrate CSS stylesheets for about ten years now... Sorry.
But, on the up side, Skye used Glk to hack the Internet in Agents of SHIELD season 1. So that's cool.
Thanks for listening.
Zarfhome (map) (down)