Why IF Save Files Break

A frequent complaint about Inform (and TADS, and I think all of the parser-IF development tools) is that save files aren't portable between versions. If you release an updated version of your game, even if it's only to fix textual typos, then everybody's save files break. Players have to decide whether to keep playing the old version of the game or start from scratch in the new version.

Why is this? Can't it be fixed?

(This is an extended version of a forum post I made in September on the subject.)

The short answer is: No, it can't be fixed. The slightly longer answer is: It can't be fixed without putting a large and very programmer-y burden on the game author. IF development tools aim to make life easier on the author, and so they rely on a save model which is absolutely painless, effortless, and reliable -- but not portable between game version. The alternatives are painful and effortful.


There's an obvious solution which doesn't work. I should dispose of that first. It is: save the command history of the player, and play it back in the new game file.

This is great, except it's unreliable. Game versions sometimes change the game behavior. If the point of the update was to fix a bug, it may well be a bug that affects the responses to the player's commands. This sounds theoretical, but there are cases where you'd expect it:

You don't see any such thing.
You walk into the bedroom.

The game author reads this, smacks herself on the head, and adds "FANCY" as a door synonym. Replay the command history and you now have:

You walk into the bedroom.
You walk back out of the bedroom.

The game then goes off the rails because the player is in the wrong room. This is a slightly contrived example, but I'd expect a lot of cases where small changes in parsing invalidate a command history.

Then there's a whole host of problems in replaying randomness, which you can try to cope with; replaying interaction with real-world input like Web activity and the real-time clock, which is harder; the general slowdown as games become longer... a lot of headaches. I lied when I said "This is great." It's actually pretty terrible.

You can improve it somewhat by storing a log of parsed actions rather than a log of typed commands. And then replaying actions with a careful mechanism that validates every action in terms of game state, and then carries it out in such a way as to maintain valid game state. I think this could work -- but it triples the load on the author. For every action, you now have to write code to validate it when the player types it, validate it when it comes in from the action log, and carry it out (from either source). Current IF systems let the author mix up the validate and carry-out mechanisms, because forcing them to be separate is a giant headache.

(I believe TADS is better at separating them than Inform, but I don't believe it's good enough to make an action-log model easy.)

A possible solution

Let's try to construct a compromise -- a save model which allows portable game files, at the cost of some extra work for the game author. This is going to be a very generalized approach. It's not Inform-specific. (In fact, Inform is organized the opposite of this way. I will get into Inform's particular hassles later.)

Imagine you have an IF game in some abstract sense. It's a bundle of information. It's not organized in a very regular way -- not like a database schema of regular tables. Some objects have common properties ("location") but some have object-specific properties, there are a bunch of ad-hoc global variables, etc.

To save the game state in a portable way, we have to give every bit of information a name and then save the big heap of named data. When we read the data back in with a new version, hopefully most of the names haven't changed -- they still exist in the new version and they mean the same thing. For the ones that have changed, we supply an upgrade routine that massages the data. (The routine will make the appropriate tweaks to the data as it's read in.)

Here's a simple example. Every object has a location, which is another object. This models a simple containment tree. (Skip over I7's complexity of different containment/support/incorporation relations.) So at startup time, we might have


(The diamond has not been created yet so it starts off-stage.)

We can easily throw some global variables into the heap:


In version 2, you decide to add a treasure, the emerald, which starts in the kitchen. Save files from version 1 aren't going to mention the emerald at all, of course. So you need a function (give it a conventional name like V1_TO_V2) which sets all the emerald properties when a v1 save file is loaded. It could also increase the max-score global to 350.

If you decide to delete the diamond in v3, that might be trickier. If it vanishes from the player's inventory (or from the trophy cabinet), the V2_TO_V3 routine might have to appropriately decrease the score. But what if the player handed the diamond to the troll as a bridge toll? You could bring the troll back into play, but if the player is on the far side of the bridge, the game could wind up in an impossible state. There are various ways to work around this; you'll have to pick one.

(One option is always for the V2_TO_V3 routine to say "Sorry, this is too large a version jump, I can't restore v2 files." Current IF systems effectively always fall back on this option. We're trying to be nicer, but in practice we might wind up with some version jumps that are "save-safe" and some that aren't.)

So this is a system. It's uglier than IF programming should be, because we're requiring the author to write these upgrade routines. But it's theoretically possible.

It's critical, though perhaps not obvious, that the author has to be involved in the process of naming objects (and properties, and globals). When the author implements the diamond, they'll have to decide (at that point) that it will be called DIAMOND in the save file, then and forevermore.

It's no use the compiler auto-assigning names OBJECT0, OBJECT1, OBJECT2, etc. Because first, that makes the V1_TO_V2 function hellishly obfuscated; and second, what if the author inserts a new object early in the game? It throws off all the other numbers; now the V1_TO_V2 has to be really big because it's reshuffling every single data bit. No author wants to write that. (And the compiler can't auto-generate it because, well, that's the problem we're trying to solve in the first place!)

Now presumably every object (and global, etc) already has a "source code name" that the author made up. (Slightly tricky in I7, but again, skip that for now!) So this naming problem is in a sense already solved. But this means that we're exposing the source code names in a way which the author isn't used to. In a typical IF program, if you decide to rename a global variable, you do your context-sensitive search-and-replace and it's done. But in this system, you have to also add a line to a V1_TO_V2 function. The fact that the variable changed names gets embedded in the program, and it has to stay there forever (or until you decide to stop supporting v1 save files).

So complexity accumulates in unpleasant ways. One more thing to remember, one more potential bug.

Let's go back to substantive data changes (as opposed to renaming things). I've skipped an important distinction: static properties (which never change over the course of the game) versus dynamic properties (which do). Similarly, let's distinguish global constants from global variables.

Our life is much simpler if we don't store constants and static properties in the save file at all. They are fixed in the game file at startup, and the restore mechanism never touches them. So we can update them in v2 without any trouble at all! The player launches the v2 game: all the static properties have the correct v2 value. The player restores a v1 save file: the static properties are not touched, they still have the correct v2 values. Copacetic.

You can go a long way with static properties. For example, in my Inform games, I always treat "description" as static; I (almost) never change an object's description. If the object is mutable, I give it a description value like "[if in Kitchen]...[else]...[end if]". That's fine in the save system I'm describing. It's code, but Inform already treats code as static.

So you can imagine changing an object's description in v2 of a game. As long as you treat description as a static property, this will not require any V1_TO_V2 work; it's a guaranteed safe fix.

In fact you can imagine a large class of updates which only involve updates to static data. (Fixing textual typos, fixing logic errors within functions, changing constants, etc.) The system handles these very nicely! This is the domain within which an IF system could have safe, reliable version bumps with no extra work by the author.

Let's imagine making Inform work this way

Unfortunately, Inform is in no way engineered for this sort of thing! It doesn't much try to distinguish between static and dynamic data. You can define global constants, but nearly everything else is implemented internally as dynamic data. (You can say "The weight of a treasure is always 12," for example -- but it's not implemented as a truly immutable property. And even though I conventionally treat descriptions as fixed, there's no way to tell Inform that this is so.)

I say "unfortunately", but really it's kind of a feature. If you did decide to change a description property halfway through the game, would you want an extra stumbling block? You'd want to just do it, write an apologetic comment to yourself, and move on. And this gets back to the complexity thing. This save scheme seems awesome until you realize that every time you add a damn property you have to think about its save-and-restore strategy. IF games are full of ad-hoc properties. The languages are designed to make that easy.

Inform, in particular, is downright efflorescent with saved state. Every time you write "[first time]...[only]" or "[one of]...[stopping]", bam, that's a new flag. Every time you write a rule "if examining foo for the third time", bam, that's a new counter. A relation is implicitly a property, unless it's a many-to-many relation, in which case it's an array. Grammar tables are stored as arrays; there's no built-in way to alter them during play, but the system allows the possibility.

The naming problem also rears its nasty, hydra-like heads. An object has a clear source-code name (except that it's not so clear, what with synonyms and optional adjectives and "the" and so on). But what about a grammar table entry? The flag associated with a "[first time]" substitution?

Having to think about save-and-restore problems for every one of these features is an almighty pain in the spinal cord. For a start, you have to come up with a permanent name for each of these supposed-to-be-automatic features. Then you have to think about how it might break between versions. And remember, even adding such a thing affects the reliability of your save files. It's nice to think, oh, I will only make safe changes this version -- but one little "[first time]...[only]" sneaks in...

In real life, bug fixes are messy. During _Hadean Lands_ development I had a silly bug -- "PUT ME IN BEAKER" sometimes succeeded. The fix was, logically, a purely static change: I reduced the scope of that rule from "things" (which included the player) to a smaller class. But the fix would still require a V1_TO_V2 rule because the player shouldn't be in the beaker! To really be reliable, I'd have to check that case and move the player out. (And then there's the case where you follow up by emptying the beaker, so the player winds up off-stage...)

The average IF author is not willing to think about this stuff during game development. Nor, in most cases, likely to get it right. Authors won't test cross-version saves; if they do, they won't test it consistently or completely. If you screw it up, well, your game is buggier than before -- for players with v1 save files.

It's really hard to advertise a compiler feature as "works some of the time, as long as you don't do any of this set of things that is hard to explain." And that's why we tell players with v1 save files to throw them away.

No, really, it's not going to happen

Since it's never tried to do cross-version saving, Inform is chock full of design decisions that make the problem even harder. It strips source-code names out of the compiled game file. It generates data structures that cannot be analyzed at run-time. Glulx games don't even have objects per se; the compiler outputs a big array of bytes, some of which are code and some of which are data, and the interpreter doesn't care as long as it's internally consistent. (Within a given release!) This is great for compiler extensibility; an Inform compiler upgrade rarely requires an interpreter upgrade. But it means that every game release is an opaque snowball that can never be related to any other release.

To imagine even the simplest form of portable game files, we'd have to re-engineer Inform and its virtual machine from scratch. Which is not a bad idea! Glulx's low-level "dumb" design results in some annoying limitations. But it's a large engineering task and nobody is currently planning to tackle it.

Last updated Nov 1, 2014.

Glk home page

Zarfhome (map) (down)