The basics of Sokoban level formats for designing your own Sokoban levels

A somewhat difficult Sokoban level, play in progress.

Maybe you’ve played Sokoban, and you’ve wondered about how to make your own levels. I described the basics of the game in a previous article, and went over some thoughts about designing levels in another article.

From what I can tell, there are two different formats for representing Sokoban levels. One is somewhat free-form, the other is based on XML.

To design your own levels to play in Sokoban 3.0, you must write in the XML-based SLC format in your favorite plaintext editor. For Sokoban YASC, you can write the more freely structured SOK format in your favorite plaintext editor.

Or you can use the YASC Editor for a WYSIWYG design experience. And it helps you avoid common mistakes, like having more boxes than goals or completely forgetting to put in the player token.

But, as far as I can tell, YASC can’t export to the SLC format. So, if you want to distribute your own Sokoban levels in both formats, you have to learn how to read and write both formats. That’s what this article is about.

Before you get to that point, though, you must of course play your own levels. Make sure they have reasonable solutions. But it might also be a good idea to run your solutions through the YASC Optimizer.

If you’re anything like me, you’ll probably design a Sokoban level that you think is deliciously difficult, but when you run it through an optimizer, you find that it has a trivially simple solution.

When a level is way too easy, way too hard, or unsolvable, you go back to the drawing board. Either the YASC Editor or your favorite plaintext editor.

YASC also includes the YASC Solver. In my opinion, the only legitimate use for a solver is when you suspect you’ve created an impossible level. On the other hand, if you know you’ve created an impossible level, there’s no need to take it through the solver, go straight back to the drawing board.

The “free form” SOK format

SOK format files have a *.sok file extension. They’re essentially just human-readable text files. No markup, no markdown.

At least in the levels that come with Sokoban YASC, the files have what seems to be a standard preamble explaining the symbols for the level elements, and other details, including mentions of Sokoban variations.

There are actually two different sets of symbols for level elements, though in both of them “#” stands for a wall brick and “.” stands for an empty goal. Then “p” or “@” stands for a player token not on a goal, “b” or “$” stands for a box not on a goal.

In both sets of symbols, you can use spaces to represent floor tiles (that is, a space with no player token, box, goal or wall brick on it). You can also use “-” or “_” if you think that’ll make your meaning clearer.

Actually, the preamble says that

The first and the last non-empty square in each row must be a wall or a box on a goal. An empty interior row is written with at least one “-” or “_”.

To be honest with you, I’m not sure there’s ever a need to represent “an empty interior row.” But if the need arises, you know what to use.

The second set of symbols seems to be more widely used. The example level shown above, the one with the four boxes and the four goals, looks like this using the second set of symbols:

       ####
######## ##
# ###
# @$$ ## ..#
# $$ ## ..#
# ####
###########

A lot of level creators seem to really like making levels in which the player is on a goal and every box but one is also on a goal. Such levels are diabolically difficult when the only possible solutions require moving boxes out of goals before you can put the box that’s not already on a goal into a goal.

This next example can be solved without having to move every single box. In fact, two of the boxes don’t need to move at all.

Regardless of what solutions are possible, it’s necessary to convey, with a single character, that the player token or a box is on a goal.

For a player token that’s on a goal, you can use “P” or “+”, and for a box that’s on a goal, you can use “B” or “*”. This 5-box example would thus be:

      #####
##### #####
# #
# ### ### #
#### # # ####
# # * # #
# $ # *+* #
# # * # #
#### # # ####
# ### ### #
# #
##### #####
#####

It makes sense to group similar levels into a single file. I won’t quote the whole preamble here. I get the sense it’s not required, but I haven’t tested that hunch.

If you don’t get the preamble off levels packaged with a Sokoban program, you can copy it off my GitHub. The link is for the SOK file of my “Illustrative Levels” collection. They’re all valid Sokoban levels, one or two of them might even be fun, but the main purpose is to illustrate certain common patterns in Sokoban level designs.

Immediately after the preamble, there’s some metadata about the collection as a whole. Here’s for example the metadata for my Illustrative Levels:

Date Created: 2020-06-10  12:16:45                        
Date of Last Change: 2020-06-22 03:30:39 :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: These levels are Copyright (c) by Alonso del Arte E-mail: sokoban@mail.com
Original level website: https://github.com/Alonso-del-Arte/ Collection: Illustrative

Next, two blank lines, and the first level. This next example also comes from the Illustrative Levels set:

IllustrativeLevels, "Illustrative #01"    

#####
# ###
# #
# # # #
# # # #
#+$ # #
# #
#######

Author: Alonso del Arte
Email: sokoban@mail.com
Original level website: https://github.com/Alonso-del-Arte/ These levels are copyright (c) by Alonso del Arte and may be freely distributed for non-commercial use. Author's note: Although these are valid Sokoban levels, each with at least one valid solution, the purpose of these levels is to illustrate certain facts about Sokoban, or certain arrangements of initial state in Sokoban levels, or situations that sometimes arise in the course of trying to solve a Sokoban level.

The levels in this and the other collections of mine are sorted by the number of boxes in each level. This is not at all required. You are free to sort levels in whatever order you like. The only expectation is that each level is at least slightly more difficult than the previous level.

If at first you don’t feel like writing an author’s note, I recommend putting in a short placeholder.

Recording solutions in the SOK format

The SOK preamble also specifies symbols to record solutions. In the case of a new level, even if you have a particular solution in mind, I think it’s best to leave this blank and let Sokoban YASC fill it in later.

Once you play and solve a level for the first time in Sokoban YASC, it will record your solution as the “Best Solution” even if it is woefully inefficient. I actually like to give a very bad solution as the first solution.

The symbols for non-push moves are quite straightforward: “u” for up, “d” for down, “l” for left and “r” for right. For pushes, the symbols are the same except capitalized: “U”, “D”, “L” and “R”.

The preamble says “push/pull.” If I’m understanding correctly, pulls are for reverse mode playing, which might be interesting. “Pusher changes” are relevant only for Multiban, which just doesn’t interest me.

When you improve your first solution, Sokoban YASC records the new solution, downgrading the previous solution. For example, in Illustrative №1, after the 9-push solution, I played the much more obvious and simpler single-push solution.

Best Solution 5/1                        
drruL
Solution 23/9
RdrUUUdddrruuuulLLulDDD

When the YASC Optimizer improves on a solution, it also records a bit more detail besides the moves. You should keep this in mind if you decide to write your own solution optimizer.

This next example is from the still easy level Illustrative №2. I intended it to have a 16-move, 4-push solution, but the YASC Optimizer was able to figure out a way to save two moves.

Best Solution 14/4 (YASO 2.142 Optimizer)                        drrULrruulDrdL                                                Optimizer: YASO 2.142                        
Optimizer time: 00:00:00
Optimizer date: 2020-06-10 22:56:58 Optimizer metrics: Pushes, moves

Solution 16/4
drrULdruruulDrdL

Sokoban YASC uses a slightly different format when there is no single best solution but a move-optimal and a push-optimal solution. For Illustrative №6, the YASC Optimizer wasn’t able to improve my 46-move, 10-push solution for fewer moves, but it was able to come up with a push-optimal solution.

Solution/Moves 46/10                        urrdLDlddrUUlluurrrrrddlLDlddrUrUdlddrruruuLuL                                                Solution/Pushes 54/6 (YASO 2.142 Optimizer)                        urrdLulldddrrUlluurrrrrddlLuulllddddrrrrUdlddrruruuLuL                                                Optimizer: YASO 2.142                        
Optimizer time: 00:00:00
Optimizer date: 2020-06-10 14:12:39 Optimizer metrics: Pushes, moves
Solution 50/10 urrdLullddrdrdrrULUUluurrdLulDlldddrRdrddrruruuLuL

The 50-move, 10-push solution is probably a downgraded solution. If there are more downgraded solutions, they appear next. You can look on GitHub if you care to see more of those for this particular level.

Sometimes you play a level again without improving the best solution or the move-optimal solution or the push-optimal solution. Such as a solution gets recorded as well, as “SaveGame,” but it’s subject to replacement by a later solution that doesn’t improve on the better solutions.

For that reason I sometimes load the file into Notepad, replace “SaveGame” with “Solution” and try to place the solution where Sokoban YASC would place it if it was a normally downgraded solution.

The XML-based SLC format

The Sokoban Level Collection (SLC) format looks a lot like HTML. There’s a document type definition (DTD), just like in XML.

Indentation is not required, but if you’re just starting out with SLC, it’s probably a good idea to indent. Though maybe choose a small indent, like two spaces rather than four.

After the XML version declaration and the document type, we open the top level tag, SokobanLevels. Next we have metadata about the level collection. I don’t know if the SLC DTD specifies anything like the HTML head tag, the Title, Description and Email tags would probably go within it.

I haven’t yet gotten around to converting the Illustrative Levels SOK file to SLC, so for the following examples I will use my Seemingly Hard set.

These levels are quite easy, but they can be hard if you make your first move impulsively without thinking about it.

So here are the elements of the SLC file for the Seemingly Hard set that I have described so far:

<?xml version="1.0" encoding="ISO-8859-1"?>                       <!DOCTYPE SokobanLevels SYSTEM "SokobanLev.dtd">                       <SokobanLevels>                         
<Title>Seemingly Hard</Title>
<Description>These levels may seem difficult, but they are
actually quite easy if one stops for a little bit to ponder the
puzzle before making the first move. Otherwise it may be necessary
to restart the level to recover from a deadlock situation.
</Description>
<Email>sokoban@mail.com</Email>
</SokobanLevels>

Next, after the Email element, and before the closing tag for SokobanLevels, there should be the LevelCollection element.

  <LevelCollection Copyright="Alonso del Arte" MaxWidth="32"
MaxHeight="15">
</LevelCollection>

The MaxWidth and MaxHeight attributes should be the greatest length and width, respectively, in the collection. I suspect that some of the Sokoban implementations using the SLC format don’t care about these two attributes, but it’s probably best to make sure they’re correct.

Next in the XML tree, there should be at least one Level element enclosed within the LevelCollection element. And within each Level element, there should be at least three L elements. I’m guessing “L” is for “line.”

And even if the DTD does not say there should be at least three L elements within each Level element, that requirement should follow from the principles of the game.

The L elements wrap the game elements, line by line. For those, I think you can use either of the two symbol sets from the SOK format. Here’s Seemingly Hard №1:

    <Level Id="Hard #1" Width="13" Height="9">
<L> ### ####</L>
<L> # #### #</L>
<L> # $ #</L>
<L> # ### #</L>
<L>## ### ####</L>
<L># @ #</L>
<L># ## # ## #</L>
<L>#.#### #</L>
<L>### ########</L>
</Level>

Just like the id attributes in HTML documents, the Id attributes in SLC documents should also be unique within each document.

As for Width and Height, those should be correct for the level, and not exceed the MaxWidth and MaxHeight attributes of the LevelCollection element, even if a given Sokoban implementation ignores them.

I don’t know if the SLC DTD specifies any elements for recording solutions.

is a composer and photographer from Detroit, Michigan. He has been working on a Java program to display certain mathematical diagrams.