Random Access Permutations
Years ago, I was building a game. It was a puzzle game on a gridded board: clear the board in as few clicks as possible. The order you clicked mattered. It’s fun and I’ll blog about it someday.
A friend of mine suggested that I randomly generate levels and build a solver to create and evaluate infinite puzzles. As an amateur programmer my initial (and only) idea was to use brute-force to solve the puzzles. This meant clicking on every piece in every possible order, i.e. permutations.
Since there were many permutations on some boards (easily 1080 or more), I wanted to be able to split up the process. Do 100,000 now, and another 200,000 later. This meant I needed to resume where I left off.
The problem is, many permutation algorithms rely on the previous permutation to generate the next one. For example, Heap’s Algorithm uses a swapping method that swaps two elements every time to create the next permutation.
A B C D <-- swap A & B to get B A C D <-- swap B & C to get C A B D <-- swap A & C to get A C B D ... etc
If I want permutation 100,000 I’d have to run through the first 99,999. I needed a way to get a permutation based on an offset (random access) instead of what came before it (sequential access). I worked on this problem on and off since then and I recently cracked it. That’s what this post is about.
How it works
If I’m being honest, this shouldn’t have taken me several years, even working on it occasionally. The solution didn’t end up being particularly complex in theory or implementation. (Although buckle up for my attempt to explain it with words.)
It works by taking advantage of the fact that one valid way to generate permutations keeps consecutive sections of each item in the first item of the permutation. The remaining items mirror the next-smallest permutation, including the consecutive item in the first slot.
For example, a 4-item permutation of
A B C D:
A B C D A B D C A C B D A C D B A D B C A D C B B A C D B A D C B C A D B C D A B D A C B D C A C A B D C A D B C B A D C B D A C D A B C D B A D A B C D A C B D B A C D B C A D C A B D C B A
The number of permutations is the number of items factorial which is
4! = 26 in the above case. You can see that for the first 6 permutations,
A is in the first slot. I’m calling that a section. Each section lasts for the total number of permutations divided by the number of items in the set. There are 4 items and 26 permutations, so there are 4 sections of 6 permutations each.
Recap: 4-item set, 24 permutations total, 24/4 = 6 permutations per section. Each section features one of each item as the first item. First
The algorithm finds out which zero-indexed section you’re in based on the permutation offset you want. The permutation at the 6th offset is
all_permutations. That section number will be the index of the item in the set that goes in the first slot of the permutation. Section 0 = index 0 =
A in this example.
Once you have the first index, reduce the number of items in the permutation by 1 (down from 4 to 3 for example) and find which section you’re in again. Do this until you have no items left.
You’ll end up with a list of indexes the same length as your set. Loop through each index and move the item at that index into a new array. At the end your new array will be your permutation.
View this using Algorithm Visualizer
Depending on what kind of learner you are, learning about an algorithm through prose may not work for you. For those that learn better by visualizing and stepping through code, I’ve put the algorithm into Algorithm Visualizer. Check it out here.
I’ve uploaded Ruby code for this here. The core algorithm is about 8-lines and the whole Ruby class is 30 including whitespace.
I did some benchmarks and in a worst-case scenario (retrieving the last permutation) my algorithm performs so much faster. Here are the results on an 8-item set (40,320 permutations):
user system total real ruby: 12.313426 0.013559 12.326985 ( 12.341237) new: 0.005471 0.000023 0.005494 ( 0.005494)
4 orders of magnitude faster! I also tested it on large sets, like a 100-item sets and it doesn’t break a sweat. In computer science terms, it’s always
O(n) instead of
n is the number of items. For that 8 item set, that’s either a worst-case time of 40,320 operations, or 8.
Suffice it to say, I’m very happy with the algorithm. I don’t know if it will actually ever help me build a game, but this makes me feel like I got the high score.
A helpful commenter on lobste.rs pointed out that this is like Lehmer codes! I knew there must be something else out there in the world that could do this, and it turns out it’s almost exactly the same as my algorithm, just with different names.