Finding Gray codes with backtracking in Python

on

Gray codes are ways of writing numbers using only 0s and 1s. They share the property that as one goes from one number to the next, only one binary digit (bit) changes. For example, if one is represented as “001”, then two might be represented as “011”; only the middle bit changes.

How are Gray codes used?

There is a family of Gray codes which is widely used called reflected binary Gray codes. These are simple to convert to and from standard binary notation. This makes them useful in embedded systems. For example, some rotary encoders use reflected binary Gray codes to indicate position around a circle.

a rotary encoder, which uses Gray codes to encode the angle of rotation. There are concentric rings, broken into wedges. The rings within the wedges are colored black to represent 0 and white to represent 1.

Just how many Gray codes are there?

There are more Gray codes than the reflected binary Gray codes. In fact, there are many more. Since a Gray code can count arbitrarily high, there are an infinite number of different codes. What is more interesting is how many different Gray codes can be created using N bits. Say we have 4 bits, which can represent 16 different numbers. How many Gray codes with 4 bits can create a cycle with 16 elements?

One way answering this question is to use backtracking. Mostly, backtracking is a a way of saying “search the whole set of possible solutions using a depth-first search,„ but if a partial solution in our search cannot possibly be correct, we do not need to continue searching solutions which include that partial solution. For example, if I was searching for solutions to the n-queens problem, and I have a partial solution with only two queens and they are attacking each other, I do not need to continue searching for solutions that contain these two queen positions.

So how do we use backtracking?

According to Wikipedia, we will need to provide 6 different methods, but for this problem we will take a little shortcut. Rather than use a first() and a next() method to generate the search tree of possible solutions, we will define an extensions() generator which will find all possible extensions, which add a single code point to a partial Gray code.

root(#bits)

Create the starting partial solution for some number of bits. To keep things easy, this will output a partial solution with the first codepoint as all zeros.

def makeroot(bits):
    return [tuple([False for i in range(bits)])]

reject(partial solution)

Returns true if the partial solution is not worth completing. For example, if more than one bit changes between consecutive code points.

def hammingdistance(a, b):
    distance = 0
    for i, j in zip(a, b):
        if i != j:
            distance += 1
    return distance
def reject(code):
    # No repeat codes
    if len(set(code)) != len(code):
        return True
    # Each consecutive code point must be one
    # bit Hamming distance apart
    for c1, c2 in consecutives(code):
        if hammingdistance(c1, c2) != 1:
            return True
    return False

accept(possibly full solution)

Accept returns true only when the given code is a full Gray code cycle.

def accept(codelen, code):
    # Are we finished?
    if len(code) != codelen:
        return False
    return hammingdistance(code[0], code[-1]) == 1

extensions(partial solution)

Returns an iterator over all possible partial Gray codes which add a code point to the provided partial code.

def extensions(code):
    for i in range(len(code[-1])):
        nextcodepoint = list(code[-1])
        nextcodepoint[i] = not nextcodepoint[i]
        yield code + [tuple(nextcodepoint)]

output(solution)

For now, output the full code. Later, we will change this function to count the codes instead.
def output(code):
    print "code:"
    for codepoint in code:
        for codeelement in codepoint:
            codechar = "1" if codeelement else "0"
            print codechar,
        print

backtrack(partial solution)

Backtrack does the actual search. See that it looks very similar to depth-first search in the way it uses recursion.
def backtrack(codelen, code):
    if reject(code):
        return
    if accept(codelen, code):
        output(code)
        return
    for extension in extensions(code):
        backtrack(codelen, extension)

Let's try backtracking out.

What are the possible Gray codes when we have 2 bits and want the full length-4 cycles?

backtrack(4, makeroot(2))
code:
0 0
1 0
1 1
0 1
code:
0 0
0 1
1 1
1 0

Making it count.

The number of Gray codes is going to blow up very quickly, so let's change the output to count instead of print all the results. Since I used the IPython notebook rather than writing a script or Python library, I was a bit sloppy and put the count in a global variable. This mean we will need to reset the count between calls.

number_of_gray_codes = 0
def output(code):
    global number_of_gray_codes
    number_of_gray_codes += 1
def resetoutput():
    global number_of_gray_codes
    print number_of_gray_codes
    number_of_gray_codes = 0

Now, we can run backtrack to see how many different Gray codes there are.

In [12]:
backtrack(2**2, makeroot(2))
resetoutput()
2

Well, we knew that there were two Gray codes for 2 bits already. What if we had a few more bits to play with, but still wanted to keep the cycle length at 4?

In [13]:
backtrack(4, makeroot(3))
resetoutput()
6

Now, what if we get the maximum lenght cycle for this number of bits?

In [14]:
backtrack(2**3, makeroot(3))
resetoutput()
12

In [15]:
backtrack(2**4, makeroot(4))
resetoutput()
2688

That grew fast! I also tried running this for 5 bits, but I “Ctrl-C”ed to cancel the computation because I got impatient waiting for an answer. It turns out, that for 5 bits the answer is 1,813,091,520 which is a lot more than 2,688. No wonder it was taking so long!

What's the point of all this?

If we could find the answer on OEIS, what was the point of calculating it using backtracking? Firstly, backtracking is a very useful technique to know for searching solution spaces. Secondly, we were able to find Gray codes for non-power-of-two length cycles. We discovered that there are 6 gray codes of length 4 using 3 bits for the code points.

Lastly, now that we have this framework, we can adjust our accept and reject functions to find other, more-specific kinds of Gray codes. For example, it would be interesting to see some Gray codes which can be flood-filled and to learn how many such codes there are.

A Gray code of length 6 using 3 bits which can be flood filled in the 0s. {000, 001, 011, 010, 110, 100}