Sunday, November 23, 2008

Networking, routing, and bit masks

Last week I got funny little assignment through a colleague of the Networking department. They’ve got a huge set of routes set in a major internet exchange machine in Amsterdam, and wanted to have a program to optimize the routes. Optimizing the routes is not only important for performance reasons, but also because of money—some connections are more expensive than others.

This assignment is interesting for a number of reasons:

I feel like I haven’t coded anything useful for ages
I will not be writing the code, but designing it together with a co-worker, who will do the actual implementation
Someone will actually be using this
This will save the boss money (we could’ve been rich …!)
The internet might benefit from this, implicitly pleasing many people (the good folks at home)
It involves some good old-fashioned bit-mucking about (and I love it!)

So, what does an internet route look like? Well, you’ve got the network address in an IPv4 decimal dotted quad notation, and the destination to route to also in IPv4 decimal dotted quad notation. So:

Network address     Destination
10.0.1.0/24         172.16.3.91
10.0.2.0/22         172.16.4.1
10.0.0.0/16         192.168.2.2

Let’s call these three routes A, B, and C. Now, let’s pretend that using route A is more expensive than using route C. Anyone can see in this (pretty bad) example, that route C is such a large network, that it encompasses both A and B—the network address ranges of A and B both fit in C. Therefore, route A and B can be eliminated and we can cheaply use route C.

To be able to do this programmatically, one must realize that IPv4 addresses are 4 bytes. The decimal dotted quad 10.0.1.0 translates to 4 bytes (hexidecimal notation): 0A 00 01 00, or the 32 bit word 0A000100.
I like writing this in hexadecimal because the network range ‘/24’ translates to a bit mask; ‘/24’ means the first 24 bits of the 32 bit word are 1s, giving a bit mask of FFFFFF00. Likewise, a /22 gives FFFFFC00 and /16 is FFFF0000.

Why is this important? Well, it shows how the internals of the internet work, and why it is so fast. For example, a source IP address of 10.0.1.18 is bitwise AND masked with its network address, and it will match route A in the routing table. The packet is consequently sent to the configured destination (in this case, 172.16.3.91).

Finding the cheapest route now only breaks down to this:

Find a route that has a smaller bitmask, but still matches the address
The route has to be cheaper in terms of cost, we don’t like spending more money

We can see that route C fits these criteria for A and B:

10.0.1.0/24 => 0A000100 AND FFFFFF00
10.0.2.0/22 => 0A000200 AND FFFFFF00
10.0.0.0/16 => 0A000000 AND FFFF0000

0A000100 AND FFFF0000 == 0A000000, so route A fits in C

If we would have a route D like 10.10.0.0/16, it would not fit in C:

10.10.0.0/16 => 0A0A0000
0A0A0000 != 0A000000, so route D does not fit in C.

As you can see, routing is just simply bitmasking and using a lookup table (the “routing table”). If there is one thing that hardware can do fast, it’s bitwise AND operations.

I came up with a funny statement that creates the bitmask from the slash-notation:

((1 SHIFTLEFT (32 - numbits)) - 1) XOR FFFFFFFF

This is overly complicated. It is possible to look at it in another way; e.g. with a /16 all the left-hand side bits are set, and all the right-hand side bits are clear. So it is possible to start with all bits set, and shift the clear bits in from right to left:

FFFFFFFF SHIFTLEFT (32 - numbits)

Be mindful that this works only for 32 bit registers, otherwise the mask will overflow. You can prevent this be explicitly selecting the least 32 bits:

(FFFFFFFF SHIFTLEFT (32 - numbits)) AND FFFFFFFF

We can do even better than this. There is a cool operation named arithmetic shift right which is ideal for the job, but it is not present in all programming languages. The arithmetic shift right copies the highest bit into the next bit, which is exactly what we want:

80000000 ARITHMETICSHIFTRIGHT (numbits - 1)

Just for the fun of it, I thought about implementing this in assembly language. There are many implementations possible, depending on what CPU you are programming. Just for the kick of it, I picked a 32 bits ARM processor. You have to be a bit clever with the bit operations, especially on this system.

@
@ first solution, using (-1 SHL (32 - numbits))
@
LDR R0, numbits
MOV R1, #32
SUB R1, R1, R0        @ R1 is (32 - numbits)
MVN R0, #0            @ R0 is FFFFFFFF
MOV R0, R0, LSL R1    @ R0 becomes the bitmask

@
@ second solution, using an arithmetic shift right
@
MOV R0, #1
MOV R0, R0, LSL #31   @ R0 has only highest bit set
LDR R1, numbits
SUB R1, R1, #1        @ R1 is (numbits - 1)
MOV R0, R0, ASR R1    @ R0 becomes the bitmask

Note that the routing theory holds true for IPv6 as well, except that for IPv6, you have to work with 64 bits addresses rather than with 32 bits addresses.