Treebeard's Homepage : Stumpers

Treebeard's Stumper Answer
8 December 2000

Obfuscated URLs

We're all used to seeing Internet addresses or URLs, but there's more than one way to find a Web site! Try these links, exactly as shown: All of these obfuscated URLs will take you to the same Dunn School Web site! Spammers and scammers use these weird addresses to hide their tracks, but why do they work? (Hint: this is a math problem!)

The above URLs all work for me using Internet Explorer 5.0 with Win98. The HTML links are coded exactly as shown. Moving the cursor over the links in IE5 gives another hint!


A Web address is really an alias for a unique number that your computer looks up in a special network database called a Domain Name Server. 207.154.84.115 = (207 x 2563) + (154 x 2562) + (84 x 256) + 115, or 3,482,997,875. That's the real Internet address for Dunn School, and it works! You can add any multiple of 2564 and get the same address since most computers just ignore larger powers. "%77" is the ASCII computer code for "w", so "%77%77%77" is just as good as "www". These obfuscated codes all end up as the same address-number for your computer!

Notes:

Computers do their work with pulses of electricity that have an exact voltage that changes over brief periods of time. That's analog. We usually simplify that to a digital high or low voltage, and we don't care about the exact voltage level as long as it's above or below a certain threshold. That's already an abstraction. We can think of these pulses as strings of ones and zeros that represent numbers in binary or number base 2. That's another abstraction. Sometimes it's important to remember that computers really work with physical voltage levels that are not numbers at all, but it's usually OK to think that computers crunch numbers.

Numbers can represent many different things, like a text character, or the red/green/blue components of a colored dot on your screen, or the voltage level of a Grateful Dead sound sample at a particular microsecond-moment, or the modem sounds on your phone line, or even an Internet address. That's a higher level of abstraction.

Numbers can represent many things, and they can also be represented in many different ways. MCMLXVIII (Roman numerals), 11110110000 (base 2), 7B0 (base 16), and 0.7.176 (dotted-decimal base 256) are all perfectly good ways of writing 1968, the year that Stanley Kubrick's (now) timely movie 2001: A Space Odyssey was first released.

The blue square on the top-right represents 1,968 as a particular color with 0 units of red, 7 units of green, and 176 units of blue. There are 256 levels of each primary color possible on most computer displays, so this translates to a unique number for that color: 0 x 2562 + 7 x 2561 + 176 x 2560 = 1,968. We can also write 1,968 in Internet dotted-decimal fashion as "0.7.176", where it's implied that 256 is the place value. If we had better RGB color sense, we could do arithmetic with colors! Check out the addition on the right for an example. (Note that I carefully avoided any carry. What would it mean? What about multiplication and division??) We can write numbers any way we like, as long as we understand the rules for what we're doing. I admit I'm a Platonist. I think of numbers as ideal forms that are "out there" somewhere in the landscape of the mind. We don't invent math, we discover it. Don't confuse the number with how you find the number!

Computers work in binary base 2, but long strings of zeros and ones are hard to remember. Phone numbers are hard to remember too, so we break them up with dashes into smaller chunks that have a rhythm. Number bases that are powers of two make it easy to work with binary, so number bases 8 (octal), 16 (hex), and 256 are usually used by programmers, and are represented in standard ways. In hexadecimal (base 16), digit place values represent powers of 16 ( = 24). With 16 digits to represent, we can use the letters A-F as extra digits: A=10, B=11, c=12, D=13, E=14, and F=15. Take any binary number and divide it right-to-left in blocks of four digits, and then translate block-by-block into hex. For example 10110101 (base 2) = 1011 0101 = B7 (base 16). This gets easy with practice. Divide the binary number into 3s for Octal (base 8), and into 8s for base 256. You can also divide a hex number into 2s to convert it to base 256. There aren't enough standard characters for base 256, so the dotted-decimal form is used instead.

The Windows calculator (in scientific mode) is a useful tool for converting between common number bases, as Graybear illustrates:

The easiest way to convert from base ten to base 256 is to first convert to base 16 (hexadecimal) on the computer's calculator, then combine the digits into pairs and convert each pair to base ten.

3482997875 (base 10) = CF9A5473 (base 16)
CF (base 16) = 207 (base 10)
9A (base 16) = 154 (base 10)
54 (base 16) = 84 (base 10)
73 (base 16) = 115 (base 10)
3482997875 (base 10) = 207.154.84.115 (base 256)

I bet you could write a short program that would make it even easier.

My BIGNUM and BNC Basic programs can do all sorts of number base conversions. They are available with source code from Treebeard's BASIC Vault. There are also online calculators.

Here are the standard ways of writing numbers in these bases. These rules can be ambiguous, so don't trust to chance!

Number Base Binary Standard The rule:
10 Decimal - 1986 Start with a non-zero digit {1 .. 9}.
2 Binary 11110110000      - (no way)
8 Octal 011 110 110 000
 3   6   6   0
03660
03.06.060
Start with leading "0" before every digit {0-7}.
dotted-octal?
16 Hex 0111 1011 0000
 7    B    0
0x7B0 Start with "0x", {0-9, A-F} are digits.
256 base 256 00000111 10110000
   7        176  (dec)
   07       B0   (hex)
7.176
0x07B0
Use "." (a dot) to separate decimal numbers in the range {0 - 255}

Now we can examine each of the obfuscated URLs, and find a few more!

These are some truly obfuscated URLs! I expect things to make sense in nature, but with technology, I'm usually happy to know what works and leave it at that. A plain numeric address like http://207.154.84.115/ might just mean that the site is brand-new and hasn't yet been registered. But spammers and scammers use these techniques to hide their tracks when they send junk email. Is this also a way to get around firewalls and Net-Nanny type censorship? (Let me know!)

Here's my best shot at a truly obfuscated URL for dunnschool.com that uses all the tricks. See if you can figure out why this works!
http://Obfuscate!%64@%32%30%37%2e000%3232%2E84%2e%31%315/

A stumper remains. I have my own Web server at treebeard.org, aka 204.48.153.235, which is hosted as a virtual server on Ray Ford's fine Santa Barbara Outdoors site. When I try the same obfuscated URLs on my address, I sometimes get my page and sometimes Ray's! Is there any rhyme or reason to this??

http://www.treebeard.org/treebeard.org
http://204.48.153.235/SB Outdoors
http://3425737195/SB Outdoors
http://%77%77%77%2E%74%72%65%65%62%65%61%72%64%2E%6F%72%67/treebeard.org

Update (4 March 2001):
cLive hoLLoway emailed this explanation:

Your server (like most), runs virtual servers, ie mapping more than one domain to the same IP address. Typing the IP address alone gives you the *default* server. Your "actual" URL will be http://204.48.153.235/~yourlogin When the server is requested a URL, it converts the URL to the correct internal mapping. If your domain was the only one on the IP address, then your experiments would work.

Thanks Clive, this makes sense. But when I try various combinations of my login name and my treebeard.org domain name, I still can't get to my page. I either get a "File not Found" or a "You don't have permission" error. This must have something to do with the server mapping just as you say.

Here are some Web links for further research:

Back to Stumper


Last modified .

Copyright © 2000 by Marc Kummel / mkummel@rain.org