An Exercise in De-Obfuscation

Mon, Jan 4, 2010 6-minute read

I don’t like to post line noise very often, but I did break that exception in the Holiday Wishes post a few weeks ago. I hate quoting entire blog posts, but, 177 characters isn’t too bad:

'(18029799997931744,139752119745773792|%{"{0,60}"-f [Convert]::ToString($_, 2).Replace("0"," ")})-split''(.{12}|Mh)''|?{$_}'|%{iex $_;-join[char[]]$_[[char[]]"nOBB7[4oBCaenRa"]}

When it runs, you get this for output:

PS >'(18029799997931744,139752119745773792|%{"{0,60}"-f [Convert]::ToString
($_, 2).Replace("0"," ")})-split''(.{12}|Mh)''|?{$_}'|%{iex $_;-join[char[]
]$_[[char[]]"nOBB7[4oBCaenRa"]}
     1
    111
   11111
  1111111
    111
   11111
 111111111
11111111111
    111
    111
Merrv ChristMas 

While it may seem like magic, it in fact was not – just some terribly obtuse PowerShell scripting. Let’s take a look at what makes it tick. Rather than deconstruct it, we’ll build it up from scratch.

In a text editor, I drew the tree you see in the main output. In thinking how to compress this down into a shorter script, the essence of the solution is that the tree is really just a set of ON (“1”) cells, and OFF (“ “) cells. Binary is a system that describes ON and OFF really well, so perhaps we can store the tree’s pattern of ON and OFF in a binary number?

Converting this to more realistic binary, we get:

000001000000  
000011100000  
000111110000  
001111111000  
000011100000  
000111110000  
011111111100  
111111111110  
000011100000  
000011100000

Now, NUMBERS do a great job of encoding binary patterns. For example, “1234” is 10011010010 in binary. What if we just concatenate all those ones and zeros together to find the number it represents?

PS >$result = [Convert]::ToInt64("00000100000000001110000000011111000000111
111100000001110000000011111000001111111110011111111111000001110000000001110
0000", 2)

Exception calling "ToInt64" with "2" argument(s): "Value was either too la
rge or too small for a UInt64."
At line:1 char:29
+ $result = [Convert]::ToInt64 <<<< ("000001000000000011100000000111110000
00111111100000001110000000011111000001111111110011111111111000001110000000
0011100000", 2)
    + CategoryInfo          : NotSpecified: (:) [], MethodInvocationException
    + FullyQualifiedErrorId : DotNetMethodException 

Unfortunately, there are too many bits (120) to fit in a single 64-bit number. While there are some “big number” libraries about, this should stand on its own. Instead, lets break it up into two smaller 60-bit chunks, and convert those:

PS >[Convert]::ToInt64("000001000000000011100000000111110000001111111000000
011100000", 2)
18029799997931744
PS >[Convert]::ToInt64("000111110000011111111100111111111110000011100000000
011100000", 2)
139752119745773792 

Perfect. There are the magic numbers you see at the beginning of the script. We should be able to recreate our greeting by converting them back to binary, and outputting them:

PS >18029799997931744,139752119745773792 | Foreach-Object {
>>     [Convert]::ToString($_, 2)
>> }
>>
1000000000011100000000111110000001111111000000011100000
111110000011111111100111111111110000011100000000011100000 

Hmm. These don’t match up any longer. Since leading zeroes are not important when converting numbers to binary, the .NET Framework doesn’t put any. After all, how many would it put? We can get around this problem by right-aligning the strings back to 60 characters:

PS >18029799997931744,139752119745773792 | Foreach-Object {
>>     "{0,60}" -f [Convert]::ToString($_, 2)
>> }
>>
     1000000000011100000000111110000001111111000000011100000
   111110000011111111100111111111110000011100000000011100000 

That’s got the correct width again, but those zeroes are going to get in the way. Let’s convert them back to spaces:

PS >18029799997931744,139752119745773792 | Foreach-Object {
>>     "{0,60}" -f [Convert]::ToString($_, 2).Replace("0"," ")
>> }
>>
     1          111        11111      1111111       111
   11111     111111111  11111111111     111         111 

That’s getting pretty close. In fact, many trees look exactly like this when they come from a box. However, it’s not stacked properly – our original tree rows were 12 wide, and now they are 60. Busting out a gnarly word wrapping regular expression I blogged out a while ago, let’s apply it:

PS >(
>>     18029799997931744,139752119745773792 | Foreach-Object {
>>         "{0,60}" -f [Convert]::ToString($_, 2).Replace("0"," ")
>>     }
>> ) -split '(.{12})' | Where-Object { $_ }
>>
     1
    111
   11111
  1111111
    111
   11111
 111111111
11111111111
    111
    111 

Hey, that rings a bell!

The next step is to add the “Merry Christmas” greeting at the end. Rather than write it directly or invent some cool scheme to encode the characters, what if we picked characters from the script itself, so far? If we have access to the script as a string, we can use PowerShell’s array slicing to pick whatever letters we want out. For example:

PS >"ABCDEF"[3,4,0,3,1,4,4,5]
D
E
A
D
B
E
E
F 

We don’t actually have access to the script as a string, so instead – let’s pack it into a string first, and then run it with Invoke-Expression. We’ll drop a bunch of spaces and use a bunch of aliases to compress things further:

'(18029799997931744,139752119745773792|%{"{0,60}"-f [Convert]::ToString($_, 2).Replace("0"," ")})-split''(.{12})''|?{$_}'|iex

Now, how can we invoke that string, but also have it available for the letter picking process we still want to do? We could store it in a variable (and reuse that variable,) but it might be easier to use the Foreach-Object cmdlet to loop over the (single) string, and then reuse the implicit $_ variable.

PS >'"Hello"' | Foreach-Object { Invoke-Expression $_; $_[3,2,2] }
Hello
l
e
e 

PowerShell naturally emits those last objects one-by-one, so let’s join them into a string:

PS >'"Hello"' | Foreach-Object { Invoke-Expression $_; -join $_[3,2,2] }
Hello
lee 

That looks like the approach we need to pick the letters out of our original script. I generated a table and reviewed its output:

PS >$script = '(18029799997931744,139752119745773792|%{"{0,60}"-f [Convert]
::ToString($_, 2).Replace("0"," ")})-split''(.{12})''|?{$_}'
PS >$x = 0; $script.ToCharArray() | % { "$x : $_"; $x++ }
0 : (
1 : 1
2 : 8
3 : 0
4 : 2
5 : 9
6 : 7
51 : [
52 : C
53 : o
54 : n
55 : v
56 : e
57 : r
58 : t
59 : ]
116 : }

Unfortunately, the script output is missing a few chararcters: “M”, and “h”. It’s also missing a “y”, but using a “v” is a common trick due to its visual similarity. Where can we pack those letters in, though?

The regular expression looks like the perfect place. It splits on sequences of 12 characters, so we could also ask it to split on the non-existent string “Mh”. It would have no impact, but would give us the characters we need. Some readers thought that the “v” instead of a “y” was a typo, so Robert (in the comments) fixed the script by adding the “y” to this throwaway piece of the regular expression.

'(18029799997931744,139752119745773792|%{"{0,60}"-f [Convert]::ToString($_, 2).Replace("0"," ")})-split''(.{12}|Mhy)''|?{$_}

After adding the extra text to the script, review the output again and get a big list of numbers:

… –join $_[110,79,66,66,55,91,52,111,66,67,97,101,110,82,97]

That’s unbecoming of an obfuscated script. Luckily, those numbers are in the regular ASCII range, and we can convert them to a string:

PS >[char[]] (110,79,66,66,55,91,52,111,66,67,97,101,110,82,97)
n
O
B
B
7
[
4
o
B
C
a
e
n
R
a
PS >[int[]] [char[]] "nOBB7[4oBCaenRa"
110
79
66
66
55
91
52
111
66
67
97
101
110
82
97

PowerShell’s array slicing does automatic integer casting if needed, so we are fine just converting the string to a sequence of characters:

… $_[[char[]]"nOBB7[4oBCaenRa"]}

And there you have it: 1200 words to describe 177 characters.