Heaven's Gate
64-bit code in 32-bit file
roy g biv / defjam
-= defjam =-
since 1992
bringing you the viruses of tomorrow
today!
Former DOS/Win16 virus writer, author of several virus families, including
Ginger (see Coderz #1 zine for terrible buggy example, contact me for better
sources ;), and Virus Bulletin 9/95 for a description of what they called
Rainbow. Co-author of world's first virus using circular partition trick
(Orsam, coded with Prototype in 1993). Designer of world's first XMS swapping
virus (John Galt, coded by RT Fishel in 1995, only 30 bytes stub, the rest is
swapped out). Author of world's first virus using Thread Local Storage for
replication (Shrug, see Virus Bulletin 6/02 for a description, but they call
it Chiton), world's first virus using Visual Basic 5/6 language extensions for
replication (OU812), world's first Native executable virus (Chthon), world's
first virus using process co-operation to prevent termination (Gemini, see
Virus Bulletin 9/02 for a description), world's first virus using polymorphic
SMTP headers (JunkMail, see Virus Bulletin 11/02 for a description), world's
first viruses that can convert any data files to infectable objects (Pretext),
world's first 32/64-bit parasitic EPO .NET virus (Croissant, see Virus
Bulletin 11/04 for a description, but they call it Impanate), world's first
virus using self-executing HTML (JunkHTMaiL, see Virus Bulletin 7/03 for a
description), world's first virus for Win64 on Intel Itanium (Shrug, see Virus
Bulletin 6/04 for a description, but they call it Rugrat), world's first virus
for Win64 on AMD AMD64 (Shrug), world's first cross-infecting virus for Intel
IA32 and AMD AMD64 (Shrug), world's first viruses that infect Office
applications and script files using the same code (Macaroni, see Virus
Bulletin 11/05 for a description, but they call it Macar), world's first
viruses that can infect both VBS and JScript using the same code (ACDC, see
Virus Bulletin 11/05 for a description, but they call it Cada), world's first
virus that can infect CHM files (Charm, see Virus Bulletin 10/06 for a
description, but they call it Chamb), world's first IDA plugin virus (Hidan,
see Virus Bulletin 3/07 for a description), world's first viruses that use the
Microsoft Script Encoder to dynamically encrypt the virus body (Screed),
world's first virus for StarOffice and OpenOffice (Starbucks), world's first
virus IDC virus (ID10TiC), world's first polymorphic virus for Win64 on AMD
AMD64 (Boundary, see Virus Bulletin 12/06 for a description, but they call it
Bounds), world's first virus that can infect Intel-format and PowerPC-format
Mach-O files (MachoMan, see Virus Bulletin 01/07 for a description, but
they call it Macarena), world's first virus that uses Unicode escapes to
dynamically encrypt the virus body, world's first self-executing PIF (Spiffy),
world's first self-executing LNK (WeakLNK), world's first virus that uses
virtual code (Relock), world's first virus to use FSAVE for instruction
reordering (Mimix), world's first virus for ODbgScript (Volly), world's first
Hiew plugin virus (Hiewg), world's first virus that uses fake BOMs
(Bombastic), and world's first virus that uses JScript prototypes to run
itself (Protato). Author of various retrovirus articles (eg see Vlad #7 for
the strings that make your code invisible to TBScan). This is my fifth virus
for Win64. It is the world's first virus that uses Heaven's Gate for
replication.
I found this technique in 2009, and I update it in 2011.
What is it?
On 64-bit platform, there is only one ntoskrnl.exe, and it is 64-bit code. It
also uses a different calling convention (registers, so called "fastcall")
compared to 32-bit code (stack, so called "stdcall", old name was "pascal").
So how can 32-bit code run on 64-bit platform? There is "thunking" layer in
wow64cpu.dll, which saves 32-bit state, converts parameters to 64-bit form,
then runs "Wow64SystemServiceEx" in wow64.dll. But 64-bit registers are
visible only in 64-bit mode, so how does wow64cpu.dll work? Here is what I
call Heaven's Gate, but first we must go back to ntdll.dll.
Thunking Layer
When an important function is called from a DLL like kernel32.dll, it calls
into the native interface in ntdll.dll. The native interface powerful but
mostly undocumented layer between user-mode and kernel-mode. For some detail,
see my Chthon code in 29A#6. It used to be that to call into kernel mode, the
code would do this:
mov eax, service
lea edx, dword ptr [esp + 4]
int 2eh
In Windows XP, it became possible to use sysenter instead of int 2eh, for
better performance. In 64-bit Windows, a "xor ecx, ecx" was added because of
64-bit pointer size, and the int 2eh was replaced by:
call dword ptr fs:[0c0h]
and now we are one step closer to Heaven's Gate. The field at fs:[0c0h] is
called WOW32Reserved, and holds an address in wow64cpu.dll. If we follow the
call, we reach a jump. A far jump. A special far jump. Heaven's Gate.
Heaven's Gate
The jump in wow64cpu.dll is a 64-bit gate. We can jump through it into the
world of 64-bit code: 64-bit address space, 64-bit registers, 64-bit calls.
We might think that jumping into wow64cpu.dll is useless because we cannot
control where it goes after that, but of course we can change the address
ourself to anywhere we like. We can alter the address inside wow64cpu.dll,
we can alter the address at fs:[0c0h], or we can just call through the gate
on our own. The gate maps the entire 4Gb of memory, and the selector value
is always 33h. We can switch between the modes easily, too. All we need is
the return address on the stack. We can switch modes in this long way:
call to64
;32-bit code continues here
to64:
db 0eah ;jmp 33:in64
dd offset in64
dw 33h
in64:
;64-bit code goes here
To switch back to 32-bit code can be done this way:
jmp fword ptr [offset to32 - offset fr64]
fr64:
to32:
dd offset in32
dw 23h
in32:
ret
Once in 64-bit mode, we can only use the native interface in ntdll.dll
The 0eah-style jmp not supported in 64-bit mode, and there are no absolute
memory addressing in 64-bit mode. All addressing is rip-relative, which is
why the jmp is relative to the fr64 label.
Of course there's a simpler way, which looks like this:
db 9ah ;call 33:in64
dd offset in64
dw 33h
;32-bit code continues here
in64:
;64-bit code goes here
To switch back to 32-bit code, just use a 32-bit retf. That's much easier.
Finding ntdll.dll
Once in 64-bit mode, we can only use the native interface in ntdll.dll
because the kernel32.dll in our process memory is 32-bit, and won't run in
64-bit mode. We can get the base address of ntdll.dll this way:
push 60h
pop rsi
gs:lodsq ;gs not fs
mov rax, qword ptr [rax+18h]
mov rax, qword ptr [rax+30h]
mov rax, qword ptr [rax+10h]
Mixing 32-bit and 64-bit
Best of all, Yasm now allows mixing 32-bit and 64-bit code in the same file.
When I was writing Shrug48 (because half-way between 32-bit and 64-bit), this
was not possible, so I had two source files that had to be built separately
and then concatenated afterwards. Now with Yasm, we can use "bits 32" before
the 32-bit code, and "bits 64" before the 64-bit code, anywhere in the file,
and we can swap between them as much as we want, like this:
bits 32
db 9ah ;call 33:in64
dd offset in64
dw 33h
;32-bit code continues here
bits 64
in64:
push 60h
pop rsi
gs:lodsq ;gs not fs
mov rax, qword [rax+18h]
mov rax, qword [rax+30h]
mov rax, qword [rax+10h]
retf
Another way to jump in a position-independent way is this:
push cs
call to64
;32-bit code continues here
to64:
push 0cb0033h ;combined selector 33h and retf
call to64 + 3
bits64
;now in 64-bit mode
;64-bit code goes here
retf ;return to 32-bit mode
Current Directory
There is a separate current directory for 32-bit and 64-bit mode. Normally,
the 64-bit current directory is never used, because all 32-bit APIs that work
with the current directory do not switch to 64-bit first. We can make the
directories the same by overwriting the 64-bit pointers with the 32-bit ones.
Of course, we have to find the location for the 64-bit pointers, first. ;)
Even in 32-bit mode, there is a 64-bit Thread Information Block. It is 0x1000
after the 32-bit Thread Information Block. Inside the 64-bit TIB is a pointer
to the 64-bit RTL_USER_PROCESS_PARAMETERS. At 0x28 bytes before the structure
is the pointer to the current directory that is used by ntdll function
RtlDosPathNameToRelativeNtPathName_U. There are other pointers to the current
directory, but this is the one that we need.
Exceptions
We can use exceptions in 64-bit mode as usual, but SEH does not exist there.
We must use Vectored Exception Handlers instead. There is also a small thing
that surprised me. The 64-bit TIB has a context structure for saving 32-bit
state during mode switching. During the switch, the esp slot is zeroed, and
restored again afterwards. This prevents recursive switching from overwriting
the context. This includes when an exception occurs. When exception occurs,
no matter which mode, context is saved, and esp slot is zeroed. The problem
is that when exception returns, esp slot is not restored. If exception occurs
in 32-bit mode after that, then application will crash. So save esp slot from
TIB (it is at gs:0x1480) if you will use exceptions in 64-bit mode.
Closing
Using the gate is another way to check for 64-bit support, without using the
obvious IsWow64Process API call. Just place a SEH around the call, and if an
exception occurs, then you are on a 32-bit platform. You can also check if gs
selector is not zero. This is true only on the 64-bit platform.
64-bit code in 32-bit files. The ultimate emulator killer. ;)
Greets to friendly people (A-Z):
Active - Benny - herm1t - hh86 - izee - jqwerty - Malum - Obleak - Prototype -
Ratter - Ronin - RT Fishel - sars - SPTH - The Gingerbread Man - Ultras -
uNdErX - Vallez - Vecna - Whitehead
rgb/defjam jun 2009/apr 2011
iam_rgb@hotmail.com