Blue pill/red pill - the matrix has windows longhorn

The advancement of stealth-technologies concluded with emerging of totally new rootkits(circa mid. 2006), which are impossible to detect. Once a computer swallows a “blue pill”, the OS transits into virtual world fully controlled by the rootkit. The old real world ceases to exist. In order to see it, it is necessary to swallow “red pill”. The minds of best hackers attempt to create it… without much success.

Foreword

Had Windows Vista/Longhorn barely released, it was brutally hacked by Polish pania Joanna Rutkowska. She presented the new generation rootkit called “Blue pill” on July 21 2006 at SyScan(Singapore) conference, and August 3 2006 at Black Hat in Las Vegas. It obviously has reference to the “Matrix” film. (You can find the presentation’s transcript and a lot of other interesting stuff and programs at Joanna’s site: www.invisiblethings.org)

After swallowing the blue pill we enter the virtual world, after swallowing the red one we can see the world as it really is

Microsoft representatives reacted to that too calmly: nah, it was just beta version! Nobody claimed it is impossible to hack Vista/Longhorn! The casual process of OS “preparing” started, and the more bugs are found at beta testing, the more stable the final OS version will be. The new OS version was released two months later, but Vista RC1 wasn’t shipped with bug fix and is still vulnerable. Microsoft looked at the problem and it stated there was no issue about it, as if there was no security hole! (Microsoft representative’s speech at http://blogs.msdn.com/windowsvistasecurity/archive/2006/08/07/691441.aspx).

They say it is possible to do anything with administrator privileges(it is necessary for “Blue pill” to obtain them). Well, actually what can user do if he has it?! You don’t have privilege to load an unsigned driver to rightfully run programs in kernelmode. It causes huge headache to both administrators and developers. If Microsoft patched all the bugs, its customers could have a good reason to silently forget about it in the name of Her Majesty Security. But now it turns out we have to give up our freedom and comfort for… nothing! Is it rational? As it always has been, it is left to Microsoft to decide if it is, after the only thing it succeeded at is promotion of their ever-glitching products onto the software market.

Joanna Rutkowska at Black Hat conference

<untranslatable comment about whether her name should be spelled as “Жанна” or “Джоанна” in Russian>

inside “blue pill”

“Blue pill” has two base mechanisms it is built upon - evasion of driver digital signature checking(it is necessary step to check signatures in x86-64 Windows editions since Vista Beta 2 Build 5384) and hypervisor installment. Hypervisor uses AMD Pacifica/Intel Vanderpool technologies, it makes it available for users to launch operating systems inside emulator controlling all the “interesting” system events. It’s like Intel 80386’s virtual 8086 mode, i.e. V86, which was used to run several MS-DOS sessions. Now there is “virtual 386+” extension, the interesting thing is it doesn’t let guest OS to detect if it is being executed on “real” processor. Provided the hypervisor(also called as VMM - virtual machine monitor) implementation was constructed with detection evasion feature in mind.

The technique of driver digital signature evasion works only for 64-bit Windows systems(it is possible load unsigned drivers in 32-bit Windows anyway). The virtualization technology is not tied to particular OS and should function with any of them. Of course, if the processor supports this.

So, the “blue pill” consists of two components whereas only one of them is in fact “blue”. It is responsible for OS transition into virtual world. The other component is just preparation to evade protection on 64-bit Windows systems. It loads shellcode inside kernelspace. Shellcode can serve as rootkit.

The two components of "blue pill" - one is pushed onto kernel level, the other one makes OS delve into virtual world

digital signature evasion

The mechanism of digital signature evasion proposed by Joanna, is based on swap file modification performed at sector level(let us call it page-file attack). The attack itself has 6 steps:

  1. We find a rarely used driver(like NULL.SYS) in /WINNT/System32/Drives directory, read up its contents and outline the unique byte sequence (signature) to let us distinguish it later. The signature must be located in IRP_MJ_DEVICE_CONTROL branch of DeviceDispatcher procedure(we can find its address after disassembling it). It is important that signature must not extend beyond page limit because swap file might store adjacent pages with gaps between them. That is, the following condition must be true: (virtual_address_of_signature % 1000h) + sizeof(virtual_address_of_signature) < 1000h;

  2. We launch “memory-eater” program to “eat out” remaining memory(via calling API function VirtualAlloc, for example). It makes OS RAM swap onto hard drive with kernel components as well(attention! if DisablePagingExecutive parameter located in HKLM\SYSTEM\CurrentControlSet\Control\Session Manager\MemoryManagement branch is equal to 1(default value is 0), kernel components will not be swapped! all changes are applied after reboot);

  3. We open “\.\C” device, which is logical disk, or “\.\PHYSICALDRIVE0”, which is physical drive, with CreateFile API function, and read/write with ReadFile/WriteFile accordingly at sector level. We can also use well-documented SPTI interface. It passes commands to SCSI with DeviceIOControl API function by using IOCTL code of IOCTL_SCSI_PASS_THROUGH_DIRECT (4D014h). operating system translates automatically pseudo-SCSI commands into native commands for particular memory storage, like IDE HDD. Two undocumented IOCTL codes(IOCTL_IDE_PASS_THROUGH and SCSIOP_ATA_PASSTHROUGH) allow for transmitting native-ATA commands to IDE storage medium. It gives unlimitted control over them, but also reduces compatibility(what if our victim PC has SCSI?!). To access all the interfaces mentioned previously, you have to own administration privileges which you might not have at the time. But we have ASPI interface developed by Adaptec, it doesn’t have such restriction!!! And although ASPI driver provides access only to ATAPI devices such as CD/DVD, often hard drives are supported as well. So in theory it is possible to conduct the attack without administrator privilege at all!!! If the target machine doesn’t have ASPI driver installed, the rootkit must either install it on its own(btw the driver is signed and distributed for free) or look for other drivers installed by HDD/CD/DVD low-level operating programs, i.e. hard drive editors, CD/DVD burners. Many of them let you access a hard drive without administrator privilege. You can read more on that in my book “Technique of lased disks copying protection”, the draft can be found at ftp://nezumi.org.ru. Remember, the server is not always uptime;

  4. After the driver was swapped to the hard drive, we start skimming it through for the driver’s signature at sector level. We can tell if the driver is already swapped or not emperically - after RAM exhausting, the system starts pushing out memory pages being allocated by VirtualAlloc to the drive. It increases the level of available physical memory exponentially, which is detectable with VirtualQuery API function;

  5. We look for the IRP_MJ_DEVICE_CONTROL address and write it over with the shellcode disabling digital signature checking. Or we load the required code at kernel level on our own. If we do it the first way, we will be able to load unsigned drivers effortlessly in future.

  6. We call CreateFile API function, with one of parameters containing the hacked driver’s name(it is NULL.SYS in our case), and… the operating systems reads modified memory pages, calls IRP_MJ_DEVICE_CONTROL, and transmits execution flow to shellcode.

After prototyping the attack, Joanna already proposed several countermeasures against it, as any “white” hacker would do. The simplest, but not smartest solution for Microsoft, is to stop kernel level components from swapping to hard drive. We already have everything we need to do this, it is only necessary to set DisablePagingExecutive flag to 1 in all future Windows NT versions. As the result, we’ll lose some RAM amount. But we can choose the other way: calculating memory page’s checksum before offloading to the hard drive and checking it after uploading. We have to store it somewhere(it must be RAM, not hard drive), so we don’t get any advantage with this method. We only waste the CPU cycles. We can encrypt the memory pages with superfast cryptoalgorithm, though, but that’s an overkill. It is also possible to alter the kernel file itself in order to disable all the protection mechanisms. So these methods are useless!

Swapfile attack is indeed advanced technique that Microsoft will not patch very soon. At least Vista RC1 didn’t. It is important to note that everything described above works only for 64-bit Windows editions, as they are the only ones that don’t let administators load unsigned drivers.

delving into virtual world

The mechanism of hardware virtualization from AMD, i.e. Pacifica, is implemented in Athlon 64/Turion 64 processor families. There is also ongoing effort to extend hardware virtualization capabilities for Opteron family. But this is true only for scarce x86-64 chips. AMD didn’t decide to put virtualization capability in x86 processors. The company alleged x86 architecture is not suitable for such purpose(AMD are suckers). Nevertheless Intel did that.

The Pacifica virtualization technology of AMD processors. After VMRUN instruction is executed, the processor creates new SVM(Secure Virtual Machine) controlled by hypervisor and VMCB(Virtual Memory Control Block)

Intel technology codenamed “Vanderpool” swayed off from Itanium “Silvervale”. Vanderpool is implemented in Pentium 4 6x2, Pentium D 9xx, Xeon 7xxx, Core Duo, and Core 2 Duo. To escape probable ambiguity, Intel merged both codenames into VT-X(Virtualization Technology X).

VT-X is substantially different from Pacifica, although this is the same technology which provides similar functionality. This allows launching hypervisor, the mechanism to switch operating system to “guest” virtual mode. It makes operating system think it is real from its standpoint.

Hardware virtualization mechanism implemented in Vanderpool/Silvervale by Intel - virtual machine monitor(VM Monitor); it can be launched with VMXON command, and allows for creating as many virtual machine as one likes

The hypervisor gives control to guest OS and takes it back in case there are “interesting” events happening: hardware/software interrupts, accessing model-specific registers, etc.

Hypervisor is not capable to intercept API calls within guest OS, but it can watch if I/O ports are being accessed. It can also manipulate hardware despite what OS wants to do with that. Indirect API interception is possible when there are hardware breakpoints set on instructions which will be executed by guest OS, the hypervisor also has to take care to show “good” values located in DRx registers. Unfortunatelly, we can’t set more than 4 breakpoints for now.

The full Pacifica specification can be found in “AMD64 Architecture Programmer’s Manual Vol. 2: System Programming” at http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/24593.pdf.

Intel decided to release a separate document instead of batching everything into a big pile: ftp://download.intel.com/technology/computing/vptech/C97063-002.pdf(though it is too hard to write your own VMM without consulting system programming guide, hackers already read it. Well, those who have no experience with protected mode, will have hard time implementing “blue pill”, but… it is only their problem).

Executing VMRUN on AMD x86-64 processors makes hypervisor take control over OS

Let’s look at the “blue pill” conception made by Joanna for AMD x86-64 processors. After swallowing the “blue pill”(CALL bluepill), the kernel executes rootkit’s main function(PROC bluepill). It prepares the data structures needed for hypervisor, creates virtual machine, and launches it with VMRUN(“VMXON” for Intel chips). This command in turn sets up the hypervisor and puts “glasses” on OS, they intercept everything guest OS sees in “external world”, i.e. I/O ports, model-specific registers, physical RAM, etc.

The only information the hypervisor “palms off” to guest OS is what it allows it to perceive while also hiding VMM’s existence.

Simplified concept of the "blue pill" created by Joanna

Hypervisor/VMM is by itself very complex to write and even harder to debug. But we still want to have our own “blue pill”, don’t we? Actually Joanna itself didn’t do that, she sincerely told that in her blog in “The Blue Pile Hype”: theinvisiblethings.blogspot.com/2006/07/blue-pill-hype.html. Мыщъх quoted the following: “All the hype started from this article in eWeek by Ryan Naraine (http://www.eweek.com/article2/0,1895,1983037,00.asp). The article is mostly accurate, despite one detail - the tile, which is a little misleading… It suggests that I already implemented “a prototype of Blue Pill which creates 100% undetectable malware”, which is not true. Should this be true, I would not call my implementation “a prototype”, which suggests some early stage of product… The Blue Pill prototype I currently have is not yet complete, but this is not that important, because having successfully moved the OS into a virtual machine, implementing all the other features is just a matter of following the Pacifica specification

Once "blue pill" is swallowed, the operating system descends in virtual world

If we wanted to take step closer for several light-years towards our goal, without rebuilding the wheel, we can “rip out” kernel from a finished emulator and slightly enhance it with “file” for our hacker purposes. Obviously it must be a free open-source emulator like XEN, it supports both architectures(Pacifica + Vanderpool) with additional abstraction layer over hardware details. Anyway it doesn’t mean we shouldn’t create two separate rootkits for x86 and x86-64 platforms. XEN v3 source code can be found at http://www.cl.cam.ac.uk/Research/SRG/netos/xen/downloads/xen-3.0-testing-src.tgz(the only stable version for now), the official site link is http://www.cl.cam.ac.uk/Research/SRG/netos/xen. Particulary, the kernel with x86 support is in /xen-3.0-testing/xen/include/asm-x86/hvm/vmx/vmx.с file.

The red pill

In the “Matrix” film you have to take a red pill to see the real world. What about operating system? Can it somehow determine it is being executed under hypervisor? The program for this sole purpose is normally called “Red Pill”. It is used by both hackers and system administrators. The former use it to detect VMWare or other software-level emulators, the latter use it for rootkit detection. But things get more complex for hardware virtualization…

After taking "Red Pill", the operating system detects emulator, if it actually exists

You could think you can execute VMCALL/VMXON to see if instruction call fails, this must happen if OS is virtualized(the processor either doesn’t support hardware virtualization, or it is disabled with BIOS). It is too easy to write such program if you already have driver’s skeleton, but… how should we tell whether it is “the processor has no support for” or “OS is being emulated”?! There is only one way: rip the radiator off from your silicon chip and look at its serials, and compare it with CPUID’s “testimony”. Unless the emulator is too stupid, it should output false info to assure us the processor doesn’t have support for hardware virtualization. But how many users are ready to do something as bold as this?! Suppose they can, but can they put it back where it was?

Note, hypervisor can emulate VMCALL/VMXON to implement nested virtualization. It’s like one virtual world inside the other with no real limit of nestedness. Obviously, the computer performance rate decreases as much as many “Blue Pills” it swallowed. After swallowing the first pill, the computer suffers significant performance loss(nobody ever asserted hardware virtualization preserves 100% COP).

Virtual worlds nested in each other like Russian nesting dolls

Wait wait wait! Performance! We can measure instruction execution time intercepted by hypervisor to notice if it takes longer for them to get executed in virtual world(at first we have to look at RDMSR EFER, EFER 12th bit can show if OS is emulated).

The only problem is we have nothing to reliably measure execution time with. The RDTSC command(it reads TSC’s value, which contains CPU cycle number) drops out from our candidates list because it can be controlled by hypervisor. To make it simple for processor to execute RDTSC, there is the special “calibration” VMCB.TSC_OFFSET value. It contains the value processor has to substract during RDTSC execution. So RDTSC correction occurs even without emulator doing this on its own.

Instruction execution time correction

In theory, you could use real-time clock, like network atom clock or a chronometer. Hypervisor is able to change OS' real-time clock(although it will show incorrect value, and user can notice it), or intercept and change network traffic if it stores network time information(though it must be too hard to do). Unless the user is in virtual world himself, it is unlikely the guest’s OS can influence real life clock he keeps in his hand. ;-)

Detecting hypervisor with chronometer would fail as well

That’s indeed a good “Red Pile”… “The computer launches the testing program, please look at your computer clock to determine its execution time. Test duration is 60 seconds.” Who is even going to do this?! And did we consider what we should compare with our measurement result? In order to detect hypervisor, it is obligatory to have the very same computer and thermostated CPU. There are many CPUs which adjust their cycle count according to temperature. And still, the hypervisor could intercept RDMSR EFER command while ommitting some loop iterations to reach nearly authentic time.

Some crazy people suggest to read RAM with DMA and offload the data to hard drive. After that, they should seek hypervisor’s tracesteps in this shit pile. First, they simply overestimate DMA reading capability. Second, it is easy for hypervisor to track memory references with I/O ports. And third, what should we do if we don’t find the signature?! If “Blue Pill” code signature is not known for us, or code itself is polimorphic, we will never distinguish it!

something instead of conclusion

So it turns out there is no absolute “Red Pill” at all?! Nobody stated that… also there is no “Blue Pill” yet. It is possible for virtualization technology to create many undetectable “Blue Pills”. I guess. In practice, everything related to this problem is very difficult to implement. So we shouldn’t architect spherical cows, instead we should solve the problems as soon as they materialize.