Overview
In this blog entry I am going to describe the work I have done for the 2018 Google Summer of Code (GSoC). Here I describe the techniques, which I have used to successfully finish the project, my contributions, and what I have learned during the three months.About Me
My name is Ulrich Fourier, I am a Computer Science student currently pursuing my masters degree at the Technical University of Munich. During writing my bachelor thesis I used Virtual Machine Introspection (VMI) to dynamically analyse malware. Because of this experience, I was very interested to work on this project.The GSoC Project
Dynamic malware analysis has one enemy, split-personality malware. This malware changes its behavior in case it has indication, that it is analysed. In order to circumvent the detection of such analysis system, Virtual Machine Introspection (VMI) was introduced. With VMI the analyser can have a look inside a running virtual machine (the guest). He can analyse the guest operation system, while being stealthy at the same time, and thus circumvent the detection of the malware.VMI has been around for quite a long time for x86 architectures. Due to the rising use of ARM processors - also in the server market - there is an obvious need - to port this technique to ARM. This Project is built upon DRAKVUF, a dynamic malware analysis tool written by one of my mentors, Tamas Lengyel and also upon the former GSoC project done by my other mentor Sergej Proskurin. This last project lays - as the title says - the foundation for DRAKVUF on ARM. This project implements a technique called altp2m for ARM. altp2m allows us to provide multiple memory views for the guest. For understanding altp2m we first take a look at how the memory address translation of virtual machines work.
On the one hand the operating system in the virtual machines is supposed to handle address translation autonomously, on the other hand the hypervisor aims to have the virtual machines memory regions strictly separated. In order to solve this problem in software, shadow page tables were introduced and Intel provided a technique integrated into the hardware called the extended page table (EPT). The hypervisor translates the guest physical address to a host physical address, or as it is called in Xen, machine physical address (p2m). altp2m now takes advantage of the fact that we can manipulate this translation, and provide different results for the translation. These translations are changed depending on what the guest's memory access, this happens in the background transparent to the guest. For example altp2m translates a guest virtual address to a page with an injected instruction when it executes the memory and translates it to a clean page when it does an integrity check. We combine multiple translations to views. For example, we can have one original view which leaves the memory unchanged, and one view which is active when the guest executes code. The abstraction with views helps us to keep track of which memory, the original or the modified, the guest sees.
Trapping the guest
For VMI it is necessary to have the capability to stop and examine the guest when it executes a certain part of the code. As described earlier we are able to stealthy inject code into the guest, on x86 for example DRAKVUF injects the INT3 instruction into the guest. Anytime the instruction is executed the hypervisor is called, thus allowing the malware analyser to stop the guest and do some analysis. Because INT3 is x86 specific we needed to find an equivalent instruction for ARM. We chose the secure monitor call (SMC) instruction. When this instruction is executed the guest is interrupted and the hypervisor is informed, this behavior equivalent to INT3 on x86. As the only disadvantage injection with SMC has, is that the user space is not allowed to execute SMCs. We therefore can only trap code running in kernel space.Single Stepping
single stepping solved in software |
Changes I made
For implementation of these technique I needed to extend the code of DRAKVUF and libvmi. The changes in detail:libvmi: In this patch I added support for SMC events to the libvmi event handling functions. This patch also extends already existing code in the libvmi event handling to support ARM registers, this is necessary in order to use those register in DRAKVUF.
DRAKVUF: This pull request is split into three commits:
- The first commit adds the term software traps as an abstract name for interrupts traps used on x86 and the SMC traps used on ARM. This commit let me reuse some of DRAKVUFs trap injection code in the next commit.
- The second commit is the biggest of this pull request. It adds the single stepping technique described above to the trap injection function of DRAKVUF and also adds a callback for SMC traps. This callback changes the altp2m views in order to enable single stepping. We tried to use as most existing code from DRAKVUF as possible.
- The third commit makes changes to the syscall plugin, one of multiple DRAKVUF plugins. This plugin monitors every syscall happening inside the virtual machine. In order to use this plugin on an ARM machine only few changes were needed.
How to use the code
Clone the drakvuf repository and also make sure that you also checkout the submodules. After you installed Xen from the xen-arm submodule and the libvmi submodule successfully on your system you can then go on with compiling DRAKVUF. Before that make sure to configure it without any plugin but the syscall plugin, because the other plugins have not yet been ported to ARM. Then compile and run it as you would on x86:
autoreconf -vi
./configure --disable-plugin-poolmon --disable-plugin-filetracer --disable-plugin-filedelete --disable-plugin-objmon --disable-plugin-exmon --disable-plugin-ssdtmon --disable-plugin-debugmon --disable-plugin-cpuidmon --disable-plugin-socketmon --disable-plugin-regmon --disable-plugin-procmon
make -j
sudo ./src/drakvuf -r <rekall profile> -d <domid>
Future Work
The obvious next step is to get my DRAKVUF pull request accepted. In order for my pull request to be accepted, I need to configure the travis automatic build testing system to cross-compile the arm code on their x86-based systems. This is a complex task and unfortunately I did not yet manage to get this done successfully.What I learned
The most important take away for me from participating in GSoC is the learning of how to use git in a collaborative environment. Although I had used git before, I did not know what a powerful tool it is. The other take away is that people maintaining open source projects are really nice, helpful and thankful for every addition to their code.With this in mind, I think I am more determined to participate in open source software development in the future. Although it was hard to motivate myself to code during the summer I had a lot of fun developing and test the code which will be used by a lot of people.