"Why?", I hear you ask.
Well... There are quite a few interesting things we can do solely from the context of the TrustZone kernel. To name a few:
- We could hijack any QSEE application directly, thus exposing all of it's internal secrets. For example, we could directly extract the stored real-life fingerprint or various secret encryption keys (more on this in the next blog post!).
- We could disable the hardware protections provided by the SoC's XPUs, allowing us to read and write directly to all of the DRAM. This includes the memory used by the peripherals on the board (such as the modem).
- As we've previously seen, we could blow the QFuses responsible for various device features. In certain cases, this could allow us to unlock a locked bootloader (depending on how the lock is implemented).
Qualcomm's Secure Environment Operating System (QSEOS), like most operating systems, provides services to the applications running under it by means of system-calls.
As you know, operating systems must take great care to protect themselves from malicious applications. In the case of system-calls, this means the operating system mustn't trust any information provided by an application and should always validate it. This forms a "trust-boundary" between the operating system itself and the running applications.
So... This sounds like a good place to start looking! Let's see if the TrustZone kernel does, in fact, cover all the bases.
In the "Secure World", just like the "Normal World", user-space applications can invoke system-calls by issuing the "SVC" instruction. All system-calls in QSEE are invoked via a single function, which I've dubbed "qsee_syscall":
- Stores the syscall number in R0
- Stores the arguments for the syscall in R4-R9
- Invokes the SVC instruction with the code 0x1400
- Returns the syscall result via R0
Unlike SMC instructions (used to request "Secure World" services from the "Normal World"), which use the MVBAR (Monitor Vector Base Address Register) register to provide the vector's base address, SVC instructions simply use the "Secure" version of the VBAR (Vector Base Address Register).
Accessing the VBAR is done using the MRC/MCR opcodes, with the following operands:
At this point we can start tracing the execution from the SVC handler in the vector table.
The code initially does some boilerplate preparations, such as saving the passed arguments and context, and finally gets to the main entry point which is used to actually handle the requested system-call. Qualcomm have helpfully left a single logging string in this function containing it's original name "app_syscall_handler", so we'll use that name as well. Let's take a look at the function's high-level graph overview:
|app_syscall_handler graph overview|
However, on closer inspection, the graph seems very shallow, so while there are a lot of different code-paths, they are all relatively simple. In fact, the function is simply a large switch-case, which uses the syscall command-code supplied by the user (in R0) in order to select which syscall should be executed.
|snippet from app_syscall_handler's switch-case|
But something's obviously missing! Where are the validations on the arguments passed in by the user? app_syscall_handler does no such effort, so this means the validation can only possibly be in the syscalls themselves... Time to dig deeper once more!
As you can see in the screenshot above, most of the syscalls aren't directly invoked, but rather indirectly called by using a set of globally-stored pointers, each pointing to a different table of supported system-calls. I've taken to using the following (imaginative) names to describe them:
Cross-referencing these pointers reveals the locations of the actual system-call tables to which they point. The tables' structure is very simple - each entry contains a 32-bit number representing the syscall number within the table, followed by a pointer to the syscall handler function itself. Here is one such table:
Finally, let's take a look at a simple syscall which must perform validation in order to function correctly. A good candidate would be a syscall which receives a pointer as an argument, and subsequently writes data to that pointer. Obviously, this is incredibly dangerous, and would therefore require extra validation to make sure the pointer is strictly within the memory regions belonging to the QSEE application.
Digging through the widevine application, we find the following syscall:
- A pointer to a "cipher" object, which has previously been initialized by calling "qsee_cipher_init"
- The type of parameter which is going to be retrieved from the cipher object
- The address to which the read parameter will be written
- An unknown argument
Note that this was more than just a stroke of luck - taking a peek at the implementation of all the other syscalls reveals that the TrustZone kernel does not perform any validation on QSEE-supplied arguments (more specifically, it freely uses any given pointers), meaning that at the time all syscalls were vulnerable.
For the sake of our exploit, we'll stick to qsee_cipher_get_param, since we've already started reviewing it.
As always, before we start writing an exploit, let's try and improve our primitives. This is nearly always worth our while; the more time we spend on improving the primitives, the cleaner and more robust our exploit will be. We might even end up saving time in the long-run.
Right now we have an uncontrolled-write primitive - we can write some uncontrolled data from our cipher object to a controlled memory location. Of course, it would be much easier if we were able to control the written data as well.
Intuitively, since "qsee_cipher_get_param" is used to read a parameter from a cipher object, it stands to reason that there would be a matching function which is used to set the parameter. Indeed, searching for "qsee_cipher_set_param" in the widevine application confirms our suspicion:
Let's take a look at the implementation of this syscall:
It looks like we can set the parameter's value by using the same param_type value (3), and supplying a pointer to a controlled memory region within QSEE which will contain the byte we would later like to write. The TrustZone kernel will happily store the value we supplied in the cipher object, allowing us to later write that value to any address by calling qsee_cipher_get_param with our target pointer.
Putting this together, we now have relatively clean write-what-where primitive. Here's a run-down of our new primitive:
- Initialize a cipher object using qsee_cipher_init
- Allocate a buffer in QSEE
- Write the wanted byte to our allocated QSEE buffer
- Call qsee_cipher_set_param using our QSEE-allocated buffer as the param_value argument
- Call qsee_cipher_get_param, but supply the target address as the output argument
Writing an Exploit
Using the primitives we just crafted, we finally have full read-write access to the TrustZone kernel. All that's left is to achieve code-execution within the TrustZone kernel in a controllable way.
The first obvious choice would be to write some shellcode into the TrustZone kernel's code segments and execute it. However, there's a tiny snag - the TrustZone kernel's code segments in newer devices are protected by special memory protection units (called XPUs), which prevent us for directly modifying the kernel's code (along with many different protected memory regions). We could still modify the kernel's code (more information in the next blog post!), but it would be much harder...
...However, we have already come across a piece of dynamically allocated code in the "Secure World" - the QSEE applications themselves!
So here's a plan - if we could ignore the access-protection bits on the code pages of the QSEE applications (since they are all marked as read-execute), we should be able to directly modify them from the context of the TrustZone kernel. Then, we could simply jump to the our newly-created code from the context of the kernel in order to execute any piece of code we'd like.
Luckily, ignoring the access-protection bits can actually be done without modifying the translation table at all, by using a convenient feature of the ARM MMU called "domains".
In the ARM translation table, each entry has a field which lists its permissions, as well as a 4-bit field denoting the "domain" to which the translation belongs.
Within the ARM MMU, there is a register called the DACR (Domain Access Control Register). This 32-bit register has 16 pairs of bits, one pair for each domain, which are used to specify whether faults for read access, write access, both, or neither, should be generated for translations of the given domain.
Whenever the processor attempts to access a given memory address, the MMU first checks if the access is possible using the access permissions of the given translation for that address. If the access is allowed, no fault is generated.
Otherwise, the MMU checks if the bits corresponding to the given domain in the DACR are set. If so, the fault is suppressed and the access is allowed.
This means that simply setting the DACR's value to 0xFFFFFFFF will cause the MMU to enable access to any mapped memory address, for both read and write access, without generating a fault (and more importantly, without having to modify the translation table).
Moreover, the TrustZone kernel already has a piece of code that is used to set the value of the DACR, which we can simply call using our own value (0xFFFFFFFF) in order to fully set the DACR.
|TrustZone kernel function which sets the DACR|
All that said and done, we're still missing a key component in our exploit! All we have right now is read/write access to the TrustZone kernel, we still need a way to execute arbitrary functions within the TrustZone kernel and restore execution. This would allow us to change the DACR using the gadget above and subsequently write and execute shellcode in the "Secure World".
As we've seen, most QSEE system-calls are invoked indirectly by using a set of globally-stored pointers, each of which pointing to a corresponding system-call table.
While the system-call tables themselves are located in a memory region that is protected by an XPU, the pointers to these tables are not protected in any way! This is because they are only populated during runtime, and as such must reside in a modifiable memory region.
This little tidbit actually makes it much simpler for us to hijack code execution in the kernel in a controllable manner!
All we need to do is allocate our own "fake" system-call table. Our table would be identical to the real system-call table, apart from a single "poisoned" entry, which would point to a function of our choice (instead of pointing to the original syscall handler).
It should be noted that since we don't want to cause any adverse effects for other QSEE applications, it is important that we choose to modify an entry corresponding to an unused (or rarely used) system call.
Once we've crafted the "fake" syscall table, we can simply use our write primitive in order to modify the global syscall table pointer to point to our newly created "fake" table.
Then, whenever the "poisoned" system-call is invoked from QSEE, our function will be executed within the context of the TrustZone kernel! Not only that, but app_syscall_handler will also conveniently make sure the return value from our executed code will be returned to QSEE upon returning from the SVC call.
Putting it all together
By now we have all the pieces we need to write a simple exploit which writes a chunk of shellcode in the "Secure World", executes that shellcode in the context of the TrustZone kernel, and restores execution.
Here's what we need to do:
- Allocate a "fake" syscall table in QSEE
- Use the write primitive to overwrite the syscall table pointer to point to our crafted "fake" syscall table
- Set the single "poison" syscall entry in the "fake" syscall table to point to the DACR-modifying function in the TrustZone kernel
- Invoke the "poison" syscall in order to call the DACR-modifying function in the TrustZone kernel - thus setting the DACR to 0xFFFFFFFF
- Use the write gadget to write our shellcode directly to a code page in QSEE belonging to our QSEE application
- Invalidate the instruction cache (to avoid conflicts with the newly written code)
- Set the single "poison" syscall entry in the "fake" syscall table to point to the written shellcode
- Invoke the "poison" syscall in order to jump to our newly-written shellcode from the context of the TrustZone kernel!
Playing With The Code
As always, the full exploit source code is available here:
The exploit builds upon the previous QSEE exploit, in order to achieve QSEE code-execution. If you'd like to play around with it, you might want to use the following two useful functions:
- tzbsp_execute_function - calls the given function with the given arguments within the context of the TrustZone kernel.
- tzbsp_load_and_exec_file - Loads the shellcode from a given file and executes it within the context of the TrustZone kernel.
I've also included a small shell script called "build_shellcode.sh", which can be used to build the shellcode supplied in the file "shellcode.S" and write it into a binary blob (which can then be loaded and executed using the function above).
- 13.10.2015 - Vulnerability disclosed and minimal PoC sent
- 15.10.2015 - Initial response from Google
- 16.10.2015 - Full exploit sent to Google
- 30.03.2016 - CVE assigned
- 02.05.2016 - Issue patched and released in the Nexus public bulletin
As there was no public research into QSEE up to that point, this issue wasn't discovered. Hopefully in the future further research into QSEE and TrustZone in general will help uncover similar issues and make the security boundary between QSEOS and QSEE stronger.