Some fun with vintage bugs and driver signing enforcement
Tablets with Windows, SecureBoot, TPM and Bitlocker are all fun and games, until you want to install some free utility (such as FreeOTFE). Long story short, you have to disable Bitlocker, or buy a higher than home edition of windows (to add -pw protector so you can turn off secure boot). Bummer.
There are of course other ways around this, unfortunately the vbox exploit is not very suitable for my baytrail tablet, enforced WHQL won't even allow the driver to load. However this being an old hat-trick, googling turned up an interesting approach of simply loading older, more buggy Microsoft drivers (=WHQL sig) on modern windows to exploit, and use that to bypass DSE.
We'll still keep using the same bug class, but try to find some more reliable
target than mountmgr.sys, as dealing with trashed SYSTEM thread stack and SMEP is pretty annoying.
Using RtlQueryRegistryValues bugs as a convenient copy-what-where
Maybe we can do better - don't try to hijack IP control, but abuse RtlQueryRegistryValues
to simply write 0 byte into CI.DLL!ci_Options, thus removing signed driver enforcement
- all without running any shellcode.
IDA has turned up an interesting candidate, termdd.sys from Windows 7:

Bonus point for reading the registry before trying to create a device - allowing us to trigger the bug even when the driver is already present under original service name. Here is the driver code:

According to MJ0011's CVE-2010-4398, this is not very encouraging for REG_BINARY - all
EntryContext buffers are initialized to 0. However this is not the only exploitable
condition - in particular, REG_SZ entries are fine with 0, the structure of UNICODE_STRING
they expect is as follows:

Now we need to study
RtlQueryRegistryValues, REG_SZ behavior in particular.
When Buffer is NULL, RtlQueryRegistryValues will simply allocate the string,
overwrite the first two fields with string length from registry, and store pointer into Buffer with
registry string value in it.
When Buffer is not NULL, the string data in registry will be stored at Buffer address, but ONLY
if MaximumLength field is big enough to hold the string. Using a large registry string value you can avoid
dereferencing junk Buffer pointer - notice t3_data being 1.
The behavior of REG_SZ where the Buffer fields are dereferenced and written to will be used
as our write-what-where vector.
And it gets better - REG_MULTI_SZ will do the same as REG_SZ, but will keep storing the
individual sub-strings as consecutive array of UNICODE_STRINGs - allowing us to smash stack
as much as we like as long as Buffer turns out to be NULL. It will also
silently skip over fields which it can't write to - if MaximumLength is smaller than the size of registry string.
We can come up with the following registry payload:
FlowControlDisable-t[0]field - will beREG_MULTI_SZof 2 substrings (L"x\0x\0\0"in C). This will overwritet2_buf(as well as things before it) with some junk (we need it non-zero).FlowControlDisplayBandwidth-t[1]field will be aREG_DWORD- an address we want to overwrite in kernel - it will be stored as is intot1_buf.FlowControlChannelBandwidth-t[2]will beREG_SZ, value being simply the bytes we want to write at the address we've chosen before via DWORD.
Notice that 1. overwrote t2_buf with non-zeros, and we set our controlled value to t1_buf in step 2.
t2_buf (=>{Length,Maximum}) and t1_buf (=>Buffer) will now be interpreted as a (very long) UNICODE_STRING
in step 3 - with its Buffer field fully controlled by us. The string value in registry is then simply the payload
copied at our chosen address via t[1] DWORD.
On AMD64
... the situation complicates a bit. We can't just overwrite ci_Options willy-nilly - PatchGuard would
vomit a death smiley at us (Win8+). Instead, we'd want to save the original value first, then overwrite
it with 0, load our unsigned driver, and restore the saved value. Fortunately, we got very lucky
with stack layout on x64:

Having EntryContext stack destination in front of the table is pretty convenient, as we can just
reshape rest of the table as we like and achieve true memcpy (control both source - DefaultData and destination
- EntryContext).
FlowControlDisablewill beREG_SZentry, the important bit is to makev14become non-0 - when it will be used as aREG_BINARYbuffer length field byt[1]-FlowControlDisplayBandwidth.FlowControlDisplayBandwidthwill beREG_BINARY, and will overwrite contents ofRTL_QUERY_REGISTRY_TABLE[5]on the stack.- Kernel routine continues with t[2] and t[3] fields which are fully controlled through
REG_BINARYvalue. One will be configured to save originalci_Options, second will be used to write0byte intoci_Optionsright after we make our "backup"- both use REG_DWORD type, and it becomes memcpy(dst=EntryContext, src=DefaultData,1).Nameis pointed to some random junk for default value fallback to trigger, as long it's a valid kernel memory, its fine.
There is one caveat - v14 is low 32bits result of ExAllocatePool, and it can be either positive or negative
LONG. REG_BINARY behaves differently depending on sign
- for negative, it omits the {Length,Type} preamble.
Meaning our table will be dumped on stack off-by-8 bytes and we can't guess where the pool sits (ie the sign bit it delivers) at first.
What we do is - first assume the number will be positive and there is 8byte "padding" present,
and if the guess turns out to be wrong, the misaligned structure will terminate RtlQueryRegistryValues
through documented invalid parameter combo,
namely specifying non-null QueryRoutine together with RTL_QUERY_REGISTRY_DIRECT (=0x20) in Flags. Remember
the table looks like this on x64:

When our guess is wrong and the 8 byte padding is not present, presumed Name will be interpreted as Flags.
Remember, Name is just some random pointer in kernel range we made up. So we make also sure it has a bit set
at 0x20, thus becoming RTL_QUERY_REGISTRY_DIRECT as well. Moreover, the presumed Flags will be
actually interpreted as QueryRoutine - thus completing the invalid parameter condition and function aborts,
not touching any of our invalid pointers.
We then just try again (the pool sign is very likely to remain the same), but by adding the 8 byte padding in registry, so the structure fields fit correctly.
In conclusion
"algorithmic" exploits like this are very robust and self-contained - they can bypass SMEP, CFG with
no need for ROP-magic-constants voodoo. So far I've seen this only with RtlQueryRegistryValues
class of bugs - mostly thanks to how user-data-driven this API is. A single universal exploit was tested and works with everything ranging from Win7sp0 up to Win10 preview builds.