Here s a short summary of some of interesting security things in Sunday s v4.13 release of the Linux kernel:
security documentation ReSTification
The kernel has been switching to formatting documentation with ReST
, and I noticed that none of the
tree had been converted yet. I took the opportunity
to take a few passes at formatting the existing documentation and, at Jon Corbet s recommendation, split it up between end-user documentation
(which is mainly how to use LSMs) and developer documentation
(which is mainly how to use various internal APIs). A bunch of these docs need some updating, so maybe with the improved visibility, they ll get some extra attention.
Since Peter Zijlstra implemented the
in v4.11, Elena Reshetova (with Hans Liljestrand and David Windsor) has been systematically replacing
reference counters with
. As of v4.13, there are now close to 125 conversions
with many more to come. However, there were concerns over the performance characteristics of the
implementation from the maintainers of the net, mm, and block subsystems. In order to assuage these concerns and help the conversion progress continue, I added an unchecked
implementation (identical to the earlier
implementation) as the default, with the fully checked implementation now available under
. The plan is that for v4.14 and beyond, the kernel can grow per-architecture implementations of
that have performance characteristics on par with
(as done in grsecurity s PAX_REFCOUNT).
Daniel Micay created a version
of glibc s
compile-time and run-time protection for finding overflows in the common string (e.g.
) and memory (e.g.
) functions. The idea is that since the compiler already knows the size of many of the buffer arguments used by these functions, it can already build in checks for buffer overflows. When all the sizes are known at compile time, this can actually allow the compiler to fail the build instead of continuing with a proven overflow. When only some of the sizes are known (e.g. destination size is known at compile-time, but source size is only known at run-time) run-time checks are added to catch any cases where an overflow might happen. Adding this found several places where minor leaks were happening
, and Daniel and I chased down fixes for them.
One interesting note about this protection is that is only examines the size of the whole object
for its size (via
). If you have a string within a structure,
as currently implemented will make sure only that you can t copy beyond the structure (but therefore, you can
still overflow the string within the structure). The next step in enhancing this protection is to switch from 0 (above) to 1, which will use the closest surrounding subobject (e.g. the string). However, there are a lot of cases where the kernel intentionally copies across multiple structure fields, which means more fixes before this higher level can be enabled.
NULL-prefixed stack canary
Rik van Riel and Daniel Micay changed how the stack canary is defined on 64-bit systems to always make sure that the leading byte is zero
. This provides a deterministic defense against overflowing string functions (e.g.
), since they will either stop an overflowing read at the NULL byte, or be unable to write a NULL byte, thereby always triggering the canary check. This does reduce the entropy from 64 bits to 56 bits for overflow cases where NULL bytes can be written (e.g.
), but the trade-off is worth it. (Besdies, x86_64 s canary was 32-bits until recently
Partially in support of allowing IPC structure layouts to be randomized by the randstruct plugin, Manfred Spraul and I reorganized the internal layout
of how IPC is tracked in the kernel. The resulting allocations are smaller and much easier to deal with, even if I initially missed a few needed container_of() uses
randstruct gcc plugin
I ported grsecurity s clever randstruct gcc plugin to upstream
. This plugin allows structure layouts to be randomized on a per-build basis, providing a probabilistic defense against attacks that need to know the location of sensitive structure fields in kernel memory (which is most attacks). By moving things around in this fashion, attackers need to perform much more work to determine the resulting layout before they can mount a reliable attack.
Unfortunately, due to the timing of the development cycle, only the manual mode of randstruct landed in upstream (i.e. marking structures with
). v4.14 will also have the automatic mode enabled
, which randomizes all structures that contain only function pointers.
A large number of fixes to support randstruct have been landing from v4.10 through v4.13
, most of which were already identified and fixed by grsecurity, but many were novel, either in newly added drivers, as whitelisted cross-structure casts
(like IPC noted above), or in a corner case on ARM
found during upstream testing.
One of the issues identified from the Stack Clash
set of vulnerabilities was that it was possible to collide stack memory with the highest portion of a PIE program s text memory since the default
(the lowest possible random position of a PIE executable in memory) was already so high in the memory layout (specifically, 2/3rds of the way through the address space). Fixing this required teaching the ELF loader
how to load interpreters as shared objects in the mmap region instead of as a PIE executable (to avoid potentially colliding with the binary it was loading). As a result, the PIE default could be moved down to ET_EXEC (0x400000) on 32-bit, entirely avoiding the subset of Stack Clash attacks. 64-bit could be moved to just above the 32-bit address space (0x100000000), leaving the entire 32-bit region open for VMs to do 32-bit addressing, but late in the cycle it was discovered that Address Sanitizer couldn t handle it moving
. With most of the Stack Clash risk only applicable to 32-bit, fixing 64-bit has been deferred until there is a way to teach Address Sanitizer how to load itself as a shared object instead of as a PIE binary.
early device randomness
I noticed that early device randomness wasn t actually getting added to the kernel entropy pools, so I fixed that
to improve the effectiveness of the latent_entropy gcc plugin.
That s it for now; please let me know if I missed anything. As a side note, I was rather alarmed to discover that due to all my trivial ReSTification formatting, and tiny FORTIFY_SOURCE and randstruct fixes, I made it into the most active 4.13 developers
list (by patch count) at LWN with 76 patches: a whopping 0.6% of the cycle s patches. ;)
Anyway, the v4.14 merge window is open!
2017, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.