New Linux Kernel Flaw Allows Null Pointer Exploits 391
Trailrunner7 writes "A new flaw in the latest release of the Linux kernel gives attackers the ability to exploit NULL pointer dereferences and bypass the protections of SELinux, AppArmor and the Linux Security Module. Brad Spengler discovered the vulnerability and found a reliable way to exploit it, giving him complete control of the remote machine. This is somewhat similar to the magic that Mark Dowd performed last year to exploit Adobe Flash. Threatpost.com reports: 'The vulnerability is in the 2.6.30 release of the Linux kernel, and in a message to the Daily Dave mailing list Spengler said that he was able to exploit the flaw, which at first glance seemed unexploitable. He said that he was able to defeat the protection against exploiting NULL pointer dereferences on systems running SELinux and those running typical Linux implementations.'"
Just don't use that version (Score:4, Insightful)
It's important to note that there is almost never any "preferred" or "special" release of Linux to use. And obviously this flaw doesn't affect people that don't use any security modules.
This is not good news, but it's important news. The kernel's not likely to have a "fixed" re-release for this version, although there probably will be patches for it as well. And when in doubt, just don't upgrade. Not very many machines can take advantage of all of the cool bleeding-edge features that come with each release, anyways. Lots of older versions get "adopted" by someone who will continue to maintain that single kernel release.
Re:Just don't use that version (Score:4, Insightful)
It's important to note that there is almost never any "preferred" or "special" release of Linux to use. (...) And when in doubt, just don't upgrade. Not very many machines can take advantage of all of the cool bleeding-edge features that come with each release, anyways. Lots of older versions get "adopted" by someone who will continue to maintain that single kernel release.
As a guess pulled out of my nethers 99% use their distro's default shipping kernel, which means there's maybe a dozen kernels in widespread use with a long tail. Unless you're living on the bleeding edge that's what you want to do, otherwise you have to keep up with and patch stuff like this yourself. I'd much rather trust that than not upgrading or picking some random kernel version and hope it's adopted by someone.
Actually, it's already been fixed (Score:5, Informative)
Re: (Score:3, Interesting)
The code by itself is not fine. The underlying bug is that the kernel is allowing memory at virtual address 0 to be valid. The compiler was designed for an environment where there can never be a valid object at 0, and has chosen the bit pattern 000...000 to be the null pointer. If you want to use C in an environment where 0 can be a valid address
Re:Actually, it's already been fixed (Score:4, Insightful)
Re: (Score:2)
Re: (Score:3, Funny)
Re:Just don't use that version (Score:5, Informative)
I'm very sorry, but you are wrong.
There is no longer an unstable/stable kernel branch difference. Essentially all new kernels are development versions. It is specifically up to the distribution vendors to pick stable kernels out of this continuous release stream.
Mart
Re: (Score:3, Informative)
Sound familiar?
Then they have to upload it to their package management servers and put the fix out there for you to use.
This might not sound like a lot of work, but who needs a new kernel when they are busy with a whole truck full of
Re: (Score:2)
I would expect most distributions to take the fix from .31 and apply it to .30. Most distributions are pretty good at watching for CVEs and other high-importance bug reports and backfitting them. For example, I would expect the fix to show up in the ebuild for Gentoo Real Soon Now .
Re: (Score:2)
I always disable those (Score:3, Interesting)
I always disable those security modules as they always end up to incompatibilities and other erratic behavior in software.
Exactly what do they do anyway?
Re:I always disable those (Score:5, Funny)
Re: (Score:3, Interesting)
They ruin otherwise working code that was written in slightly different environments, and for which the very arcane behavior of SELinux has not been tuned. They're also often difficult to write test suites for, especially the unpredictability of SELinux changes, since they affect behavior due to factors entirely outside the control of the particular program author: they affect behavior based on where the package installs the code and what SELinux policies are in place.
It's gotten better: Linux operating sys
Re: (Score:3, Interesting)
SELinux does help prevent tools from executing in locations or in places that are inappropriate: it helps reduce the destructive capabilities of components that are mis-installed, installed without proper permissions, or that have certain classes of errors. It also helps force you to think before doing something foolish, such as running CGI tools that are not in /var/www/cgi-bin/: it's too easy for foolish people to use .htaccess or poorly handled HTTPD include directives to include some very foolish CGI to
Wait, what? (Score:5, Interesting)
So, he's dereferencing tun, and then checking if tun was NULL? Looks like the compiler is performing an incorrect optimisation if it's removing the test, but it's still horribly bad style. This ought to be crashing at the sk = tun->sk line, because the structure is smaller than a page, and page 0 is mapped no-access (I assume Linux does this; it's been standard practice in most operating systems for a couple of decades to protect against NULL-pointer dereferencing). Technically, however, the C standard allows tun->sk to be a valid address, so removing the test is a semantically-invalid optimisation. In practice, it's safe for any structure smaller than a page, because the code should crash before reaching the test.
So, we have bad code in Linux and bad code in GCC, combining to make this a true GNU/Linux vulnerability.
Re: (Score:2, Interesting)
The patch [kerneltrap.org].
Re:Wait, what? (Score:5, Insightful)
I think the compiler is correct. If tun is null, then tun->sk is undefined and the compiler can do what even optimization it want.
So when the compiler see tun->sk it can assume that tun is not null, and do the optimization, because IF tun is null, then the program is invoked undefined behavier, which the compiler don't have to preserve/handle. (How do you keep the semantic of an undefined program??)
Re: (Score:3, Insightful)
Arguably the compiler is wrong because it's (obviously) not actually impossible for address 0 to refer to valid memory however against convention and best practices that may be. The very existence of this problem proves that the compiler can NOT assume that tun is not null.
Re: (Score:3, Informative)
If, and only if "tun = 0; if (!tun) ..." be optimised, since NULL is not guarented to be 0.
I left your comments about some C programmers out, because they just make you look like an idiot.
A "null pointer constant" is by definition either an integer constant expression with a value of zero, or such an expression cast to (void *).
NULL is guaranteed to be a macro that evaluates to a "null pointer constant", with parentheses around it if needed.
In certain contexts (when assigned to a pointer lvalue, or when compared to a pointer expression), the compiler will replace a "null pointer constant"
Re:Wait, what? (Score:5, Interesting)
No. Technically, if tun is null, dereferencing it in the expression tun->sk invokes undefined behaviour -- not implementation-defined behaviour. It is perfectly valid to remove the test, because no strictly conforming code could tell the difference -- the game is already over once you've dereferenced a null pointer. This is a kernel bug (and not even, as Brad Spengler appears to be claiming, a new class of kernel bug); it's not a GCC bug.
But as other posters have said, it would indeed be a good security feature for GCC to warn when it does this.
Peter
Re: (Score:3, Informative)
No. Technically, if tun is null, dereferencing it in the expression tun->sk invokes undefined behaviour -- not implementation-defined behaviour
I've seen a lot of people claiming that, however (as someone who hacks on a C compiler) there are a few things I take issue with in your assertion.
First, NULL is a preprocessor construct, not a language construct; by the time it gets to the compiler the preprocessor has replaced it with a magic constant[1]. The standard requires that it be defined as some value that may not be dereferenced, which is typically 0 (but doesn't have to be, and isn't on some mainframes). Dereferencing NULL is invalid, however
Re: (Score:2)
But it's (tun->sk) not &(tun->sk). That is: The code is looking at the value of the member sk in the struct pointed to by tun. Looking at this value is undefined if tun is null.
It does not take the address of tun or tun->sk.
Re: (Score:2)
Not any non-null memory address -- only one that points into an object. As there is no object whose address is the null pointer, the dereference is still undefined. And the compiler knows that.
Another way of looking at it is that tun->sk is equivalent to (*tun).sk, which is even more clearly undefined in C.
Peter
Re: (Score:2, Informative)
You are completely wrong and you should learn some C before posting crap like this.
The NULL pointer has the value 0 and no other value. Period. Internally, it can be represented by other bit-patterns than all-0. But the C standard demands that
void *x = 0;
generates the NULL pointer.
The last paragraph is also completely wrong because you fail to realize that the substraction of two pointers gives an integer and not another pointer.
So: please, please don't post again until you've learnt the abso
Re:Wait, what? (Score:5, Informative)
First, NULL is a preprocessor construct, not a language construct; by the time it gets to the compiler the preprocessor has replaced it with a magic constant[1].
Which must be either "0" or "(void *) 0".
The standard requires that it be defined as some value that may not be dereferenced, which is typically 0 (but doesn't have to be
Not true - the standard requires NULL to be defined as one of the two values given above.
and isn't on some mainframes
There are indeed some platforms where a null pointer is not an all-bits-zero value, but this is achieved by compiler magic behind the scenes. It is still created by assigning the constant value 0 to a pointer, and can be checked for by comparing a pointer with a constant 0.
Re:Wait, what? (Score:4, Informative)
Which must be either "0" or "(void *) 0". ...
There are indeed some platforms where a null pointer is not an all-bits-zero value, but this is achieved by compiler magic behind the scenes. It is still created by assigning the constant value 0 to a pointer, and can be checked for by comparing a pointer with a constant 0.
What you've said is technically true, but doesn't contradict or clarify the post to which you replied in any way, so I'm not sure what your point is.
As you point out, a NULL pointer is a pointer which is represented by "(void *) 0" in the C language. However, where you may be confused is that "(void *) 0 != (int) 0". At least, not always. The compiler is responsible for determining if any "0" is used in a pointer context and casting it to the appropriate value, which may not be the same as numeric "0". So, while it's always possible to check for a NULL pointer by comparing a pointer to 0 in code, the machine may use a different value for NULL pointers. When you check "if(p)", the binary code that is produced will be comparing the value of "p" to the NULL address which is appropriate for the machine on which it is running.
The C FAQ [c-faq.com] has more information.
Re:Wait, what? (Score:4, Informative)
You're speaking with a voice of authority, which is dangerous because of how incorrect in general your post is.
Others have already pointed out that you are wrong about NULL. Here's precisely what the spec says about the argument to &:
The operand of the unary & operator shall be either a function designator, the result of a
[] or unary * operator, or an lvalue that designates an object that is not a bit-field and is
not declared with the register storage-class specifier.
(((struct foo*)(void*)0)->bar) in particular is none of those things, and your expression is not legal C.
Some apparent dereferences of null pointers are allowed. For instance:
void *a = 0;
void *b = &(*a);
The above is legal not because dereferencing a null pointer is legal, but rather because of an explicit exception to the rule carved out in section 6.5.3.2 of the spec, which says that in this case, the & and * cancel, and "the result is as if both were omitted".
Your expression is neither safe nor portable. If you do need to check the offset of a field in a structure, use the standard library offsetof() macro -- that's what it's for.
Re: (Score:3, Informative)
The value &(tun->sk) is the address of tun, plus a fixed offset. The expression &(((struct foo*)0)->bar) is valid C and will give the value of the offset of the sk field in the foo struct. A typical definition of NULL is (void*)0, and &(((struct foo*)(void*)0)->bar) will also give the value of the offset of the bar field.
Wrong. If tun is a null pointer, then the only valid operations are the following:
1. Assign tun to a pointer variable of a matching type or of type void*, which will set that variable to a null pointer.
2. Cast tun to another pointer type, which will produce a null pointer.
3. Cast tun to an integral type, which will produce the value 0 (and this is true whatever bit pattern the compiler uses for null pointers)
4. Comparing tun to a pointer of a matching type or type void* using the == or != operators.
Re:Wait, what? (Score:5, Informative)
In this case, it is tun->sk, not &(tun->sk) which is being loaded, however the pointer arithmetic which generates the address happens first. If tun is NULL then this is NULL + {the offset of sk}. While dereferencing NULL is explicitly not permitted, pointer arithmetic on NULL is permitted, and dereferencing any non-NULL memory address is permitted.
Raven, I've seen you make the same comment a few times in this story. Please stop pushing this nonsense.
The language standard calls * and -> operations "dereferencing". The way it works is that tun->sk dereferences the whole struct, then hands you the sk field from it.
When you implement this in your compiler you do an address computation first then load only the field because you don't want to load the whole struct when you don't need to, but that's an implementation detail. The compiler is required to act as if the pointer tun were being dereferenced.
It would be a major missed optimization bug if the compiler didn't eliminate the later if (!tun) operation. This is a case where the input code is simply wrong.
Re: (Score:3, Interesting)
If you actually read the exploit code (see: http://grsecurity.net/~spender/cheddar_bay.tgz [grsecurity.net]) the thing that really enables this exploit is one of two ways to map page zero. One of these seems to be a flaw with SELinux (either wi
CFLAGS (Score:4, Informative)
CFLAGS+= -fno-delete-null-pointer-checks
Job done (should work with Gentoo, buggered if I know how to do this in other distros, DYOR), even with -O2/-O3. This is an optimisation/code conflict. The code itself is perfectly valid, so if your CFLAGS are -O -pipe you have nothing to worry about. GCC's info pages show what is enabled at various optimisation levels. -fdelete-null-pointer-checks is enabled at -O2. Of course, this only applies when you compile your own kernel. If vendors are supplying kernels compiled with -O2 without checking what it does to the code then it is obvious who is to blame.
Re:CFLAGS (Score:4, Informative)
No. That doesn't fix the problem. All it does is stop the broken optimisation (why the *hell* did someone at gcc think such a thing should be default anyway?)
You need an -ferror-on-bogus-null-pointer-checks parameter so that the code can be fixed.
It's an easy error to make. It's the compilers job to warn you.. in this case not only did it fail to throw a warning it also made the problem worse by 'optimising' it.
Re: (Score:3, Informative)
Because it makes sense on every modern platform on earth except for strange embedded ones, that's why. This kernel bug is the result of incorrect kernel code, not a GCC bug.
Interesting (Score:3, Funny)
Guys, I'm trying to decide what to post:
[ ] Downplay how serious flaw is ...or we could RFA
[ ] Compare to Window's track record
[x] Make a meta-reference to Slashdot psychology
[ ] Post work-around that doesn't fix problem
[ ] Say that flaw is a feature
[ ] bash Windows
[ ] Claim that not all Windows software is bad
[ ] Claim that the more popular gets, Linux will be targeted more
[ ] Pretend I understand the problem
Running a static checker on the Linux kernel? (Score:3, Interesting)
Isn't someone running a static checker on the Linux kernel? There are commercial tools which will find code that can dereference NULL. However, there aren't free tools that will do this.
DRM is defective by design. (Score:5, Informative)
I think that tag is mostly reserved for DRM related news...
And I have seen news about linux DRM modules also tagged that.
Re: (Score:2, Offtopic)
...you mean the direct rendering module or proprietary modules that some evil vender installs? I don't know of any digital restrictions management kernel modules but wouldn't be that surprised if they existed.
I remember freaking out the first time I noticed the DRM module loading when I came over the Tux's loving embrace years ago.
Re:Double standards (Score:4, Informative)
Thats because with Windows, no one would be able to marvel at how un-obvious the flaw is. According to The Register, the kernel actually has gaurds in place against just this type of valnerability, but the complier optimized them out during compiling. IMHO this makes this flaw a very good case study, even with security in place, you cannot really trust the compiler. (actually, this flaw apparently only occurs if security is in place... or if you use PulseAudio (in which case, you deserve it!)).
Comment removed (Score:4, Insightful)
Re: (Score:2)
Blind trust is not necessary for this to be an issue. NO ONE has time to write all their code in assembly, not even for the kernel. This is arguably more of an issue in the compiler than in the kernel, and if you honestly claim you can write a C compiler without learning assembly...yeah.
Better retort: Which dialect?
Re:Double standards (Score:5, Informative)
This is arguably more of an issue in the compiler than in the kernel,
Not completely... from the SANS Storm Center [sans.org], the code was as follows:
struct sock *sk = tun->sk;
if (!tun) // if tun is NULL return error
return POLLERR;
The error was that the compiler optimized away the if statement, assuming that tun had already been initialized. The check should have been placed before the sock variable referenced it. Not entirely obvious maybe, but then again, it should have been checked before the assignment.
I really don't see how this is a compiler problem? (Score:4, Insightful)
no exact code snippet found in Linux (Score:4, Informative)
I tried to google code search for "tun->sk" and Linux doesn't contain that snippet of code. Since SANS claimed that drivers/net/tun.c is at fault, I looked at that source file and didn't find any instances where "if (!...) return ...;" is performed after NULL dereference.
I think the only fascinating bit of the story is that the SElinux extension allows you to map a page at memory address 0 (the NULL page), making NULL dereferencing valid. I also found out about that [likai.org] a while ago, but I didn't know it has anything to do with SElinux. By the way, mapping the NULL page also works on Mac OS X.
However, mapping NULL page is typically NOT exploitable. A correct program will simply reject access to NULL pointer, giving it a special semantic regardless whether the memory page itself is valid or not.
code found in Linux 2.6.30 (Score:5, Informative)
Oh, found the code on lxr. It looks like Linux kernels up to 2.6.29.6 [linux.no] are NOT affected, and this is a vulnerability introduced in 2.6.30 [linux.no] due to a fairly significant rewrite of tun.c. Linux 2.6.30 was released in Jun 9, 2009, just a month ago. Funny the tun.c rewrite was not mentioned in the set of changes for 2.6.30.
I think this example actually shows a forte of Linux as open source. New vulnerability is found very quickly after "new" code is released.
the set of changes for 2.6.30 (Score:5, Informative)
Re: (Score:3, Informative)
A bug exists with or without the optimization if the code you pasted is the actual code. tun being null makes the tun->sk reference invalid. You should end up with a panic at this point.
If the compiler optimized away the tun check without there being a previous tun check, there is also a compiler bug. The compiler shouldn't have assumed that tun was initialized just because it was read from, which is all a dereference is, a read and an add.
Re: (Score:3, Insightful)
The error was that the compiler optimized away the if statement,
Being more specific, based on reading the code in the SANS report after getting the suggestion from a user comment in the Register, the error was that the compiler was in an optimising mode which told it to optimise away such checks where the Null pointer had already been dereferenced. -O2 was active and that clearly means that -fdelete-null-pointer-checks is turned on.
Two groups are at fault here:
The optimisation was sufficiently clearly documented (it's listed in gcc under -O2 and when you look at the do
Re:Double standards (Score:5, Informative)
No. You are wrong.
The code is grabbing the value of the sk field of the tun struct, not its address. Did you misread the code, or do you not actually know C? Or are you perhaps just on the sauce?
You're claiming the code reads struct sock **sk = &tun->sk when in reality, it reads struct sock* sk = tun->sk, which is completely different.
Re: (Score:3, Informative)
The exploit maps 0x00000000 to userspace using pulseaudio, this prevents the segfault.
Re: (Score:3, Interesting)
The compiler is allowed to assume that no other code(Including no other thread, running the same code) change the value of a variable behind its back(As long as the variable it not volatile(Volatile got it's own can of worms)), so the optimization is safe.
so in the code
int *data=myFunc(); // The compiler is allowed to optimize this call out.
val=*data;
printf("%d\n",val);
val=*data;
printf("%d\n",val);
And the compiler is allowed to turn this into nothing:
int *val=myFunc(); // Returns a valid pointer to an int.
*
Re: (Score:3, Interesting)
int *data=myFunc();
val=*data;
printf("%d\n",val);
val=*data;
printf("%d\n",val);
Actually, the compiler isn't allowed to optimize that second assignment to val out unless it can see the source for printf and can prove that there are not other aliases to the memory that data points to that might be changing it.
Even if you assume the default printf(), myFunc might be returning a pointer to one of the buffers used for IO.
This is one of the reasons that C99 intr
Re: (Score:3, Insightful)
Comment removed (Score:4, Insightful)
Re:Double standards (Score:5, Insightful)
What's that? You don't fully disassemble and analyze large binaries but only critical paths or small binaries? How unique and sought-after your services must be. I'm sure analysis of compiled kernels is the best way to tackle this bug..
Re:Double standards (Score:4, Informative)
Yes, but the rest of us have written about 1000 times more code than you because we didn't spend our time checking a ton of assembly because we presume the compiler is flawed.
There are times when this sort of checking is acceptable if not required. The kernel is a good place to do it.
You aren't going to do this for KDE or Gnome however.
Re: (Score:3, Insightful)
To be fair, the OP wasn't suggesting that programs actually be written in assembly, but rather that programmers learn assembly and know how to debug libraries. That's a sentiment I'll second: of course you don't write programs using machine code operations, but when something breaks, it's quite useful to be able to drop down to assembly in a debugger and see what's actually going on, especially when debugging optimized code.
As for c
Re:Double standards (Score:5, Funny)
Re: (Score:3, Funny)
gcc -pedantic $@
Re:Double standards (Score:4, Funny)
i compiled my kernel using that flag , and now it boots Windows instead.
Comment removed (Score:5, Interesting)
Re:Double standards (Score:4, Informative)
Assembly of any sort isn't that difficult once you get some experience with it, and with the proper macros and defines set up, it can actually be fairly quick to code in. Some chips are easier than others (the 68K was *awesome* to code for), but it just requires some attention to detail and a good understanding of how the machine works.
Re: (Score:3, Insightful)
Re: (Score:2)
In my tests, gcc only optimizes out null checks at -O3 or above, which is already known to make potentially unsafe optimizations. Maybe if -Wunreachable-code were part of -Wall it would have been easier to spot.
Re: (Score:3, Informative)
Nope. PulseAudio is NOT necessary to trigger this flaw. Read the exploit source code.
PS: I hate PulseAudio bashing.
Re: (Score:3, Insightful)
For the sake of argument, let's suppose you're right. (I think it'll be a cold day in hell when the BSDs move away from GCC.) Increasing performance demands will lead to the inclusion of more optimizations in PCC, and these optimizations will lead people like you to make the same complaints about PCC that people make about GCC today.
Really, what you're opposed to isn't GCC, but the notion of an optimizing compiler. Sorry, but history has spoken: the gain of optimization far outweighs the minor cost of forci
Re: (Score:2, Insightful)
If this had been Windows we'd find out 9 months later after the guy who discovered it informed Microsoft, they stick their fingers up their butt for 3 months. Microsoft would then spend 3 months finding that the code they used to fix the problem (which was copied from a dll they wrote in 2005) causes problems in newer versions of Excel because it uses null pointers to calculate file bloat or something. Then they threaten him with lawsuits for a few months if he releases the information. In a rush to rele
Re:Double standards (Score:4, Informative)
Oh please, it's a response to
If this had been Windows, the article would have been tagged defectivebydesign.
You're not supposed to read the article, but at least the post you're criticizing.
Re: (Score:2)
If this had been Windows, the article would have been tagged defectivebydesign.
What are you talking about? How is a Linux kernel exploit related to the architecture of DRM [defectivebydesign.org]??
Re:Double standards (Score:5, Funny)
Right... Because Microsoft are really losing sleep over the negative comments posted on slashdot, so they have assembled a crack team of slashdotters to game the moderation system in their favour.
You have to be kidding me.
Re:Double standards (Score:4, Insightful)
For such a piece of shit company, they sure do have a lot more marketshare than the computing godOS known as Linux.
Microsoft's current market share has nothing to do with quality, and everything to do with monopoly. It doesn't matter whether their product is any good or not, because not only do the vast majority of computer users not even know what Windows is, they wouldn't have the first clue what an alternative to Windows or MS Office would be like.
Time to learn about basic economic theory [wikipedia.org] I think.
Re: (Score:3, Informative)
There aren't any important services that run setuid is there?
Oh...
Re:Serious bug in gcc? (Score:5, Informative)
gcc is definitely doing the wrong thing here.
Given the code:
a = foo->bar
if(foo) something()
gcc is doing precisely the wrong thing - optimising out the if on the theory that the app would have crashed if it was null.
What it *should* do is throw a warning (even an error, given the clear intent of the code) pointing out that the variable is dereferensed before it is tested.
This kind of error being missed by gcc is going to affect a *lot* of code - it's really not that uncommon a coding error, and is easy to do.
Re: (Score:2)
In the kernel, null pointer derefs have no place. It's not valid kernel space, and for userland access you're supposed to use special functions anyway.
Re: (Score:2, Insightful)
The point is that GCC silently optimizes it away so the programmer has no idea that it's not even running the code they put in (however incorrect that code is). It's like saying "if there is an error in my code just remove that code and keep the rest without telling me".
Re: (Score:3, Insightful)
For the most part, programmers DO WANT this kind of optimization, which is why they use an optimizing compiler. Things like dead-code elimination, constant propogation, and whole program optimizations are important to programmers.
If you don't want this stuff done, you don't reach for an optimizing compiler and then enable those optimizations. Its their purpose. If (something we know at compile time) should *always* be eliminate
Re:Serious bug in gcc? (Score:5, Insightful)
Sure it does - GCC knows at compile time that if the if() condition were true, we're already in the "undefined behavior" realm and all bets are off. So it gets rid of it. The code is broken: it's not the compiler's job to compile for the maximum defensiveness of the resulting machine code, otherwise we'd all be using bounds-checking compilers. If the compiler realizes that a certain runtime value will lead to undefined results (because the programmer chose to do so), it is free to break the execution as much as it wants in that case for code that runs afterwards. Essentially, undefined behavior is a contract signed by the programmer that says "I certify that this will never happen", which is why the compiler chose to perform this optimization.
Even though the real bug is clearly in the code, moving on to the realm of what's desirable from a compiler, I think it's clear that this behavior can make some problems worse (to the compiler, problems are binary - if there's a problem all bets are off - but not to us). This is fine in the name of optimization, but I think in this particular instance either a) kernel developers should opt to turn this optimization off, or b) (better) make GCC warn when this kind of optimization happens, because it's quite likely a bug.
In effect, the code is a form of broken defensive programming (you check after the fact whether you've screwed up). It's wrong, but we still wouldn't want the compiler to silently remove the check. So I think the ideal solution (besides fixing the code) is to add a warning to the compiler. NULL pointer dereferences are a bug in the vast majority of cases, and checking for a NULL pointer after dereferencing it (in such a way that the compiler recognizes it and is about to remove the check) is at best redundant and more likely a bug.
There's still the issue of the page 0 fuckery. If someone can make page 0 accesses not crash the kernel then that's also a bug - there are good reason why we want NULL and neal-NULL pointer accesses to always crash.
Re:Serious bug in gcc? (Score:4, Interesting)
In effect, the code is a form of broken defensive programming (you check after the fact whether you've screwed up). It's wrong, but we still wouldn't want the compiler to silently remove the check. So I think the ideal solution (besides fixing the code) is to add a warning to the compiler. NULL pointer dereferences are a bug in the vast majority of cases, and checking for a NULL pointer after dereferencing it (in such a way that the compiler recognizes it and is about to remove the check) is at best redundant and more likely a bug.
My problem with this sort of thinking is when you throw in macros and templates and whatnot, there can end up being hundreds, thousands, even millions of "redundant" tests againt NULL specified by the expanded source. Now, I suspect that simply adding this warning to GCC and then compiling some large project would generate so many such warnings that the only reasonable choice would be to then disable that warning. The warning would then have no value, and if so then that certainly doesnt address the "problem."
.. ex, the pointer was just assigned, or its nested within another test for null.
As far as the other stuff.. my point was that the arguement that the compiler should never optimize away such if() statements is flawed. I was responding to someone who did in fact make such a claim. There are certainly cases where the pointer absolutely cannot be NULL (or absolutely must be)
Re: (Score:3, Interesting)
Given the code:
a = foo->bar;
if(foo) something()
gcc is doing precisely the wrong thing - optimising out the if on the theory that the app would have crashed if it was null.
No, that's not the theory. They weren't optimizing the if out because the app would have crashed, they were optimizing the if out because if the programmer is dereferencing foo beforehand without testing, one can assume that the programmer is sure that foo is not null by that point. I agree with you that a warning should be thrown (and I'm not sure if it is or isn't), but that if really should be optimized out.
Re: (Score:3, Informative)
The description given by SANS is a bit misleading. What I believe is happening is:
Since point 2 is mostly true, the compiler is not completely wrong to assume point 3
As Spengler says, a bigger problem is that loading SELinux (or, it looks like, most other security modules) causes the NULL dereference protection to be disa
Re: (Score:2)
Re: (Score:3, Funny)
Re:Serious bug in gcc? (Score:5, Insightful)
They were writing nonsense. GCC makes use of the fact that in the C language any pointer that was dereferenced can't be NULL (this is made explicit in the standard). People use C as a high-level assembly where these assumptions don't hold. This is why code that doesn't assume this breaks. This issue came up a few months ago on the GCC lists, where an embedded developer pointed out that he regularly maps memory to the address 0x0, thereby running into issues with this assumption in the optimizers. The GCC developers introduced a command-line flag which tells the computer to not make that assumption, therefore allowing the compiler to be used even in environments where NULL pointers can be valid.
Now, the exploit uses this feature of the compiler (or the C language, if you will) to get the kernel into an unspecified state (which is then exploited) -- the NULL pointer check will be "correctly" optimized away. But in order to do this it first has to make sure that the pointer dereference preceding the NULL pointer check doesn't trap. This needs some mucking around with SELinux, namely one has to map memory to 0x0.
This is a beautiful exploit, which nicely demonstrates how complex interplay between parts can show unforeseen consequences. Linux fixes this by using the aforementioned new compiler option to not have the NULL pointer check optimized away.
Re: (Score:2)
Wish I had mod points today. This is a truly clear and understandable explanation of the issues involved.
The embedded market is a very good point - and "embedded" doesn't necessarily mean what people thinks it means. It might, for example, mean that graphics card you just put in your computer that runs code itself.
15 years ago I was dealing with code on a graphics card and, in fact, the video buffer was mapped in at 0x0 - which may have been a questionable choice, except that you could actually SEE a write
Re:Serious bug in gcc? (Score:5, Interesting)
On most modern platforms, NULL is defined as (void*)0 and the entire bottom page of memory is mapped as no-access. On some embedded systems, however, the bottom few hundred bytes are used for I/O and you get the addresses of these by adding a value to 0. On these systems it is perfectly valid (and correct) C to define a structure which has the layout of the attached devices and then cast 0 to a pointer to this structure and use that for I/O.
Re:Serious bug in gcc? (Score:5, Insightful)
Of course NULL is part of the C language, you blathering idiot, and it always has been. The level of ignorance here astounds me. Don't post about things you don't understand.
Quoting from C89 [flash-gordon.me.uk]: (not C99, C89, the one that's older than dirt.)
NULL wasn't even "added" in C89: NULL appears in the oldest, cruftiest UNIX code you can imagine [tuhs.org]. (That link is the original cat command from 1979.)
Re: (Score:3, Informative)
That's not true. It's not your fault, you just assumed the GP was right, and he's not.
In your example, you thought b->two was (dereference of (0 + offset)) but actually it's ((dereference of 0) + an offset).
To get the former you need b to be an actual struct and use (*b.two). Or with it still being a pointer you could do some fancy pointer arithmetic, like *(((int *) b) + 1).
Re: (Score:3, Informative)
This sound more like a gcc/embeded os bug. There is no requirement in c/c++ that the null pointer is (int)0. That is: It don't have to be all 0 bits.
It just need to be distinct from any valid pointer, so if you run on a platform where you use memory address 0(Valid, but still wierd), you need
to config gcc(If possible, I don't know if gcc supports this) to use an other bit pattern as null pointer(say 0xeffff), and then you need to configure your embeded os, to never
return a memory address/buffer that contain
Re:Serious bug in gcc? (Score:4, Informative)
That is: It don't have to be all 0 bits. It just need to be distinct from any valid pointer,
Correct - apart from the "just" bit.
It doesn't need to be all 0 bits.
It does need to be distinct from any valid pointer.
*and*
void *p = 0;
must generate a null pointer, and:
p == 0
must come out true if p is a null pointer. The internal implementation need not be all zeroes, but it does need to look rather like it to source code.
Re: (Score:2)
More and More like windows everyday
Actually no, this will probably be fixed by later today, as opposed to having to wait for "n" intervals of "patch tuesday"...
Re:Just like Linux (Score:4, Insightful)
Unless they're going to add a proper warning for the condition to gcc 'today' it won't, really.
Sure there are enough developers to go over the kernel to make sure such errors haven't been missed elsewhere, but all it takes is one to miss it and it's still there. Then there's all the other software compiled by gcc..
I'm not entirely sure how it can lead to an exploit (short of remapping page zero, which requires root privileges so doesn't really count) but since it has it's going to need a proper fix.
Re: (Score:2)
Re:Just like Linux (Score:4, Interesting)
Funny enough a few months back I made a very similar error if not the exact same error while coding on the bootloader for Darwin/x86. Except in my case it wasn't exactly a true error because in the bootloader I know that a page zero dereference isn't going to fault the machine but will instead just read something out of the IVT.
So as I recall it seemed perfectly reasonable to go ahead and initialize a local variable with the contents of something in the zero page and then check for null and end the function. But GCC had other ideas. It assumed that because I had dereferenced the pointer a few lines above that the pointer must not be NULL so it just stripped my NULL check out completely. Had it warned about this like "warning: pointer is dereferenced before checking for NULL, removing NULL check" then that would have been great. But there was no warning so I wound up sitting in GDB (via VMware debug stub) stepping through the code then looking at the disassembly until I realized that.. oops.. the compiler assumed that this code would never be reached because in user-land it would have segfaulted 4 lines ago if the pointer was indeed NULL.
Obviously the fix is simple. Declare the variable but don't initialize it at that time. Do the null check and return if null. Then initialize the variable. If using C99 or C++ then you can actually defer the local variable declaration until after you've done the NULL check which IMO is preferable. It may be that the guy wrote it as C99 (where you can do this) then went oops, the compiler won't accept that in older C and simply moved the declaration and initialization statement up to the top of the function instead of splitting the declaration from the initialization. My recollection of how I managed to introduce this bug myself is shady but as I recall it was something like that.
Re: (Score:2)
Re: (Score:2)
Re:Linus, you Rookie !! (Score:4, Informative)
Ok, I know I shouldn't be feeding the troll, but read the article: the kernel source itself is perfectly fine, is the compiler that optimizes the check away.
Re:Linus, you Rookie !! (Score:5, Informative)
Ok, I know I shouldn't be feeding the troll, but read the article: the kernel source itself is perfectly fine, is the compiler that optimizes the check away.
Absolutely not. The code itself has a severe bug: If tun is a null pointer then it invokes undefined behaviour. Undefined behaviour means anything can happen. Anything can happen means a severe bug, especially in kernel code. The optimizing compiler just turned C source code that was buggy, but not obviously enough for the programmer, into assembler code that would have been obviously buggy to anyone. Most definitely not the fault of the compiler.
Re: (Score:3, Informative)
The code makes a potentially undefined assignment, but before doing anything significant with it, it checks for the undefined condition. It's not technically wrong but it is against best practices. Without the invalid optimization it wouldn't be a problem. In turn, the optimization is in the opposite condition. It is technically wrong, but where best practices are followed, it does no harm.
Re:Linus, you Rookie !! (Score:4, Informative)
Umm - no - the *code* does the undefined behaviour and *then* checks if the undefined behaviour could happen. But, heck, mistakes happen - it was identified and fixed. Not much of a story really.
Re: (Score:3, Interesting)
The rest of us consider it a fundamental law of the universe.
Clearly, your universe is too small! Technically, the code dereferenced NULL+offset where offset (and so NULL+offset) is non zero (which I presume you are hard wired to consider to be the NULL value).
In an environment where a segv (or equivalent) won't be triggered, the code's not wrong until it makes use of an invalid dereference. The if would have prevented it. I don't think that makes it GOOD since in most environments it will fail.
In some languages or with some C optimizers, the assignment would never b