'Kernel Memory Leaking' Intel Processor Design Flaw Forces Linux, Windows Redesign (theregister.co.uk) 416

Posted by BeauHD on Tuesday January 02, 2018 @05:40PM from the update-required dept.

According to The Register, "A fundamental design flaw in Intel's processor chips has forced a significant redesign of the Linux and Windows kernels to defang the chip-level security bug." From the report: Programmers are scrambling to overhaul the open-source Linux kernel's virtual memory system. Meanwhile, Microsoft is expected to publicly introduce the necessary changes to its Windows operating system in this month's Patch Tuesday: these changes were seeded to beta testers running fast-ring Windows Insider builds in November and December. Crucially, these updates to both Linux and Windows will incur a performance hit on Intel products. The effects are still being benchmarked, however we're looking at a ballpark figure of five to 30 per cent slow down, depending on the task and the processor model. More recent Intel chips have features -- specifically, PCID -- to reduce the performance hit. Similar operating systems, such as Apple's 64-bit macOS, will also need to be updated -- the flaw is in the Intel x86 hardware, and it appears a microcode update can't address it. It has to be fixed in software at the OS level, or buy a new processor without the design blunder. Details of the vulnerability within Intel's silicon are under wraps: an embargo on the specifics is due to lift early this month, perhaps in time for Microsoft's Patch Tuesday next week. Indeed, patches for the Linux kernel are available for all to see but comments in the source code have been redacted to obfuscate the issue. The report goes on to share some details of the flaw that have surfaced. "It is understood the bug is present in modern Intel processors produced in the past decade," reports The Register. "It allows normal user programs -- from database applications to JavaScript in web browsers -- to discern to some extent the contents of protected kernel memory. The fix is to separate the kernel's memory completely from user processes using what's called Kernel Page Table Isolation, or KPTI."

'Kernel Memory Leaking' Intel Processor Design Flaw Forces Linux, Windows Redesign

This discussion has been archived. No new comments can be posted.

Load All Comments

Search 416 Comments Log In/Create an Account

Comments Filter:

Don't know if serious. (Score:3, Funny)

by Anonymous Coward writes: on Tuesday January 02, 2018 @05:44PM (#55851737)

At one point, Forcefully Unmap Complete Kernel With Interrupt Trampolines, aka FUCKWIT, was mulled by the Linux kernel team, giving you an idea of how annoying this has been for the developers.

- Re:Don't know if serious. (Score:5, Funny)
  
  by Hal_Porter ( 817932 ) writes: on Tuesday January 02, 2018 @07:51PM (#55852477)
  
  https://www.mail-archive.com/l... [mail-archive.com]
  2) Namespace
  Several people including Linus requested to change the KAISER name.
  We came up with a list of technically correct acronyms:
  User Address Space Separation, prefix uass_
  Forcefully Unmap Complete Kernel With Interrupt Trampolines, prefix fuckwit_
  but we are politically correct people so we settled for
  Kernel Page Table Isolation, prefix kpti_
  Linus, your call :)
  LOL!
  
- Re: (Score:3)
  
  by Pseudonym ( 62607 ) writes:
  
  At one point, Forcefully Unmap Complete Kernel With Interrupt Trampolines, aka FUCKWIT, was mulled by the Linux kernel team, giving you an idea of how annoying this has been for the developers.
  Someone on LKML using the word "fuckwit" is what Linux developers call "a normal Wednesday".
FOOF (Score:5, Insightful)

by OverlordQ ( 264228 ) writes: on Tuesday January 02, 2018 @05:44PM (#55851739) Journal

About par for Intel's course. Make it fast at the expense of horrible bugs.

- FUCKWIT (Score:3)
  
  by goombah99 ( 560566 ) writes:
  
  not you. thats's what the Linux team wanted to call this bug.
  I read the El Reg article but I still don't understand what it is saying. At all levels. I don't understand if this means all intel processors or just the new ones. I don't understand if the 20% slowdown is for a tiny fraction of operations in the OS or if it means that things like e-mail, firefox or general python programming will be slowed down 20% overall. The latter would be a disaster. (could I ask intel to refund 20% of my computer cos
  - - Re:FUCKWIT (Score:5, Insightful)
      
      by Waffle Iron ( 339739 ) writes: on Wednesday January 03, 2018 @12:09AM (#55853583)
      
      I'm not in the habit of running random binaries downloaded from the Internet
      As TFS implies, given that Javascript required to do almost anything on the web, you are most likely downloading and running random code from the internet that could potentially exploit this bug hundreds of times every day.
      
      - HPC cluster (Score:3)
        
        by DrYak ( 748999 ) writes:
        
        I don't know, but maybe he runs an high performance computing (HPC ) cluster.
        With compute nodes segregated on a separate network that might not even have internet access,
        and certainly not running random javascripts downloaded from random websites.
        And in these context, performance matters a lot,
        while security is handled in a "perimeter" fashion.
        In those cases, it makes sense to have an option to disable the fix.
      - Re: (Score:3)
        
        by Waffle Iron ( 339739 ) writes:
        
        It still requires dereferencing kernel address space. How do you pull that off otherwise?
        With a timing attack, you don't need to dereference anything. From the Wikipedia article:
        Likewise, if an application is trusted, but its paging/caching is affected by branching logic, it may be possible for a second application to determine the values of the data compared to the branch condition by monitoring access time changes; in extreme examples, this can allow recovery of cryptographic key bits.
        
        Re: (Score:3)
        
        by Waffle Iron ( 339739 ) writes:
        
        Since large fractions of all systems run the exact same OS images, people DO know much of the system state ahead of time.
        You also don't need to know much about leaked kernel information to make use of it. In a scattershot approach, you try whatever bits you infer to decrypt data. If you're lucky, you find a match. If you attack thousands of systems, you're likely to get lucky.
        In summary, you're just way too overconfident. It only takes one really smart person to package up a hard-to-execute attack and make
- - Re:FOOF (Score:5, Informative)
    
    by sjames ( 1099 ) writes: on Tuesday January 02, 2018 @06:24PM (#55852029) Homepage Journal
    
    Actually, it's a reference to a hex value that could trigger a nasty Pentium bug.
    
  - Re:FOOF (Score:5, Informative)
    
    by Anonymous Coward writes: on Tuesday January 02, 2018 @06:33PM (#55852085)
    
    F00F [wikipedia.org]
    
- - Re:FOOF (Score:5, Interesting)
    
    by TheRaven64 ( 641858 ) writes: on Wednesday January 03, 2018 @01:04AM (#55853759) Journal
    
    Having done a fair amount of both, I would disagree. The things that make software complex are dynamic: you can dynamically create threads and you can dynamically allocate memory, which makes the number of possible interactions almost impossible to statically reason about. In a CPU, all of these things are bounded and (very) finite, to the extent that it is actually possible to apply formal methods to the design (Centaur in collaboration with UT Austin do some amazing work in this area).
    The difference is that a bug in software can usually[1] be fixed after you ship, whereas a bug in silicon usually can't (though if you've got a lot of microcode you can sometimes work around it). ARM has a nice chart of the cost of fixing a bug at each stage in development, which becomes more and more terrifying, whereas for software that cost is roughly flat, so you can get away with spending a lot less effort on correctness.
    [1] Though not always easily. A colleague of mine released his first CVE a couple of years ago for a small library he wrote a couple of decades back. It turns out that most deployments of this library are in fax machines and printers, with no software update mechanism.
    
In all fairness... (Score:3, Insightful)

by Anonymous Coward writes: on Tuesday January 02, 2018 @05:45PM (#55851749)

Intel guys are doing the bulk of the work for the linux kernel changes, and I'm sure to be fair they'll equally cripple all processors with the changes not just their own.

- Re:In all fairness... (Score:5, Informative)
  
  by scumdamn ( 82357 ) writes: on Tuesday January 02, 2018 @05:56PM (#55851847)
  
  Looks like you missed this commit [lkml.org] from Tom Lendacky at AMD.
  
  - Re:In all fairness... (Score:4, Informative)
    
    by nemequ ( 888009 ) writes: on Tuesday January 02, 2018 @06:15PM (#55851989)
    
    What are you talking about? That patch says AMD CPU's aren't vulnerable:
    AMD processors are not subject to the types of attacks that the kernel page table isolation feature protects against. The AMD microarchitecture does not allow memory references, including speculative references, that access higher privileged data when running in a lesser privileged mode when that access would result in a page fault.
    The (trivial) patch essential disables the work-around on AMD CPUs. I'm not going to comment on how fair GP's criticism of Intel is in general, but that link definitely isn't evidence in Intel's favor
    
    - Re:In all fairness... (Score:5, Informative)
      
      by scumdamn ( 82357 ) writes: on Tuesday January 02, 2018 @10:16PM (#55853173)
      
      I'm saying that Intel tried to cripple all processors but the dude from AMD told him their products weren't vulnerable and didn't need to be slowed down.
      
      - Re:In all fairness... (Score:4, Interesting)
        
        by nnull ( 1148259 ) writes: on Wednesday January 03, 2018 @03:35AM (#55854127)
        
        This, I don't completely understand the reasoning for crippling all processors, including non-intel. It seems Intel is trying to use its political clout to reduce everyone's performance with a bunch of fear and scare tactics which just tops the charts for 2017 and 2018, just so they don't lose their edge. This is just an utter catastrophe for Intel and they're trying to drag the rest of us with them.
        
        
        Re: (Score:3)
        
        by zifn4b ( 1040588 ) writes:
        
        Hanlon's Razor: Never attribute to malice that which can be adequately explained by stupidity
  - Re:In all fairness... (Score:5, Insightful)
    
    by sjames ( 1099 ) writes: on Tuesday January 02, 2018 @06:33PM (#55852087) Homepage Journal
    
    That bolster's AC's point. It looks like the Intel guys were going to cripple performance for everyone until the patch from AMD removed the unnecessary crippling from AMD processors.
    
    - Re:In all fairness... (Score:4, Funny)
      
      by ColaMan ( 37550 ) writes: on Tuesday January 02, 2018 @07:13PM (#55852269) Journal
      
      Well, I'm guessing the approach was more along the lines of "an abundance of caution with the X86 ISA" as opposed to deliberate malice towards AMD.
      Whilst no doubt there's some Intel guys with a very good working knowledge of AMD CPU internals, you'd really want to get direct confirmation from the actual AMD hardware guys that their hardware is immune to this.
      
      - Re:In all fairness... (Score:5, Insightful)
        
        by sexconker ( 1179573 ) writes: on Tuesday January 02, 2018 @07:31PM (#55852377)
        
        Well, I'm guessing the approach was more along the lines of "an abundance of caution with the X86 ISA" as opposed to deliberate malice towards AMD.
        Hi. Have you met Intel?
        
    - - Re: (Score:2)
        
        by Demena ( 966987 ) writes:
        
        Thus makes apple's recent fiasco trivia in comparison. I just got a new machine. How do I get it changed for one that does not have the bug?
  - - - Re: In all fairness... (Score:4, Insightful)
        
        by viperidaenz ( 2515578 ) writes: on Tuesday January 02, 2018 @10:21PM (#55853197)
        
        They didn't explicitly create a patch for AMD CPU's. They made changes that patch ALL x86 CPU's, regardless of vendor.
        AMD submitted a vendor check to disable the patch for AMD CPU's
        
        
        Re: (Score:3)
        
        by Zocalo ( 252965 ) writes:
        
        This. Without being privy to the closed discussions about how to engineer the patch (which AMD was clearly a part of) we have no way of knowing whether this wasn't the agreed approach. It's not like AMD would really want Intel making the decision over whether or not each of their CPU SKUs were vulnerable, and I doubt Intel would want to make that call either - they've got more pressing matters than trying to exploit AMD CPUs to deal with and they'd have to assume AMD would eat their lunch in court if they
        
        Re: (Score:3, Informative)
        
        by Anonymous Coward writes:
        
        When it comes to intel C compiler, optimizations that would work on AMD chips are left out when compiling for that CPU, because "we wouldn't presume to know what would work and what wouldn't on a competitor's chip!"
        Yet in this case, they presumed to know what was best for a competitor's chip.
        Funny how the error always falls to Intel's favor. One might think it wasn't an error at all.
        
        Re: In all fairness... (Score:4, Interesting)
        
        by TheRaven64 ( 641858 ) writes: on Wednesday January 03, 2018 @01:10AM (#55853773) Journal
        
        When it comes to the C compiler, Intel puts some things behind microarchitecture tests because in the past they didn't and people complained. They enabled fast paths after doing feature checks and it turned out that a bunch of x86 chips implemented certain instructions, but in microcoded slow paths and the optimised icc version that used them ran a lot slower than the version that didn't. As a result, they only enable certain instructions on chips that they have tested and which run faster when those instructions are used. And then people complained that they weren't doing optimisations on their competitors chips...
        In this case, there is a vulnerability affecting a bunch of x86 processors, they issued a patch for x86 processors. This is exactly the right thing to do for a security issue: blanket enable and then whitelist safe CPUs.
        
How could this be abused? (Score:5, Interesting)

by jader3rd ( 2222716 ) writes: on Tuesday January 02, 2018 @05:46PM (#55851761)

Sorry for the lack of imagination, but if the user space process can only read kernel memory, and can't write to it, how could one make use of this?

- Re:How could this be abused? (Score:5, Insightful)
  
  by OverlordQ ( 264228 ) writes: on Tuesday January 02, 2018 @05:48PM (#55851777) Journal
  
  You're running in EC2 on shared hardware. Your instance can read the memory of other instances running on the same physical hardware.
  
  - Re: (Score:2, Interesting)
    
    by Anonymous Coward writes:
    
    You're running in EC2 on shared hardware. Your instance can read the memory of other instances running on the same physical hardware.
    Yep, "the cloud" bites again.
    When are you supposedly "technically sophisticated" people going to learn that security inherently means running on your own hardware?
    When you cheap out, you lose. Performance, security, integrity, reliability - everything.
    VM specifically: For most local-system applications, VM is a long in the tooth, unneeded hangover that drops performance. I hav
    - Re:How could this be abused? (Score:4, Insightful)
      
      by Mad Merlin ( 837387 ) writes: on Tuesday January 02, 2018 @08:31PM (#55852671) Homepage
      
      When you cheap out, you lose.
      You're assuming that the cloud works out cheaper. Sometimes it does, but for many cases it does not (and it's not even close).
      
    - Re:How could this be abused? (Score:4, Informative)
      
      by viperidaenz ( 2515578 ) writes: on Tuesday January 02, 2018 @10:23PM (#55853205)
      
      You've confused virtual address space for swap.
      
    - - Re:How could this be abused? (Score:4, Informative)
        
        by LordKronos ( 470910 ) writes: on Tuesday January 02, 2018 @08:30PM (#55852667)
        
        He confused me at first with the mention of VMs as, like you, I initially thought he was talking about Virtual Machines (especially given the context of cloud computing and hardware sharing). But he was actually talking about Virtual Memory. Not sure how he got from one topic to the other, but it's pretty clear:
        "Memory is crazy cheap these days compared to your time and security and energy and wallet thickness; you should have lots and lots. If you don't, then there's your error."
        
- Re:How could this be abused? (Score:5, Interesting)
  
  by pereric ( 528017 ) writes: on Tuesday January 02, 2018 @06:00PM (#55851893) Homepage
  
  Cryptographic keys, information on other processes (making other attacks feasible), perhaps random number generator seeds and status, for example ...
  And the principle in general that there could be information the process is not supposed to reach.
  
- Re:How could this be abused? (Score:5, Interesting)
  
  by EndlessNameless ( 673105 ) writes: on Tuesday January 02, 2018 @06:09PM (#55851955)
  
  Private keys for system-level crypto and user credentials are stored in kernel space. You want everyone on the system to be reading those? If you can read a private key or a Kerberos token, you can become that daemon/system/user.
  This bug essentially destroys local security and severely compromises network security, subject to any limitations on where/when data can be read.
  I'm not a microarchitecture guru who can dig through the details and figure out the limitations of potential attacks. Perhaps only a small portion of kernel memory can be exposed via this bug. I don't really know. The naive, simple scenario where all kernel memory is exposed, though---that is pretty damned bad. Infosec doomsday bad.
  
five to 30 per cent slow down (Score:3)

by rwven ( 663186 ) writes: on Tuesday January 02, 2018 @05:46PM (#55851763)

I find it hard to believe that a virtual memory change will result in a 5-30% slowdown for Intel processors. Maybe for a few extremely specific (likely edge-case) tasks, but if there was a legitimate 5-30% performance decrease, you can bet there would be a far different solution in the works that would suitably fix the problem.

- Re: five to 30 per cent slow down (Score:2)
  
  by Monster_user ( 5075027 ) writes:
  
  By "virtual memory" are we talking Page Files and swap space? Disk space as memory?
  
  So an almost unusuable computer becomes completely unusable. Unless you're on solid state, then you get the performance of a mechanical hdd.
  - Re: (Score:2)
    
    by darkain ( 749283 ) writes:
    
    Page File is only one area that can be mapped into a Virtual Address Space. System RAM is another. Often time, I/O is as well.
    https://en.wikipedia.org/wiki/... [wikipedia.org]
  - Re: (Score:2, Interesting)
    
    by Anonymous Coward writes:
    
    By "virtual memory" are we talking Page Files and swap space? Disk space as memory?
    No. On modern processor running in protected mode, all memory is virtual. Memory pages that appear contiguous could be backed up by NON-contiguous areas of physical memory, device i/o space, numa, swap, etc. This bug may affect any memory access in userspace.
    So an almost unusuable computer becomes completely unusable.
    OK about the second part. The first part is a matter of opinion ;-)
  - Re: five to 30 per cent slow down (Score:5, Interesting)
    
    by MrKaos ( 858439 ) writes: on Tuesday January 02, 2018 @11:46PM (#55853505) Journal
    
    By "virtual memory" are we talking Page Files and swap space?
    
    Almost. The difference is a minor or major page fault. From swap space to ram to CPU cache is a major page fault whereas a memory transfer is between the CPU cache (L1,L2 or L3) and system RAM is a minor page fault.
    Disk space as memory?
    
    No. Between CPU cache and system RAM. In this case the issue relates to minor page faults which occur when the CPU scheduler is switching tasks, called "context switching", between processes, threads and lwt running on the system.
    The kernel maintains a summary page of the process when it switches tasks so that it doesn't have to recreate details of the process and is more efficient when it switches tasks. It is this summary page that can be attacked and IIUC addresses can be changed to access memory the process does not have permission to access.
    This means that a process running in the userspace of Ring 3 of the CPU it can modify it's summary table to access data in Ring 0, where the kernel is. Which is obviously bad because now that process can access all the memory.
    So an almost unusuable computer becomes completely unusable. Unless you're on solid state, then you get the performance of a mechanical hdd.
    Maybe, but not how you would expect. The CPU scheduler *may* make it possible to hide some of the latency created by the now crippled context switching in mechanical disk latency that you can't do with ssd because of the way IO determines *when* the scheduler will context switch. Still early days and it depends on what techniques can be devised. It depends on where the summary tables are maintained and I would expect that to be in CPU cache, which is much faster than system RAM, which is why it is a tough bug to get around.
    With that in mind it's plain to see why the Linux kernel devs want to call the patch fuckwit_ because Intel screwed up badly on this.
    
- Re: (Score:3)
  
  by jonwil ( 467024 ) writes:
  
  If the choice is a 30% slowdown or a massive highly dangerous security flaw, the developers will pick the 30% slowdown. Especially if that flaw is a big problem for people using VMs (there are suggestions in some places that the flaw would be a HUGE problem for cloud providers like Amazon). That said, if you are running Linux and dont care about the security flaw but do care about the slowdown, you can always compile your own kernel without the relavent bits in there :)
  - Re: (Score:3)
    
    by gravewax ( 4772409 ) writes:
    
    I can see it being a problem for shared providers etc. HOWEVER, I know a lot of cases in the customer I am working with they will happily accept the risk rather than take an even 5% performance hit as they use the machines for dedicated tasks so if rogue code was somehow on their they would have already been seriously compromised. So I seriously hope regardless of OS that their is a way to disable the fix easily while staying in a supported state if the performance hits are more than just FUD.
    - Re: (Score:3)
      
      by sjames ( 1099 ) writes:
      
      This is a good point. If the machine lives in a protected environment where only approved software is used by authorized users, disabling the fix to avoid the slowdown might be the right thing.
      But I'm pretty sure the slowdown in this case isn't FUD. Otherwise we'd hear Intel loudly denying it by now.
  - Re:five to 30 per cent slow down (Score:5, Informative)
    
    by Zocalo ( 252965 ) writes: on Tuesday January 02, 2018 @06:20PM (#55852009) Homepage
    
    Linux users don't even need to compile a custom kernel (although if performance *really* matters you should probably be doing so as a matter of course) as there's a boot time option in the that can be set to disable the new Page Table Isolation mode, "nopti". Without knowing the performance hit for a specific usage case and the nature of the flaw its currently impossible to say whether using it is going to be a good idea or not, but it's nice that they at least thought to include the option. Pretty sure BSD will do the same, but feel free to place your bets on commercial operating system vendors...
    
    - - Re: (Score:2)
        
        by Zocalo ( 252965 ) writes:
        
        I think that's actually going to depend more on the OS than the vendor. The available comments, urgency, and secrecy indicates a major hardware security issue being worked around, and both Apple and Microsoft have older OSs versions that are still officially in support for security patches, so this should therefore qualify. I wouldn't put it past either of them to neglect to provide the necessary switch for those older versions in an attempt to encourage people to upgrade to their current release.
        
        One o
  - Re: (Score:2, Insightful)
    
    by LinuxIsGarbage ( 1658307 ) writes:
    
    If the choice is a 30% slowdown or a massive highly dangerous security flaw, the developers will pick the 30% slowdown.
    As it is it seems a lot of developers choose "30% slowdown" over "spend some time to write not shit code".
    "Premature optimization is the root of all evil" gets turned to "Do no optimization whatsoever, have no understanding of underlying hardware, and pick the latest trendiest framework that runs on top of 5 layers of framework to provide 6502 performance from an i7."
    - Re: (Score:3)
      
      by dryeo ( 100693 ) writes:
      
      provide 6502 performance from an i7
      Good, it's a step in the direction of improving latency. Next, replace the rest of the computer with an Apple //E and we'll have something responsive.
      Article measuring latency of various computers that finds the Apple //E to have the least latency in displaying a character, https://danluu.com/input-lag/ [danluu.com]
- Re:five to 30 per cent slow down (Score:5, Informative)
  
  by Carewolf ( 581105 ) writes: on Tuesday January 02, 2018 @05:56PM (#55851843) Homepage
  
  I find it hard to believe that a virtual memory change will result in a 5-30% slowdown for Intel processors. Maybe for a few extremely specific (likely edge-case) tasks, but if there was a legitimate 5-30% performance decrease, you can bet there would be a far different solution in the works that would suitably fix the problem.
  Virtual memory access is used in every single memory access cached or not. 5% would be lucky for trying to work around a broken system. I am guessing the flaw is probably in the TLB which is meant to accelerate these things.
  
- Re: (Score:2, Offtopic)
  
  by epine ( 68316 ) writes:
  
  I find it hard to believe that a virtual memory change will result in a 5-30% slowdown for Intel processors. Maybe for a few extremely specific (likely edge-case) tasks, but if there was a legitimate 5-30% performance decrease, you can bet there would be a far different solution in the works that would suitably fix the problem.
  Of course, if microcode update fails, there's always the hail Mary unicorn ass-pull.
  I assure you, every Intel employee is kneeling on the carpet this very instant, facing the most aus
  - - Re: AMD stock? Intel Stock? (Score:2)
      
      by Monster_user ( 5075027 ) writes:
      
      Does Intel still have shares of AMD stock?
- Either that or a lawsuit (Score:2)
  
  by rsilvergun ( 571051 ) writes:
  
  you cut 30% off the performance of my CPU expect to hear about it.
  - Re: (Score:2)
    
    by Xenx ( 2211586 ) writes:
    
    Technically speaking, they're making a change to the operating system kernel and how it operates, not reducing performance of the CPU.
    - - Re: (Score:2)
        
        by Xenx ( 2211586 ) writes:
        
        Microsoft/linux kernel devs/etc are not the ones at fault. They're working around an issue in the CPU. Using your analogy, they didn't design the car. They're just forcing you to either slow down, not drive on their roads, or buy a different car.
        
        Re: (Score:3)
        
        by Xenx ( 2211586 ) writes:
        
        If you're provided three options, and only three options, you're forced to pick one. Inaction is a choice that would force at least one of the options.
  - Re: (Score:3)
    
    by Pyramid ( 57001 ) writes:
    
    "you cut 30% off the performance of my CPU expect to hear about it.
    --
    Hi! I make Firefox Plug-ins. Check 'em out @ https://addons.mozilla.org/en-... [mozilla.org]"
    I can't be the only person who sees the irony of a person complaining about performance degradation and that they make Firefox plug-ins in the same post.
- Re: (Score:2)
  
  by eclectro ( 227083 ) writes:
  
  Mail everybody a new chip from the past decade would be the "different" solution!.
- Re: (Score:2)
  
  by paskie ( 539112 ) writes:
  
  The slowdown is on syscalls. So it depends on your workload. For example, `du -s` is reportedly slowed down really by tens of percents.
- Re:five to 30 per cent slow down (Score:4, Insightful)
  
  by sjames ( 1099 ) writes: on Tuesday January 02, 2018 @07:11PM (#55852257) Homepage Journal
  
  They don't have a choice. The cost is quite believable since the workaround involves mapping the kernel in and out of the process space for every system call. Keeping it mapped in and keeping the page tables hot in the cache helps performance a lot.
  The real fix involves new silicon.
  
  - Re: (Score:3)
    
    by TheRaven64 ( 641858 ) writes:
    
    Note that this wouldn't be the first time that Linux has done this. RedHat kernels used to support a 4GB/4GB address split on i386 PAE systems. Conventionally, on 32-bit systems, the kernel reserves the top 2GB of the address space and userspace gets the bottom 2GB. When this started to get cramped, the was an option to reserve the top 1GB for the kernel and make the bottom 3GB available for userspace. Sometimes, even that wasn't enough, so with a 4GB/4GB address split both get completely separate page
- Re:five to 30 per cent slow down (Score:5, Informative)
  
  by grimr ( 88927 ) writes: on Tuesday January 02, 2018 @07:58PM (#55852523)
  
  I don't think you understand how drastic this fix is. Every time a user mode to kernel mode transition happens and every time a hardware interrupt happens, the entire page table directory layout has to be switched. This means all the TLB caches are flushed as well and that's where the main performance hit comes from.
  So if you're doing something like crypto currency mining you're not going to see much of a hit. But if you're doing a lot of I/O (file servers, database servers, web servers, etc.) you're going to see that 25-35% performance hit.
  And that's why hardware bugs are so serious. Sometimes you get lucky and it's a microcode update with no penalty. Sometimes it's a simple fix with barely any performance penalty. But sometimes you get unlucky and the fix hurts a lot and the only way to get the performance back is to swap out the hardware.
  
- Re: (Score:3, Informative)
  
  by Anonymous Coward writes:
  
  Every process runs in virtual address space--EVERY process. Low level device drivers on X86 might need to map to physical, but nearly everything is virtual. Certainly everything in user-mode.
  This is why process 1 memory location 10000 is a different piece of physical ram from process 5 memory location 10000. Different virtual address spaces. Each such mapping between each virtual address space and physical space is stored in a "page table". (Pages might also not currently exist in physical ram and the
- - - Re: (Score:2)
      
      by ewibble ( 1655195 ) writes:
      
      Why should warranty period apply, this is a manufacturing defect, that causes your machine to be insecure or slower than advertised.
      Just like if find out your car air bags won't work, they are replaced free of charge.
- - - - Re:Caching is what makes CPUs fast (Score:5, Interesting)
        
        by TheRaven64 ( 641858 ) writes: on Wednesday January 03, 2018 @01:30AM (#55853813) Journal
        
        Leaving stale ones is often fine. FreeBSD does this intentionally in the transparent superpage promotion. When it condenses adjacent pages into a single superpage in the page tables, it doesn't invalidate the TLB entries. The exact behaviour of this varies between CPUs. When you get a TLB miss in an adjacent address range, it's filled from the superpage entry and now you have two TLB entries for the same virtual address[1]. Intel will just discard the smaller one, Centaur will note the mismatch, invalidate both, and refill from the page table, gem5 will crash (I think we've upstreamed the fix for this), and I'm not sure what AMD does.
        Your example is highly unlikely, because the kernel typically reserves the top half of the address space for itself and an attempt by userspace to map anything in this range will fail. On x86 chips, there's a huge gap in the middle of the address space (it actually makes more sense to think of virtual addresses as signed values, with userspace ones being positive, kernel ones being negative, and the size of the number somewhat less than 64 bits [microarchitecture specific]). The kernel map, by default, will include the entire userspace portion of the address space for the current process, so that copies between kernel and userspace are cheap.
        This kind of TLB invalidate is actually cheap on AMD, because they implement a tagged TLB with the cr3 value as the tag, so swapping cr3 values implicitly invalidates the TLB, but the entries become valid again when you reset the entry. The PCID feature, which is apparently the cause of this vulnerability, is largely a result of the fact that AMD patented this technique and so Intel doesn't use it.
        [1] Some old SPARC chips would literally catch fire if you did this: the TCAMs would run hot enough to burn.
        
This could be massive (Score:5, Interesting)

by Artem S. Tashkinov ( 764309 ) writes: on Tuesday January 02, 2018 @05:47PM (#55851769) Homepage

The developers behind the GRSecurity project measured up to 63% performance loss [twitter.com]. If most common tasks are equally affected, Intel is sure fucked. Home users might not need to bother, but large cloud providers might be seriously affected.
Meanwhile the Linux kernel has received the largest incremental minor patch [kernel.org] in its history (229KiB) - perhaps kernel 4.14.11 already contains all the required fixes.
I have a sneaking suspicion Intel shares will fall through the floor in the next few weeks because Intel CPUs might have suddenly become quite slower than their AMD Zen based counterparts.

- Re:This could be massive (Score:5, Informative)
  
  by Artem S. Tashkinov ( 764309 ) writes: on Tuesday January 02, 2018 @05:52PM (#55851805) Homepage
  
  Some PostgreSQL results have just been released [postgresql.org]: up to 23% performance loss. This is indeed huge.
  
- What about games? (Score:2)
  
  by rsilvergun ( 571051 ) writes:
  
  or heck if you've just got a low end laptop?
  - Re: What about games? (Score:2)
    
    by Monster_user ( 5075027 ) writes:
    
    I'm already getting serious slowdown from games optimizing for other Windows operating systems. Rebooting into Windows 7 offered an incredible leap in performance on the same rig for one game in particular, Makes the game as fast as it was when I first installed it on the old OS! That is what I get for playing an MMORPG.
  - - Re: (Score:3)
      
      by Xenx ( 2211586 ) writes:
      
      You're over generalizing. You're also assuming people are playing games that are console ports. I play a few games that are already CPU limited. There are also a few popular titles that are cpu intensive. PUBG being a big one right now. I'm not saying this fix is going to make it worse, as I don't know. But, a huge hit to CPU performance would make a huge impact on those games.
- Re:This could be massive (Score:5, Interesting)
  
  by Motherfucking Shit ( 636021 ) writes: on Tuesday January 02, 2018 @10:40PM (#55853273) Journal
  
  I have a sneaking suspicion Intel shares will fall through the floor
  Intel's CEO agrees; a couple weeks ago he sold all the Intel stock he can [fool.com]. If he'd dumped any more shares, he would have had to forfeit his job. That isn't a man who's confident about the future of his company...
  
  - Re:This could be massive (Score:4, Interesting)
    
    by Zocalo ( 252965 ) writes: on Wednesday January 03, 2018 @05:18AM (#55854355) Homepage
    
    This flaw has been known about since at least October, and probably even sooner since early code to fix it starting coming out in November, which makes it seem *highly* likely that he was aware of the it and the potential impact when he took the decision to commit to the November sale. Also, as CEO, there's no way that he can plausibly claim "I didn't know about it" like the two Equifax execs who executed a stock sale just before their breach announcement did without coming across as completely incompetent and unfit for the CEO role. This reeks of insider trading, and after the outcry over the Equifax execs I can't imagine that the SEC isn't going to want to take a good hard look at this stock sale as well.
    
    I'm sure his vested options will pay for some really good lawyers but, even so, forfeiting his job could be the least of his problems.
    
- - Re:This could be massive (Score:5, Informative)
    
    by scumdamn ( 82357 ) writes: on Tuesday January 02, 2018 @05:57PM (#55851851)
    
    Doesn't look like it's everybody. https://lkml.org/lkml/2017/12/... [lkml.org]
    
    - Re: (Score:2)
      
      by sl3xd ( 111641 ) writes:
      
      Nice catch.
      I guess that's what I get for reading articles about the issue, rather than LKML...
    - - Re:This could be massive (Score:4, Informative)
        
        by grimr ( 88927 ) writes: on Tuesday January 02, 2018 @07:42PM (#55852435)
        
        Nope. Page Table Isolation is the fix and not the fault. But isolating the userland and kernel page tables means you have to switch between them each time you go from user mode to kernel mode and back. This slows things down.
        AMD CPUs don't have the bug where user mode can read kernel pages so does not require this isolation and the performance hit caused by enabling it. From the AMD email: "The AMD microarchitecture does not allow memory references, including speculative references, that access higher privileged data when running in a lesser privileged mode when that access would result in a page fault."
        
        
        Re:This could be massive (Score:5, Interesting)
        
        by Qzukk ( 229616 ) writes: on Tuesday January 02, 2018 @09:44PM (#55853035) Journal
        
        Based on this link from Hacker News: https://cyber.wtf/2017/07/28/n... [cyber.wtf] and the linked email/patch from AMD, it looks like what happens is that AMD checks memory permissions up front before allowing an instruction into the pipeline, while Intel made the memory permission check as a later part of the pipeline, apparently after the memory was accessed and inserted into the cache.
        
Obligitory LWN link (Also affects ARM64) (Score:5, Informative)

by sl3xd ( 111641 ) writes: on Tuesday January 02, 2018 @05:52PM (#55851801) Journal

Linux Weekly News [lwn.net] has been covering this for quite a while.
5% slowdown on average, with up to 30% for some particularly bad network operations.
ARM64 is also affected [lwn.net], so it's not just intel

- Older (non-paywalled) LWN Link (Score:2)
  
  by sl3xd ( 111641 ) writes:
  
  An older link, about the KAISER patch set [lwn.net]
And, time for AMD to shine again (Score:5, Informative)

by NuclearCat ( 899738 ) writes: on Tuesday January 02, 2018 @05:57PM (#55851859) Journal

And what is interesting, AMD is immune to that, proof: https://lkml.org/lkml/2017/12/... [lkml.org]

- Re: (Score:2)
  
  by sl3xd ( 111641 ) writes:
  
  Nice catch.
AMD is safe (Score:5, Informative)

by TeknoHog ( 164938 ) writes: on Tuesday January 02, 2018 @06:00PM (#55851899) Homepage Journal

The summary is not fully explicit: this is not a flaw in Intel x86 ISA, but specific to CPUs made by Intel. AMD processors don't have the problem, so they should not need the patch.
https://lkml.org/lkml/2017/12/... [lkml.org]
This could be a huge win for AMD, because the patch incurs a measurable slowdown. At the moment, though, the Linux fix doesn't seem to distinguish between manufacturers. I expect the distinction will appear later -- better safe than sorry.

- Re: (Score:2)
  
  by Mindragon ( 627249 ) * writes:
  
  Although AMD is safe, why do they mention a 50% slowdown for AMD processors?
  > @grsecurity measured a simple case where Linux “du -s” suffered a 50% slowdown on a recent AMD CPU.
  - Re: (Score:2)
    
    by ColaMan ( 37550 ) writes:
    
    Presumably they enabled the software workaround and ran it on a bunch of CPUs. Then they picked the most alarming slowdown they could find, regardless of whether that CPU needed the workaround or not.
    Or perhaps they were just unaware at the time that AMD CPUs are not at risk.
More info on the subject (Score:5, Informative)

by trybywrench ( 584843 ) writes: on Tuesday January 02, 2018 @06:02PM (#55851911)

some of my sys admin friends posted this on a slack channel i'm in, apparently it's a big deal

http://pythonsweetness.tumblr.com/post/169166980422/the-mysterious-case-of-the-linux-page-table [tumblr.com]

Deja Vu (Score:2)

by WhoBeDaPlaya ( 984958 ) writes:

Anyone remember the TLB bug that also resulted in huge performance penalties in the first generation Phenoms? Guess it's Intel's turn.
- Re:Deja Vu (Score:4, Informative)
  
  by Artem S. Tashkinov ( 764309 ) writes: on Tuesday January 02, 2018 @06:30PM (#55852067) Homepage
  
  Results [anandtech.com] varied [techreport.com]. Some workflows experienced a very significant performance loss however, e.g. Firefox became twice as slow, WinRAR four times as slow.
  
The future (Score:5, Interesting)

by Artem S. Tashkinov ( 764309 ) writes: on Tuesday January 02, 2018 @06:44PM (#55852127) Homepage

I'm curious how much Cannon Lake and Ice Lake CPU architectures are going to be delayed. Since Cannon Lake is basically SkyLake on a 10nm node, Intel cannot release it with such a glaring hole which causes such a significant performance loss.
I've been running a Sandy Bridge CPU for seven years now, and now I'm really looking forward to the second gen Zen CPUs. Viva, competition. I'm really glad AMD is still around.

- Re: (Score:3)
  
  by sl3xd ( 111641 ) writes:
  
  I'm curious how much Cannon Lake and Ice Lake CPU architectures are going to be delayed.
  
  I'm going to go out on a limb and say "not at all"
  CPU design pipelines are pretty long; generally requiring at least a year to go from "Tape Out [wikipedia.org]" to fabrication.
  Releasing no chip (and staying with an even slower current generation) is just not an option.
  Cannon Lake and Ice lake are still an improvement on Intel's current offering, and can still compete against AMD's offering.
  Intel managed to move on after the (even more dire) disasterous NetBurst architecture [wikipedia.org]; there's no reason to believe they won't get pas
Intel CEO Sold a lot of stock... (Score:5, Interesting)

by Nos. ( 179609 ) writes: <andrew.thekerrs@ca> on Tuesday January 02, 2018 @06:56PM (#55852181) Homepage

https://www.fool.com/investing... [fool.com]
Less than a month before we know the linux kernel was being patched for this bug.

- Re:Intel CEO Sold a lot of stock... (Score:5, Informative)
  
  by sl3xd ( 111641 ) writes: on Tuesday January 02, 2018 @07:27PM (#55852347) Journal
  
  This bug has been known and reported about since early November; the original paper was presented in July of 2017, and code has been in Github since Feburary.
  Motley Fool is just noting that the Intel CEO isn't holding any more stock than he needs to.
  And there are good reasons:
  * AMD is back from the dead.
  * Intel's GPU hasn't been that successful -- they've even teamed up with AMD to put Radeon GPU's in the same die as an Intel CPU.
  * PC sales are declining as consumers shift from Intel PC's to using ARM-powered tablets & phones instead.
  * ARM is making inroads into the "desktop and laptop computer" marketplace.
  * ARM is powering most consumer electronics as well (TV's, Blu-ray players, Smart Speakers, etc)
  * Intel is absolutely nowhere in the mobile world. Mobile has one ARM to rule them all.
  * Intel missed the boat for the current generation of XBOX and PlayStation consoles.
  Intel is looking more and more like a one trick pony, and its competitors are beginning to do that one trick better too.
  
... and this is why ... (Score:5, Interesting)

by nbvb ( 32836 ) writes: on Tuesday January 02, 2018 @07:08PM (#55852247) Journal

This is why we run our mission critical workloads on SPARC and Power along side Linux. Solaris and AIX. Diversity -- in operating system, in processor, in manufacturer - is healthy. The SPARC T8's are blazing faster, secure, and don't have this nonsense. Neither do our POWER8's. Having all your eggs in the Intel+Linux basket could be a major shitshow here... meanwhile, we'll keep chugging along.

MacOS? (Score:2)

by e432776 ( 4495975 ) writes:

Have to ask, does this also affect Intel-Macs? I infer "yes", but have not read many of the detailed articles yet...
- Re: (Score:2)
  
  by Artem S. Tashkinov ( 764309 ) writes:
  
  With 99.99% certainty unless the Mach kernel does something no other OS does in order to secure kernel memory pages.
- Re: (Score:3)
  
  by bruce_the_moose ( 621423 ) writes:
  
  Cue the lawyers to initiate class action lawsuits against Apple once they release their patches to deliberately slow down older machines in the face of a hardware limitation.
Go ARM Go (Score:2, Interesting)

by smist08 ( 1059006 ) writes:

Glad my latest computer is a Raspberry Pi. Glad to be on an ARM processor. Perhaps this will help more ARM based computers become more mainstream this year.
- Re:Go ARM Go (Score:5, Informative)
  
  by andydread ( 758754 ) writes: on Tuesday January 02, 2018 @08:36PM (#55852689)
  
  really sorry to bust your ARM bubble. https://lwn.net/Articles/74039... [lwn.net]
  
We'll just live with the slowdown, pretty much. (Score:2)

by OFnow ( 1098151 ) writes:

The notion that Intel even has the capability of producing new fixed CPUs to match other than the latest packaging/pin requirements seems fanciful. In which case we'll just have to live with any slowdown. As buying all new systems is just too expensive.
Speculative Memory References and Page Faults (Score:5, Interesting)

by bill_mcgonigle ( 4333 ) * writes: on Tuesday January 02, 2018 @08:42PM (#55852713) Homepage Journal

From the AMD commit [lkml.org]:
AMD processors are not subject to the types of attacks that the kernel
page table isolation feature protects against. The AMD microarchitecture
does not allow memory references, including speculative references, that
access higher privileged data when running in a lesser privileged mode
when that access would result in a page fault.
this can probably be rewritten in the inverse like:
Intel processors ... allow memory references, including speculative references, that
access higher privileged data when running in a lesser privileged mode, [including]
when that access would result in a page fault.
So it seems like: set up a speculative memory reference to a kernel memory structure, cause a page fault, and then get a bit of kernel memory out (and back in?). That could get you root before long. Some people have been saying this can be leveraged to get a guest into its hypervisor too.

- Re: (Score:2)
  
  by sl3xd ( 111641 ) writes:
  
  If you read one of the original articles about the KAISER patch set [lwn.net]: a commenter asked about microkernels, and the reply is that since it's a hardware issue, both microkernels & monolithic kernels have to pay the same price.
  - Re:tanenbaum's revenge? (Score:4, Insightful)
    
    by mikael ( 484 ) writes: on Tuesday January 02, 2018 @06:16PM (#55851997)
    
    And this comment. Someone could feel the storm coming.
    KAISER: hiding the kernel from user space
    Posted Nov 16, 2017 7:21 UTC (Thu) by alkbyby (subscriber, #61687) [Link]
    Looks like something bad is coming. Such as mega-hole maybe in hardware that can be mitigated by hiding kernel addresses.
    Otherwise I cannot see why simply hiding kernel addresses better, suddenly becomes important enough to spend massive amount of cpu on it.
    - This isn't the first time. There was a problem a decade ago with Intel CPU's, when separate process threads could access each others data through cache memory.
    
- Re: (Score:2)
  
  by Artem S. Tashkinov ( 764309 ) writes:
  
  Have you just hit infamous bug 12309 [kernel.org]? It's ostensibly fixed however people keep reproducing it (just google for heavy I/O operations slow down my PC).
  - Re: (Score:2)
    
    by sl3xd ( 111641 ) writes:
    
    Nah, GP didn't configure the kernel's settings properly.
    There's more to running without swap than not enabling a swap file/partition. You have to configure swappiness, the oom_adj_score for what to kill when memory runs out, and so on.
- Re: (Score:3)
  
  by Chrisq ( 894406 ) writes:
  
  If I bought a car with X horsepower, and suddenly something was found wrong with it and it had to be modified to work, and was suddenly X - 10%, I'd expect compensation.
  Funnily enough people are bringing a class action lawsuit [theguardian.com] for exactly this against Volkswagen after a fix to the emissions cheat reduced the power of their cars.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Don't know if serious. (Score:3, Funny)

Re:Don't know if serious. (Score:5, Funny)

Re: (Score:3)

FOOF (Score:5, Insightful)

FUCKWIT (Score:3)

Re:FUCKWIT (Score:5, Insightful)

HPC cluster (Score:3)

Re: (Score:3)

Re: (Score:3)

Re:FOOF (Score:5, Informative)

Re:FOOF (Score:5, Informative)

Re:FOOF (Score:5, Interesting)

In all fairness... (Score:3, Insightful)

Re:In all fairness... (Score:5, Informative)

Re:In all fairness... (Score:4, Informative)

Re:In all fairness... (Score:5, Informative)

Re:In all fairness... (Score:4, Interesting)

Re: (Score:3)

Re:In all fairness... (Score:5, Insightful)

Re:In all fairness... (Score:4, Funny)

Re:In all fairness... (Score:5, Insightful)

Re: (Score:2)

Re: In all fairness... (Score:4, Insightful)

Re: (Score:3)

Re: (Score:3, Informative)

Re: In all fairness... (Score:4, Interesting)

How could this be abused? (Score:5, Interesting)

Re:How could this be abused? (Score:5, Insightful)

Re: (Score:2, Interesting)

Re:How could this be abused? (Score:4, Insightful)

Re:How could this be abused? (Score:4, Informative)

Re:How could this be abused? (Score:4, Informative)

Re:How could this be abused? (Score:5, Interesting)

Re:How could this be abused? (Score:5, Interesting)

five to 30 per cent slow down (Score:3)

Re: five to 30 per cent slow down (Score:2)

Re: (Score:2)

Re: (Score:2, Interesting)

Re: five to 30 per cent slow down (Score:5, Interesting)

Re: (Score:3)

Re: (Score:3)

Re: (Score:3)

Re:five to 30 per cent slow down (Score:5, Informative)

Re: (Score:2)

Re: (Score:2, Insightful)

Re: (Score:3)

Re:five to 30 per cent slow down (Score:5, Informative)

Re: (Score:2, Offtopic)

Re: AMD stock? Intel Stock? (Score:2)

Either that or a lawsuit (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re:five to 30 per cent slow down (Score:4, Insightful)

Re: (Score:3)

Re:five to 30 per cent slow down (Score:5, Informative)

Re: (Score:3, Informative)

Re: (Score:2)

Re:Caching is what makes CPUs fast (Score:5, Interesting)

This could be massive (Score:5, Interesting)

Re:This could be massive (Score:5, Informative)

What about games? (Score:2)

Re: What about games? (Score:2)

Re: (Score:3)

Re:This could be massive (Score:5, Interesting)

Re:This could be massive (Score:4, Interesting)

Re:This could be massive (Score:5, Informative)

Re: (Score:2)

Re:This could be massive (Score:4, Informative)

Re:This could be massive (Score:5, Interesting)

Obligitory LWN link (Also affects ARM64) (Score:5, Informative)

Older (non-paywalled) LWN Link (Score:2)

And, time for AMD to shine again (Score:5, Informative)

Re: (Score:2)

AMD is safe (Score:5, Informative)

Re: (Score:2)

Re: (Score:2)