Forgot your password?
typodupeerror
Security Software Linux

Entropy Problems For Linux In the Cloud 179

Posted by kdawson
from the mehr-Zufälligkeit dept.
CalTrumpet writes "Our research group recently spoke at Black Hat USA on the topic of cloud computing security. One of the interesting outcomes of our research was the discovery that the combination of virtualization technologies and public system images results in a problem for random number generation on guest operating systems. This is especially true for Linux, since its PRNG uses only a small set of entropy-gathering events, and virtual Linux images often generate SSH host keys within seconds of their initial boot. The slides are available; the PRNG vulnerability material begins at slide 63."
This discussion has been archived. No new comments can be posted.

Entropy Problems For Linux In the Cloud

Comments Filter:
  • by Anonymous Coward on Monday August 03, 2009 @08:34PM (#28935067)

    TPM chips have their bad things, but one thing they do offer is a cryptographically secure RNG. Its completely understandable not to trust it 100% completely, but you can use the random number stream it puts out as a good addition to the /dev/random number pool.

    • by iYk6 (1425255) on Monday August 03, 2009 @08:37PM (#28935097)

      Or you could plug in a microphone.

      • Or you could plug in a microphone.

        Assign all of he virtual servers a unique 256 bit ID. XOR that with 256 bits of input of any USB device that measures the real world, and send it through a hash algorithm. USB devices are easy for virtual servers to access.

        Perhaps better, have a 256 bit seed for each server as above, but have the host server distribute 256 bits at startup time using a microphone run through a hash algorithm.

      • by profplump (309017)

        Unless you wanted all of the servers (and all of the VMs on each server) to have *different* entropy sources, which was the whole point of TFA. Unless you run a lot of single-device racks each in their own room you're just going to end up with an expensive way to get exactly the same "random" data on each machine.

        There's also some correlation between things like disk activity and sound output of the machine; there may be some entropy available in the ambient sound -- it may even be chaotic -- but it's certa

        • by hairyfeet (841228)
          there are plenty of free webcams that will let you look at...say the Eiffel tower, or even out at some guy's yard to watch his grass grow. Couldn't those free feeds be used to generate the random number?
          • by profplump (309017) <zach-slashjunk@kotlarek.com> on Tuesday August 04, 2009 @12:59AM (#28936765)

            First, real-world images are not very random just be virtual of being part of the real world; random things also need to happen. This is particularly mostly-static images like you'd see in 24/7 web cams -- there is not much entropy available.

            Second, most of the reason we want random data for seeing purposes is because the seed needs to be something an attacker cannot derive. The output of truly random number generator cannot be predicted by a remote attacker, but publicly available video streams most certainly can, so any source that sends the same data to more than one person is not suitable for things like cryptography. Frankly that's the whole point of the article; if there are many VMs on the same host, or many real hosts on the same hardware and network, started at the same time, and using the same source for entropy they will all generate the same "random" number.

            Finally, this is a well-solved problem. Many CPUs and motherboards include a hardware RNG that is perfectly sufficient both in terms of randomness and speed for typical PRNG seeding needs. VIA has had one directly in all their CPUs for a long time, Intel includes one in their firmware hubs, and I'm sure there are similar options on most other architectures. Using that on-board RNG to individually seed each VM/host would solve the problem described in the article. There's no reason to try to invent ways to get random data unless you have very specific requirements not met by the existing solutions, as you're quite likely to come up with something inferior either in design or implementation.

          • by Fred_A (10934)

            The Eiffel tower hasn't changed that much in the past 110 years... The MIT (IIRC) used to have a random data feed generated from a webcam and a lava lamp. I guess each datacenter ought to invest in a lavalamp...

            Correction : after a quick Google, it seems that it was SGI [wired.com], and that they actually patented the thing (which may or may not mean something depending on your jurisdiction). The site was (and still is apparently) lavarnd.org [lavarnd.org].

      • Background sound in a server room is likely to be a fairly regular hum. Better yet is a geiger counter picking up background radiation.
      • by evanbd (210358) on Monday August 03, 2009 @11:56PM (#28936421)

        You don't need a mic. The resistor noise on the sound card inputs is present and of secure quantum origin, regardless of whether a microphone is plugged in. The microphone noise is louder, but it's much harder to determine how much secure entropy is present. Why trust it when you don't have to? There's plenty available for most purposes without it. The Turbid [av8n.com] program does this in an efficient and secure manner (and they have a paper discussing the details, along with the relevant proofs, for the curious).

    • TPM chips have their bad things

      They do? It's secure crypto hardware.. what's evil about that? Yes you have scary evil like Palladium but you don't have to install it if you don't want to. And if machines take control you can always disable the device from the BIOS.. (given you don't care about any data which has encryption keys stored only in the module)

    • or just write a script to check my favorite stock values... pretty random, as far as I can tell. But trending down :-(
      • All my favorite stock values are ones with a lot of zeros after it. Too bad none of my stocks are at those values.
    • by Isao (153092)
      TPM doesn't buy you anything in a VM - the virtualized environment has to trust the host that it's getting the correct certificates and data. The VM doesn't have direct access to the TPM, and the TPM won't export private keys. Also, that's only one set of keys per TPM, so multiple/portable VMs aren't realistic. I'm in favor of the tools TPM offers users (vice content producers) but I don't believe this is a good fit.
  • Getting creative (Score:3, Interesting)

    by Brian Gordon (987471) on Monday August 03, 2009 @08:36PM (#28935085)
    How about getting signed entropy from a trusted server on the network/internet? How about putting that microsecond-accurate system clock to use?
  • by BadAnalogyGuy (945258) <BadAnalogyGuy@gmail.com> on Monday August 03, 2009 @08:39PM (#28935115)

    Why can't the CPU contain a register which holds a random number which is updated with every clock cycle?

    • Re: (Score:3, Insightful)

      That's like asking "Why can't they add a DWIM opcode to the instruction set?"
    • by ShadowRangerRIT (1301549) on Monday August 03, 2009 @08:49PM (#28935181)

      First, the cost of computing truly random numbers is way too high for that, unless you are performing an iterative approach to random number generation (and then you have the problem of predictability). It could be done, but you'd be pumping a lot of hardware into computing values that would be thrown away 99.9%+ of the time.

      Secondarily, if your PRNG algorithm is broken, you're stuck replacing the hardware. At least a bad software PRNG can be replaced.

      That said, hardware PRNG is provided in many modern systems by a TPM [wikipedia.org]. It lacks the performance problems associated with your solution, since it only generates random numbers on demand. It still has the problem of a potential exploit being discovered leading to expensive hardware upgrades, but to my knowledge that has not been a problem to date.

      • Why can't the CPU contain a register which holds a random number which is updated with every clock cycle?

        First, the cost of computing truly random numbers is way too high for that

        Computers are deterministic. Truly random numbers cannot be computed, they can only be provided by special hardware (something that can measure shot noise or thermal noise, a camera pointed at a lava lamp, a movement detector in Schrodinger's cat's box).

        Secondarily, if your PRNG algorithm is broken, you're stuck replacing the hardware.

        That's why you don't do pseudo-random numbers, but real randomness from thermal noise or shot noise or some other quantum effect (cats and lava lamps don't fit on ICs).

        That said, hardware PRNG is provided in many modern systems by a TPM.

        And at some level, the randomness generator on the TPM almost certainly has an interface of "read this special register every X clock cycles" (because how else would you interface with your special hardware?).

        It lacks the performance problems associated with your solution, since it only generates random numbers on demand.

        If it's implemented in hardware (as it must be, to get true randomness), it's always running and there is no "on demand".

        It still has the problem of a potential exploit being discovered leading to expensive hardware upgrades, but to my knowledge that has not been a problem to date.

        That would be because it's a RNG instead of a PRNG.

        • by hardburn (141468)

          That's why you don't do pseudo-random numbers, but real randomness from thermal noise or shot noise or some other quantum effect (cats and lava lamps don't fit on ICs).

          A small radiation source/detector, like the ones in smoke detectors, can work just fine for this purpose. Since radiation is the result of quantum interactions, the output is truly random due to the nature of the universe.

          • by drerwk (695572)
            While the radiation is random, the measurment being so depends on a perfect detector.
            • by profplump (309017)

              Only if you demand perfect randomness (for which there is little practical use in typical computers). And even then "perfect" only means "perfectly preserving randomness" not "correctly detecting every single event/non-event". Given the relative simplicity of a radiation detector "perfect" or some very close equivalent thereto is probably not all that unrealistic anyway.

    • Why can't the CPU contain a register which holds a random number which is updated with every clock cycle?

      Some do have something like that [via.com.tw], although it's only about 800kbps instead of 4 bytes per cycle.

    • by hardburn (141468)

      There are CPUs (or more often, chipsets) that provide RNGs, along with a few other hardware implementations of crypto algorithms. Most of them are meant for smaller computers, though, like the VIA C3. I wish they were more widespread and used.

    • by profplump (309017)

      Many do. VIA has had CPU-integrated dual-oscillator hardware RNGs for a long time. Intel firmware hubs also commonly contain a hardware RNG, as do other motherboard architectures.

      They aren't very fast sources of random data -- it's actually pretty hard to get truly random data, even outside the world of desktop CPUs -- but they are fast enough to provide a relatively long seed for a PRNG within seconds of boot. Assuming you use a reasonable PRNG, providing a truly random seed is sufficient to let the PRNG g

  • Surely Not. (Score:3, Insightful)

    by lobiusmoop (305328) on Monday August 03, 2009 @08:40PM (#28935125) Homepage

    Generating SSH keys involves interaction via at least keyboard and possibly mouse at a terminal. Surely that basic permise is enough to provide enough entropy for the pseudo-random generator. Also, the date and time (as sources of random) can't be virtualized of course.

    • Not surely (Score:4, Interesting)

      by Kaseijin (766041) on Monday August 03, 2009 @08:51PM (#28935189)

      Generating SSH keys involves interaction via at least keyboard and possibly mouse at a terminal.

      SSH host keys are often generated automatically when the init script notices there aren't any.

    • Generating SSH keys involves interaction via at least keyboard and possibly mouse at a terminal.

      If you use PuTTY, yes. OpenSSH, at least, doesn't require anything in particular, just a sufficient amount of entropy. On a properly configured system, moving a mouse or banging randomly on the keyboard might feed entropy -- but then, so would plugging a microphone into the sound card, or any number of other things.

      And as Kaseijin mentions, this is about host keys. Especially in a virtualized environment, you can't assume any sort of human interaction when these keys are generated.

    • Last I checked the date and time are anything but random.

  • by Facegarden (967477) on Monday August 03, 2009 @08:47PM (#28935173)

    All this complaining over random numbers is silly. All you really have to do is use 5. It's just as random as any other number, and it's easy to generate a 5.
    -Taylor

    • "and it's easy to generate a 5"

      And you can generate it at any random point in time too.

      But somehow, being as random as anyone else, I prefer 42.

    • by BobisOnlyBob (1438553) on Monday August 03, 2009 @09:29PM (#28935421)

      This only proves how easy it is to generate a (5, Funny).

    • Re: (Score:3, Funny)

      by hannson (1369413)

      int getRandomNumer()
      {
              return 4; // chosen by fair dice roll.
      // guaranteed to be random.
      }

    • by tylerni7 (944579)
      http://xkcd.com/221/ [xkcd.com]
      It's cool, it's different because you used a 5.
    • by jefu (53450)

      No, the only true random number is 17. This was asserted by several mathematicians who used several lines of reasoning (one rather like this [flickr.com]). Then you get the random sequence 17,17,17... and the random rational 0.17171717... and lots of other perfectly good random numbers. Though you probably shouldn't use them as a source for cryptographically strong random numbers.

    • Re: (Score:3, Insightful)

      by jamesh (87723)

      Interesting that both Dilbert (years ago) and xkcd (more recently) both contain a comic with a similar joke...

    • by sam0737 (648914)

      Hell! I have been using 42 for long. And you know that's the Answer to Life, The Universe and Everything, including but not limited to generating Random Number I suppose.
      I bet it works better than 5.

  • by lamber45 (658956) <lamber45@msu.edu> on Monday August 03, 2009 @08:52PM (#28935201) Homepage Journal
    The nice thing about Linux is that you can develop whatever entropy-producing process you want and write its output to /dev/urandom to add more entropy to the pool. For instance, a boot script could issue an HTTP request to a website backed by a hardware random-number generator (access-control to only machines in the cloud by IP range). It is something to be worried about, though.

    Java code that does cryptography or generates UUIDs (in the hope that they will be a truly universal key for something) operates under similar problems. JavaScript is even worse; all it has is the time, perhaps the user's window-size (not very random if maximised) and mouse-movements, and the built-in random() method, which is not expected to be of cryptographic quality.

    • by gd2shoe (747932)

      Interesting idea, though I would recommend HTTPS (pre-shared self signed cert would be sufficient for in-house use). If predictability is the problem you're trying to avoid, you want to skirt the packet sniffers.

      By the way, why write to /dev/urandom, and not /dev/random? Doesn't /dev/urandom act as a front for /dev/random except when the entropy pool is empty (at which point it goes pseudo-random). Just curious.

      • by CalTrumpet (98553) on Monday August 03, 2009 @10:48PM (#28935963)
        Actually, /dev/random and /dev/urandom have their own, separate secondary pools that are fed off of a main pool when entropy is "depleted" in the second level pools. This is an area of research for us as well, since Linux's entropy estimation algorithm fails in situations where the timing deltas of entropy gathering events (IRQs and disk IOs) are actually predictable, so it's possible that the second level pools are not being refreshed at appropriate times.

        If you write to /dev/urandom, it goes into the primary pool by tradition. This is what the rc scripts do on bootup with the random seed file on disk.

        BTW, it's absolutely the wrong solution to get entropy from another source on the network (for many reasons, but one is that you can't do a secure HTTPS handshake without, you guessed it, unguessable random numbers). The whole point here is that we are looking for a way for 500 Linux instances on EC2 to have different entropy pools before the kernel completes boot. The only possible solution is for the hypervisor (Xen for Amazon) to provide a simulated HW RNG that pulls entropy from a real HW RNG or from an entropy daemon in the hypervisor.

        The best way to learn about Linux RNG basics is Gutterman et. al. Analysis of the Linux Random Number Generator [iacr.org]. Several of the issues they describe have been addressed, such as their PFS concerns, but their description of the entropy pools is still accurate.
        • by gd2shoe (747932)

          Dang, you're right. Thanks for the post.

        • by julesh (229690)

          BTW, it's absolutely the wrong solution to get entropy from another source on the network (for many reasons, but one is that you can't do a secure HTTPS handshake without, you guessed it, unguessable random numbers). The whole point here is that we are looking for a way for 500 Linux instances on EC2 to have different entropy pools before the kernel completes boot.

          If we're talking about a VM, what's wrong with setting up a point-to-point link with the host machine and accessing an entropy source over that,

  • by Anthony Liguori (820979) on Monday August 03, 2009 @08:55PM (#28935227) Homepage

    CONFIG_HW_RANDOM_VIRTIO enables it. It's been there for quite a while.

    We could easily support it in KVM but I've held back on it because to really solve the problem, you would need to use /dev/random as an entropy source. I've always been a bit concerned that one VM could starve another by aggressively consuming entropy.

    lguest does support this backend device though.

    • Why not set a rate limit on entropy consumption then?
      • I guess you could set it as an option, but the threshold between a useful amount of entropy and what it would take to starve another is often overlappng, so it wouldn't be much help in any but the most controlled situations--which is exactly when you wouldn't need the option.

        • by profplump (309017)

          I don't understand how entropy consumption is fundamentally different than I/O consumption or memory consumption, or why it would need a different solution to the problem of competing demands for scarce resources.

    • Wouldn't it be sufficient to use the host random pool as a seed for some sort of strong PRNG?

    • Re: (Score:3, Funny)

      by plasmacutter (901737)

      I heard the aliens from zeta reticuli utilize paravirtual entropy drivers to get to earth.

  • I'd like some evidence that cloud computing is a fad. Tens of thousands of companies, in dozens of industries, do not list "computing hardware, availability, and capacity management" as a core competency, making them prime cloud customers.
    • Re: (Score:3, Insightful)

      by jellomizer (103300)

      It is a tool in the bucket. That what it is. There will be a huge growth spurt, then they realize that it won't solve everything. Then they will cut back and still use it until they find something better.

  • Seems to me this could be solved via the "Guest Additions" module that most virtualization packages recommend you install in the guest OS. Use the GA to inject some entropy from the host system into the guest system's entropy pool. The host CPU's TSC register would probably be an excellent source.
  • Eh? (Score:4, Interesting)

    by ledow (319597) on Monday August 03, 2009 @09:37PM (#28935471) Homepage

    If you "need" cloud computing, then you're bright enough to install an entropy daemon on one of the machines and maybe even slap a hardware-based RNG on it (probably worth sourcing a VIA or similar just for this purpose, to be honest). It's not hard.

    Anything else, your "randomness" really doesn't matter and the standard entropy will be just fine.

    • Bullshit. Any network connected host *needs* to be able to generate unguessable random numbers. Otherwise, that host might as well be a member of a botnet already.
      • Re: (Score:3, Insightful)

        by ledow (319597)

        A bold assertion. I assume you're thinking of TCP sequence numbers or similar. Otherwise, I call bullshit on the "ANY".

        And the entropy provided by being connected to a network in any way, shape or form is enough for that purpose.

        Even in general, unless you're generating LOTS of SSH/SSL keys on some kind of automated process schedule, you're fine, and that's the sort of task that should be pushed out to a dedicated entropy machine.

        Otherwise, every ADSL router etc. in the WORLD would be worthless - no keybo

    • by julesh (229690)

      If you "need" cloud computing, then you're bright enough to install an entropy daemon on one of the machines and maybe even slap a hardware-based RNG on it (probably worth sourcing a VIA or similar just for this purpose, to be honest). It's not hard.

      Err... yes, it is. Where does your entropy daemon get its entropy from? How do you install the hardware given that you're running in a VM hosted on somebody else's machine, located in somebody else's datacentre? This is an issue that can only be solved by the

  • by zullnero (833754) on Monday August 03, 2009 @10:05PM (#28935631) Homepage

    "The term cloud computing is useless" said Stamos. "It's way overused. It's mostly about gathering venture capital or selling your products."

    Yes. Because no one on the Internet has any use for gathering venture capital or selling products.

    It IS an overused term, but you're not testing some product or how people are using it, you're really just testing the security models of various operating systems to determine which are more ready to support those concepts that people grouped together and called "cloud computing". There were a lot of various concepts that were grouped together that comprised the "Net 2.0" concept too...and that cliche was just as derided for being overused. And yet, websites that aren't all ajaxed up or don't use css seem pretty old-fashioned these days.

    That said, the question I have is how ready for those "cloud computing" concepts is Windows, really? How much of that security model is using the proper approach to securing a transaction instead of just shutting down that path altogether?

  • This is not a "cloud" problem. This is a virtual server and image problem. Clouds have nothing to do with virtual servers. If you use a service like NewServers.com, you can get dedicated physical servers for your cloud, on-demand and at hourly prices.
    • by CalTrumpet (98553)

      While the origin of the issue is the virtualization layer, it is more specifically a cloud problem because most IaaS/VPS providers use standard images with a public random seed file, so everybody's machine initializes up to the same state (RTC + random seed + handful of interrupts).

    • Re: (Score:3, Informative)

      by julesh (229690)

      This is not a "cloud" problem. This is a virtual server and image problem. Clouds have nothing to do with virtual servers. If you use a service like NewServers.com, you can get dedicated physical servers for your cloud, on-demand and at hourly prices.

      Expanding on the other answer you've, here's the basic problem:

      I can take a virtual server, install an image with a well-known PRNG seed in it, and use it for a little while. While it's used the PRNG is updated by entropy in an unpredictable way, resulting eve

  • I did a system wipe and rebuild (re-installs CentOS from scratch) and SSH'ed in and... got no warning. The system's SSH keys were identical as the previous build. Needless to say I generated a local set and uploaded them.
  • FTA... (Score:5, Funny)

    by NotBorg (829820) on Monday August 03, 2009 @10:52PM (#28935999)

    "This falls somewhere between a very big deal and irrelevant," says Wagner.

    I'm glad he cleared that up for me.

    • So, it resembles an act of Congress then.

      (Although one could argue that by simultaneously fulfilling both opposing states, Congress is more like a quantum computing machine.)

  • There's a simple random number generator based on a radioactive source [fourmilab.ch] on-line. That can be accessed through a Java app, and the hardware info needed to build one of your own is on line. There are commercial random number generators. [comscire.com] USB, even. A serious data center should have a few of these.

  • I set up and scanned a number of virtual machines for a network security class Spring of this year. I noticed the following was the typical output of nmap when scanning the virtual host (in this case the VM was Fedora Core 10 hosted on CentOS 5.3 running a 2.6.29 custom kernel):

    Network Distance: 1 hop
    TCP Sequence Prediction: Difficulty=0 (Trivial joke)

    Running nmap against the same host but the physical OS (currently FC-11) gives:

    TCP Sequence Prediction: Class=random positive increments
    Diffic

  • As has been so often said, the generation of random numbers is too important to be left to chance. :-)

  • Old hat? (Score:4, Informative)

    by GiMP (10923) on Tuesday August 04, 2009 @01:43AM (#28936955)

    Disclaimer: I work for a hosting company doing VPS/cloud hosting.

    This is pretty old-hat. First, the host-keys issue inside pre-generated images is a very obvious one, although I'm not too surprised that companies aren't considering it. RNG issues aren't quite as obvious, but they're not super-secret either, anyone with any amount of background in security has been aware of this for a while.

    In fact, questions regarding RNGs have even surfaced in the ##xen IRC channel (freenode.org) because it is a very important issue to some. In particular, those with the need for hardware RNG solutions have come seeking assistance.

    I'm certainly not minimizing the issue, just noting that it isn't really a new one at all. More than anything, is that the average systems administrator has been slow to realize this, and developers even less so.

  • by flok (24996) <mail@vanheusden.com> on Tuesday August 04, 2009 @02:54AM (#28937217) Homepage Journal
    This problem has been solved: use EntropyBroker [vanheusden.com]: a physical machine gathers entropy data and distributes this to the virtual machines. If I remember correctly KVM has a special driver for feeding the VM with entropdata from the host system.
  • ...should be importing it explicitly (eg to create important crypto keys) from an external source, such as random.hd.org (mine) or random.org or whatever.

    Rgds

    Damon

  • by Lennie (16154) on Tuesday August 04, 2009 @10:24AM (#28940749) Homepage

    Those just use process-namespaces and the same kernel and you are done with it.

RADIO SHACK LEVEL II BASIC READY >_

Working...