Ultra-low-cost True Randomness 201
Cryptocrat writes "Today I blogged about a new method for secure random sequence generation that is based on physical properties of hardware, but requires only hardware found on most computer systems: from standard PCs to RFID tags." Basically he's powercycling memory and looking at the default state of the bits, which surprisingly (to me anyway) is able to both to fingerprint systems, as well as generate a true random number. There also is a PDF Paper on the subject if you're interested in the concept.
A Slightly More Expensive Method (Score:5, Interesting)
But in all seriousness, I wonder how this compares to the Mersenne Twister [wikipedia.org] (Java implementation [gmu.edu] & PDF [hiroshima-u.ac.jp])that I use at home? I am almost sure this new proposed method is more efficient and faster, when will there be (I know, I'm lazy) a universal implementation of it?
Also, this may be a stupid question, but I wonder how one measures the 'randomness' of a generator? Is there a unit that represents randomness? I mean, it would be seemingly impossible to do it using observation of the output so I guess all you can do is discuss how dependent it is on particular prior events and what they are, theoretically. Can you really say that this is 'more random' than another one because you have to know so much more before hand about the particular machine & its fingerprint in order to predict its generated number?
Re:A Slightly More Expensive Method (Score:1, Interesting)
P(bit) vs. fabrication variations (Score:3, Interesting)
This is a very interesting phenomenon, but a lot more data is needed to show that it provides consistent behavior.
Re:A Slightly More Expensive Method (Score:2, Interesting)
A VERY interesting idea... (Score:5, Interesting)
a: Many of the bits are sorta random, but physically random. So very biased coins, but true randomness.
b: With the right reduction function, you can turn a LOT (eg, 512 Kb) of cruddy random data to a small amount (128b-512b) of very high quality, well distributed random.
And the fingerprinting relies on the fact that:
a: Many other of the bits are physically random, but VERY VERY biased. So map where those are and record them and it is a very good fingerprint. And since it is all silicon process randomness going into that, it is pretty much a physically unclonable function.
Kevin Fu has some SMART grad students.
Read Gleick's Chaos (Score:3, Interesting)
Also, this may be a stupid question, but I wonder how one measures the 'randomness' of a generator?
Read James Gleick's Chaos.
There is a method in that book that describes how they extracted attractors from a stream of data. Here's how it works.
A team of researchers had a bathtub with a dripping faucet. They tuned the dripping until the drips fell at random intervals. Nothing rhythmic about it. As the drop broke away from the faucet, it was setting up a vibration in the remaining water that would jiggle until the next drop fell. It was highly nonlinear.
They constructed a phase space where you would look at the time between any two drops. On the other axis was the time between the one previous to that. So on one axis you have the time bewteen drops 1 and 2, and on the other axis between drops 2 and 3.
It turns out that an attractor would emerge. The times did not scatter around the page randomly, they grouped in clusters. There was an underlying order that this method would expose.
So - to answer your question, what you could do would be to take your stream of numbers, and examine them in phase space looking at the differences between each data point. If nothing shows up in a two dimensional plot, go for three. Use n1-n2, n2-n3 and n3-n4 on your axis. Add dimensions if you need to beyond that. See what it takes to make your data cluster, if it ever does. The more complex your data is, the more dimensions it will take to visualize that.
Don't follow the hype. Does not apply to PC's. (Score:5, Interesting)
The described method is ONLY for SRAM (statical RAM), no DRAM, no SDRAM. You can find this on RFID chips and in a CPU'S cache, not in RAM. As there is no way to access a CPU's cache uninitialized, I can't see why this should be useful.
If you have to modify a CPU first, to allow access to it's unitialized caches (think about all the unwanted implications), it's much cheaper to just give it a thermal diode and register to poll (as most modern CPU's already have).
After all the described method is just another way of collecting thermal noise. As RFID's are custom designs most of the time, also there it would be cheaper to just use a thermal diode.
The only application for this would be if you had to develop strong crypto for legacy RFID chips.
Slashdot stories get worse by the day.
Re:A Slightly More Expensive Method (Score:3, Interesting)
Eh, well, the unit of entropy is actually "energy per temperature"*, so there are physical units associated with it. Of course, that's physical entropy, and I don't know that it's the same as "information entropy." If they're different, then I blame the folks that overload technical words.
That said, I always thought "random" simply meant "the next outcome is not predictable based on all previous measurements." Therefore the measure of "random" would be based on probability that the next outcome can be predicted based on the previous measurements. I'd say in this case that "completely nonrandom" would be "the next outcome can always be predicted based on previous measurements" and "completely random" would be "zero probability of predicting the next outcome based on previous measurements."
In that sense, it's probably not possible for anything to be either completely random or completely nonrandom, because there is always a finite probability of getting a correct guess, and it's probably impossible to distinguish a guess from looking at previous measurements.
*from dS = dQ/T where S is entropy, Q is energy, and T is temperature (or, better yet, (Boltzmann's constant)*(multiplicity of the system)). I can't remember from Shannon's paper the exact method he used to compute "entropy", but I'm pretty sure it's not "change in energy per unit temperature". Come to think of it, my guess is Shannon's entropy is simply the multiplicity of the system normalized by Boltzmann's constant so the units dissappear (multiplicity doesn't have units). Those crazy non-physical scientists! *grin*
Re:This is hardly random (Score:5, Interesting)
I've had this bite me, and exploited it.
It bit me when booting into Windows CE - you'd power cycle the thing, and the OS would boot with the old RAM disk you had - we'd gotten to the point where we'd have the bootloader wipe the kernel memory so the data structures were all corrupted by the time the OS was trying to decide between mounting the RAM disk (object store) and starting fresh. It turns out that the longer an image is unchanged in RAM, the more likely the cells woudl be biased such that if you cycle the power on them, they're more likely to lean towards the way they were before power was cut.
The time I exploited it, I didn't have any way of logging. Logging to serial port caused issues (timing-sensitive code), so I logged to memory (and no, I had no filesystem running, so I couldn't log to file). My trick was to simply log to a circular RAM buffer. When it crashed, I would just power cycle and dump the RAM buffer. Even though the data was fresh, it was enough to make out what my debug message was trying to say (almost always perfect). This was readable after a brief power cycle, and was still readable after turning power off for nearly a minute. The characters got corrupted, but since it was regular ASCII, you could still make out the words.
Re:A Slightly More Expensive Method (Score:3, Interesting)
Another way to put it is that before the advent of quantum mechanics, every measurement of entropy was only meaningful in a relative, differential sense. S is arbitrary up to a constant. You can see that from the definition you use, which when you integrate is ambiguous up to a constant.