Slashdot Log In
Forensics Tool Finds Headerless Encrypted Files
Posted by
timothy
on Thu Apr 30, 2009 03:17 PM
from the sir-there's-an-anomaly-here dept.
from the sir-there's-an-anomaly-here dept.
gurps_npc writes "Forensics Innovations claims to have for sale a product that detects headerless encrypted files, such as TrueCrypt Dynamic files.
It does not decrypt the file, just tells you that it is in fact an encrypted file. It works by detecting hidden patterns that don't exist in a random file. It does not mention steganography, but if their claim is true, it seems that it should be capable of detecting stenographic information as well."
Related Stories
This discussion has been archived.
No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
Full
Abbreviated
Hidden
Loading... please wait.
Plausible Denial? (Score:5, Funny)
I'm am a citizen of the United Kingdom. Amongst many odd laws we have here, there's one that basically means that you can go to jail if you refuse to hand the police your encryption keys if they ask for them. The one saviour was Truecrypt's plausible denial. If they don't know you have encryption they can't ask for keys!
Now they do know I have encryption... ...and I've forgotten my password.
Can someone please give me tips on how to avoid dropping soap in the shower?
Re:Plausible Denial? (Score:5, Funny)
Parent
Don't worry (Score:5, Insightful)
The company has "innovations" in it's name, so their product probably won't work.
If it did work against true crypt, which is a yard stick of well implemented encryption, I'm sure they'll come up with a counter measure by the next minor release.
Also: In before XKCD strip.
Parent
Re:Don't worry (Score:4, Informative)
What I am guessing is that they are doing Gaussian analysis. It is actually quite simple, and not too hard to implement. If a data set is truly random then the statistics will have some basic indications that it is random.
Since encryption implements a lossless conversion then the data is not random. BECAUSE random data is just that random.
Though it would not be that hard to get around this because the statistics can be fooled. Actually would not be that hard to do that. Thinking about it, rather interesting problem...
BTW I do statistical and probabilistic analysis in a hedge fund...
Parent
Re:Don't worry (Score:5, Insightful)
Since encryption implements a lossless conversion then the data is not random. BECAUSE random data is just that random.
Encryption in ECB mode leaves a very clear pattern, because identical input blocks leads to identical output blocks. Pretty much every other block chaining mode doesn't though because they mix it the preceding blocks, so i'm guessing an implementation flaw because the cryptographic primitives are pseudorandom, they have no distinguishable non-randomness unless you know the exact key.
Parent
Re:Don't worry (Score:4, Interesting)
What I think they are doing and I think it would indicate an encrypted drive is distribution analysis.
If you have truly random data then there is a specific pattern. If you have deleted or unused blocks there will be a specific pattern.
But if you have an encrypted block the distribution will not be like any of the other pieces of data. This is your indicator.
Think of it as follows. You are driving on the highway and somebody on the highway drives the speed limit exactly, stays in the center lane, and does not switch lanes at all. Even though that would seem to be right, it is actually quite wrong and it would make police suspicious.
Parent
Re: (Score:3, Insightful)
You've never heard of cruise control on a 500 mile trip have you?
Re:Don't worry (Score:5, Insightful)
You realize that you aren't saying anything at all, right? Your argument is that since encrypted data is different than random data (an assumption you make without stating), encrypted data will look different than random data.
In reality, one of the standards for encryption algorithms (and block chaining methods) is that they produce a pseudorandom output. In fact, block ciphers are often called upon to operate as PRNGs when given random input data. The idea is that they will produce a significantly larger amount of pseudorandom output data than the random seed data.
BTW I do mathematical cryptanalysis at a university...
Parent
Re:Don't worry (Score:5, Insightful)
I wish I had mod-points for you.
Finally we hear from someone who knows WTF he/she is talking about.
Just to expand a bit: encryption algorithms (except for one-time-pad) don't produce truly random output. But all good, modern ones seek to produce output that's as indistinguishable as possible from truly random output, as a necessary but not sufficient component of their security. There are a variety of techniques to produce pseudorandom data based on a variety of sophisticated mathematics.
It seems like the height of hubris to claim that one software program can reliably detect all these different kinds of extreme slight deviation from perfect randomness.
A more plausible approach (as others have pointed out), is to look for files that do appear to be totally random. Such files are likely to be either (a) the output of a random number generator, or (b) encrypted. All files that have some useful content in their present form have some structure or non-randomness.
Parent
Re:Don't worry (Score:4, Informative)
Parent
Re: (Score:3, Interesting)
So basically this doesn't tell the difference between an encrypted drive and a blank drive, it tells the difference between a pure random drive and a blank drive.
That is, out of the following three possibilities:
1. Default/blank drive, possibly non-random.
2. Drive written over with pure random bytes.
3. Full disk encryption.
This tool can tell the difference between 1 and {2,3}, but it can't tell the difference between 2 and 3. That should still give you plausible deniability then, because there's the possibi
Re:Don't worry (Score:5, Insightful)
BECAUSE random data is just that random.
Any kind of analysis that answers the question of whether a piece of data is random or deterministic can't do so with certainty. You can't prove a string of a million 1's wasn't randomly generated. Every piece of random data long enough will have substrings that appear to be a pattern.
Give a voice recognition program a low enough certainty threshold and it'll pick out words from below the noise floor. But the lower you go, it'll make more and more mistakes and eventually it'll pick out words from plain white noise.
Parent
Re: (Score:3, Informative)
Re:Don't worry (Score:5, Informative)
I don't think so... It's recommended that you compress things before you encrypt them if you plan to do both (usually for network transmission). If you encrypt and then compress, your compression will not be very effective. Good encryption produces very few patterns, and patterns are what compression applications need in order to function.
Parent
Re: (Score:3, Funny)
This will probably become an arms race, in order to use vs detect subtler and subtler patterns in the bytes.
In any case, this tool will probably end up being used by law-enforcement as a polygraph, or breathalyzer: not true, not quite false either, but exciting enou
Re: (Score:3, Informative)
I seem to remember that being a scene from The Wire.
It first appeared in the David Simon's Homicide: A Year on the Killing Streets. The anecdote had been passed down within the Baltimore Police homicide squad, and was presented as a true story in the book. Simon later adapted this and other events from his true crime books for use in The Wire.
Re:Don't worry (Score:4, Interesting)
The company has "innovations" in it's name, so their product probably won't work.
I actually tried it with a Truecrypt volume and a random file (/dev/urandom) and it seems to work. The Truecrypt is identified as "Encrypted Data (Headerless)" and the random file is identified as "Data File (Unknown)".
Parent
Re:Don't worry (Score:4, Insightful)
"Awww, jeez... the damn thing's gotten corrupted! My boss told me to keep my sensitive company files in an encrypted zip file, and it keeps screwing up"
Just because security through obscurity isn't good as the only defense doesn't mean that it's not quite handy in addition to others.
Parent
Re:Don't worry (Score:4, Informative)
I am a computer forensic investigator, and I know what the structure of a zip file looks like internally. It's NOT a blob of random bits. Even a corrupted zip file has a well defined header, indexes, etc.
It's extremely difficult, if not impossible, to hide data from a good investigator who has the time and motivation to investigate thoroughly. If I find a large file containing only random bytes, it is NOT a normal thing and I will look into it further, especially if the file size is an even multiple of 512 bytes. If I can find traces of TrueCrypt ever having been used on that drive I will have a pretty good idea what I'm looking at. I can try to decrypt the file using every possible string found on the hard drive, including bits of memory saved to the paging file and hibernation file. If I manage to decrypt and open the file and find it is formatted with the FAT32 filesystem instead of NTFS I will be very suspicious that this was done because there is a hidden "plausibly deniable" inner volume. I will then work on cracking that open like I did the outer volume. I will also report to the authorities I am working for that there is a significant possibility of a hidden volume. They will use their social skills [xkcd.com] to get the key from the owner.
The real limitation is that cases usually DON'T give me enough time or resources to investigate that deeply, or the lawyers manage to bury the issue of an encrypted file and it doesn't get addressed. The best bet for a person with something to hide is to make it very difficult and time consuming for an investigator to get to the bad stuff, and hope his case isn't that important to warrant the time to dig deeply. In practice that means if you cheated your partner in a small business and hide it very very well I probably won't find it. If you killed someone I will find it.
Parent
Re:Plausible Denial? (Score:5, Informative)
I thought one feature of TrueCrypt was the ability to have two passwords. One password unlocks your "non-secret" data. The other password unlocks your "secret" data in a hidden volume.
http://www.truecrypt.org/docs/plausible-deniability [truecrypt.org]
The point is both sets of data are stored in one big binary blob. It'll all look like one big fat encrypted mess. In fact, if you are not careful, your non-secret data can overrun your secret data.
To get around this "randomness" problem, after creating your non-secret partition, fill the partition completely with something (copy a few public domain books over and over until the partition is full). All the "randomness" will be gone with encrypted data. Then delete everything and put back in just the smallest amount of non-secret data you need to store in order to appear legit. The "randomness" is still there, as only the FAT entries are deleted, but all the encrypted data is still filling up that whole binary blob.
Now, create your secret partition and use it. Be sure to use it just short of the non-secret data's amount (as they fill from the opposite end), otherwise your non-secret partition will be corrupted.
This link helps with the graphics:
http://www.truecrypt.org/docs/hidden-volume [truecrypt.org]
The one downside is that the non-secret side, if it fills up with too much data, will override your secret side. That's why your have backups and this is just for transport anyway, right?
Parent
Re: (Score:3, Interesting)
you got it. It's called hiding in the noise.
Format your drive, now plug it in as usb and create a full size truecrypt encryption on it and fill it with junk.
now take the drive, delete that file and then use it as your new drive whatever. any encrypted files will be hidden in the noise of the background encrypted file that is in the blank area of the drive.
Re:Plausible Denial? (Score:5, Insightful)
"That's cute, sir - now give us the other password"
- "what other password?"
"for the hidden truecrypt volume"
- "what hidden truecrypt volume??"
"the one that's being referred to by half a dozen applications' most recently used files lists"
- "oh err.. that's uh.. another drive entirely"
"very well, then hand us that other drive"
- "err uhm.. my dog ate it?"
If you're really, really serious about these things, maybe you could work super-diligently to prevent leaving any clues as to that hidden volume's existence.. odds are something's going to bite you in the behind somewhere though.
Parent
Re: (Score:3, Funny)
>recently used files lists
strange, my cli apps don't seem to have that
Sure they do! :) (Score:3, Insightful)
[pb@localhost ~]$ tail ~/.bash_history
less GnosLoadPDFForms.pdf
file GnosLoadPDFForms.
mv GnosLoadPDFForms.pdf GnosLoadPDFForms.fdf
file GnosLoadPDFForms.fdf
evince GnosLoadPDFForms.fdf
less GnosLoadPDFForms.fdf
su
acroread GnosLoadPDFForms.fdf
top
Re:Sure they do! :) (Score:4, Informative)
# ignores commands preceded by a space
HISTCONTROL=ignorespace
of course then you have to remember to put a space in front of any commands you don't want recorded
Parent
Re:Plausible Denial? (Score:5, Funny)
Simple. Make your password, "what hidden truecrypt volume?"
Parent
Re:Plausible Denial? (Score:4, Informative)
If you are actually seriously using TrueCrypt so that the NSA (or law enforcement in general) won't get ya, you'd be an idiot to do so from Windows, or even your typical desktop Linux. I'd probably make a separate Linux (or BSD) install just for that, with home directory mounted in ramfs by default. Then make an image of its clean untainted state, and then everytime I need to access the encrypted drive, dd the image to a USB flash stick, boot from that, and only then mount the TrueCrypt volume and work with it. Once done, `shred` the stick.
Parent
Re: (Score:3, Interesting)
The one downside is that the non-secret side, if it fills up with too much data, will override your secret side. That's why your have backups and this is just for transport anyway, right?
It has a protection option where you can enter the hidden password along with the normal password so the hidden partition will be protected, the outer container will be frozen on a write attempt to hidden data. I think it's unnatural that you must ensure that there's no data written to the end of the disk though, it leads to some peculiar disk format choices and so on. A better implementation would be more like a transparent file system layer, where the outer partition could write anywhere it wants and the
Patterns? (Score:5, Informative)
I should first say that I'm rather ignorant about encryption but I hope someone will be able to explain this. I was under the impression that any sort of good-quality encrypted data is indistinguishable from completely random data. That seems to directly contradict the ability to determine whether a volume contains encrypted data by means of locating patterns. Is this really a contradiction?
Re:Patterns? (Score:4, Insightful)
The fact that there's order in the encrypted information doesn't change the fact that, to an outside observer that doesn't know the original information or the key can't tell the difference between the encrypted information and true random noise. That's part of the point.
If they can tell that it's not random, that's a start on cracking the encryption and gaining the original information.
Parent
Re: (Score:3, Interesting)
No...
Encryption is supposed to indicate random noise. But encryption in a grand sense is about writing, and rewriting data.
Let's say I have data which is number 2...
My key is 4,4,4
My encryption is:
Value1 + number -> * Value3 -> - Value4
So it is 4 + 2 * 4 - 4... And I get some number...
I do this multiple times and I get a bunch of other others. Put all of these numbers together and I get what looks like giberish (assuming the algorithm is good enough).
But here is the problem, underneath the data is a p
Re:Patterns? (Score:4, Informative)
Actually, if you use the wrong block cipher mode, it's easy to distinguish between an encrypted file and random noise. AES-256 encrypts 128 bits of data at a time (with a 256-bit key). If you use the same key and the same block of data (ECB mode), you get the same output and can determine that there's something there.
If you modify each block with some known quantity that is different from block to block, then the output becomes much less patterned. For example, Counter (CTR) mode XORs or adds an increasing count to each block of cleartext, so that if you have two identical blocks of cleartext, the output is very different. Cipher Block Chaining (CBC) takes the encrypted output of block N and XORs it with the cleartext of block N+1 before encrypting that block.
Parent
Re:Patterns? (Score:4, Informative)
That's called a known cleartext attack. If they already have the original file then the point of encryption is moot.
1. It's usually called a "known plaintext" attack.
2. Detecting patterns in ECB mode encrypted data is not a known plaintext attack.
3. Known plaintext attacks are most definately not moot.
A known plaintext attack means that you can derive the key or some intermediate to decrypt other data encrypted with the same material, and is highly useful. For example, you could send someone a mail, an instant message, upload a file to a server or whatever and if stored on an encrypted disk you have a known plaintext. If that'll let you figure out the key, big uh-oh. I actually used this on some encrypted (standard password protected) zip files, they have a known plaintext attack. Basicly I had one zip file with contents I already had, and other zip files with contents that I didn't have. But from having both plaintext and ciphertext from one file, I could decrypt all the other files too.
Parent
Re: (Score:3, Insightful)
Re:Patterns? (Score:5, Insightful)
Dear mods, that's meant to be facetious. Some of you seem to be a little trigger-happy so you won't understand why I shouldn't have to explain that.
Make your joke and take the moderations like a man.
If you are going to explain that it is a joke, you might as well not bother in the first place since explaining takes away all the fun.
Parent
Re:Patterns? (Score:5, Informative)
Another thing would be Truecrypt's refusal to overwrite certain parts of that "random" data inside the not-hidden container. Gives it away that it's protecting the integrity of a hidden container.
Why do people constantly make this mistake?
TrueCrypt cannot know a hidden partition exists, *unless* you enter the inner volume password. It will cheerfully let you write right over the inner volume without so much as a by-your-leave, if you only give it the first password. It is true deniability, assuming this tool can't distinguish "encrypted blank space" and "encrypted data".
Parent
Umm... (Score:5, Informative)
Re:Umm... (Score:5, Funny)
ssshhh, the "ga" is secretly embedded through steganography
Parent
Benford's law (Score:4, Informative)
This is probably another application of the Benford's law [wikipedia.org].
Who Cares? (Score:5, Informative)
Be enlightened: http://en.wikipedia.org/wiki/TrueCrypt [wikipedia.org]
Yet another scam (Score:5, Interesting)
Wow, the quality of Slashdot has really been going down lately. Now any random fraud can submit his misleading material and it gets accepted to front page just because it sounds interesting? Is this actually tabloid or serious news for nerds who understand what the talk about?
In short, this is yet another lame attempt to make money by posting bogus claims about a popular product.
First, hidden volumes [truecrypt.org] are the only kind of steganography that TrueCrypt offers. Second, if you read the TrueCrypt documentation, you'll learn the following about hidden volumes vs. dynamic:
On Linux or Mac OS X, if you intend to create a hidden volume within a file-hosted TrueCrypt volume, make sure that the volume is not sparse-file-hosted (the Windows version of TrueCrypt verifies this and disallows creation of hidden volumes within sparse files).
Furthermore, when I try to create a dynamic TrueCrypt volume, TrueCrypt displays a big warning saying that dynamic volumes are insecure. That's right. Insecure.
So again, I demote this story as total and utter bogus motivated by the vision commercial gain.
Re:Yet another scam (Score:5, Interesting)
The article may in fact just be an advertisement, created for commercial gain.
But it was posted because I personally read it and was interested in it.
Parent
Is unreadable data really encrypted data? (Score:3, Interesting)
I do a lot of data acquisition for work and in grad school. I've got lots of my data on my drive. They are written in binary formats of my design. There is lots of repetition, there are no headers, who knows what my data looks like to someone who doesn't know the "decoder ring" to unpack it.
That doesn't mean that my files are kiddie pron or directions to make a dirty bomb.
Sheldon
Re: (Score:3, Insightful)
This is complete BS, and is easy to test (Score:5, Informative)
This is complete sensationalist crap. Truecrypt isn't broken, (probably) nor are any of the other programs they possibly claim to have broken.
This is easy to test for yourselves folks, I just did it in 5 minutes.
dd if=/dev/urandom of=/home/me/somefile.jpg bs=512 count=10000
Performing this command and then scanning the resulting file with "File Investigator" results in the file being detected as a headerless encrypted data file.
Whoever pointed out that they simply identify any randomly filled binary file of a size of a multiple of 512bytes is correct.
TrueCrypt doesn't use ECB mode, hasn't for some time, etc etc etc. Stop freaking out every time someone claims to have broken it.
Re:That's STEGANOGRAPHY! (Score:4, Funny)
Parent
Re:That's STEGANOGRAPHY! (Score:4, Funny)
Our groundbreaking software can detect the presence of SHORTHAND* and allow law-enforcement decryption of this nefarious data-hiding technology!
*Currently can detect Gregg, Pitman, Teeline, and Speedwriting. Also detects the presence of steno pads and stenotype machines.
Parent
Re:That's STEGANOGRAPHY! (Score:4, Funny)
Easy, I'll just encrypt using a one-time steno pad!
Parent
Re: (Score:3, Insightful)
encrypted information (short of a one time pad, which is the only way to get true noise) has an underlying structure in the data operated on.
The digits of pi have an underlying structure. If you have a way to distinguish an arbitrary stretch of pi from truly random data, I suspect you'll win a Fields Medal.
Re: (Score:3, Insightful)
OK, I checked it out. Here's how they "do" it:
1. No File Header.
2. (File size % 512) = 0
3. Successful X2 and Arithmetic Mean tests on certain bytes.
4. File size greater than 15 MB.
Step 2 == entropy tests.
In other words, they detect random looking files (which implicitly implies "no header") whose size is 0 mod 512 and is greater than 15MB.
Big fucking deal. It might be true that on your system, the only files that meet these characteristics are TrueCrypt volumes, but again it's trivial