DARPA Delving Into the Black Art of Super Secure Software Obfuscation 124
coondoggie writes Given enough computer power, desire, brains, and luck, the security of most systems can be broken. But there are cryptographic and algorithmic security techniques, ideas and concepts out there that add a level of algorithmic mystification that could be built into programs that would make them close to unbreakable. That's what the Defense Advanced Research Projects Agency (DARPA) wants for a new program called "Safeware." From DARPA: “The goal of the SafeWare research effort is to drive fundamental advances in the theory of program obfuscation and to develop highly efficient and widely applicable program obfuscation methods with mathematically proven security properties.”
Good luck with that. (Score:4, Insightful)
The objective of "mathematically proven security properties" via program obfuscation is definitely not achievable. After all, it's a given security principle of "security through obfuscation" is unsupportable. If an adversary is capable of obtaining the executable of a program, they can also reverse engineer that same executable. It may take a lot of effort, but it is always achievable.
Re: (Score:2, Insightful)
OTOH all security is by obscurity, what is a password if not a piece of data that is obscured from most people and supposedly is only known by the one that owns it?
Re:Good luck with that. (Score:5, Informative)
Well, something that is obscure is just something that's hard to read. A password is supposed to be hidden, and not seen at all. "Security through obscurity" is the idea that they'll be able to see your algorithms, just not figure it out.
Re: (Score:1)
Fully homomorphic encryption. 'nuff said. Happens to be my current project: developing a processor that uses FHE. Unfortunately, they don't want to use specialized hardware. Does an FPGA count?
CARRIER LOST. ;)
Re: (Score:2)
Re:Good luck with that. (Score:4, Informative)
Fortunately, Merriam Webster is not the final and complete authority on the connotations of words, nor on how they are used within specialized disciplines.
Re: (Score:1)
But you are? Nice ego you got there...
Re: (Score:2)
I always get those mixed up.
Re: (Score:2)
Re: (Score:2)
Well, I found the summary completely incomprehensible, so DARPA is apparently well on their way with this new technology to befuddle and obfuscate...
I thought that was the purview of the legislature...
Re: (Score:2)
OTOH all security is by obscurity, what is a password if not a piece of data that is obscured from most people and supposedly is only known by the one that owns it?
not impressed. "security by obscurity" generally refers to restrict information on how a system works in order to make it harder for people to access. pwords are not that - everybody knows that a system can be accessed by a password, and there are protocols in place for resetting or releasing passwords (if that is the case), etc. The rules of the game are well publicized.
"When I use a word," (Score:2)
No, not all security is obscurity. [stackexchange.com] If your list of things that need to be kept secret includes your security implementation, and especially your algorithm, then you have flawed security. Multi-level security increases the number of things you need to have and/or know in order to compromise the system. With e.g. ROT-13 or another shift cipher, once you know that they are using that cipher, there is no other knowledge that you need in order to break it. On the other hand, if you have an arbitrary number of ke
Re: (Score:1, Insightful)
Security through obscurity as a first line of defense is perfectly fine. Now if the obscurity is the entirety of your security then you have problems.
Re: (Score:2)
Security through obscurity as a first line of defense is perfectly fine. Now if the obscurity is the entirety of your security then you have problems.
It tends to give you a false sense of being more protected than you actually are, and it gives management incentive, through this false sense working on them as well, that they budget less for the work on real security in depth.
There's a natural human psychological barrier against getting a good lock for one's front door, when one already has a lock for one's front door. Why buy another lock, when I have a perfectly good lock? It's the same mentality behind the anti-circumvention and reverse engineering l
Re: (Score:2)
It's one of the reasons I think ASLR is pretty much bullshit,
And yet you would be wrong. Without ASLR return to libc exploits are trivial.
Without shitty libc implementations, return to libc exploits are NOT trivial.
Re: (Score:3)
This doesn't really have anything to do with libc, except that it is a rich source of well known addresses (without ASLR). So what in the hell are you talking about?
Re: (Score:2)
This doesn't really have anything to do with libc, except that it is a rich source of well known addresses (without ASLR). So what in the hell are you talking about?
Why is "arbitrary return to X" a problem in the first place? Is it shitty code that should be running in a sandbox and having it's jumps outward sanitized to the location the jump originated in the first place maybe the problem?
You could always run only the non-shitty code outside the sandbox.
Re: (Score:2)
You could always run only the non-shitty code outside the sandbox.
Good luck with that. Particularly, with the part of figuring out which software is good enough to not require a sandbox. And that's before considering the bugs your sandbox has [xen.org].
Re: (Score:2)
Run all your libraries in ring 2 (currently unused on Intel) instead of ring 3 with all the shitty code.
Re: (Score:2)
Re: (Score:2)
Seems to me that would only make things worse when your libraries have bugs: Why rings 1 and 2 aren't used? [stackoverflow.com]
That reference does not seem to indicate it would be a bad idea, only that "the benefits are reduced".
Ring 2 on VAX hardware is where installed system images (read: like libraries) ran. Same place they run on OpenVMS on Itanium.
People who do not know history are doomed to keep pounding their heads against a brick wall until they repeat it correctly.
Re: (Score:3)
ASLR changes the security issue from "trivial, undetectable remote access or privilege escalation" to "trivial, deafeningly loud denial-of-service."
I can write shellcode that hijacks a function, spawns a thread, and creates a controlled jump back to an earlier function, simulating a successful return and allowing the program to continue--a separate thread downloads and mmap()s in a shared object, which provides all the exploit functions without even spawning a new process, even so far as to open a tempora
Re: (Score:3)
How sure are you? If you can execute the program, that still doesn't mean you can predict exactly what it does and understand everything it could possibly do.
A simple example of checking a password: you can see that the program hashes it and checks it against the hash it should be, doesn't mean you know what the right password is to get beyond it.
Even if you can execute the program, triggering every possible machine state to analyze it is impossible for non trivial programs. And i'm wondering what they coul
Re: (Score:2)
Re: (Score:3)
I imagine that self-altering program code could become incredibly hard to analyze and unravel.
There are many tricks to make it hard. Some CISC instruction sets have variable length instructions, and allow instructions to start at any byte offset. So you can have the same string of bytes execute a completely different sequence of instructions depending on the offset of the entry point. This can make dis-assembly very challenging. I have heard from my biologist friends that DNA sometimes does the same thing, with the same DNA sequence encoding different proteins depending on the offset of the star
Re:Good luck with that. (Score:5, Informative)
That is the standard consensus view in the software industry, yes. I'm afraid to tell you though, that it's wrong.
Last year there was a mathematical breakthrough in the field of what is called "indistinguishability obfuscation" [iacr.org]. This is a mathematical approach to program obfuscation which has sound theoretical foundations. This line of work could in theory yield programs whose functioning cannot be understood no matter how skilled the reverse engineer is.
It is important to note here a few important caveats. The first is that iO (to use the cryptographers name) is presently a theoretical technique. A new paper came out literally 5 days ago that claims to discuss an implementation of the technique [iacr.org] but I haven't read it yet. Will do so after posting this comment. Indeed, it seems nobody is quite sure how to make it work with practical performance at this time.
The second caveat is that the most well explored version of it only applies to circuits which can be seen as a kind of pure functional program only. Actually a circuit is closer to a mathematical formula than a real program e.g. you cannot write circuits in C or any other programming language we mortals are familiar with. Researchers are now starting to look at the question of obfuscating "RAM programs" [iacr.org] i.e. programs that look like normal imperative programs written in dialects of, say, C. But this work is still quite early.
The third caveat is that because the techniques apply to pure functions only, they cannot do input or output. This makes them somewhat less than useful for obfuscation of the sort of programs that are processed with commercial obfuscators today like video games.
Despite those caveats the technique is very exciting and promising for many reasons, none of which have to do with DRM. For example iO could provide a unifying framework for all kinds of existing cryptographic techniques, and enable cryptographic capabilities that were hereto only conjectured. For example timelock crypto can be implemented using and iO obfuscator and Bitcoin.
Re: (Score:2)
Thanks, this is very interesting. I'd imagine that DARPA is aiming to do further research along these lines.
Re:Good luck with that. (Score:5, Informative)
OK, I read the paper.
The money quote is at the end:
Translated into programmer English, a "16 bit point function" is basically a mathematical function that yields either true or false depending on the input. It would correspond to the following C++ function prototype:
bool point_function(short input);
In other words you can hide a 16-bit "password" inside such a function and discover if you got a match or not. Obviously, obfuscating such a function is just a toy to experiment with. "SHA256(x) == y" is also a point function and one that can be implemented in any programming language with ease - short of brute forcing it, there is no way to break such an "obfuscated point function". Thus using this technique doesn't presently make a whole lot of sense. However, it's a great base to build on.
I should note that the reference to AND gates above doesn't mean that the program is an arbitrary circuit - it means that the "program" which is being obfuscated is in fact a boolean formula. Now you can translate boolean circuits into boolean formulas, but often only at great cost. And regular programs can only be translated into circuits at also a great cost. So you can see how far away from practicality we are. Nonetheless, just last year the entire idea that you could do this at all seemed absurd, so to call the progress so far astonishing would be an understatement. Right now the field of iO is developing so fast that the paper's authors note that whilst they were writing it, new optimisations were researched and published, so there are plenty of improvements left open for future work.
Verilog (Score:1)
Actually a circuit is closer to a mathematical formula than a real program e.g. you cannot write circuits in C or any other programming language we mortals are familiar with.
Kevin Horton (kevtris in #nesdev on EFnet) writes circuits in Verilog for a living.
Re: (Score:1)
Re: (Score:1)
Assume for the moment it is true.
That also means that an undetectable virus will exist as well.
In other word programmer pretty much are right (Score:2)
Re: (Score:3)
Re: (Score:2)
The trick is that which one is actually kept is hidden from you cryptographically.
But what if you happen to be a CPU who just wants to execute a series of instructions? Is there some way to tell the CPU which opcode you want to execute without also telling a human who is merely pretending to be a CPU? If it is cryptographically hidden then how can the CPU read it?
Re: (Score:2)
Re: (Score:2)
I am skeptical about unbreakable-obfuscation reaching the likes of C or Java.
Perhaps; but perl had it down pat years ago.
Re: (Score:2)
Re: (Score:2)
If an adversary is capable of obtaining the executable of a program, they can also reverse engineer that same executable. It may take a lot of effort, but it is always achievable.
Well, you can also brute force a 256-bit key. It may take the lifetime of the universe, squared, but it is achievable.
The whole point of this technology is that the computer executing the code doesn't have the source or data in the clear.
There's some existing work designed for databases that work just this way. I send "the cloud" and encrypted query that causes the server to sum a column in some encrypted table and return me the encrypted result all without the server having any keys. It's all manipulati
Re: (Score:2)
If only I had a mod point, I'd be modding that as "funny". That's the most completely hilariously wrong code.
security through obscurity (Score:3)
I'm amazed that someone who supposedly knows what they are doing would even suggest this.
Program obfuscation is completely the wrong approach. It is just another mechanism that relies on security through obscurity, which has been proven time and again to be a short-term solution at best.
When something is actually secure, it's readability should be irrelevant.
Re: (Score:2, Informative)
Security through obscurity can work to a point. *IF* you make it hard enough.
Take for example Raiden II. That game has only recently (in the past month) been 'cracked'. Even though only sorta. There is no encryption. It is all just bundled into a 'cop' chip.
The point though with their 'security' was not to never be cracked. But just make it a big enough pain in the ass that the bootlegers didnt copy the game for a long time. You could argue it took nearly 20 years to crack. Not bad for security thro
Entire copyright term (Score:2)
You could argue it took nearly 20 years to crack. Not bad for security through obscurity.
But not nearly as long as what the industry wants, which is 95 years after first publication.
Re: (Score:2)
When something is actually secure,
That's like saying "when we have world peace."
Programmers make coding mistakes. It is inevitable.
Even the best coding techniques can only reduce errors, not stop them completely.
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
I suspect the intention behind this is to hide backdoors that are intentionally placed. Tell the clueless public that the "TCP/IP stack is now protected from hacking by code obfuscation", when in fact a magic packet gets the NSA access to everything and it is very hard to find that out looking at the source.
Re: (Score:2)
wow I hadn't thought of that. If I had mod points you'd get them.
Re: (Score:2)
And I always thought that PERL had been invented just for this purpose. That language has obfuscation built in!
Re: (Score:1)
I can reverse engineer the output of a dotfuscated solution, for example, but the code is so mangled during the obfuscation process (where classes are hacked and merged semi-arbitrarily and variable names turn into "A", "x34Kj", etc) that it becomes unmanageable to make any changes to.
Being able to reliably make changes to the source code is almost the
Re: (Score:2)
>> Being able to reliably make changes to the source code is almost the entire cost of software development.
It sounds like you're suggesting that making the $ cost of doing something not commecially profitable is alone sufficient to be a good security mechansim.
Re: (Score:1)
Re: (Score:2)
by extension then, neither is it completely the right one.
waiving my consultancy fee today (Score:2, Funny)
Re: (Score:1)
First translate the algorithm into Perl, and then run it on a Perl interpreter written in BrainFuck, of which the BrainFuck interpreter is written in APL, which runs on a machine language written for a drum-based OS from the early 1960's.
Re: (Score:2)
First translate the algorithm into Perl, and then run it on a Perl interpreter written in BrainFuck, of which the BrainFuck interpreter is written in APL, which runs on a machine language written for a drum-based OS from the early 1960's.
You shouldn't knock Visual BASIC like that...
All aboard the bloat boat... (Score:1)
So now on top of abstraction layers and lazy code we can look forward to wasting cycles on advancing the cat and mouse game of a security. I know I'm going to sound like an old codger, but my daily computing tasks have not really changed substantially since the mid 90s (emails, web browsing, shell access, word processing, etc). Streaming video and modern web technologies are awesome; it's not like there haven't been any worthy advances, but addressing excess power consumption, e-waste, and other associate
Not Security Through Obscurity (Score:1)
This is a very different usage of "obfuscation" from what people typically use in everyday programming. It's coming out of some recent work in cryptography. See for example:
Candidate Indistinguishability Obfuscation and Functional Encryption for all circuits
http://eprint.iacr.org/2013/451.pdf
What this line of work may allow you to do is have a cloud computer run some code on some data for you, without revealing anything to the computer about either the code or the data. Without breaking the crypto the clo
Re: (Score:2)
It seems to my uninformed mind like you would have to have already processed the data to create the sum total code path that the cloud computer is to run, at which point you no longer need them to run it.
Re: (Score:2)
Still, this does not make software "secure". In typical attacks you just want the software to misbehave in some way that gets you a shell. You do not need to understand what it does for that. In fact, most modern attacks involve fuzzing and then only looking at the specific things that break in order to subvert them. There is no need to understand what the code actually is supposed to be doing.
Re: (Score:3)
And before I forget: These techniques are excellent to hide backdoors and such, and thereby make software much, much less secure. That may be the real intent. After all, you do not want some vigilante to find the secret government backdoors in everything.
Blast from the past: the Orange Book (Score:1)
This feels like a blast from the past, specifically the Trusted Computer System Evaluation Criteria (TCSEC) [wikipedia.org] aka the "Orange Book."
DoD 5200.28-STD - December 26, l985 [nist.gov]
4.1 CLASS (A1): VERIFIED DESIGN
Systems in class (A1) are functionally equivalent to those in class (B3) in that no additional architectural features or policy requirements are added. The distinguishing feature of systems in this class is the analysis derived from formal design specification and verification techniques and the resulting high degree of assurance that the TCB is correctly implemented. This assurance is developmental in nature, starting with a formal model of the security policy and a formal top-level specification (FTLS) of the design. Independent of the particular specification language or verification system used, there are five important criteria for class (A1) design verification:
4.2 BEYOND CLASS (A1)
Most of the security enhancements envisioned for systems that will provide features and assurance in addition to that already provided by class (Al) systems are beyond current technology. The discussion below is intended to guide future work and is derived from research and development activities already underway in both the public and private sectors. As more and better analysis techniques are developed, the requirements for these systems will become more explicit. In the future, use of formal verification will be extended to the source level and covert timing channels will be more fully addressed. At this level the design environment will become important and testing will be aided by analysis of the formal top-level specification. Consideration will be given to the correctness of the tools used in TCB development (e.g., compilers, assemblers, loaders) and to the correct functioning of the hardware/firmware on which the TCB will run. Areas to be addressed by systems beyond class (A1) include:
DEF CON 20 - Tom Perrine - Creating an A1 Security Kernel in the 1980s [youtube.com]
DEF CON 20 Archive [defcon.org]
Re: (Score:2)
4.2 BEYOND CLASS (A1)
Most of the security enhancements envisioned for systems that will provide features and assurance in addition to that already provided by class (Al) systems are beyond current technology.
Ah, lovely. Government language at its most... statuesque. That's an incredibly awkward way to say, "Dude! We can't do it!"
Thats good (Score:2, Interesting)
Re: (Score:2)
You don't. But we already don't scan the majority of proprietary (or even open source) code that we run on our machines, so effectively, the difference might not be that great.
You can disable its ability to communicate with the outside world, or monitor communications it does make, to warn others that the code may be malicious. But that's about it.
I'm deeply skeptical (Score:2)
Software obfuscation confronts exactly the same core problem as DRM: The goal is to both provide information, in usable form, and not provide the same information, to the same recipient, at the same time. That's impossible. So in both cases all you can do is to try to raise the bar, make it harder to extract the convenient form of the information, but "mathematically proven security properties" must be forever out of reach.
Unless maybe they define "obfuscation" differently than I do.
Re: (Score:2)
It is not the same problem. DRM has to be secure against the machine it runs on. That is impossible. Secure software has to be secure at some perimeter (network socket, IPC interface, etc.), but anything inside this perimeter is assumed to be trustworthy. Secure software _is_ possible.
Re: (Score:2)
It is not the same problem. DRM has to be secure against the machine it runs on. That is impossible. Secure software has to be secure at some perimeter (network socket, IPC interface, etc.), but anything inside this perimeter is assumed to be trustworthy. Secure software _is_ possible.
Obfuscation also has to be secure against the machine it runs on.
Re: (Score:2)
Unless maybe they define "obfuscation" differently than I do.
It wouldn't be the first time this has happened in academic circles.
A friend of mine did his PhD in Artifical Intelligence a couple decades ago, and has been working in the field since then. Some years back we were having an discussion about the Turing Test or something related, but it seemed like we were arguing past each other. Well, after some time it came out the source of the problem - at least according to him, what AI people think of when they discuss whether a system "understands" a language (to pic
Re: (Score:2)
A person (or computer) possessing a (theoretical) book containing every possible response to every conceivable question and statement in, say, Chinese, would be considered to understand Chinese,
I've always had a problem with this definition of the Chinese Room scenario. It's the following:
To be successful in a conversation, that 'book' with responses to questions has to model not just the language, but also the subject domain and the personality of the simulated Chinese speaker. That means that not only does the book have to be huge - we're talking a giant library - but it also has have a representation of a personality inside. And the ability to store knowledge and alter that personality, dependi
virtual processor? (Score:1)
Re: (Score:2)
Re: (Score:2)
Actually no. Security functionality at some point must interact with its environment. Homomorphic encryption does _not_ allow the software to make decisions that are hidden from the processor, yet get passed to some the environment, like an access decision or the like.
Re: (Score:2)
DARPA loves the 'philes (Score:1)
If a compiler can... (Score:4, Insightful)
Re: (Score:2)
As I understand it, it's the compiler output which is obfuscated. Not the source code itself. After all, the original source code must be understood by a human programmer in order to be written in the first place.
Re: (Score:2)
Another programmer who's yet to see any Perl source...
Re: (Score:2)
Even in Perl the original source is usually understood by the programmer. This means that, given enough effort, another human programmer can figure out what it does.
Re: (Score:2)
You are going to need to put it in... (Score:2)
... a different language. Take Ancient Egyptian... without a Rosetta Stone there would be no means to translate it. The whole structure of the language was inscrutable without some sort of introduction to it.
Your program and system ideally should run on custom hardware. Not known computer hardware that must conform to known standards. The system will not be as fast or cheap. But it will be so different that it will be difficult to understand. And what a programmers cannot understand they will not be able t
"close to unbreakable" = "breakable" (Score:2)
This really is BS. Sure, you can obfuscate complex function, e.g. mathematical functions, to the point that reading the code becomes pretty hard. (That includes most crypto, but note that non-standard crypto has a tendency to be insecure and standard crypto can be recognized.) But even there, an attacker can simply try out the functionality and recognize what it does for the cases that matter. For simple functionality (and most functionality is simple, e.g. data access or access control), this does not work
safeware (Score:1)
four words, people (Score:2)
Black Box Hacker Challenge. Or variations on the name. It's what I'm calling it.
OK, before you start with "HONEYPOT!", no it isn't, and this isn't a new idea either. It's been done. Many times. By lots of companies. Including Google - and the NSA, and all to test security on various bits of software outside of lab conditions. In case you're new here, a BBHC is a standalone or more commonly an integrated part of, a hacker convention where you take a blackbox (literally, hence the name of the game) loaded wit
What positive aspects could that have? (Score:2)
This seems like something only useful for malware. After all the only reason you don't want the person to know what code they run is to do something they don't want you to do. And that's essentially the definition of malware.