Researchers Seek Help In Solving DuQu Mystery Language 131
An anonymous reader writes "DuQu, the malicious code that followed in the wake of the infamous Stuxnet code, has been analyzed nearly as much as its predecessor. But one part of the code remains a mystery, and researchers are asking programmers for help in solving it. The mystery concerns an essential component of the malware that communicates with command-and-control servers and has the ability to download additional payload modules and execute them on infected machines."
It says... (Score:5, Funny)
NSA Property, Keep Out.
Re:NSA (Score:5, Insightful)
Actually, I'll reverse the joke and gun for +1 Insightful.
Ready?
Literally why does this story even exist? This code takes out nuclear reactors and "researchers ask programmers for help"? Really?! (Does "Ask" imply they want the answer FREE?!)
So the Dept of Homeland Security is busy helping yank down file share sites and they have no time for this?
Ladies and Gentlemen and AI's, this is your answer to why we're spiralling into a mess.
Re: (Score:2, Informative)
DHS, conspiracy theories aside, is likely conducting their own investigation into DuQu, the details of which are unlikely to be shared with the general public. TFA is about Kaspersky Labs, an independently owned security firm, asking for help from the general public.
Re: (Score:2)
DHS, conspiracy theories aside, is likely conducting their own investigation into DuQu
No need for that unless they snuffed the original developer before securing the relevant docs.~
Re:NSA (Score:5, Funny)
DHS, conspiracy theories aside, is likely conducting their own investigation into DuQu
No need for that unless they snuffed the original developer before securing the relevant docs.~
Hey, everyone makes mistakes. That drone was supposed to have been loaded with tranquilizer darts, not Hellfires. Boy, there were some red faces in the office when we found out what happened.
Re:NSA (Score:5, Insightful)
Literally why does this story even exist? This code takes out nuclear reactors and "researchers ask programmers for help"? Really?! (Does "Ask" imply they want the answer FREE?!) So the Dept of Homeland Security is busy helping yank down file share sites and they have no time for this?
Why would DHS have anything to do with this? DuQu so far hasn't done anything to American interests (in fact, so far as I can tell, it has helped them). The people in TFA looking at the code are Kaspersky: a Russian anti-virus company. They don't even recognize the language the code is written in, much less how it works, and they are wondering if anyone of the billions of people on the Internet knows (specifically, if it is a a specialized language used in some niche industry or something). If no one does, they can be pretty sure it was a custom created language, and proceed accordingly. They aren't asking for someone to do their work for them: they are saying "hey, this look like anything anyone knows?" DHS might be looking at it too, if they didn't create it: but the story has absolutely nothing whatsoever to do with them, in any way. Not even the same continent.
Also, I don't know where you got "takes out nuclear reactors." Stuxnet did damage to nuclear centrifuges. AFAICT all DuQu seems to be doing is stealing data (private keys, actually). Bad for people who get infected, yes. Not like it is causing nuclear meltdowns or something.
Wrong. (Score:1)
No, no DuQu does not, and has never attempted to, 'take out nuclear reactors.' That was a different piece of malware.
It would benefit us all - as well as yourself - if before you commented you educated yourself on the subject of the submitted story.
Re:It says... (Score:5, Interesting)
It looks to me to be the output from the PLC compiler. Clear, count, and compare are basic ladder logic commands.
If you figure out which PLCs the Iranians are using that'll give you the compiler; each brand has its own and you're really unlikely to see it if you haven't used it. How many people here have used DirectSoft? Have you seen Schneider's programming interface?
That would explain why the researchers haven't seen it. You rarely use PLCs outside of industry.
Re: (Score:1)
Looks like the SCADA variant (Score:3)
I only took a glance so don't blame me if I am wrong, but it looks like the SCADA variant
More info available at http://en.wikipedia.org/wiki/SCADA [wikipedia.org]
Re: (Score:2)
Re: (Score:2)
Lots of libraries are event-driven: X-windows, GUI widgets, Qt, device-drivers. Even certain 2D graphics API's had callbacks like TIGA.
Back in the 1980's, ADT (Abstract Data Types) were the big thing in C programming. They were the predecessors to object-orientated design. You started by having a typedef'ed structure. Then you had init, allocate, deallocate functions. With function-pointers (something like: int (*procfunc)( int param1, int param2) stored within that structure, you could do all sorts of C++
Output from a IEC 61131-3 dev kit (Score:2)
Re: (Score:3)
Re: (Score:2)
It's actually \|/
Re: (Score:2)
Mystery Code (Score:5, Funny)
The mystery code isn't really much of a mystery- it's just how Duqu communicates with the sith lord.
Re: (Score:1)
Re: (Score:2)
Besides, I thought Obi Wan killed DuQu?
Re: (Score:3)
Anakin actually killed both DuQu and Obi Wan.
Re: (Score:2)
Re: (Score:2)
Sounds insidious.
Re: (Score:2)
Re:Looks like assembly to me (Score:4, Insightful)
I kid, I kid...
Why? Its entirely possible that this snippet of code is a piece of in-line assembly. It may have started out coming from some higher level language, but been tweaked or completely rewritten in assembly and its origin is no longer recognizable.
Re:Looks like assembly to me (Score:4, Insightful)
Or even self modifying assembly....
That would be a real pisser to figure out.
Any sucker (Score:5, Funny)
Re: (Score:2)
Well, I couldn't tell. Because I'm the suckee, not the sucker.
Seriously, that would be kind of disturbing. A virus written using a distributed memory, multi-processor model. The more systems it infects, the more powerful it gets and the larger the problems it can handle.
Re: (Score:2)
Re: (Score:2)
On a related note, we had an employee with a last name Lovelace. An older client, always prim and proper, left a message once to see when "Mr. Deepthroat would be stopping by to finish the job".
Got to watch out for those "prim and proper" ones. She probably exhausted Mr. Lovelace by the time he "finished the job" on each visit.
Re: (Score:2)
Actually looks like the result of a macro assembler module. The MOV functions gives it away. The only reason for doing that is to make it faster or to reduce the code size, not necessarily to obfuscate. The programmer is old school.
Re: (Score:3)
I don't understand why they are avoiding this option like the plague. C'mon... practically every compiler compiles its language into assembly and runs that through an assembler for final object code creation. (tho some will then run THAT through an optimizer etc) There's absolutely no reason for them to insist it can't be written in native assembler. I wrote many things for the 6502 that way - if you want it fast and small, that's the way to go.
And sorry, if they have to reverse it back into C++ or som
Re: (Score:3)
I think most are missing the point. They probably already know what it does (if they don't, given the effort they have expended, then they are boobs). What they want to do is find what the language was *in order to track down the authors* on the premise that it was some strange language only used in a few places and if they find it, they can narrow the range of likely candidates .
easy (Score:2)
Re: (Score:2)
Probably written in INTERCAL (Score:2)
Learned INTERCAL [wikipedia.org] from Guy Steele in the Comparative Languages course at CMU.
Re: (Score:3)
Who would be insane enough to write OO code in assembly?
Re: (Score:3)
My dad did. Maybe he's behind this. But he was a first generation programmer. Trying to get him to move on from assembly was a pointless endeavor.
Re: (Score:2)
Re: (Score:2)
No seriously. When OO became a fad he figured out how to build up macros to support an OO model.
Re: (Score:2)
Right. That limits the suspects to... um... just about anyone who took a second-year computer science course above the level of "See Spot Run".
Re: (Score:2)
Re: (Score:2)
Why's that so hard to believe? I've been programming almost exclusively in object-oriented languages for 15 years now, give or take. Chances are no matter what language I write in, whatever I write is going to include many object-oriented features. If I was working with a complex assembly project, a type system would be one of the first things I came up with. From there, it's not much of a stretch to imagine you'd want to associate data with instances of that type, and functions that can operate on them. Ba
Re: (Score:2)
Seriously! (Score:5, Insightful)
I'm sure he did write assembly. But Object Oriented assembly?
I'm incredulous that you are incredulous. I thought I saw a book about that somewhere. So I walked over to my tall stack of random language books and there it is:
Object-Oriented Assembly Language, Len Dorfman, McGraw-Hill, 1990
I hereby thwack you upside the head.
Re: (Score:2)
> from assembly was a pointerless endeavor.
ftfy.
Re: (Score:2)
I did when I was college.
Re: (Score:2)
Re: (Score:2)
I use to do that for fun 15yr ago, it is not that hard. There are still some old tutorials on this floating on the net:
http://webcache.googleusercontent.com/search?q=cache:TIHCSoP4378J:yanaware.com/com4me/createcom.php-author%3DErnie%2520MURPHY%26mail%3Dernie%40surfree.com%26url%3Dhttp---here.is-ComInAsm%26idTute%3D39.htm+masm+COM+component&cd=2&hl=en&ct=clnk&gl=us&client=firefox-a [googleusercontent.com]
Re: (Score:1)
I have also. TASM (Borland's Turbo Assembler) had support for it. The assembler would manage a vtable for you among other things. I've also programmed OO in Korn shell. Why OO in assembler or ksh? Because it was the right tool for the job and OO principals can be used anywhere they make sense and help the effort. It's not as far out there as you make it seem.
Re: (Score:2)
Not the whole app. According to TFA, it was written in C++. They even know which implementation. But this particular function (subroutine, method, whatever) appears not to be written in that.
Its something different. Or someone banged out some ASM by hand. If they can figure out what language this routine was written in, they can narrow down the list of possible authors*.
* Come on now. You didn't think the NSA wasn't scraping all the developers' resumes from LinkedIn to build a skill set database to figure
Re: (Score:1)
Re: (Score:2)
That's beside the point. Who the fuck cares what is the imaginary high level language this stuff was written in? They are analyzing the somewhat annotated disassembly anyway. To me it looks like it may be the output from some PLC environment. Perhaps it's CoDeSys output. It doesn't matter anyway, there are no tools that will take this and restore the source. It's not like you need something uber-fancy anyway to help with what's the key here: figuring out what the code does.
Re: (Score:1)
Re: (Score:2)
Good point, although compared to mainstream tools like MSVC, almost everything has a "small" user base.
Re: (Score:2)
Because it annoys the PhDs, that's why they care.
Think about it. We use high-level languages because it expresses an idea in fewer words. If I call a TextBox control in C#, that's simpler than the equivalent in Assembly. These people, of course, are annoyed, because without knowing what the higher language was (assuming there was one used), it will take their minds years to analyze what exactly the code is dong; whereas if they knew what the higher language was, they could create a decompiler, and have some
Re: (Score:2)
Creating a "decompiler" isn't exactly trivial. The types of analyses you have to do on machine code compiled with today's optimizing compilers are fairly generic, they will give you some higher-level representation of the code no matter what was the underlying language. Those tools recognize certain patterns to provide even higher level information, but at a basic level they pretty much repeat what a compiler would do: there's data flow and control flow analysis, and a whole lot of inference based on those.
Re: (Score:2)
Two points:
1.) No one said it was trivial, but for a capable researcher who has spent a fair portion of their life dealing with decompilers (and writing a few of their own), they probably have an idea how to do it fairly quickly.
2.) While it's possible to walk-through reams of Assembly code, it's painful. Extremely painful. A 100-line 'function' in Assembly code will cause most programmers to pause, and a 100,000 lines of Assembly code ('functions' and all) will break even the most vigilant of programmers.
Re: (Score:2)
Just for reference, I have a PhD, I work on compilers and runtime systems. People like me program in assembly, we program in HLLs, whatever works. We will (for actual example) pore over 88 pages of assembly-language output from a compiler in order to find the register allocation bug. Other people I've worked with on compilers (some with PhDs, some not) do things like diagnosing a C optimizer bug based only on the C++ input to cfront (later run through a C compiler) and the busted output, or, after ponder
Re: (Score:2)
answer is simple (Score:1)
It's in ROT-13 Pig Latin.
I'll take my paycheck in gum, Trident Layers to be specific.
Re: (Score:3)
Uhh what? (Score:2)
...and here's me thinking that compiled code has already been reduced to machine code.
Re:Uhh what? (Score:4, Interesting)
Re: (Score:1)
"This compiler will self-destruct in 10 seconds - 'squelch...'"
Re: (Score:2)
Or just a regular code obfuscator.
Re: (Score:1)
Regular code obfuscators are pretty obvious to spot, and you can usually fingerprint which obfuscator was used, if it wasn't homemade. Whatever this code is, it's not something you see in every day asm.
Re: (Score:1)
Assembler is a 1-to-1 correlation with machine code. Simple software can switch between the two.
As explained in the article (blasphemy, I know), high-level languages and the compilers they use tend to leave evidence in the machine code, which can be recognized by some of the real code-nutters when decompiled into assembler.
Re: (Score:2)
Re:Uhh what? (Score:4, Insightful)
A compiler takes your high-level language instructions, and generates the many, many low-level instructions it might take to express a given high-level instruction. The thing is, much like there's many ways to write a cover letter for a resume, there's a lot of different ways to do that high->low expression, but a compiler writer is unlikely to bother with more than one way, or maybe a couple others if there's some benefit to doing so.
A person on the other hand, will have all sorts of random variations in what they write. Oh, they'll come up with certain ruts, and have a certain style, but the won't be exactly the same every single time.
Compilers also do useless stuff. For a car analogy, it's kind of like the tow hooks under your bumper--most of the time they aren't used. A person isn't going to bother to put them there if they're not currently needed or they can envision a need for them--a compiler never forgets to put those hooks there, and sometimes puts them there even when it's redundant. Optimization gets rid of that kind of thing, but no compiler is perfect, and they're often conservative.
Re: (Score:1)
Re: (Score:2)
it was written in assembly language (Score:4, Funny)
that's just a guess
but the level these guys are working at here, something well above script kiddie and slightly below elder neckbeard, it seems entirely plausible to me
Re: (Score:2)
Re:it was written in assembly language (Score:5, Interesting)
Well, if it's above the advanced level of Neck-beard the Gray then it's even more advanced than something like a tiny VM that interprets encrypted bytecode and has re-allocatable variable width opcodes such that the second time you encounter an instruction it may not do the same thing. Eg: my opcodes are Arithmetic encoded and encrypted with an evolving 12bit block cipher; Additionally, each execution swaps a few "function pointers" that the op-codes invoke. The compiler for my VM makes several passes to discover the optimal compression, encryption, and initial opcode-to-action table to use. To reverse engineer such a beast requires manually stepping through machine code from the very first instruction -- That is, given a partial sample of code: no amount of visual analysis will reveal what it does. The language used to write programs for it? ASM, or a subset of C; Though it could be Java, Python or any other high level language -- That's the beauty of compilers.
Not saying this is what's been done, just that I've done and seen some VERY wicked code. I once cracked DRM that was implemented in enciphered MIPS and used such an embedded VM. It looked like the input language for the generated opcode was C.
The government employees paid to come up with such a thing would be at most on-par with the masses of crypto nerds that joygasm over such things -- Who do you think they would hire? There's not some magical government-only breed of human with super hacker powers... Ergo, they must hire from the available pool of people, and since they don't hire us all, or even necessarily the absolute brightest, the highest level of hackerdom they could employ would be on-par with "the advanced neck beard" at most.
Re:it was written in assembly language (Score:5, Funny)
fine, you've made your point
but the official coder manual officially classifies neckbeards as
young neckbeard, adult neckbeard, elder neckbeard, and ancient neckbeard
with Hit Points 100, 300, 700, and 1500, respectively
the ancient variety is allowed to cast Befuddlement at will with a savings throw adjustment of -6 on your character's intelligence rating. i see you tried to cast that spell in your past post
but i have no idea what this "advanced" neckbeard is you refer too. i don't think such a neckbeard classification exists... oh shoot, did you just Befuddle me?
fine, i'll wait out the next 3 turns
*sigh*
It's either: (Score:5, Funny)
Possibly? (Score:1)
erlang (Score:5, Insightful)
My guess is that it's probably erlang. It fits all the descriptions of how erlang works. Erlang is used in all sorts of realtime systems, it wouldn't be a stretch to see that it was used in a virus library. Someone that is in the Telecom or Network infrastructure industry might be familiar with Erlang and that type of person might also be the same type of person that knows enough about networks and network vunerabilities to architect a framework for virus distribution.
Re: (Score:2)
+1 insightful. I haven't thought of Erlang!
Re: (Score:2)
But wouldn't erlang would have separate functions for each callback? Everything else is very similar.
Another architecture this looks similar too is the X Toolkit event library...
Perl (Score:3)
That clearly looks like perl to me.
Re:Perl (Score:5, Funny)
Re: (Score:2)
IMHO compiling may actually make it more readable :p
FTFA (Score:2)
the decoded message is (Score:3, Funny)
"Be sure to drink your Ovaltine."
Re: (Score:2)
"Be sure to drink your Ovaltine."
Or you'll shoot your eye out?
Spin language? (Score:1)
Forth ? (Score:2)
One of the comments on the page already said that.
I remember I disassembled Forth a lot of years ago.
It comes in 2 flavours: interpreted and compiled.
It relies on RPN heavily.
It's a very compact language, both in source and in compiled form.
You extend the language by using "words", and it's like OOP.
It's one of the weirdest language I ever used.
Sounds like.. (Score:1)
... it's Java!