English Shell Code Could Make Security Harder 291

Posted by ScuttleMonkey on Monday November 23, 2009 @09:33PM from the little-bobby-tables-takes-up-writing dept.

An anonymous reader writes to tell us that finding malicious code might have just become a little harder. Last week at the ACM Conference on Computer and Communications Security, security researchers Joshua Mason, Sam Small, Fabian Monrose, and Greg MacManus presented a method they developed to generate English shell code [PDF]. Using content from Wikipedia and other public works to train their engine, they convert arbitrary x86 shell code into sentences that read like spam, but are natively executable. "In this paper we revisit the assumption that shell code need be fundamentally different in structure than non-executable data. Specifically, we elucidate how one can use natural language generation techniques to produce shell code that is superficially similar to English prose. We argue that this new development poses significant challenges for in-line payload-based inspection (and emulation) as a defensive measure, and also highlights the need for designing more efficient techniques for preventing shell code injection attacks altogether."

This discussion has been archived. No new comments can be posted.

English Shell Code Could Make Security Harder

Load All Comments

Search 291 Comments Log In/Create an Account

Comments Filter:

This is (Score:4, Funny)

by Anrego ( 830717 ) * writes: on Monday November 23, 2009 @09:41PM (#30209212)

quite terrifying :(
If hackers convert arbitrary x86 shell code into sentences that read like spam, but are natively executable .. we're all screwed :(
We'll either need to tighten up how architectures execute instructions to make it harder to execute shell code in the first place.. or come up with sophisticated AI to help filter out the shell code. Of course, as soon as we do that, hackers will develop AIs which can write convincing (and even compelling) shell code.. and THEN what the hell do we do.
Now where I live you can get a pretty decent hair cut for $17 (they even trim up the beard). You can't get anything fancy.. but a decent, professional-ish type haircut is definitely no problem.
My employer is giving us a pretty generous Christmas vacation.. really looking forward to that!!
Also this time of year is great cause CHRISTMAS is everywhere :D

Share
twitter facebook
- Re:This is (Score:5, Informative)
  
  by Wovel ( 964431 ) writes: on Monday November 23, 2009 @10:00PM (#30209358) Homepage
  
  Guess you missed their "compromised" machine assumption. "..After successful exploitation of a software vulnerability, we assume that a pointer to the shellcode..." . The sky is not really falling any faster today than it was yesterday.
  
  Parent Share
  twitter facebook
  - Re:This is (Score:5, Informative)
    
    by blueg3 ( 192743 ) writes: on Monday November 23, 2009 @10:11PM (#30209426)
    
    Pinning down terminology use by security researchers is tricky.
    In this case, what they mean is that the system has a vulnerability that enables code from a remote source to be executed, and that the input from the remote source is being run through a filter that attempt to identify executable code (in order to block it) versus English text.
    On an already-secure system, this makes no difference at all. Those don't exist, much. If you were relying on a "looks like executable code" filter to protect you, this is a tip that it's not that secure. The paranoid should already assume so (based on things that already are available in Metasploit, if nothing else).
    
    Parent Share
    twitter facebook
    - Re:This is (Score:5, Insightful)
      
      by wvmarle ( 1070040 ) writes: on Tuesday November 24, 2009 @12:20AM (#30210086)
      
      As is being argued all the time: security is about layers. Layer upon layer. One layer to prevent executable code to reach your system in the first place by looking at the content of a message. Another layer to prevent code that does reach your system to be executed at all. Another layer to prevent untrusted code that does manage to be executed to do any damage (sandbox, permissions). Relying on a single layer of defense is not secure, no matter what that layer is or how strong that layer is. Breach that one layer and you're in.
      This research gives at the very least a proof-of-concept on how to breach that first layer of security. And that of course is significant.
      Of course there are no 100% secure systems - but the more layers of defense, the more secure it becomes. This takes away one layer of defense, thus making a system less secure. So yes it does make a difference even on "already-secure" systems.
      
      Parent Share
      twitter facebook
- Re: (Score:3, Insightful)
  
  by afidel ( 530433 ) writes:
  
  Isn't this what NX is supposed to stop, execution of arbitrary data as code?
  - Binaries that opt out of NX (Score:3, Informative)
    
    by tepples ( 727027 ) writes:
    
    Isn't this what NX is supposed to stop, execution of arbitrary data as code?
    Then you compromise a binary that has opted out of strict NX, such as a Java virtual machine that needs to dynamically recompile JVM bytecode to x86 bytecode.
    - Re: (Score:2)
      
      by afidel ( 530433 ) writes:
      
      Yes, but that should dramatically reduce your attack surface, well except for stupid Flash Player and Acrobat, Adobe can't code their way out of a paper bag.
  - Re: (Score:2)
    
    by blueg3 ( 192743 ) writes:
    
    Yes -- in theory, could should be W xor X: writable or executable, but never both. This is then solved neatly. However, this is often not the case. It's a little bold on Von Neumann machines, where the code and data are the same, to hope that code and data can be cleanly separated reliably.
    The most egregious case is interpreters, where data that's passed around is turned into executable code dynamically. Less egregious but still unsafe is dynamically-generated code, which must be both writable and executabl
    - Re: (Score:2)
      
      by XDirtypunkX ( 1290358 ) writes:
      
      But it doesn't have to be both writable and executable at the same time, unless the generated code is self modifying.
    - Re: (Score:3, Interesting)
      
      by nneonneo ( 911150 ) writes:
      
      Unfortunately, this does not fully solve the problem. Say, for instance, that you've managed to get a buffer overflow on a system, and you now have control over the stack (which is marked RW, but not X). Then, you overwrite the return address of the current function to mprotect() and stick some arguments on it which change the stack protection to RX (there are good reasons for doing this in actual practice, e.g. executable compressors like UPX, or executable thunks on the stack); this type of attack is know
      - Re:This is (Score:4, Insightful)
        
        by blueg3 ( 192743 ) writes: on Monday November 23, 2009 @11:39PM (#30209898)
        
        Even better: inputs that can overwrite the stack can perform arbitrary code execution even if the stack is never executable, via "return-to-libc" programming.
        
        Parent Share
        twitter facebook
- Re: (Score:2, Funny)
  
  by mysidia ( 191772 ) writes:
  
  I propose the x86 instruction set be altered to add an additional byte to every instruction, a NUL byte or NUL word, so every instruction will have an additional 2 to 8 bytes of overhead, at least 1 must be set to all bits 0, and the following byte must be set to all bits 1.
  Since the NUL byte cannot be expressed in a sentence and commonly causes I/O to terminate (i.e. delineates the end of the string), x86 code can then not be disguised as a sentence.
  Also, the following byte being all bits 1, assures
  - Re: (Score:3, Insightful)
    
    by x2A ( 858210 ) writes:
    
    Well then that won't be the x86 instruction set, will it?
    - Re: (Score:2, Interesting)
      
      by mysidia ( 191772 ) writes:
      
      No, it won't be the legacy x86 instruction set.
      But we can call it the "Secure x86 instruction set" or the "Enhanced x86 instruction set"
      Market it properly, and everyone will switch to it, because they think it's faster and safer.
      - Re: (Score:3, Insightful)
        
        by x2A ( 858210 ) writes:
        
        If you've got the ability to market a processor that won't run peoples old software, and using it makes software slower, take up more memory (think for single byte instructions, a single byte of padding is doubling the space it takes up, which is in effect halving the size of your L1/L2 caches), to a level sufficient enough to get people to actually buy it, then you may as well not even bother with the CPU, just convince them to give you money for nothing, as obviously your marketing team are that good that
        
        Re: (Score:2)
        
        by x2A ( 858210 ) writes:
        
        No. We're talking realising that exageration. This CPU wouldn't even run those.
- Re: (Score:2)
  
  by Blakey Rat ( 99501 ) writes:
  
  I know, I'm going to have to stop saving and trying to execute all my incoming spam messages.
  Maybe I'll try executing my IMs...
- Re: (Score:2)
  
  by zippthorne ( 748122 ) writes:
  
  It's even worse than that. With liberal use of jumps, the hackers can edit the jumped-over text to make sentences that actually mean something, rather than simply superficially looking like English. They could, for instance, combine a fork bomb with a screed about cheap haircuts that really aren't.
  Now, if I'm reading right, on page 7 there is a diagram which seems to imply that they also have a solution to the halting problem...
- Apple already did this (Score:2)
  
  by syousef ( 465911 ) writes:
  
  If hackers convert arbitrary x86 shell code into sentences that read like spam, but are natively executable .. we're all screwed :(
  It's called Hypercard.
- - Re:This is (Score:4, Funny)
    
    by BradleyUffner ( 103496 ) writes: on Monday November 23, 2009 @09:48PM (#30209262) Homepage
    
    I beleive you missed the virus he just sent you. :)
    
    Parent Share
    twitter facebook
- - Thanks (Score:4, Informative)
    
    by turgid ( 580780 ) writes: on Tuesday November 24, 2009 @07:09AM (#30211680) Journal
    
    What is "shell code" supposed to be? Bourne shell scripts?
    Someone had to ask it!
    From the wikipedia [wikipedia.org]: In computer security, a shellcode is a small piece of code used as the payload in the exploitation of a software vulnerability. It is called "shellcode" because it typically starts a command shell from which the attacker can control the compromised machine. Shellcode is commonly written in machine code, but any piece of code that performs a similar task can be called shellcode. Because the function of a payload is not limited to merely spawning a shell, some have suggested that the name shellcode is insufficient.[1] However, attempts at replacing the term have not gained wide acceptance.
    So it's a poor piece of new terminology that has stuck, unfortunately.
    
    Parent Share
    twitter facebook
Oh great - that love letter from the IRS (Score:4, Funny)

by rcpitt ( 711863 ) writes: on Monday November 23, 2009 @09:43PM (#30209232) Homepage Journal

just formatted my hard disk and installed Windows 7 - how low can you get :(

Share
twitter facebook
- - Re: (Score:2)
    
    by roguetrick ( 1147853 ) writes:
    
    You think thats exceptional, after I read that my head morphed into a facsimile of Ballmer's 0:`-( ))
Confused (Score:2)

by MichaelSmith ( 789609 ) writes:

Does TFA talk about shell code or assembler code?
- Re: (Score:3, Insightful)
  
  by icebraining ( 1313345 ) writes:
  
  It's a shellcode [wikipedia.org]; it's actually written in machine code.
- Re: (Score:2)
  
  by blueg3 ( 192743 ) writes:
  
  Shellcode is machine code. That is, compiled assembler.
  It's just a logical extension of the shellcode filters that Metasploit already provides. If you hadn't thought it through, though, it's an important proof-of-concept.
- Re: (Score:2)
  
  by Blakey Rat ( 99501 ) writes:
  
  Shellembler code.
  Common mistake.
- Re:Confused (Score:5, Informative)
  
  by Ungrounded Lightning ( 62228 ) writes: on Tuesday November 24, 2009 @12:14AM (#30210070) Journal
  
  TFA uses the security community's special term "(a) shellcode", which means something other than what it sounds like to ordinary programmers.
  "A shellcode" is the infection head of an exploit - the thing you try to get to run on the target to make the rest of the exploit work. It's in the machine language of the target, not a shell language.
  It's called "a shellcode" because it typically (but not necessarily) tries to sucker the system into launching a shell to run the rest of the exploit. The rest of the exploit may be in a shell language (depending on the shell to interpret it), a machine language executable, etc. Or "the shellcode" may do something else than launch a shell.
  This is one of the latter cases. It's a chunk of self-modifying code (due to the limits of what instructions you can get out of English-looking text) that bootstraps its own internals into something that can act as an interpreter (or other executor) for the rest of the English-looking exploit code, then runs though that code and "makes it happen".
  You can think of it as a binary executable program that depends on self-modification to get away with consisting only of combinations of bytes that look enough like English to fool spam filters which are trying to recognize executable code.
  So it's a very goofy binary and there are no shells or shell languages involved. Instead (if I read this right) the researchers built a very screwy assembler that takes as input an assembler source program and produces as output some VERY screwy machine code that looks like English and ends up doing the same job in a roundabout way, rather than being the direct translation of the assembler code input.
  
  Parent Share
  twitter facebook
- - Re:Confused (Score:4, Informative)
    
    by The MAZZTer ( 911996 ) writes: <megazzt&gmail,com> on Monday November 23, 2009 @10:04PM (#30209380) Homepage
    
    Nope, you're confusing assembly code and shell/machine code, which are two different things.
    Assembly is text-based, and is readable for people who know the language. Each operation is a keyword, and some take arguments. It's basically the lightest-weight possible programming language (although it's not really considered a programming language, it's so light weight!) A computer cannot run assembly code directly.
    Machine code is what you get if you take the assembly and run it through an assembler to produce code that the computer can understand. The computer can then execute it. It is not human readable unless you've memorized which opcodes correspond to which assembly keywords. Far easier to pipe it through a disassembler to get the assembly code back and read that.
    To answer the GP's question this sounds like they mean shell code. It wouldn't be very useful as assembly code anyway. ("To claim your free iPod, run this sentence through masm and run the resulting EXE file.") Most people don't have an assembler and the ones who do aren't usually susceptible to malware anyway.
    
    Parent Share
    twitter facebook
    - Re: (Score:2)
      
      by MichaelSmith ( 789609 ) writes:
      
      Its a bit like people who put obscure perl code in their sig, waiting for somebody to run it out of curiosity.
    - Re: (Score:2)
      
      by Nazlfrag ( 1035012 ) writes:
      
      This is machine code that is restricted to only those opcodes found in English phrases with tricks to get other opcodes via self modification. Quite nifty really.
This very comment (Score:5, Funny)

by ewg ( 158266 ) writes: on Monday November 23, 2009 @09:51PM (#30209288)

Why, this very comment prints a list of prime numbers less than one hundred!

Share
twitter facebook
- Re:This very comment (Score:5, Funny)
  
  by The MAZZTer ( 911996 ) writes: <megazzt&gmail,com> on Monday November 23, 2009 @10:06PM (#30209400) Homepage
  
  Where do the numbers print out I don't see325072$OGO^%$#G@!!)%@^)&@!^%$$36PEER TIMEOUT
  
  Parent Share
  twitter facebook
OMG! (Score:5, Funny)

by mhajicek ( 1582795 ) writes: on Monday November 23, 2009 @09:53PM (#30209302)

Now your brain can catch a virus just by reading!!!1

Share
twitter facebook
- Re:OMG! (Score:5, Funny)
  
  by Nethead ( 1563 ) writes: <joe@nethead.com> on Monday November 23, 2009 @09:58PM (#30209338) Homepage Journal
  
  Leave the bible out of this!
  
  Parent Share
  twitter facebook
  - Re:OMG! (Score:4, Interesting)
    
    by wizardforce ( 1005805 ) writes: on Monday November 23, 2009 @10:28PM (#30209526) Journal
    
    You joke but what is a meme (religions are "memes") really other than a self replicating piece of language? The *extreme* bits act in many ways like a virus does: self replication, performing specific tasks, adapting to their environment (like some of the more insidious malware) and neither viruses nor memes can replicate on their own; they need a "host."
    
    Parent Share
    twitter facebook
    - Re:OMG! (Score:5, Funny)
      
      by Nethead ( 1563 ) writes: <joe@nethead.com> on Monday November 23, 2009 @10:40PM (#30209586) Homepage Journal
      
      So now that you've explained my joke, do you get it?
      
      Parent Share
      twitter facebook
    - Re: (Score:2)
      
      by roguetrick ( 1147853 ) writes:
      
      Thanks for the sub-wikipedia summary level class on memes, professor. Maybe next you can present to us your grand theory on how girls don't like nice guys, or some other such bullshit.
    - Re: (Score:2)
      
      by NotQuiteReal ( 608241 ) writes:
      
      No, I Say "OMG", You Say "Ponies!".
- Re: (Score:2)
  
  by enoz ( 1181117 ) writes:
  
  The English language is infected, do not translate this message.
  - Re: (Score:2)
    
    by MBCook ( 132727 ) writes:
    
    Ah, The Funniest Joke in the World [wikipedia.org]. Oddly topical for this topic, eh?
    - Re: (Score:2)
      
      by enoz ( 1181117 ) writes:
      
      I was referencing Pontypool [imdb.com] but that Monty Python skit is also relevant.
- Re:OMG! (Score:4, Funny)
  
  by Concerned Onlooker ( 473481 ) writes: on Tuesday November 24, 2009 @12:38AM (#30210160) Homepage Journal
  
  Yes, its' a simple head code. Any English schoolboy could catch it.
  
  Parent Share
  twitter facebook
That was rather pretty (Score:2, Interesting)

by jaymz2k4 ( 790806 ) writes:

I just have to point out how well that PDF looked from a purely graphic point of view... That is all. Interesting content to boot.
- Re: (Score:2)
  
  by Wovel ( 964431 ) writes:
  
  I actually agree it was good looking and a fairly interesting read.
- Re: (Score:2, Informative)
  
  by sten ben ( 1652107 ) writes:
  
  Looks like LaTeX [latex-project.org] with a CHI [rwth-aachen.de] template. But maybe that was what you were getting at? Pretty it is.
  - Re: (Score:2, Informative)
    
    by gzipped_tar ( 1151931 ) writes:
    
    The PDF file itself was generated using Adobe Distiller for Mac. Not sure what is used to generate the original. Since they were using Adobe, it's not likely that they were using LaTeX.
    - Re: (Score:2, Informative)
      
      by sten ben ( 1652107 ) writes:
      
      Since they were using Adobe, it's not likely that they were using LaTeX.
      Except the .dvi file extension. And: Creator: dvips(k) 5.97 Copyright 2008 Radical Eye Software
      Acrobat was probably only used to convert the ps to pdf.
      - Re:That was rather pretty (Score:4, Informative)
        
        by dubaiguy ( 1684890 ) writes: on Monday November 23, 2009 @10:49PM (#30209648)
        
        It's latex with an ACM template. I'm pretty sure their workflow was latex (.dvi) to dvips (.ps) to Acrobat Distiller (.pdf).
        
        Parent Share
        twitter facebook
Antelope museum (Score:5, Funny)

by beej ( 82035 ) writes: on Monday November 23, 2009 @10:37PM (#30209580) Homepage Journal

Consume more trains, Elvis! He, and snorkels, drink elephant's sock puppet master. Steamed cabbage can reverse big piles of ducks. Additionally, cheese log cabin nightmare.
You're screwed now, x86 suckas!

Share
twitter facebook
- Re:Antelope museum (Score:5, Informative)
  
  by slashqwerty ( 1099091 ) writes: on Tuesday November 24, 2009 @01:12AM (#30210312)
  
  For those that are curious, here is some actual exploit code from the paper [jhu.edu]:
  There is a major center of economic activity, such as Star Trek, including The Ed Sullivan Show. The former Soviet Union. International organization participation Asian Development Bank, established in the United States Drug Enforcement Administration, and the Palestinian territories, the International Telecommunication Union, the first ma
  
  The bold characters are code. The rest have no net effect.
  
  Their strategy is to break the exploit into two pieces, a small executable decoder, and the payload. As you might imagine, the decoder decodes the payload. The payload is encoded in a benign-looking format which is simple enough. Their goal was make the decoder also look like benign data. To achieve that, their tool takes an existing decoder and automatically converts it to English-looking prose like the paragraph above. The tool is able to convert a decoder is less than an hour on commodity hardware.
  
  Parent Share
  twitter facebook
I'm screwed (Score:2)

by nedlohs ( 1335013 ) writes:

Since the first thing I do with all my emails is save the text and run it as a binary executable.
I CAN BE PLAYED ON RECORD PLAYER X (Score:3, Insightful)

by rpresser ( 610529 ) writes: <rpresser AT gmail DOT com> on Monday November 23, 2009 @10:58PM (#30209706)

Let the T-C wars continue!

Share
twitter facebook
We're doomed! (Score:2)

by REggert ( 823158 ) writes:

Oh noes! If only we had a way to detect and filter text that looks like spam....
So what? (Score:3, Interesting)

by Fnord666 ( 889225 ) writes: on Monday November 23, 2009 @11:17PM (#30209794) Journal

I guess I don't see the big deal in this paper. Yes, they can encode the shell code into English sentences. It's still meaningless to the recipient and should raise suspicion. It would be far easier to use simple steganographic techniques to embed the shell code into any image transmitted between two systems. The recipient would not suspect any alteration and filters would not have the original image for comparison. Just a thought. Maybe I should write a response paper.

Share
twitter facebook
- Re: (Score:2)
  
  by nneonneo ( 911150 ) writes:
  
  When the recipient is a computer system and no humans are involved, this becomes far more dangerous (and besides, these messages look like educated spam rather than total gibberish, and would probably even pass a simple spam filter).
  Basically, the paper is talking about defeating signature or heuristic analysis of shellcode. Normal shellcode looks nothing like English text, whereas this code has a very similar statistical distribution to real English text, meaning that heuristics likely would not flag the c
Linux version (Score:5, Funny)

by noidentity ( 188756 ) writes: on Monday November 23, 2009 @11:17PM (#30209796)

They also came up with a Linux version, which even works on non-x86 architectures, all the while looking like plain English:
"Please type the following on your command-line:
rm -rf *
Thank you."

Share
twitter facebook
Excellent Presentation (Score:5, Informative)

by rochberg ( 1444791 ) writes: on Monday November 23, 2009 @11:49PM (#30209948)

This talk was probably my favorite at CCS this year. Unlike MANY researchers, the lead author of this paper was quite entertaining. Regarding the work itself, there are a few details that the current discussion has missed.
First, I would not say that they can convert arbitrary shell code to English-like prose. Rather, the only instructions that can be used are the ones that are identical to the ASCII encoding of the alphabet. For instance, the ASCII encoding of the letter "r" is identical to the binary for the unconditional jmp instruction. Granted, the authors showed that you can do a lot with this limited set of instructions, but I still wouldn't call it arbitrary.
Second, he showed several examples of the sentences created. They make about as much sense as "Lorem ipsum dolor sit amet..." The tight constraints on the instructions that can be encoded into ASCII make crafting decent English syntax nearly impossible. Spam filters based on natural language processing could probably detect and flag them.
While disguising the binary as ASCII is cool, I don't see that it's all that different than other exploits. Once a sentence containing an exploit is detected, you'll have signatures just like any other type of virus/trojan. I highly doubt that contemporary anti-virus scanners stop working on data that looks like ASCII. Rather, they look for tell-tale signs of particular instructions that appear in particular orders, etc.
And, as many others have pointed out, this code is only harmful if it is executed in the right context (i.e., you have a vulnerability to exploit). Disguising the code as ASCII doesn't really make it different than any other type of zero-day attack.
This work was very sophisticated, and there's no way that script kiddies could build something like this. I don't know that more advanced attackers would bother, because I really don't see all that much of a payoff given the amount of work that this attack requires. It's a whole lot easier to take over a vulnerable web server and launch a XSS attack. The incentives simply do not seem to suggest that this technique will become widespread.
So, no, I don't think the sky is falling because of this attack. Having said that, though, this was a very cool piece of work.

Share
twitter facebook
- Re: (Score:3, Informative)
  
  by dubaiguy ( 1684890 ) writes:
  
  First, I would not say that they can convert arbitrary shell code to English-like prose. Rather, the only instructions that can be used are the ones that are identical to the ASCII encoding of the alphabet. For instance, the ASCII encoding of the letter "r" is identical to the binary for the unconditional jmp instruction. Granted, the authors showed that you can do a lot with this limited set of instructions, but I still wouldn't call it arbitrary.
  According to the PDF it does convert arbitrary shell code. FTA: What follows is a brief description of the method we have developed for encoding arbitrary shellcode as English text... It looks like they can encode anything once they have built an English-like decoder (judging by their language and the 3rd figure).
  The tight constraints on the instructions that can be encoded into ASCII make crafting decent English syntax nearly impossible. Spam filters based on natural language processing could probably detect and flag them.
  If they were sending SPAM... which they aren't.
You have... (Score:3, Funny)

by slimjim8094 ( 941042 ) writes: on Monday November 23, 2009 @11:49PM (#30209950)

You have
a virus
Didn't you know?
You shouldn't be
running Windows
Burma Shave

Share
twitter facebook
Hello, World! (Score:3, Insightful)

by nneonneo ( 911150 ) writes: <spam_hole@shaw.DEBIANca minus distro> on Monday November 23, 2009 @11:54PM (#30209980) Homepage

There is a major center of economic activity, such as Star Trek, including The Ed Sullivan Show. The former Soviet Union. International organization participation Asian Development Bank, established in the United States Drug Enforcement Administration, and the Palestinian territories, the International Telecommunication Union, the result of the collapse of large portions of the three provinces to have a syntax which can be found in the case of Canada and the UK, for the carriage of goods were no doubt first considered by the British, and the government, and the Soviet Union operated on the basis that they were the US Navys interpretation of the state to which he was subsequently influenced by the new government was established in 1951, when the new constitution approved it you King, he now had the higher than that the M.G.u, and soul shouters like Diane. There's a mama maggot including the major justifications that the test led to his own. This is usually prepared by the infection of the Sinai to the back and the Star Destroyers in the parliament, by the speed of these books and the revival of environmental problems of their new Arab states of the Arctic as a more and they possess power to the effort she was especially valuable as the Union and that would have said, as to note that the goods, which the night that if ever I rode after the word Father upon His Church to claim that the peace that had permitted him the city are as a hand of one into I thought of Mr. Crow and the Jews by the days of the C.Cs front garden which had first to St Cyriacus. All of a theology in the setting in a human heart as the tale of this day. I have it to friendship and the States that the way the English of the St Lawrence seven miles of an adjutant...
Now, would you have guessed that this is executable machine code (shellcode)? Honestly, it looks more like the garbage that spammers use to defeat statistical analysis (indeed, this is code generated with a similar goal).
(P.S. this particular sample is merely an amalgamation of the code which was reproduced in the paper; it is not complete, and will therefore not execute).

Share
twitter facebook
Interesting work (Score:3, Insightful)

by Stan Vassilev ( 939229 ) writes: on Tuesday November 24, 2009 @10:08AM (#30213186)

But I'd venture a guess it's far easier to hide such code in the noise of an innocent looking image.

Share
twitter facebook
Too bad IBM went with the 8088 (Score:3, Interesting)

by Megane ( 129182 ) writes: on Tuesday November 24, 2009 @02:21PM (#30216566)

This sort of shellcode is probably a bit harder to write for the 68000, with its 16-bit instructions that have an "operand mode" field that spans between the two bytes. While a lot of useful instructions are in the 2xxx-7xxx range, and branches are in the 6xxx range, the instructions that do any sort of math are outside it.
It would be interesting to see what can be done with other CPUs as well. In particular, I recall that OS X PPC missed a chance to resist shellcode by ignoring two of the four bytes of the OS trap instruction, rather than forcing them to be nulls.

Share
twitter facebook
- Re: (Score:3, Informative)
  
  by benjamindees ( 441808 ) writes:
  
  They don't mean shell commands. They mean code that exploits a vulnerability in order to start a shell.
  - Re: (Score:2)
    
    by Wovel ( 964431 ) writes:
    
    Thanks, I read the article after I posted. So they discovered a way to attack machines that have already been compromised..If your security is relying on an inline inspection for specific commands, you have already lost. All that reading to change three words.
    - Re: (Score:2)
      
      by istartedi ( 132515 ) writes:
      
      Yes, but if a machine on your network has "already lost", you'd probably like to know that.
      - Re:The syntax should not matter.. (Score:5, Insightful)
        
        by Wovel ( 964431 ) writes: on Monday November 23, 2009 @10:03PM (#30209374) Homepage
        
        And nothing in their article is helping with that. They assume they are exploiting a software vulnerability. If I know there is a software vulnerability, there are 1 million and 1 less complex ways for me to blow right by any inline scanner. (One stupid enough not to look and see what the actual bytes were anyway)
        
        Parent Share
        twitter facebook
        
        Re: (Score:3, Informative)
        
        by x2A ( 858210 ) writes:
        
        It's a research paper, not an exploit, not instructions on how to make an exploit, not recommendations on how to make an exploit. God what's with you people on this site, you can't just see something for what it is, you have to see it for how it serves no purpose to you or how you can do it so much better.
        If they could exploit a machine by sending a point across, they'd get it past you lot every time, you'd never detect that huh.
        
        Re: (Score:3, Insightful)
        
        by istartedi ( 132515 ) writes:
        
        There are indeed times when I think that we built the Internet, and that it taught us only one lesson:
        I'm right and you're wrong.
        This is not quite as concise as "42". Also, a second Internet will have to be built to determine who is "I" and who is "you".
        
        Re: (Score:2)
        
        by x2A ( 858210 ) writes:
        
        Your sig: I think you mean "for all intents and purposes" *lol* intensive purposes haha "I REALLY REALLY MEAN TO DO THIS!!!!!!!!!" with eyes bulging out and raised veins... that would be a pretty intensive purpose...
        Anyway... what were you saying about the "I'm right/you're wrong" attitude of the internet? :-p
- Re:In other news... (Score:5, Informative)
  
  by blueg3 ( 192743 ) writes: on Monday November 23, 2009 @09:50PM (#30209276)
  
  Good job not reading the article.
  It's not that shellcode can be written in text and then compiled to an executable form. It's not that shellcode can be compiled to an intermediary form, translated or compiled into machine instructions by a piece of code (this is common in malware now, to pass input restrictions -- as the article says). It's that the executed machine instructions themselves -- the compiled binary data that can be run raw on an x86 processor -- looks like English text.
  
  Parent Share
  twitter facebook
  - Re: (Score:2, Insightful)
    
    by Knightman ( 142928 ) writes:
    
    And how do you suppose they generate the text then? They have a system they train with text pulled from various sources, then they use it to generate an innocent looking text that can be executed with a predicted result, no? In other words, an assembler/compiler....
    See, I did read the pdf....
    Btw, I missed that there where 4 researchers, not 3...
    - Re: (Score:2)
      
      by blueg3 ( 192743 ) writes:
      
      No, an assembler or compiler takes as input text in a high-level language and generates executable machine code.
      This takes as input executable machine code and generates executable machine code with a very narrowly-defined statistical property. (Simpler, but important, statistical properties have been done previously -- e.g., the Metasploit filters.)
      - Re:In other news... (Score:4, Interesting)
        
        by Knightman ( 142928 ) writes: on Monday November 23, 2009 @10:10PM (#30209416)
        
        An assembler/compiler doesn't necessarily use a high-level language input.
        In this instance they (as you say) 'takes as input executable machine code and generates executable machine code with a very narrowly-defined statistical property' which tells me they have an assembler that reads executable code and assembles executable code that looks like English text, in other words an assembler.
        
        Parent Share
        twitter facebook
        
        Re: (Score:2, Funny)
        
        by mysidia ( 191772 ) writes:
        
        FAIL. It cannot be an assembler if the input is not assembly.
        It's a translator.
        
        Re: (Score:2, Insightful)
        
        by Anonymous Coward writes:
        
        Dude, you're wrong. Let it go.
        
        Re: (Score:3, Interesting)
        
        by TheLink ( 130905 ) writes:
        
        There's a difference, an assembly language representative of a machine code program doesn't normally execute on the target machine. It has to be "assembled" to the object code before it can be executed.
        
        What these bunch have done is created a program that "massages" (which could include expanding and alteration) source machine code to a new arrangement of _machine_code_ that can execute on the target as is. That new arrangement happens to resemble English text (in a computer format).
        
        It's only an assembler if
      - Re: (Score:2)
        
        by Lumpy ( 12016 ) writes:
        
        But it still does not do what they try to fearmonger...
        A page of text will run on your Computer! OMG! just scanning an infected page will infect your PC! a randomly worded email can infect your computer!!!!
        Well only if Outlook add's a compile and execute all text in the email function, I' am sure Microsoft is adding that.
    - Re: (Score:3, Interesting)
      
      by calmofthestorm ( 1344385 ) writes:
      
      No, it translates assembly to different assembly that's also English. This is actually a rather interesting piece of work. They didn't just write a program that converts assembly to English assembly, they wrote one in English assembly.
      - Re: (Score:3, Informative)
        
        by blueg3 ( 192743 ) writes:
        
        Technically, machine code -- assembly is the pseudo-English text version of machine code.
        But otherwise, yes.
      - Re: (Score:2, Interesting)
        
        by mysidia ( 191772 ) writes:
        
        It is indeed a translator.
        It doesn't translate assembler code.. it translates x86 machine code.
        (Which also implies that it cannot be an assembler, since assemblers only accept Assembly code as input)
      - Re:In other news... (Score:4, Informative)
        
        by DoctorBit ( 891714 ) writes: on Monday November 23, 2009 @10:58PM (#30209702)
        
        It's a translator that takes any arbitrary x86 machine code as input, and produces as output functionally equivalent self-modifying machine code that starts off looking like English text. The same approach also works with other non-x86 machine codes, and other languages, such as Russian, French, etc... Very interesting work. It goes to show that for an OS to allow any code to self-modify can produce results that are very difficult to predict. Self-modifying code has an almost biological nature.
        
        Parent Share
        twitter facebook
    - Re: (Score:2)
      
      by thePowerOfGrayskull ( 905905 ) writes:
      
      And how do you suppose they generate the text then? They have a system they train with text pulled from various sources, then they use it to generate an innocent looking text that can be executed with a predicted result, no? In other words, an assembler/compiler....See, I did read the pdf....
      You really see nothng noteworthy about this? (Or are you just trying to cover up from getting called out in not reading TFA with a hasty skim and blasé attitude - I've done that myself a time or two...)
      - HP had it in 1986 (Score:3, Interesting)
        
        by Anonymous Coward writes:
        
        I think this is interesting, but hardly break-through.
        In the mid 80's, we did the same thing at a field Hewlett-Packard office, although not aimed at viruses. Our target was to enable users to key in x86 code in text form. In other words, sit down at a PC, open EDLIN (the DOS equivalent of Notepad), or some simple text editor, and key in human readable words (i.e. meaningful text that humans - HP Engineers - could easily transcribe from paper or a phone call). Then save the file as a .com file (which wa
  - Re: (Score:3, Interesting)
    
    by rnturn ( 11092 ) writes:
    
    "the compiled binary data that can be run raw on an x86 processor -- looks like English text."
    I had brought something like this up during an after-work, Friday night beer session back in the late '80s when a co-worker mentioned the odd snippets of text that one would see while examining programs using the debugger. (No... we weren't talking about strings of text defined in the source code.) I wondered whether it was possible to come up with a program whose machine code formed English text that actually perf
- This is far more interesting! (Score:5, Interesting)
  
  by Terje Mathisen ( 128806 ) writes: on Tuesday November 24, 2009 @03:43AM (#30210882)
  
  I for one is very impressed by what they've done, even if it is somewhat similar to what I did nearly 15 years ago:
  At that time I wrote what's probably the "best" executable text encoder for MsDos, it uses the absolute minimum possible amount of self-modification (a single 2-byte Jcc opcode) while staying entirely within the MIME text character set, and survives all the most usual forms of reformatting/reflowing of the text. (Replacing CRLF with a single CR (Mac) or LF (unix) or turning each paragraph into a single line.)
  The initial bootstrap looks like this:
  ZRYPQIQDYLRQRQRRAQX,2,NPPa,R0Gc,.0Gd,PPu.F2,QX=0+r+E=0=tG0-Ju E=
  EE(-(-GNEEEEEEEEEEEEEEEF 5BBEEYQEEEE=DU.COM=======(c)TMathisen95
  (The uppercase 'E's are my NOP fillers, they execute as INC BP, a register I don't use.)
  Terje
  PS. Unlike the current guys, I wrote the code above by hand, on paper, during the evenings of a ski vacation. I had brought with me a listing of the ascii encoding of all instructions that would use MIME characters only. :-)
  
  Parent Share
  twitter facebook
  - Re: (Score:3, Informative)
    
    by coinreturn ( 617535 ) writes:
    
    Yeah, but yours doesn't look like English; theirs does.
    - Re:This is far more interesting! (Score:4, Interesting)
      
      by Terje Mathisen ( 128806 ) writes: on Tuesday November 24, 2009 @10:32AM (#30213440)
      
      I know, and that's exactly what's makes it so interesting:
      They have effectively defined a small subset of the entire instruction set while allowing all other instructions that doesn't produce a side effect which would crash their "real" code.
      Terje
      
      Parent Share
      twitter facebook
- - Re:In other news...BAN THE PARENT (Score:5, Informative)
    
    by HEbGb ( 6544 ) writes: on Monday November 23, 2009 @10:05PM (#30209390)
    
    This is the sixth spam message this user has posted, will SLASHDOT please BAN this guy already? Come on.
    
    Parent Share
    twitter facebook
    - Re:In other news...BAN THE PARENT (Score:5, Informative)
      
      by Tynin ( 634655 ) writes: on Monday November 23, 2009 @10:13PM (#30209438)
      
      This is the sixth spam message this user has posted, will SLASHDOT please BAN this guy already? Come on.
      He must be making new logins. I've seen him posting for a few weeks, he surely has more than 6 spams that I've seen alone. Going on that idea... lets see:
      http://slashdot.org/~coolforsale117 [slashdot.org]
      http://slashdot.org/~coolforsale116 [slashdot.org]
      http://slashdot.org/~coolforsale115 [slashdot.org]
      http://slashdot.org/~coolforsale114 [slashdot.org]
      http://slashdot.org/~coolforsale112 [slashdot.org]
      http://slashdot.org/~coolforsale110 [slashdot.org]
      
      No doubt there is a TON of them. So I'd guess they are banning him, he just keeps making new uids (and siphoning a ton of moderation points to keep him marked at troll / offtopic). I know I've used many mod points keeping this bastard down.
      
      Parent Share
      twitter facebook
      - Re: (Score:2)
        
        by ColdWetDog ( 752185 ) writes:
        
        Maybe we should slashdot his sight. Or give him to /b/
        
        Re: (Score:2)
        
        by negRo_slim ( 636783 ) writes:
        
        in b4 not your personal army
        
        Re: (Score:2, Funny)
        
        by Ethanol-fueled ( 1125189 ) writes:
        
        At least the /b/ spammers are polite enough to do their homework and know the demographic (all /b/ spams are porn). Air Jordans and POLO hoodies for Slashdot? And handbags and UGG boots, even though there are no women on Slashdot. At least try to sell us motherboards and shit...
        
        Re: (Score:3, Funny)
        
        by account_deleted ( 4530225 ) writes:
        
        Comment removed based on user account deletion
        
        Re: (Score:2)
        
        by Culture20 ( 968837 ) writes:
        
        At least try to sell us motherboards and shit...
        yeah no shit. [...]
        I concur. Just motherboards. I don't create my own motherboards.
        
        Re: (Score:2)
        
        by Falconhell ( 1289630 ) writes:
        
        Blinding him seems a little harsh!
        We could all look at his SITE simataneously at some point though!
        I have also wasted a ton of mod points on this idiot.
        Its hard to think of a worse place for trying to spam than Slashdot eh?
        
        Re: (Score:3, Interesting)
        
        by Falconhell ( 1289630 ) writes:
        
        It hope none of you are thinking of subscribing coolforsale's email address zminring@gmail.com to a lot of spam lists.
        That would be very wrong.
        Very very wrong.
        
        Much better idea (Score:2)
        
        by istartedi ( 132515 ) writes:
        
        Profile his IP, and present what appear to be angry responses and modded-down posts when serving pages to that IP. Otherwise, just don't display his posts at all. Then again... mayyyyybe we already did that.
      - Re: (Score:3, Interesting)
        
        by Hurricane78 ( 562437 ) writes:
        
        Isn’t this why CAPTCHA was invented?
        I mean just add captchas an a place where is slows him down too much for spamming to still make sense.
        And freakin’ use reCAPTCHA, if you don’t want to get laughed at! ^^
    - Re: (Score:3, Insightful)
      
      by spud603 ( 832173 ) writes:
      
      Is it spam, or is it shellcode? things like "this treatementOur goal" look fishy to me.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

This is (Score:4, Funny)

Re:This is (Score:5, Informative)

Re:This is (Score:5, Informative)

Re:This is (Score:5, Insightful)

Re: (Score:3, Insightful)

Binaries that opt out of NX (Score:3, Informative)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3, Interesting)

Re:This is (Score:4, Insightful)

Re: (Score:2, Funny)

Re: (Score:3, Insightful)

Re: (Score:2, Interesting)

Re: (Score:3, Insightful)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Apple already did this (Score:2)

Re:This is (Score:4, Funny)

Thanks (Score:4, Informative)

Oh great - that love letter from the IRS (Score:4, Funny)

Re: (Score:2)

Confused (Score:2)

Re: (Score:3, Insightful)

Re: (Score:2)

Re: (Score:2)

Re:Confused (Score:5, Informative)

Re:Confused (Score:4, Informative)

Re: (Score:2)

Re: (Score:2)

This very comment (Score:5, Funny)

Re:This very comment (Score:5, Funny)

OMG! (Score:5, Funny)

Re:OMG! (Score:5, Funny)

Re:OMG! (Score:4, Interesting)

Re:OMG! (Score:5, Funny)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re:OMG! (Score:4, Funny)

That was rather pretty (Score:2, Interesting)

Re: (Score:2)

Re: (Score:2, Informative)

Re: (Score:2, Informative)

Re: (Score:2, Informative)

Re:That was rather pretty (Score:4, Informative)

Antelope museum (Score:5, Funny)

Re:Antelope museum (Score:5, Informative)

I'm screwed (Score:2)

I CAN BE PLAYED ON RECORD PLAYER X (Score:3, Insightful)

We're doomed! (Score:2)

So what? (Score:3, Interesting)

Re: (Score:2)

Linux version (Score:5, Funny)

Excellent Presentation (Score:5, Informative)

Re: (Score:3, Informative)

You have... (Score:3, Funny)

Hello, World! (Score:3, Insightful)

Interesting work (Score:3, Insightful)

Too bad IBM went with the 8088 (Score:3, Interesting)

Re: (Score:3, Informative)

Re: (Score:2)

Re: (Score:2)

Re:The syntax should not matter.. (Score:5, Insightful)

Re: (Score:3, Informative)

Re: (Score:3, Insightful)

Re: (Score:2)

Re:In other news... (Score:5, Informative)

Re: (Score:2, Insightful)

Re: (Score:2)

Re:In other news... (Score:4, Interesting)

Re: (Score:2, Funny)

Re: (Score:2, Insightful)

Re: (Score:3, Interesting)

Re: (Score:2)

Re: (Score:3, Interesting)

Re: (Score:3, Informative)