Deserialization Issues Also Affect .NET, Not Just Java (bleepingcomputer.com)
187
"The .NET ecosystem is affected by a similar flaw that has wreaked havoc among Java apps and developers in 2016," reports BleepingComputer. An anonymous reader writes:
The issue at hand is in how some .NET libraries deserialize JSON or XML data, doing it in a total unsecured way, but also how developers handle deserialization operations when working with libraries that offer optional secure systems to prevent deserialized data from accessing and running certain methods automatically. The issue is similar to a flaw known as Mad Gadget (or Java Apocalypse) that came to light in 2015 and 2016. The flaw rocked the Java ecosystem in 2016, as it affected the Java Commons Collection and 70 other Java libraries, and was even used to compromise PayPal's servers.
Organizations such as Apache, Oracle, Cisco, Red Hat, Jenkins, VMWare, IBM, Intel, Adobe, HP, and SolarWinds , all issued security patches to fix their products. The Java deserialization flaw was so dangerous that Google engineers banded together in their free time to repair open-source Java libraries and limit the flaw's reach, patching over 2,600 projects. Now a similar issue was discovered in .NET. This research has been presented at the Black Hat and DEF CON security conferences. On page 5 [of this PDF], researchers included reviews for all the .NET and Java apps they analyzed, pointing out which ones are safe and how developers should use them to avoid deserialization attacks when working with JSON data.
Organizations such as Apache, Oracle, Cisco, Red Hat, Jenkins, VMWare, IBM, Intel, Adobe, HP, and SolarWinds , all issued security patches to fix their products. The Java deserialization flaw was so dangerous that Google engineers banded together in their free time to repair open-source Java libraries and limit the flaw's reach, patching over 2,600 projects. Now a similar issue was discovered in .NET. This research has been presented at the Black Hat and DEF CON security conferences. On page 5 [of this PDF], researchers included reviews for all the .NET and Java apps they analyzed, pointing out which ones are safe and how developers should use them to avoid deserialization attacks when working with JSON data.
Simpler solution (Score:1, Insightful)
Just don't use JSON or XML. You can thank me later.
Re: (Score:2)
So what do you recommend instead?
Re:Simpler solution (Score:5, Interesting)
JSON or YAML are probably both fine. XML is simply wasteful and unnecessary. Personally I think we should be using something like s-expressions (lisp-like). People hate them because of the parens but every other encoding has as many negative points in different ways. The advantage is that the syntax is far simpler to understand and parse leading to safer software. Some might say that having an "executable" format is bad but I'd point to bugs like this as being proof that even "text" formats are just executables in disguise. The Lisp creed is "data is code" and I've come to agree.
Re: (Score:1)
Re: Simpler solution (Score:1)
ASN.1
Re: (Score:2)
(mandatory missing sarcasm tag warning)
Not that many developers would base a decision on an AC slashdot post, but...
Re: (Score:2)
That doesn't really answer the question I asked.
Re: (Score:2)
Re: (Score:2)
The problem is exchanging tabular and hierarchical data structures, containing arbitrary values.
So for instance the simplest of such structures is a table of id, state, name, description. The description field can of course contain arbitrary characters including quotes, commas and newlines.
Sometimes there's metadata for the table. For instance think of the results of a mysql query: You want a table of the results, but there's also a list of the datatypes of each column, plus the time it took to answer the q
Re: (Score:1)
Re: (Score:2)
I agree that XML is usually insanely overkill for most purposes. Still, there are worse choices than insanely overkill, such as trying to shoehorn a complicated hierarchical data structure into something like TSV, CSV or a fixed length format, as BarbaraHudson seems to be proposing in this discussion.
Re: (Score:1)
I agree that XML is usually insanely overkill for most purposes. Still, there are worse choices than insanely overkill,
CORBA comes to mind, or EDI, both of which suck hugely for different reasons. If I never have to see either one again it will be too soon.
The real point for a heterogeneous environment is that you need to look at the basic units you have in common across all players, and then design with those limitations in mind. One of the first and major stumbling blocks for most is that the data representation may vary across the components, and some may have a concept radically different that even the minimum require
Re: (Score:2)
Re: (Score:2)
First of all, all those meanings of those ASCII symbols are long obsolete and forgotten. Just because there's a character called "file separator" doesn't mean anybody uses it for any purpose, except perhaps some fossilized piece of software for dealing with tape drives from the 60s.
Second, such an approach would suffer from various problems. For instance you obviously can't use such characters without escape sequences, which means you can't just stick a file in between file separator characters, you've got
Re: Simpler solution (Score:4, Funny)
Re: (Score:2, Funny)
You haven't used XML until you had to decode base64 encoded xml documents stored in xml attributes of a different xml document.
Re: (Score:1)
And signed, don't forget that the inner document is signed to truly enable misery. See IBM Datapower appliance for that joy.
Re: (Score:3)
Serialization without using one of these standards is going back to the bad old days of proprietary silos. You must work for Sony.
Re: Simpler solution (Score:1)
YAML. XML without pretending to solve a bunch of problems it doesn't solve.
Re: (Score:2)
Re: (Score:1)
But PHB's want their shiny dancy UI/UX toys or they won't pay you.
Re: (Score:2)
So go work somewhere else. It's not like you have to work for an idiot. There are at least 50 ways [youtube.com].
Re: (Score:1)
Leaving idiots doesn't scale.
Re: Simpler solution (Score:2)
Re: (Score:2)
Re: (Score:2)
I will agree the XML is highly over prescribed. It is however useful in situations that do require heterogeneous systems to exchange complex and potentially changing data structures where changes cannot be 100% coordinated.
That said XML is often hobbled for security reasons such that applications don't actually process DTDs etc. If you are doing those things you giving up a lot of the flexibility while keeping most of the complexity. You probably should be asking from a design perspective if perhaps XML
Re: (Score:2)
This problem was solved in the 60s. See my comments here [slashdot.org] and here [slashdot.org].
Re: (Score:1)
Sure, you can use some crufty protocol like X12 EDI, which will help you understand the benefits of XML.
Re: (Score:2)
Au contraire, I did not say "do nothing". So STFU until you learn how to read, svp.
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
I've never understood why PUT things/1/subthing/52 is somehow better than POST /api?thing=1&subthing=52. And the second one works without over complicated mod rewrite rules (though you could certainly add a very simple one to decouple your filesystem from your app).
It's because the Internet meme machine and lemming followers continually confuse exceptionally poor implementations of useful concepts for progress.
Selling point of REST (via HTTP) was simplicity + reuse. Having objectively failed to deliver on both accounts vs. coherently designed HTTP APIs REST is a nonstarter to even consider at this point. Nobody wants to deal with it.
Re: (Score:2)
I understand why you'd recommend against JSON since it was originally intended to be an expression (and some fuckwits would eval() it) rather than really intended to do quite the same thing as, say, Python's pickles. But what's the beef with XML?
Re: (Score:2)
Seriously? It's not like there weren't plenty of ways to store data that were far less verbose, more self-documenting, and took up less space and cpu both to create and search through.
Re:Simpler solution (Score:5, Informative)
The serialization format has nothing to do with the deserialization vulnarabilities.
Re: (Score:2)
Face-palm worthy because a few years ago, a lot of these bugs were found in XML Java deserializers. A lot of people said, "Don't use XML! It's insecure!" then went off to write the same frameworks, but using JSON instead. They ended up with all the same bugs.
I guess next people will rewrite them in YAML or binary.....nah, binary is scary, you never know what people could put in there!
Re: (Score:2)
Bugs in XML deserialization don't allow for arbitrary code execution.
Neither does JSON or YAML.
So, what exactly would be the attack vectors (in a VM) via text only (de)serialization?
I mean: buffer overflows, putting code on the stack or changing return adresses for JSRs obviously are impossible.
Re: (Score:2)
Buffer overflows are kind of rare these days. Because of things like ASLR, they are hard to exploit. It's mainly about logic bugs of various types.
Re: (Score:2)
And exactly that e.g is the reason why 'standard' deserialization of objects in Java/JVM does neither use ctors nor setters. .Net
No idea about
Re: (Score:2)
Re: (Score:2)
The build in ObjectOutputStream and ObjectInputStrream.
They allow serialized objects to either implement java.io.Serializable or java.io.Externalizable
https://docs.oracle.com/javase... [oracle.com]
https://docs.oracle.com/javase... [oracle.com]
( Why google finds the 7 version and not the 8 as first hits is beyond me :D )
The vulnerability comes from the option to overwrite "readObject()". Serialized data objects contain usually the classes as well. So when you read them, you also read and link the code, and hence use the supplied "rea
Re: (Score:2)
Yep. Gin up your own solution with the exact same security flaws.
I don't care how smart you are; everyone else is collectively smarter than you are. From a security standpoint you want to use popular frameworks that take security seriously and respond to the inevitable exploits promptly. Doing things in an idiosyncratic way is not protection because (a) systems can be probed using black-box methods like fuzzing and (b) chances are your way of doing it has been used thousands of times before.
Re: (Score:2)
Libraries are neither here nor there. This is 2017, not the 1970s. To build the kind of apps people want today to run on the platforms they use, you're using a framework, and it's going to be huge and complex.
Now sure, we still use libraries. And sure, if you are talking about a small, simple library that will never handle information from a source you don't trust. by all means gin up your own if that's easier for you. But if you're gluing a javascript browser app to a server back end, if you're not us
Re: (Score:2)
> I don't care how smart you are; everyone else is collectively smarter than you are.
Provably not true. If you want a high-quality library with clean interfaces, make sure it is the work of a single smart person.
As the first statement is oversimplified to make a point, maybe there's a better way to write it (given I am now part of the "everyone else who is smarter"):
How many skilled people hours have already been spent on project x which were focussed on solving the quality issues, compared to the hours I can spend on it now? And are my own skilled hours following a very similar approach to the one they used, or am I consciously or accidentally pursuing a different approach which may lead to a different, perhaps be
Re: (Score:2)
I remember life before XML or JSON. It wasn't pretty. I've reverse-engineered the .doc and .xls file formats. It was a time when everybody made up their own file formats, and there were no libraries to help you read and write those formats. No, thank you, I'll live with the potential serialization issues.
Re: (Score:2)
And there's your problem - you or your user was using a shitty format. This is a long-solved issue. Even plain text or SDF or tab-delimited or fixed field width are quick and easy to implement, and variable-field-width can also be made self-documenting with just a bit of work. All are far easier to implement than xml or json, and if it's become corrupted, you'll usually be able to see exactly where pretty quickly and recover everything else.
Re: (Score:2)
The parent is talking about .doc and .xls formats. These are absolutely not suitable for something as simple as tab or fixed field formats. They can contain arbitrary data like embedded images and videos. They have a very complex markup system. They have features like versioning, scripts, and oodles of metadata. They have to deal with arbitrary data of arbitrary length. They can attach arbitrary amounts of parameters to some piece of text. .doc and similar is one of the few cases where XML is actually not o
Re: (Score:2)
In fact, the new docx and xlsx formats are implemented in XML.
There are many data sets that don't work well as CSV. Anything, for example, that has one-to-many relationships such as customer order history with names, addresses, billing info, etc., doesn't work well as CSV. That's the whole point of XML / JSON--you can easily store and retrieve data sets that are more complex than a spreadsheet. And that is just about everything.
Re: (Score:2)
Arbitrary data serialization was solved back before the PC was invented. See the following ASCII control codes
0x1c - FS - File separator The file separator FS is an interesting control code, as it gives us insight in the way that computer technology was organized in the sixties. We are now used to random access media like RAM and magnetic disks, but when the ASCII standard was defined, most data was serial. I am not only talking about serial communications, but also about serial storage like punch cards, p
Re: (Score:2)
There is nothing that is inherently more secure about ASCII control codes over XML or JSON. And it's inherently less human-readable. There's a reason the world has moved on past ASCII control codes!
Re: (Score:2)
Re: (Score:1)
You are such a stupid shit that it's amazing that you can even string together coherent phrases. Or is someone ghost-writing for you?
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
Re: (Score:1)
If someone ripped your brain out and placed it on your dinner plate, no one would notice. That's how completely useless and unattractive both you and your cooking is.
Re: (Score:1)
In context, we are talking about data that is executed as code on purpose, you worthless bag of shit.
Real Developers never Deserialize into objects (Score:5, Informative)
Re: (Score:2)
Parser loop.
Re: Real Developers never Deserialize into object (Score:1)
Walking a data tree is 1st year CS level work. If you're spending half your efforts on it then you're either vastly short on resources or your coders suck donkey balls.
Agree. (Score:2, Insightful)
It appears that the market is flooded with developers who can write scripts but not algorithms. They believe that something like parsing JSON is really hard and complicated, that any home-grown solution to doing that will be extremely buggy and slow, all because they themselves haven't taken the mental step-up.
Of course, this mental step-up used to be a standard part of a CS degree. College students would be writing code that does this sort of thing as homework. This has changed, and I have seen the chan
Re:Agree. (Score:5, Insightful)
I ask them questions about their courses in algorithms and what they did, and they say things like "we learned what the foundational algorithms are and how to compare their performance." Did you actually write a merge sort? "No, there's no need because every major language has that sort of thing built in."
Consider me a cultist follower of your hypothesis. 20 years in CS, the last 10 I have seen it take a sharp dive. The only explanation I have is the explosion over 15 years ago in OSS and that what you espouse is true: Everyone thinks they can develop or engineer, because the code is tied up in nice little solution blocks.
Need a sort algo? Just codeproject.com
Need some bi-directional comm between remotes? Just github.com...etc....
The number of people I have turned away in the first two days of testing, who could not even write a simple priority Q... its more than disheartening.
These are the "developers" who are supposed to code my future? Fuck me! I'll be working till I die.
Re: (Score:2)
Consider me a cultist follower of your hypothesis. 20 years in CS, the last 10 I have seen it take a sharp dive. The only explanation I have is the explosion over 15 years ago in OSS and that what you espouse is true: Everyone thinks they can develop or engineer, because the code is tied up in nice little solution blocks.
Our education system is broken. Not many developers have Computer Science degrees because that's actually a had degree to achieve. A lot of them have some type of Computer Science lite degree like Information Technology or something like that. I don't see it getting any better. Insisting it should is unfortunately wishful thinking at this point. Some people value the field of Computer SCIENCE. Some people are just in it for the money.
Re: (Score:2)
If you are testing on problems which were solved 20 years ago - maybe you are using the wrong tests? Would you test using punch cards given the choice.
Bro, if you think that Priority Qs are "old", you are the problem.
Stop trying to use things like "outdated" to mean "I dont know how".
Grow some balls, dont be an A.C. and make ignorant claims about things you obviously dont even know are in use (i.e hint PQ are just heaps) when they are the literal foundation of modern caches and heaps.
Thanks for making my point.
Re: (Score:2)
So you want someone to write a round-robin, lockless thread pool on a whiteboard in 30 minutes?
No, I want someone to KNOW HOW, logically, to solve THAT PROBLEM.
And I said DAYS, not hours. Reading comprehension is an obvious skill that has gone down with this "new" education paradigm,
Re: (Score:3)
I reckon most coding jobs only really involve manipulating/displaying data from databases an
Re: (Score:2)
What to call these "CS light" people - I prefer the term "code monkey".
Re: (Score:2)
Walking a data tree is 1st year CS level work. If you're spending half your efforts on it then you're either vastly short on resources or your coders suck donkey balls.
Correct and then 2nd year CS work is red/black trees and other advanced algorithms. The whole time you're learning this you're taking increasingly higher levels of mathematics. You see the problem I run into is that many people couldn't make it to the 2nd year in Computer Science and went into Information Technology or some other "Computer Science Lite" field of study. These are the majority of people in the field now that really lack to ability to understand the difference between different implementati
It's a trap! (Score:3, Interesting)
Completely agree. We used .net binary serialization/deserialization because it was such a quick way to get things up and running...with like two lines of code. The fact that the serialized objects were about 10x bigger than they needed to be was not a problem.
It turns out the namespaces are included in the serialized data, so the moment we did an ounce of lightweight refactoring we broke it. It took us less than a day to write our own serializer, but an extra three days of combined manpower to get a form
Re: (Score:1)
You are 100% correct.
Unfortunately, going by the amount of projects affected by the bug, it seems that most programmers are not "real programmers"
Re: (Score:3)
Trust me I've built systems both ways and deserialization directly into objects is no bueno.
Yeah, running a auto-deserializer on untrusted data is basically guaranteed to be a security flaw. The NSA and FSB will pwn you at that point, along with anyone else who wants to (just ask PayPal).
Re: (Score:2)
But can't you pretty much avoid this issue by means of predefining the allowed structure(s) for the data? If the deserialized/serialized data does not match the predefined schema, it's discarded as invalid.
The deserializer reconstructs the objects by calling the constructor, or setters and getters. The setters and getters have logic bugs that allow arbitrary code execution.
I just think that the standard serialization/deserialization libraries out there have been likely created by programmers a lot smarter than me,
No, you're definitely wrong here. If you spend a few weekends, you could probably make one of these yourself. Then getting people to use it is a matter of marketing and such.
Re: (Score:1)
Absolutely correct. Any additional development overhead or memory use is acceptable in return for the gained compatibility, reliability and security.
Not a .NET problem (Score:1)
This is a programming problem that can happen anywhere. No language is immune. No project is automatically secure from exploits, or able to patch framework universally for all deployments.
Java and .NET will always have security issues, along with literally every other programming language. Anyone shocked, surprised, upset, or hostile to that concept is in the wrong profession.
Assume everything is compromised. Assume nothing is secure. Design around that assumption and you will survive.
Re: Not a .NET problem (Score:4, Insightful)
Re: (Score:3)
But you won't be able to compete with shortcut takers. They will look more productive than you. The penalty for shortcut taking is not just large enough, I hate to say. I'm just the messenger.
Re: (Score:1)
Correction: "just not large enough...".
Lexdysia
Re: (Score:2)
He'll be telling us Rust is webscale next
Yeah, serialization can be annoying. (Score:3)
I'm kind of surprised this hasn't already built into a more prominent issue over time.
Performance issues I can stomach - there's going to be some unavoidable parsing logic no matter how you go at translating from runtime to storage or network logic - but instead, large swaths of objects just get ignored in major libraries. When using unity, for instance, can't serialize dictionaries, and many other objects in the default serializer - which is a major oversight.
Google actually has provided some rather nice tools to help with this - I tend to use their 'Protocol buffer' libraries for their rather nice serialization options. This doesn't address security on its own - nothing does completely, but designing careful locked signal processing and independent cross-checking steps can help a lot. Well-salted encryption alone won't really save you.
My pet peeve with protocol buffers the need to give everything an index number, with no real auto-numbering for rapid design - I can see the logical need, to be able to rely on that order for processing - it's just an extra babysitting step that gets me sometimes. For what it does, it's still the best I've found to be consistent between diverse projects and still leaving room for decent security.
Ryan Fenton
Re: (Score:2)
I've looked at protocol buffers but everything I've ever read about people actually using them in production says they are a nightmare over time because they are binary. Supposedly the object versioning alleviates some of this but I think people were complaining about how to deal with mandatory fields over time. I can't remember but I suspect this plus JavaScript being in the browser is what makes JSON so prevalent. I have no idea why XML is used. I can't even think of a single advantage it has over any
Free time =/= 20% time (Score:2)
JSON does not have code-execution ability (Score:5, Insightful)
JSON only defines a bunch of basic data types. It defines no ability to run anything. These bugs are in (de)serialization layer above it, which uses JSON as a transport and extend the meaning of the data stored to be able to deserialize higher-level objects.
JSON or XML are not the problem here. The same problem could happen if you serialized to CSV or TXT or anything else for that matter.
Re: (Score:1)
It's probably a problem with "generic" reconstruction of objects based on data. If the data is used to (re) construct objects, then some objects can potentially have behavior because that's how objects are defined. If the data is "clever" enough, it may end up constructing objects you don't want.
It's probably better to parse out to low-level "scalar" values and hand-code the part that stuffs them into objects or databases rather than let a parser actually build objects or object trees itself.
Re: (Score:2)
It's probably better to parse out to low-level "scalar" values and hand-code the part that stuffs them into objects or databases rather than let a parser actually build objects or object trees itself.
This is exactly right. Because the data is untrusted, you need to verify it anyway, and adding parsing code to that usually doesn't add much overhead (it can often be the same code).
In the defcon talk they made a strong case that these generic de-serialization libraries are extremely difficult if not impossible to use securely. They were just grabbing at low-hanging fruit, as soon as you've imported these libraries, you're compromised. They didn't even discuss ways that the libraries might be used incorre
Re: (Score:2)
If you're dealing with enough different datatypes then it might be a big development and maintenance saving to have a generic object builder in your deserialiser. The key is to make it so that you whitelist the datatypes it will deserialise.
Re: (Score:1)
I see a problem with white-listing. Objects are often part of a bigger ecosystem. You may have to white-list sub-sets of objects to do it right, making it non-trivial to guarantee you didn't leave a current or future hole.
You are right that it might be a big saving to have auto-object generation, but at a risk.
I don't get it (Score:2)
Can someone explain what the problem is here? Serialized objects are just code, and if you're running untrusted code you've got bigger problems than bugs in your serialization libraries.
Re:I don't get it (Score:4, Informative)
General rule of thumb as always... a vague security announcement is never as big a deal as its title makes it out to be.
There really isn't much of a problem. Reading TFA, a few vulnerabilities have been discovered in a couple applications and libraries. None of these were part of .NET, and no systemic issues in how people code for .NET have been found.
Never heard of those libraries (Score:2)
JSON.NET is not vulnerable by default (Score:2)
As stated int he linked document, for JSON.NET to be vulnerable, you have to explicitly set an option making it less secure.
As with encryption and security libraries, you are better off using well-established libraries like JSON.NET than rolling your own. A solo developer, or corporate team, just doesn't have the resources or time to work out all the security vulnerabilities, as can be done with a dedicated library.
DOS attacks on .NET and Java (Score:1)
Re: (Score:2)
Re: (Score:2)
Data by itself doesn't do much. The way to think about data is that it is being fed into a machine that is doing stuff. That means I can program the internal state of that machine using data. Normally we just call this "processing" but bugs like this illustrate that you have to be very careful with how you handle state. Even for "simple" formats that are just "text" like JSON, XML, YAML, and everything else. Image (binary) formats are also not immune as there have been browser attacks using bugs in com
Re: (Score:2)
In this case it happens when "Object Oriented" is taken too literally. People think of data as inert. People think of "Objects" as inert. So they figure translating between data and objects is just transforming one inert thing into another.
But "objects" are not inert in almost any dynamic language. They are quite active, with instantiation methods, etc., and some are quite dangerous. One has to adjust one's paradigm [wordpress.com] when learning OO programming from a procedural background.
Re: (Score:2)
There is no 'pure' data here, the purpose of these frameworks is to deserialize into objects, and objects by definition are functions combined with data.
Re: (Score:2)
Re: (Score:2)
Re: (Score:3)
Personally I think "exposing" objects is the problem. Your border should be a mailbox that exchanges messages and those messages should be inspected carefully before internal delivery. I have no idea why people want to dump a class they wrote onto a live internet service and just hope that it dumps data into the correct table somewhere. They dragged the "security api" icon onto the project space so it's secure.