Highly Invasive Backdoors Hidden in Python Obfuscation Packages, Downloaded by 2,348 Developers (arstechnica.com) 50
The senior security editor at Ars Technica writes:
Highly invasive malware targeting software developers is once again circulating in Trojanized code libraries, with the latest ones downloaded thousands of times in the last eight months, researchers said Wednesday.
Since January, eight separate developer tools have contained hidden payloads with various nefarious capabilities, security firm Checkmarx reported. The most recent one was released last month under the name "pyobfgood." Like the seven packages that preceded it, pyobfgood posed as a legitimate obfuscation tool that developers could use to deter reverse engineering and tampering with their code. Once executed, it installed a payload, giving the attacker almost complete control of the developerâ(TM)s machine. Capabilities include:
- Exfiltrate detailed host information
- Steal passwords from the Chrome web browser
- Set up a keylogger
- Download files from the victim's system
- Capture screenshots and record both screen and audio
- Render the computer inoperative by ramping up CPU usage, inserting a batch script in the startup directory to shut down the PC, or forcing a BSOD error with a Python script
- Encrypt files, potentially for ransom
- Deactivate Windows Defender and Task Manager
- Execute any command on the compromised host
In all, pyobfgood and the previous seven tools were installed 2,348 times. They targeted developers using the Python programming language... Downloads of the package came primarily from the US (62%), followed by China (12%) and Russia (6%)
Ars Technica concludes that "The never-ending stream of attacks should serve as a cautionary tale underscoring the importance of carefully scrutinizing a package before allowing it to run."
Since January, eight separate developer tools have contained hidden payloads with various nefarious capabilities, security firm Checkmarx reported. The most recent one was released last month under the name "pyobfgood." Like the seven packages that preceded it, pyobfgood posed as a legitimate obfuscation tool that developers could use to deter reverse engineering and tampering with their code. Once executed, it installed a payload, giving the attacker almost complete control of the developerâ(TM)s machine. Capabilities include:
- Exfiltrate detailed host information
- Steal passwords from the Chrome web browser
- Set up a keylogger
- Download files from the victim's system
- Capture screenshots and record both screen and audio
- Render the computer inoperative by ramping up CPU usage, inserting a batch script in the startup directory to shut down the PC, or forcing a BSOD error with a Python script
- Encrypt files, potentially for ransom
- Deactivate Windows Defender and Task Manager
- Execute any command on the compromised host
In all, pyobfgood and the previous seven tools were installed 2,348 times. They targeted developers using the Python programming language... Downloads of the package came primarily from the US (62%), followed by China (12%) and Russia (6%)
Ars Technica concludes that "The never-ending stream of attacks should serve as a cautionary tale underscoring the importance of carefully scrutinizing a package before allowing it to run."
LOL (Score:5, Insightful)
"...carefully scrutinizing a package before allowing it to run..."
Sure, let's pretend that people do that.
Re: (Score:2)
It's such a stupid thing to suggest, it irks me every time I read it. If you push the stewardship downstream to the users of the package, you help to cement the anti-open-source sentiment many organizations have. Microsoft's security holes may be ever-present, but they own them. They don't just tell their customers, "Yeah, good luck with that. Check our work - we wash our hands." I have NEVER checked the code of a common library 8 have used, and there's no way I'm alone. It's just not cost effective to do s
Re:LOL (Score:5, Interesting)
It's such a stupid thing to suggest, it irks me every time I read it. If you push the stewardship downstream to the users of the package, you help to cement the anti-open-source sentiment many organizations have. Microsoft's security holes may be ever-present, but they own them. They don't just tell their customers, "Yeah, good luck with that. Check our work - we wash our hands." I have NEVER checked the code of a common library 8 have used, and there's no way I'm alone. It's just not cost effective to do so.
Theres a reason this was found by Checkmarx. Look up SAST. These guys scan opensource libs all the time, so you don't have to.
Re: (Score:2)
The blind leading the blind. Devs should not be allowed to use computers.
They should code on the whiteboard.
In fact there was a course 'program derivation' where you do exactly that...
Re: (Score:2)
For major packages that are widely used, I think this should be true. Stewardship should be in the team that is developing and maintaining the package. However, if you are going to use some random obfuscation code package, that had been downloaded by less than 3K people in total? That's totally on *you* at that point. One of open sources sayings is something like enough eyes make all bugs shallow. Well, 3K obviously was about the amount of eyes required in this case. It's just like using some fly-by-n
Re: (Score:3)
"...carefully scrutinizing a package before allowing it to run..."
Sure, let's pretend that people do that.
Yeah, especially when the backdoor is obfuscated in an obfuscation package!
I guess you can't ask too much from people. Note, that you could use a compiled language although to build and deploy your applications then, a lot less need for obfuscation and as a bonus, it runs faster with less resources!
Re: (Score:2)
More so, lets pretend this is even possible in time less than writing this from scratch.
For those who want to distribute obfuscated Python (Score:4, Interesting)
- Pyinstaller [pyinstaller.org]: create obfuscated one-directory or one-file distributions of your application: essentially it packages a Python interpreter, all the necessary modules and your code in one giant directory or single file. It's easy to use but the distribution directory or file it generates are huge, start up very slowly (especially one-file distributions) and your code is only lightly obfuscated.
python -m pip install pyinstaller
- Nuitka [github.com]: "real" Python compiler. It generates C code from your Python code whenever possible and compiles the C code. I've have very good luck with it, both in Linux in Wondows. A bit tricker to use than pyinstaller and it's super-slow to compile a big project, but it generates much smaller distribution executables that start faster. If your Python code is setup to avoid on-the-fly evaluation (which require Nuitka to bundle in a Python interpreter and leave your code uncompiled) the size can be shaved off quite dramatically. On Windows, if you have MSVC installed, Nuitka can automatically use it to compile your code, and you can make an application with a splash screen (doesn't work with gcc for some reason).
python -m pip install Nuitka
My preference: Almost always Nuitka, unless I have to release so often that the compile time become a hindrance.
Every other "obfuscation" solutions are pointless. Take it from me, I research that particular subject for a long time.
Re: (Score:2)
I haven't tried it. But it looks like a code mangler that ultimately calls Nuitka.
Re: (Score:2)
I dunno... I hate to piss off the pythonistas (har har), but - if people don't want their code easy to reverse engineer, maybe not starting with a scripting language is a good starting point?
Regardless - aren't we all about FOSS here? Code obfuscation seems anathema to that ideal.
Re: (Score:3)
Not everything is black and white.
For example: most of my Python code is code we use internally in my company. One rather large module I maintain is used as the reference API to talk to our embedded product. This module has a public part, which is a subset of the entire module that only implements the public API calls, and a private, full-featured part which we use internally for programming the devices in production, and has "dangerous" code the customers shouldn't have access to.
We distribute the public p
Re: (Score:2)
Re: (Score:2)
Aint pissing me off. I have a major beef with the entire mindset behind obsfucation and personally think it has no place in python.
Also, it doesnt work. I dont care how well you think code is obsfucated, give me time and I *will* decode your spaghetti. I lived through goto spaghettied basic and lived to tell the tale, obsfucators have no power over me.
They are about as safe as those idiotic JS doodads that disable right clicks because FOR SOME REASON a web dev thinks the HTML that he just projected into my
Re: (Score:2)
Re: (Score:2)
You can't throw a working Python application at Cython unmodified and get a working executable out of it. You can do that with Nuitka.
Re: (Score:2)
use a container (Score:4, Interesting)
Python version hell already makes using pyenv a minimum requirement, so why not go all the way and run all your python projects from podman, docker, etc. You'll end up with a tool that runs on more than just your own machine and you won't have to give malware access to everything in your home directory.
Eventually we'll need all Linux apps to work like Android, where you authorize every little access right on a per application basis. Since I doubt the superior Qubes OS is going to gain traction with your average Linux developer.
Re: (Score:2)
More abstraction layers!
Re: (Score:2)
More abstraction layers!
Is that what that's called now?
In my days, it was called cruft.
Who is at risk? (Score:3)
"legitimate obfuscation tool" - it sounds like this could affect not only the developer's machine, but also those of people who use the obfuscated code they produce.
Pypi.org is inherently insecure and counter-produc (Score:5, Interesting)
When is the Python community going to drop this stupid service "pypi.org" ?
An open registry of packages that anyone can upload, with any name and any code, that affect all users immediately, is a terrible model. It's what leads to these attacks in the news, typosquatting, etc. Quite simply, it should not be possible for one of these packages to be accessible to users without being accepted by a moderator. That's how Linux distributions work: you can't just add some malware to Ubuntu's package registry without someone approving that package for inclusion. There is due diligence required, precisely because Linux distros are serious about security and quality.
In addition, Pypi.org lacks the basic useful functionality that other package registries have:
- No ratings or comments. They show 'GitHub statistics' on a package page, but you can't see them in search, nor sort by them. By not having comments, there's no community feedback system.
- Lack of useful metadata. How often has this package been downloaded, or viewed? Other package repositories show this, and allow you to filter based on it. This gives you the basic popularity of the package which is a simple way to sort for the most likely package you want.
- Package search results are terrible. You can't filter down with more specific criteria, often because there's no criteria to search on.
- You can't search pypi via the command-line. This was supposedly due to 'abuse', but other services have simple solutions to this.
- No encouragement of a standardized hierarchical naming convention. Because of this, packages have random names, and often reinvent the wheel. Even when they build on other packages, there's no obvious sign of it, and no way to show a hierarchy of package dependencies.
Because of these and more deficiencies, we end up with malware being incredibly easy to spread, infecting more and more packages and institutions. If Python's community isn't completely irresponsible and toxic, they need to fix these things, or another language will, and eat their market share for lunch.
Re: (Score:2)
Despite all that, it works very well for 99.99% of users
Re: (Score:2)
It’s already happened with javascript where a dev sabotaged his own package and it broke lots of major downstream projects. Everyone was blaming him but hey this is “no warranty as is” open source so the burden is on you for blindly trusting third party code.
Re: (Score:2)
If you were going to use such things in critical code, you'd create your own tree of such components and maintain it, doing individual package upgrades as necessary and desirable. But that is work.
Re: (Score:3)
Well, there's this article and others like it highlighting precisely that Pypi (and similar approaches) are troubled.
Sure, it *works*, but then malicious package comes in and bad things happen. This is exacerbated by the approach of having 'boring old end users' also encouraged to 'just pip install' to pull whatever version at any time. Contrast with other environments where there are similar repositories, but mainly they feed developers rather than all end users. You don't *have* to use pip to install s
Re: (Score:2)
Re: (Score:2)
Nope. 2300 _developpers_ were affected. The number of people affected will be much higher, unless nobody used that code.
Re: (Score:2)
As usual in an open and unregulated market, reputation becomes very important and those that want to not get hit by scams need to check it carefully.
Re: (Score:2)
That is a bullshit number. If the other 0.01% of users of Pypi.org write 50% ot the code that is used out there, then 50% of the code out there is backdoored.
Re: (Score:3)
- You can't search pypi via the command-line. This was supposedly due to 'abuse', but other services have simple solutions to this.
python -m pip install pip-search
Re: (Score:2)
Why single out Ubuntu, as though it were the only distro that does this? I'd be very surprised if there were a mainstream Linux distro that doesn't take this precaution, not only to keep malware out but to make sure that the package works the way it's supposed to and doesn't have any obvious showstopper bugs.
Re: (Score:2)
> - No ratings or comments, no community feedback system
Do you realize that 90% of the ratings are fake? It's both humiliating and time consuming for a developer which wants to break through to have to operate (or subcontract) review sockpuppet rings to increase his or her "ratings".
> How often has this package been downloaded, or viewed?
Why do you care about that? You're supposed to be a programmer (an "engineer" LOL), You're a professional, you should be able to judge things on their own merits, no
Great for hiding credentials... (Score:4, Interesting)
Ironically, where I have seen Python code obfuscators used is a dev using them to hide a password or other credential, for example a root password or access to a database. This was at a place where security definitely took a back seat to pretty much anything else.
There were a few ways people obfuscated credentials when checking stuff into the codebase. Some had a utility which would obfuscate Bash scripts, others would write and compile lengthy C programs which had the credentials as the program output, and some even played around with assembly in order to just run that chunk of code and get the username/password credentials. In any case, whatever it was, was just run as a binary via a system call, and the output either used directly as the username/password, or if the dev was a bit smarter, the output would be a key that unlocked the stored credentials which were stashed away in a SQLite DB.
Re: (Score:2)
Now that is exactly the _wrong_ way to do it. Passwords have no place in code. They belong into config files if you must have them. This sounds like "developers" trying to get around coding guides and code security reviews and do this crappy thing anyways. I have done security works for some companies were trying something like that would get you fired immediately, because it was explicitly forbidden.
Re: (Score:2)
You hit the nail on the head. The "developers" did not care at all, period, and they were encouraged not to care because if their code caused a major security breach, it likely would be never that they would have any consequences for things like disabling SELinux on all servers. However, if they didn't get deliverables in, the Scrum master would make everyone in the group quite aware of it in the standup meeting on a daily basis.
Thankfully, this is a place I no longer work at.
Re: (Score:2)
Thankfully, this is a place I no longer work at.
The only good decision in such a situation. We really need liability for crappy software engineering or nothing will change.
Re: (Score:2)
Even with liability, at most it winds up being a token fine. I don't see this changing until there is something so horri-bad in a war, on the par with a D-Day, a Pearl Harbor, or some momentus event that could have been prevented, had the company spent any effort at all with security. Then, once governments start dissolving companies or holding people criminally responsible because it actually changed the tide of a war, things may change.
Re: (Score:2)
Liability as in "you pay for the damage done to those that used that software" and some damages on top, because that should not have happened at all.
Honestly, (Score:2)
the "developers" falling for this trap lack some essentials skills and therefore have no business creating, and especially distributing, software.
Re: (Score:3)
I'm afraid this is highly reminiscent of distributed software written in other interpretive languages like VB.
One of the disadvantages of ease of use is an inability to vet the code for flaws like this. Reliance on third party libraries is often the vector. The library is inscrutable to the application developer.
Re: (Score:2)
Must be nice to live in a world of unrestricted budgets and schedules.
Re: (Score:3)
Nope. It is just the world were engineering also includes risk management. Most of the software world has not yet understood that idea.
Another reason to use Perl (Score:4, Funny)
No one would have found it and, if they did, probably wouldn't understand it. :-)
(said as someone who enjoys coding in Perl)
Or better still, program in TECO [wikipedia.org]:
It has been observed that a TECO command sequence more closely resembles transmission line noise than readable text. One of the more entertaining games to play with TECO is to type your name in as a command line and try to guess what it does. Just about any possible typing error while talking with TECO will probably destroy your program, or even worse - introduce subtle and mysterious bugs in a once working subroutine.
According to Craig Finseth, author of The Craft of Text Editing, TECO has been described as a "write-only" language, implying that once a program is written in TECO, it is extremely difficult to comprehend what it did without appropriate documentation.
pyc (Score:2)
can't you just distribute the .pyc files and keep the .py files?
I know it's a small step to decompile them but maybe that's enough. Otherwise it seems ot me that the code obfuscators (even the honest ones) may not be enough either.