Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
On self-modifying executables in Rust (dend.ro)
96 points by lukastyrychtr on Jan 28, 2022 | hide | past | favorite | 41 comments


> All this happens on a copy of our program. Self-modifying programs were fine under MS-DOS, but modern operating systems won't let it fly. Renaming or overwriting is fine, since the original file is still unchanged.

This isn’t strictly true: Linux, for example, provides `process_vm_writev`, which can be used to dynamically poke the executable text at runtime. The Linux kernel itself even has a decent amount of SMC, although it doesn’t use that mechanism.

The `totally_safe_transmute` party trick[1] from last year uses a similar technique. Technically it’s on data and not code, but the effect is the same.

[1]: https://blog.yossarian.net/2021/03/16/totally_safe_transmute...


I don't think process_vm_writev can write to read-only memory, which executable memory is under modern operating systems. Doing that requires ptrace, which long predates Linux. According to the Linux man page, it is "CONFORMING TO: SVr4, 4.3BSD".


The ptrace APIs also can’t write to read only memory, at least not on modern Linux running on CPUs with modern MMUs. But SMC doesn’t require that, because you can always map (or remap) executable pages as writable.

Executable code is only read-only by default, you can always change that (or just not do it, in the case of a JIT).


> The ptrace APIs also can’t write to read only memory

Sure they can.

https://stackoverflow.com/questions/49442087/how-does-ptrace...


Huh, this is news to me! I always thought that ptrace temporarily changed the page's permissions, rather than bypassing them entirely. My mistake.


It's not precisely "self modifying code" but FYI various systems are starting to use LLVM to specialize code at runtime (aka "JIT"), including Postgres, which is written in old school C.

In the case of Postgres, the code is modified to precisely work for the particular SQL expression and for the particular table schema. This magic is on by default, which means 100 million+ computers are doing this right now. 25-40% speedups on certain queries.

This concept was first inspired in the 1990s by folks like Henry Massalin, and confirmed at Berkeley and elsewhere, but wasn't considered practical until LLVM came along.

Mic drop and mind blown.

https://severalnines.com/database-blog/overview-just-time-co...

https://www.postgresql.org/docs/current/jit.html

https://www.postgresql.org/docs/current/jit-reason.html#JIT-...

https://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.29....

https://www2.eecs.berkeley.edu/Pubs/TechRpts/1994/6310.html


Oracle and SQL Server have been using JIT for quite some time, no need to wait for LLVM.


> you probably shouldn't do this except as a party trick.

I seriously question why you ever want to do this in any language except as a research effort. (for all practical purposes)

If anyone has any real applications I would like to be enlightened. Writing an interpreter, hoping to sandbox it, executing it with modifying code with it does not count, only modifying the original program as is.


I can give you a practical reason, legit even. Say you provide your clients with a demo version of your program, that after 30 days will expire and clients either register to continue benefit or have a demo version that no longer works.

Sure, you can employ a simple scheme where you write stuff in another part of the OS (like registry in Windows) and then rely on goodwill of your potential clients they will register your demo. But such a scheme is easy to abuse. So how about one that's a bit harder like having a self-modifying demo that will write stuff onto itself and check themselves to stop running after 30 days. Bonus points if your demo also has some database that is hosted inside the executable so it has legit reasons for doing the self-modifying bit. Even with a debugger attached you made to eventual cracker life 100 times harder. Sure, anything can be cracked but it's all about your hours vs cracker hours in the end.


That's not particularly useful. Just keep a copy of the installer of the demo program. (I always do that anyway.) Instead of having a complicated self-modifying thing, the demo version could just delete itself, or overwrite itself with a stub program.


Correct. And you will have no data that you used for those 30 days because this copy is fresh. So you will do what? Every 30 days start all over again and enter the data you need? Like let's say a veterinary office that has clients and their animals, complete with treatments and pictures? Good luck getting that back in with the new fresh installed one.


I think Cosmopolitan[1] is a beautiful example for self-modifying executables.

> Cosmopolitan Libc makes C a build-once run-anywhere language, like Java, except it doesn't need an interpreter or virtual machine. Instead, it reconfigures stock GCC and Clang to output a POSIX-approved polyglot format that runs natively on Linux + Mac + Windows + FreeBSD + OpenBSD + NetBSD + BIOS with the best possible performance and the tiniest footprint imaginable.

> Please note that your APE binary will assimilate itself as a conventional resident of your platform after the first run, so it can be fast and efficient for subsequent executions.

[1] https://github.com/jart/cosmopolitan


It's a technical hack to solve an organisation problem though. The correct solution is a standard format for fat binaries that's supported everywhere.


It's an achievable hack. Your "correct solution" is unachievable.


APE is fundamentally broken on pretty much every platform that isn’t Linux because of baked-in ABI.


As others have pointed out, there is a big difference between self-modifying in-memory and on-disk. The article seems to be talking about on-disk modification.

We have an internal CLI for our developers that auto-updates itself by replacing the binary on-disk. The auto-update bit only runs when the developer uses our CLI, and at most once every 24 hour. If an update is triggered, it prints out a message saying it was updated and asks the developer to re-run the command.

The upshot is we didn't have to write and distribute a second application to handle auto-updates as a background daemon, and we can be reasonably confident anybody using our CLI is +/- one version.

If for some reason it leaves the binary in a bad state, devs can still install over it with homebrew, or downloading from the releases page - haven't had to do that though in the 2 years we've had it.


Surprisingly backtrace apparently uses self-modifying code in production for feature flags:

https://engineering.backtrace.io/2021-12-19-bounded-dynamici...

This is basically to make it zero cost (as far as I remember)

Google uses a similar system of feature flags, but as far as I know there's no self-modifying code. I guess Backtrace wants to put the flags in tight inner loops.

dtrace and I believe systemtap patch the executable at runtime, for similar performance reasons. But those are kernel features and not user applications.


You can do somethings like this using the .init and .fini sections in a dynamically loaded shared lib.

https://maskray.me/blog/2021-11-07-init-ctors-init-array


Malware and packers do this with overlap between them. Let's say you have a 10MB executable, you can pack it to be a lot less. Tamper protection is also another, VMProtect is the packer I run into in that area where it tries hard to detect if it is inside a VM before self modifying and unpacking.


EDIT: This is about modifying code in memory; the article is actually more about modifying the executable file on disk — oops (:

----

One example:

My understanding is that static initialization in C/C++ is sometimes done with self-modifying code.

Eg:

  ExpensiveObject &getInstance() {
      static ExpensiveObject obj;
      return obj;
  }
So, the first time through, the generated assembly would run some "if object is not initialized, run the constructor" conditional code. Then, it would rewrite itself to not do that check for subsequent calls.

Obviously, that's all in the "hidden behind the scenes" generated assembly code, and the rust compiler could pull similar tricks, but you might also want to allow client code to do such things, for extreme performance cases.


That's not really what's happening with such C++ code. `static` variables are placed in normal writeable global memory. Only difference with such static is that compiler generates extra "bool" variable and checks it to initialize obj only on first run (threadsafe on C++11 and up).

You can see example here: https://godbolt.org/z/7f1qbh5r5

There are no code modifications - all compiler generated is extra byte check ("if" statement) and then acquire/release lock around constructor to be threadsafe.


Some more good discussion here: https://stackoverflow.com/questions/63568992/can-compilers-g...

I guess it's not commonplace these days, but it is still allowed, it seems. I feel like my initial run-in with the concept was on some embedded systems stuff long ago, so maybe that was a more esoteric compiler.


Function tracing in Linux works like this - only functions that you want to trace are modified at runtime to redirect execution, to keep it otherwise (almost) zero cost. They also have "static keys" to allow for seldom used features in performance critical code to be toggled on and off (through runtime code modification) that are used all over:

https://elixir.bootlin.com/linux/latest/source/include/linux...


Metamorphic malware [1]. Metamorphic code is used by computer viruses to avoid the pattern recognition of anti-virus software. [2]

> A metamorphic virus is a type of malware that is capable of changing its code and signature patterns with each iteration. [1]

[1] https://www.techtarget.com/searchsecurity/definition/Metamor...

[2] https://en.wikipedia.org/wiki/Metamorphic_code


E.g. generating performance-critical DSP code at runtime, depending on the execution environment, e.g. cache-line width. Of course you can pre-generate common variants, but code size will be exponential in the number of parameters.


Compiling code at runtime is not self-modifying. For instance, in GPGPU programming, the kernels (=code running on GPUs) are usually compiled at runtime. And of course any time of JITs in scripting languages do the same.


What about rewriting CPU feature-detection code to remove branching?


Yeah, I think this is the answer to OP's question.


Ok, then consider a lazy-loading dynamic linker that rewrites the thunks when the library is loaded.


The linked article is not about self-modifying code, but self-modifying executable files on disk.


Not quite what you're asking, but back in the days of DOS, I did something this with a journal batch file I wrote.

DOS (And even Windows now) runs a line from a batch file, and the re-opens the file to read the next line. Makes it super easy to edit files that are running, and even easier back in the days of DOS when you had edlin to help you do the editing.

My script would store state, options, journal entries, and other things I've long forgotten about in itself. Of course, it all came to a crashing halt when it managed to destroy itself.


Working around the memory/cpu tradeoff was the traditional reason for writing SMC.

For instance, you wrote the code to XOR a line to the screen. Maybe you also want an AND version and a OR version, but you don't want to pay the cost of having an "if" in the middle of the loop that checks what kind of operation you want to do. You also don't want to ship 3 nearly identical versions of the code, so you ship one version and patch it at runtime.


I remember an old program that used to save config parameters (bitflags iirc) in the binary to avoid external configs.

Obviously this has a lot of downsides but it's quite intriguing to have an executable with an embedded config.


I believe self-modifying virus is a thing back in MS-DOS/Win95 era, not sure if it is still relevant these days.


How would this work, when something like Windows Security Policy (Group Policy) can prevent or allow apps based on its hash?

https://docs.microsoft.com/en-us/windows-server/identity/sof...


Quote: "Self-modifying programs were fine under MS-DOS, but modern operating systems won't let it fly"

This is wrong on 2 accounts:

1 - Even on MS-DOS you weren't able to directly modify your executable. I remember having to launch a copy, which would see from the name is a copy, exit and let the copy modify the original .exe. Later I simply did this using batch files.

2 - Even on modern operating systems you can actively destroy the link/handle between the operating system and the executable, allowing you to actively modify your own executable. See "Unlocker" as example on Windows for a legit application that can do such a killing (I used that to get rid of that crap called BitDefender, who after installation is behaving like a rootkit)


1. you can bypass DOS and write directly to the sectors on the disk

2. Linux won't let you open() a file that's a resident executable for write... you'll get ETXTBUSY

you can write out a new file and change the directory entry to point to a new one, but then you haven't technically modified the other file


> you can write out a new file and change the directory entry to point to a new one, but then you haven't technically modified the other file

Yep, this. Meanwhile, on MacOS, code signatures (for signed binaries) are cached by inode. If the file contents at that inode change, the signature no longer matches, and Gatekeeper will SIGKILL the binary immediately when you try to run it. The cache is only kept in memory though, and rebooting clears it.


Cool stuff!

For entertainment, I attached some data to some executable file just appending the data to that executable and ending it with a uint64 with the amount of data appended, but this is a lot simpler and clever.


Can't you just append the data to the executable? I don't see why it needs to be an ELF section.


Probably not allowed on iOS :(




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: