Reverse Engineering Application Protected with Pyarmor

Study case pyarmor obfuscation on windows, linux, and macos environment

Preface

I decided to write about my approach to analyzing application protected with pyarmor. The interesting fact is i've encountered many application protected with pyarmor in windows, linux, macos and i analyze all of them with different approach.

Approaches

Stomping Python Builtin Function

  • Tested on: Linux, MacOS, Windows

Study Case: TFCCTF 2024 - McKnight

Given python file that run pyarmor, we recognize it from the code

from pytransform import pyarmor_runtime
pyarmor_runtime()
__pyarmor__(__name__, __file__, <DATA>)

There are also another files given such as in image below

When we run the program it will give us output like below

One of the possible function used by the program to printout the usage string is "print" which is builtin function from python. So basically if we define our print function it will replace the print function used by the program. Lets try with simple code

We can see that we've successfully injected our python code. Now what we can do? since we can inject our python code we can do some information gathering. Lets try with calling global function.

Know we have information about function and variable available in protected code, lets try to call all of them.

  • Call calc_line function, to see whats argument should be passed (if needed)

  • Call hash function, to see whats argument should be passed (if needed)

Now we know what argument should be passed, lets try to pass some data based on known information (trial and error will be enough)

So using this approach we can call any function inside the protected code, what about getting the "plaintext" code? Lets try using dis

It failed, but if we take a look on the partial disassembled code it looks like still protected with pyarmor. Also the error shown message "IndexError: tuple index out of range", so it looks like something missing with the list of name (wether variable or function). Lets try another approach to disassembly the code, one of the approach we can use is by utilizing function attribute in python which is `__code__`.

We got output above and it still looks like protected by pyarmor. Until this step we know that direct disassembly will be fail, lets try another approach by enumerating available attribute in python function.

If the code is not much complex we can do educated guess to know the algorithm. For example we know that there is loop in the code and there is usage of coeffs variable.

Lets analyze each output

The result from our educated guess is the operation from calc_line is matrix multiplication so until this step we've been successfully reversed calc_line function. But what about hash function? it will be harder than calc_line since it call calc_line function and again we need to do some educated guess for it.

Study Case: BlackHat MEA 2023 - Can you break the armor?

Given python file protected with pyarmor file with directory below

lf we run run.py it will give nothing, but if we use python run.py with argument it will show the output like below

We can see that there is print function also called, so lets stomp it again.

Because there is self deletion we need to overcome it by calling exit at first print.

Looks like there is only main function, because we know that it will be failed if we disassembly the main function because it is protected with pyarmor lets gather some information from `__code__`.

So looks like there is inner function inside main function, lets try to disassembly those inner functions.

Looks like it successfully disassembled, lets loop all the code object.

Nice, looks like until this step we've been successfully got the flag by disassembling inner function by looking at the constants.

Further Exploration #1

Because we managed to get inside the protected code we can use the unpacker with method 2 from this repository. If we directly run the code it will failed, but by looking at the structure we can slightly modified the code so it will work with our runner.py

Run code above and we will get dump/hasher.pyc, then use pycdc to decompile the code.

Further Exploration #2

After taking a look on method 3, this approach basically did the same approach with method 3 but by utilizing builtin function called by the protected code instead of hooking marshal.loads using sysaudit.

Injecting Python Code during Runtime

  • Tested on: Windows

Study Case: Flare-On 9 - Challenge 11, Utilizing PyInjector (Windows)

For this approach we can use method 2 from this repository.

Study Case: XXXXX, Utilizing PyInjector (Linux)

Study Case: XXXXX, Utilizing PyInjector (MacOS x64)

Study Case: XXXXX, Utilizing PyInjector (MacOS aarch64)

---To Be Updated---

Modifying Python Executable

  • Tested on: Linux

Study Case: TFCCTF 2024 - McKnight, Dumping Object Code

Study Case: TFCCTF 2024 - McKnight, Tracing OP_CODE

---To Be Updated---

Conclusion

---To Be Updated---

References

Last updated