Final

Challenge
Topic

Frida, dex dump, z3

qengine πŸ₯‡

QuickJS

Kepiting Cirebon

Description

-

Solution

Given an APK file, decompile it with jadx. From com.kepitingcirebon.shell.ProxyComponentFactory and com.kepitingcirebon.shell.ProxyApplication classes, we can see that there's a process for dynamically loading Android components.

By leveraging generative AI, we obtain the following information from the list of functions in libkepiting.so used in the shell via JniBridge:

  • Initialize native environment (a(), ia(), cbde())

  • Discover real app components (rcf(), rapn())

  • Create and delegate to real application (ra(), craa(), craoc())

So the next step is to hook the class loader or dex loader function, but it turns out there's a Frida detection. However, we can bypass the anti-Frida with a universal script bypass.

try {
    var p_pthread_create = Module.findExportByName("libc.so", "pthread_create");
    var pthread_create = new NativeFunction(p_pthread_create, "int", ["pointer", "pointer", "pointer", "pointer"]);
    Interceptor.replace(p_pthread_create, new NativeCallback(function(ptr0, ptr1, ptr2, ptr3) {
        if (ptr1.isNull() && ptr3.isNull()) {
            console.log("Possible thread creation for checking. Disabling it");
            return -1;
        } else {
            return pthread_create(ptr0, ptr1, ptr2, ptr3);
        }
    }, "int", ["pointer", "pointer", "pointer", "pointer"]));
} catch (error) {
    console.log("Error", error)
}

The next idea is to trace with frida-trace several commonly used functions because if we look at the assets there is a file whose function we don't know yet, namely linggisjawa.

From the output above, we know that there is a file in the code_cache, namely i11111i111.zip. After unzipping, there are two files: classes.dex and classes2.dex. We tried decompiling them with jadx.

The decompile results show that the dex file's code is still encrypted, so the next step is to try dumping the dex file while the application is running. Here, we'll use frida-dexdump. Because there's anti-frida, we'll try running frida-dexdump on a running process, assuming the anti-frida check is performed at the start of the application.

Next we try grep with keyword PudingCoklatPakHambali.

After obtaining the two dex files, we tried decompiling each dex file. We found that classes02.dex is the same dex file as the one in the .zip (still encrypted), and classes04.dex cannot be decompiled with jadx. The next step is to convert it to jar and then decompile it again.

And it turns out we can decompile it, so the next step is to understand the code of the program.

By hooking each existing function with the script below and looking at the source code, we can see that keretaArgoNgawi actually performs an xor operation from index arg1 to index arg2 and then compares it with the static value.

So the next step is to create a solver using z3.

Flag: ITSEC{K3p1Tin9_45l1_C1r3Bon_C1k}

qengine

Description

-

Solution

We have 2 solutions for this challenge as shown in the table below

Method
Link

Understanding behavior through dynamic analysis

Dumping the opcode by modifying QuickJS code

Understanding behavior through dynamic analysis

Given emu.py and ELF files

From the code above, we know that the program writes a flag to address 0x57b0b3 and we know that the binary validates the flag that is hardcoded at a specific address.

Next, decompiling it with IDA will reveal some information from the strings.

Searching for quickjs reveals two repositories: quickjs-ng and bellard. We initially compiled quickjs-ng and noticed that the structure of the available functions differed significantly from the provided ones. So, we compiled it from the bellard repository, first downloading the version corresponding to the one listed in the strings, 2024-01-13.

Do a build for quickjs and create an independent executable for our own script as we did to generate v9.

Next, when we decompile test_program, we'll see a code structure similar to the original. So, we create a signature for the test_program executable, but when we load it in v9, we find many mismatches.

So the next step is to manually recover the function name based on the code structure and error strings present in the function.

For example, as in the image above, the function name with a blue background is the result of the signature, and the black background is the function we renamed ourselves. So, the next step is dynamic analysis. After performing dynamic analysis, we obtained the following information.

  • There is a conversion from decimal to binary

  • There is a conversion from binary to hex

  • There is a conversion from hex to decimal

  • There are several operations that change the value for each index

  • There is an array construction

Here are some breakpoints that I utilize for debugging.

From this information we can do a dump for each process, here we can set a breakpoint on the js_string_to_bigint and js_array_push functions, so that the following output is obtained.

While there may be something missing from each process, we can at least see a pattern from the known process. From the final value, it is known that the values for the first 6 indexes are the same, namely 113, 36, 205, 101, 14, 177 , but if we look at the next index, the values are different. If we look at the value at the last index, it can be seen that the values are different even though our input has the same final value, which is } .

From here we can see that each input is actually not processed completely independently but there is still a relationship between index-i and index-i+1 or at least we can see in the first array that the 0th index has 1 bit of value from the last input index. From this we can conclude that we can do bruteforce per byte. So the idea is to take the final value of the entire process, then check it with the correct value. The question is, where is the correct value? So we checked the bytecode that we had dumped and found an interesting sequence

From there, we realized that the correct final value was hardcoded. So, we tried dumping it and validating it with qjsc.

Create file test.js as following

and compile it with qjsc

And we can see the results are the same as those in bytecode.bin or the question bytecode. The final step, because I forgot this was a Linux problem, i used idapython to automate the solution, which of course took longer than using gdb scripting. Here's our solver

Flag: ITSEC{th!s_1s_4n_0pt1m!z33d_v8r5i0n}

Dumping the opcode by modifying QuickJS code

Open quickjs.c and we can see that there is function named js_dump_function_bytecode as following

If we find reference to that function, we will see following code in function js_create_function

By looking at js_dump_function_bytecode and its reference we know that this function used for dumping the opcode of quickjs, this function is called by default if we run a javascript through qjs and enable the DUMP_BYTECODE option during compilation. So let's try to modify the Makefile and create qjs-debug binary.

Then run command below

After compilation successful, create a javascript file such as coba.js with following contents

When we run that javascript file with qjs-debug we will see the following output

Okay now we can confirm that we able to dump the opcode of quickjs, now the question is how to dump the opcodes of the given challenge? during the competition we also try to just create our own quickjs wrapper and compile the code and run the executable but the result is failed, we got segmentation fault by running it.

When we take a look on the error, we can see following error

We can see that the error is on free_bytecode_atoms, so something about atoms!

If we look closer to the js_dump_function_bytecode we will see some information, not only bytecode. There is atom, vardefs, closure, cpool, and etc. We can simplify those terminology as following

  • atom

    • something like dictionary, used by quickjs as part of optimization technique

  • vardefs

    • variable declaration of current function/scope

  • closure

    • variable from outside function

  • cpool

    • constant pool, store values used in function

  • bytecode

    • byte that represent the logic of code/opcode

So the previous issue caused by atom, most likely because the atom is mismatch, something like we want to free atom in index 0x1337 but there is no key 0x1337 in it, so what should we free?

Until this step we know the urgency of atom and bytecode, but for vardefs, closure, and cpool actually we can fake it. How we can fake it? we know that js_dump_function_bytecode is called when we run qjs binary and we know also that the dumped opcode are derived from the given javascript file which is in previous step is coba.js. So if we provide a valid coba.js with fake vardefs, closure, and cpool then it will still produce valid opcode but with "generic" information about variables.

Now we need to dump the bytecode and atom first, let's back to the original binary. Through diffing the binary with our compiled quickjs, we will found the right address to dump the bytecode

  • v27 is a pointer to bytecode

Looking at JSFunctionBytecode struct from our compiled binary, we can see that the length is at 0x28 and the bytecode is at 0x20

By setting up breakpoint at 0x40E75C, we can get 2 values directly which is *(rcx+0x20) for bytecode buffer and *(rcx+0x28) for bytecode length. Let's create gdb script to automatically dump it

After get list of bytecodes, we need to dump the atoms also. Looking at quickjs.c we've following function

JS_AtomToCString can be used to printout atom value by providing index and JSContext, so we need to findout where is JS_AtomToCString in target binary and where we can find valid JSContext pointer.

Previously we've recover function name for JS_AtomToValue by looking at debug/error strings "__JS_AtomToValue". Looking at the XREF to JS_AtomToValue we can easily find which one is JS_AtomToCString function and we found that 0x433660 is the valid one.

Back to JS_CallInternal, we know that the first argument is valid pointer for JSContext, but which one is it? just take a look on function definition and we can see that arg1 is RDI. Set breakpoint at initial call of JS_CallInternal put all the needed data to print the atoms.

Now we have all the atoms, back to source code because we gonna patch it.

Following are explanation for each modification in quickjs.c

  • Hook print_atom with our defined function with values from dump_atom.py

  • Some deletion to ensure it only printout our target opcode

  • Replace original buffer for bytecode with target buffer

    • It will called multiple times (our fake code - coba.js have multiple function), because we've fake variables and constant in main function so we will dump the code only when it processed main function

Continue to qjs.c

  • Add option to read bytecode from a file

Now, let's create fake javascript code. Following is example from me

Previously we've got list of bytecodes, let's write it in dumps directory.

build the executable using make qjs-debug and run with following command

Just change target bytecode to disassemble another function, such as dumps/function_2.bin and so on. After dumping all bytecode, we will got following opcodes.

Next, reconstruct the javascript code based on quickjs opcode.

Last, we just need reverse the process. One round looping can be written as following equation

ybits=xbitsβŠ•ROTR1(xbits)βŠ•kbitsy_{bits} = x_{bits} \oplus ROTR1(x_{bits}) \oplus k_{bits}

With x is our input each round, y is output each round, and k is key. Our target is recovering x value, so let's eliminate k first

ybitsβŠ•kbits=xbitsβŠ•ROTR1(xbits)y_{bits} \oplus k_{bits} = x_{bits} \oplus ROTR1(x_{bits})

Now only left operation on x, because the operation is xoring with rotate right 1 bit, basically by choosing the first bit we can generate the rest bit and only two values are possible which is 0 or 1. Example

So if there are 6 rounds, there should be total 64 possibilities and we can easily detect it. Following is my script to solve the challenge

Flag: ITSEC{th!s_1s_4n_0pt1m!z33d_v8r5i0n}

Last updated