Hijacking the VM’s JIT

You ever wonder what happens after HotSpot compiles your hot methods to native code? Today i’ll be exploring a silly little trust assumption in the JVM, once code is JIT compiled, the VM doesn’t verify it actually matches the bytecode. We’ll abuse this to make a Java method do whatever we want :3

The Trust Model

When you write Java code, it goes through several stages before actually executing. Your .java files compile to bytecode, which the JVM initially interprets. Once a method gets “hot” (called frequently enough), HotSpot’s JIT compiler kicks in and generates native machine code for your platform

Here’s the interesting part, after compilation the JVM stores a pointer from the method’s internal structure to the compiled code and just… trusts it. There’s no verification of any sorts, that the native code still corresponds to the original bytecode

Setting Up Our Target

First, we need a method that’ll definitely get JIT compiled. Something expensive enough to trigger compilation but simple enough to understand:

public class Main {
    public static void main(String[] args) throws InterruptedException {
        System.out.println("Starting JIT hijack demo...");
        System.out.println("Warming up JIT...");
        
        for (int i = 0; i < 20000; i++) {
            expensiveMethod(42);
        }
        
        System.out.println("JIT compiled. Running...\n");
        
        int callCount = 0;
        int lastResult = 0;
        
        while (true) {
            int result = expensiveMethod(42);
            callCount++;
            
            if (result != lastResult) {
                System.out.println("[!] Result changed: " + lastResult + " -> " + result);
                lastResult = result;
            }
            
            if (callCount % 500 == 0) {
                System.out.println("[" + callCount + " calls] expensiveMethod(42) = " + result);
            }
            
            Thread.sleep(10);
        }
    }

    public static int expensiveMethod(int x) {
        int acc = x;
        for (int i = 0; i < 1000; i++) {
            acc ^= (acc << 3) + i;
            acc = Integer.rotateLeft(acc, 7);
        }
        return acc;
    }
}

The loop runs 1000 iterations with bit manipulation, enough to be “expensive” and trigger JIT compilation quickly. We’re also printing the result periodically so we can see when our hijack works !!

Run it with the following flags:

java -XX:+PrintCompilation -XX:CompileCommand=dontinline,ivy/april/Main.expensiveMethod -cp build/classes/java/main ivy.april.Main

The -XX:+PrintCompilation flag shows us when methods get compiled, and dontinline prevents HotSpot from inlining our target method into main(), which would make it more annoying to find

You’ll see output like this as HotSpot compiles our method:

  1000  117 %     3       ivy.april.Main::expensiveMethod @ 4 (34 bytes)
  1030  118       3       ivy.april.Main::expensiveMethod (34 bytes)
  1646  119 %     4       ivy.april.Main::expensiveMethod @ 4 (34 bytes)
  1788  120       4       ivy.april.Main::expensiveMethod (34 bytes)

The % means it’s an OSR (on-stack replacement) compilation, and the numbers 3 and 4 are compilation tiers. Tier 4 is C2, the optimising compiler

Understanding HotSpot’s Internal Structures

Before we can hijack anything, we need to understand how HotSpot organises compiled code.. There’s three key structures!

Method*

This is HotSpot’s internal representation of a Java method. it lives in Metaspace and contains everything the VM needs to know about a method, its bytecode, access flags, constant pool references, and most importantly, a _code field that points to the compiled native code

Looking at src/hotspot/share/oops/method.hpp, we can see the following fields:

// Entry point for calling from compiled code, to compiled code if it exists
// or else the interpreter.
volatile address _from_compiled_entry;     // Cache of: _code ? _code->entry_point() : _adapter->c2i_entry()
// The entry point for calling both from and to compiled code is
// "_code->entry_point()".  Because of tiered compilation and de-opt, this
// field can come and go.  It can transition from null to not-null at any
// time (whenever a compile completes).  It can transition from not-null to
// null only at safepoints (because of a de-opt).
nmethod* volatile _code;                   // Points to the corresponding piece of native code

When a method is only being interpreted, _code is null. Once JIT compilation happens, this pointer gets set to the newly created nmethod. Notice the comment, the VM knows this field “can come and go” but there’s no integrity checking!

nmethod

The nmethod structure represents a compiled method in the CodeCache, It’s defined in src/hotspot/share/code/nmethod.hpp and contains the actual machine code, metadata for garbage collection, deoptimisation info, exception handling tables, and references back to the Method*

The structure has entry points that the VM jumps to when calling the method:

// offsets for entry points
address  _osr_entry_point;       // entry point for on stack replacement
uint16_t _entry_offset;          // entry point with class check
uint16_t _verified_entry_offset; // entry point without class check

And there’s helper methods to get the actual addresses:

address entry_point() const          { return code_begin() + _entry_offset;          } // normal entry point
address verified_entry_point() const { return code_begin() + _verified_entry_offset; }   // if klass is correct

The verified_entry_point is what gets called when the VM already knows the receiver type is correct, this is usually the one we care about for static methods

CodeCache

All compiled code lives in a special region of memory called the CodeCache. It’s executable memory managed by the JVM, separate from the Java heap. You can see its bounds with -XX:+PrintCodeCache

Finding Our Target

I used HSDB (HotSpot Debugger) to locate our compiled method. After attaching to the running JVM process with jhsdb hsdb, I navigated to the Class Browser and found ivy.april.Main:

public class ivy.april.Main @0x000001e663000800
  Methods:
  * public static int expensiveMethod(int) @0x000001e6a3400458

That @0x000001e6a3400458 is our Method* pointer. Clicking on it reveals more details, including the compiled code:

Compiled Code:
[Entry Point] [Verified Entry Point] 0x000001e64d1e39a0: sub $0x18,%rsp

So we have:

Method* at 0x000001e6a3400458
nmethod at 0x000001e64d1e3810
Entry point at 0x000001e64d1e39a0

Analysing the Compiled Code

Let’s look at what HotSpot actually generated. Here’s the beginning of the compiled expensiveMethod:

0x000001e64d1e39a0:    sub    rsp, 0x18
0x000001e64d1e39a7:    mov    qword ptr [rsp+0x10], rbp
0x000001e64d1e39ac:    cmp    dword ptr [r15+0x20], 1
0x000001e64d1e39b4:    jne    0x000001e64d1e3b25
0x000001e64d1e39ba:    lea    r10d, [rdx*8]
0x000001e64d1e39c2:    xor    r10d, edx
0x000001e64d1e39c5:    rorx   eax, r10d, 0x19

Let’s break this down!

Stack frame setup:

sub    rsp, 0x18                  ; Allocate 24 bytes of stack space
mov    qword ptr [rsp+0x10], rbp  ; Save frame pointer

Safepoint poll:

cmp    dword ptr [r15+0x20], 1    ; Check safepoint state
jne    0x000001e64d1e3b25         ; Jump to slow path if safepoint needed

In HotSpot on x64, r15 is reserved as the thread pointer. Offset 0x20 contains the safepoint state. This check happens at method entry so the VM can stop threads for garbage collection

The actual computation:

lea    r10d, [rdx*8]          ; r10 = x * 8 (the << 3)
xor    r10d, edx              ; r10 ^= x
rorx   eax, r10d, 0x19        ; rotate right by 25 (same as left by 7)

The JIT has completely unrolled our loop :3 instead of 1000 iterations with branches, it’s doing 16 iterations per loop cycle to reduce branch overhead, all those rorx instructions are our Integer.rotateLeft() calls, and the lea/xor pairs are the acc ^= (acc << 3) + i operations

Here’s the loop structure:

0x000001e64d1e39e0:    mov    r8d, r10d
0x000001e64d1e39e3:    lea    r10d, [r8+rax*8]
0x000001e64d1e39e7:    xor    r10d, eax
                                                  ; ... 14 more unrolled iterations ...
0x000001e64d1e3ad2:    cmp    r10d, 0x3e1         ; Compare to 993
0x000001e64d1e3ad9:    jl     0x000001e64d1e39e0  ; Loop back if < 993

The loop processes 16 iterations at a time, then checks if we’ve hit 993 (0x3E1). A cleanup loop handles the remaining iterations

Method epilogue:

0x000001e64d1e3afc:    add    rsp, 0x10                   ; Clean up stack
0x000001e64d1e3b00:    pop    rbp                         ; Restore frame pointer
0x000001e64d1e3b01:    cmp    rsp, qword ptr [r15+0x450]  ; Stack overflow check
0x000001e64d1e3b08:    ja     0x000001e64d1e3b0f          ; Jump if overflow
0x000001e64d1e3b0e:    ret                                ; Return to caller

The cmp rsp, [r15+0x450] is checking for stack overflow before returning. If the stack pointer is above the limit stored in the thread structure, it jumps to a handler.

The Hijack

Now for the cool part! All we need to do is overwrite the first few bytes at the entry point, The JVM will jump to this address expecting compiled code, and execute whatever we put there.

Let’s replace the method with something simpler:

mov eax, 1337    ; B8 39 05 00 00
ret              ; C3

That’s all it takes. The method now ignores its input, skips all the computation, and returns 1337 :3

Using a DLL attached to the VM, we can do:

uint8_t thingy[] = {0xB8, 0x39, 0x05, 0x00, 0x00, 0xC3};

DWORD old;
VirtualProtect((void*)entryAddr, sizeof(thingy), PAGE_EXECUTE_READWRITE, &old);
memcpy((void*)entryAddr, thingy, sizeof(thingy));
VirtualProtect((void*)entryAddr, sizeof(thingy), old, &old);

We need VirtualProtect because the CodeCache is mapped as executable but not writable by default, we temporarily make it writable, patch it, then restore the original protection. On Linux you’d use mprotect instead..

The Result

After patching, the console shows:

[4500 calls] expensiveMethod(42) = -990666027
[5000 calls] expensiveMethod(42) = -990666027
[!] Result changed: -990666027 -> 1337
[5500 calls] expensiveMethod(42) = 1337
[6000 calls] expensiveMethod(42) = 1337

The [!] Result changed line appears the instant our patch hits. The VM is now executing our payload, completely unaware that the “compiled” method no longer matches its bytecode

Conclusion

We’ve seen how HotSpot’s trust model works, bytecode is verified, but compiled code is trusted. By locating the Method* structure, following the _code pointer to the nmethod, and patching the entry point, we can make any Java method do whatever we want

If you’re asking why too? I don’t think this is particually useful for a lot, but I do think it’s pretty interesting :O

Thanks for reading! :3

Hijacking the VM's JIT