DO I FEEL LUCKY? Linux/Slotmachine

┌───────────────────────┐
▄▄▄▄▄ ▄▄▄▄▄ ▄▄▄▄▄ │
│ █ █ █ █ █ █ │
│ █ █ █ █ █▀▀▀▀ │
│ █ █ █ █ ▄ │
│ ▄▄▄▄▄ │
│ █ █ │
│ █ █ │
│ █▄▄▄█ │
│ ▄ ▄ │
│ █ █ │
│ █ █ │
│ █▄▄▄█ │
│ ▄▄▄▄▄ │
DO I FEEL LUCKY? │ █ │
Linux/Slotmachine │ █ │
~ qkumba 2025 └───────────────────█ ──┘

Funny story, I agreed to analyse this sample before I saw it. I expected x86
code, and that I would be done in a day, as has been the case previously,
because analysing x86 code is really easy for me. Instead, I got AARCH64. I am
familiar with 16- and 32- bit ARM, but 64-bit is sufficiently different that it
might as well be considered a new thing. Then some other setbacks IRL, and it
is now many weeks later that I'm finally done... but I enjoyed the process
thoroughly, and I learned lots of things along the way!

TAKING THE HIGH(BIT) ROAD

The first thing that the virus does after saving registers is to query two
architectural system registers. Specifically, the virus reads CTR_EL0 and
DCZID_EL0, and we are already on our way to the land of obfuscation. These two
registers have a shared property that is useful to the virus: bit 63 is defined
(as 1 in CTR_EL0, and 0 in DCZID_EL0). The virus adds the two register values
together and then isolates bit 63 to produce the constant 0x80000000. This
value is used heavily throughout the virus as the basis for forming many other
constants - shift it right a few times, add or subtract, shift it left, add
some more, new value and not immediately obvious what it is.

For example, the initial stack setup looks like this:

; x0=80000000
ORR X0, XZR, X0,LSR#22 ; x0=0x200
ADD X1, X0, #0x20 ; x1=0x220
SUB SP, SP, X1 ; sp=sp-0x220
ORR X0, XZR, X0,LSL#22 ; x0=0x80000000 once again

Fortunately, it's merely an inconvenience solved by a calculator.

The next thing that the virus does is issue a syscall. The virus uses the value
obfuscation for all of the syscall indices and their parameters, too:

MOV X8, X0 ; x8=0x80000000
ORR X8, XZR, X8,LSR#24 ; x8=0x80
SUB X8, X8, #0xB ; x8=0x75
ORR X2, XZR, X8,LSR#7 ; x2=0
ORR X1, XZR, X8,LSR#7 ; x1=0
ORR X3, XZR, X8,LSR#7 ; x3=0
ORR X0, XZR, X8,LSR#7 ; x0=0

Again, solved by a calculator, but it really slows down the analysis. This call
is "ptrace(PTRACE_TRACEME)", as an anti-debugging technique. The virus exits if
ptrace is active already, but an unintended side-effect is that it will exit if
ptrace is disallowed by removing "CAP_SYS_PTRACE" from the system capabilities.
Of course, disabling ptrace would break a collection of features so an
environment without it probably has other measures that would be hostile to a
virus in any case.

GARBAGE IN, GARBAGE OUT

The next curiousity is this instruction:

UBFM X28, X7, #0x3F, #0x13

It serves no purpose to the code, so why is it there? It turns out that it's
simply one of several garbage instructions throughout the code, intended to
slow the analysis further... or is it?

The virus continues by opening the current directory, but the construction is
also obfuscated:

ADR X1, 0x2C0
ADD X1, X1, #1

What's obfuscated about that? It's because the instruction at address 0x2C0 in
the virus code looks like this:

CMP W20, #0xB

It's not until you look at the opcode itself, and consider the "ADD #1"
instruction, that it becomes clear:

9F 2E 00 71

2E 00, there's our "." string, the parameter for the current directory.

If the virus can open the directory, then it reads some entries in a loop,
looking for regular files that it can open for read and write. When a file is
found, the virus reads the first five bytes. It checks for the "0x7F ELF"
signature, in an obfuscated way:

ADR X1, 0x608
ADD X1, X1, #2

where the instruction at address 0x608 in the virus code looks like this:

UBFM X28, X15, #0x3F, #0x11

but is encoded this way:

FC 45 7F D3

and then we can see the 0x7F and the 'E'. Similarly, the 'L' is hidden in
another instruction:

FC 4C 7F D3 UBFM X28, X7, #0x3F, #0x13

Ha! There's our "garbage" instruction from earlier! The 0x7F also appears here
but we know already that it is not the one that is used. The 'F' comes from
this instruction:

4C FC 46 D3 UBFM X12, X2, #6, #0x3F

The virus checks that the found file is 64-bit but does not check the CPU,
perhaps assuming that any file on the system where the virus is running will
also be for that system. The endianness of the file is not checked, likely for
the same reason. They seem like reasonable assumptions. The virus also requires
that the file is more than 64 bytes long. When a file is found that matches the
criteria, the virus creates a memory-mapped version that is extended to include
the virus code rounded up to 128kb in size. It looks like the intention is to
ensure that the last segment in the file is extended to 64kb and then leave
room for the 64kb large virus segment. There is a corner case when the
alignment of the last segment in the file is not compatible with this
extension, but it's not interesting to consider. Let us focus on what the virus
does, not what it doesn't.

The virus checks that the program header is contained entirely within the file,
and then iterates through the table, examining all program headers until the
first PT_NOTE, if any, is seen.

LOCK AND PT_LOAD

For each program header, the virus calculates the virtual end of the segment,
rounded up to a multiple of 64kb. If this value is larger than the largest
value seen so far, then the virus replaces the old value with the new value. It
will be used during the infection stage, but it is vulnerable to another
corner-case. If the virus has not found the entrypoint segment yet, and if the
current program header is PT_LOAD, then the virus saves the virtual address of
the segment if it is the first time that PT_LOAD has been seen. The first
PT_LOAD is assumed to be the file header, from which the virus can use the load
address is subsequent calculations. If the segment is not executable then the
virus continues the search for another PT_LOAD. This situation will occur if
the code segment is separated from the header segment as a security precaution.
The virus requires that the entrypoint is inside the first executable segment,
likely for simplicity.

If the entrypoint segment is found, then the virus examines the entrypoint
code, instruction by instruction, watching for an "ADRP" instruction occurring
before a "BL" instruction. If an "ADRP" instruction is found, then the virus
checks for a "NOP" instruction immediately following. The reason for this check
is described below. If the instruction is not a "NOP", then the virus switches
the program header search to look for a PT_NOTE. Otherwise, the file is not a
candidate for infection, all changes are discarded, and the file size is
restored.

If a PT_NOTE is seen, then the virus checks if a PT_LOAD was seen already. This
is always true since the code path that branches to the PT_LOAD block appears
right after the check for a PT_LOAD having been seen already. Still, there's no
harm in "just in case".

The virus converts the PT_NOTE to a PT_LOAD, and marks the segment executable.
Then the virus rounds up the file size to the next multiple of 64kb, and sets
the segment file offset to the new file size. The virus sets the segment
alignment, physical size, and virtual size, to 64kb, and then sets the virtual
offset to the largest address that was seen earlier. This assumes that no
subsequent segments load to addresses after the one that was calculated, which
is the corner-case mentioned earlier.

After adjusting the file structure, the virus applies a metamorphosis before
writing itself to the file. This is the most interesting part of the code, and
deserves a dedicated section of analysis (see below). After the code is written
to the file, the virus examines the entrypoint code again, instruction
by instruction, watching for the "ADRP" that was seen earlier. The same
check for the "BL" instruction appears here, though it should never be
needed. Once the "ADRP" instruction is found, the virus checks if the following
instruction is either "ADD reg,LSL#12" or "LDR". The virus compares with only a
single encoding for each of those instructions, likely for simplicity. Though
there are alternative encodings possible due to undefined bits, a compiler is
unlikely to generate them.

ENTER HERE

If the "ADD" instruction was seen, then the virus constructs a "B" instruction
that points directly to the original entrypoint, and stores that as the code to
jump to the original entrypoint. Otherwise, the virus constructs a
"ADRP / ADD / LDR / BR" sequence, and stores that instead. At this point, the
virus checks that the entrypoint is reachable by an "ADRP" instruction. If it
is (that is, within 4GB(!) of the entrypoint), then the virus replaces the
source of the "ADRP" instruction at the entrypoint with the address of the
virus code. This is a simple but effective entrypoint-obscuring technique,
which will be hidden further by any disassembler that is able to find the main
function heuristically. Since the virus is aligned to a multiple of 64kb, no
"ADD" instruction is needed to construct the full address. As a result, the
virus replaces the following instruction with a "NOP", serving both as a true
do-nothing instruction, and as the infection marker. However, if the virus is
(somehow) placed too far from the entrypoint, then the virus simply alters the
entrypoint value directly. Fortunately for the virus writer, this change does
not introduce the possibility of a reinfection, despite the lack of the code-
based infection marker. It works because of the implicit infection marker: the
entrypoint is not inside the first executable PT_LOAD segment! Another piece of
luck on the part of the virus writer: if the PT_NOTE segment were allowed to
appear before the entrypoint PT_LOAD segment, and if the entrypoint were
changed to the PT_NOTE segment that is converted to an executable PT_LOAD
segment, then not only would the file be a candidate for reinfection (if
another PT_NOTE segment exists, because the entrypoint now points inside the
newly-first PT_LOAD segment), but the "ADRP / ADD" sequence might be found in
the virus code that jumps to the original entrypoint. The "ADD" instruction in
the virus code would then be replaced with the "NOP" instruction, preventing a
third infection, and then on each execution, the virus would run twice instead
of once, before finally running the original host code. What a mess.

In any case, now the file is infected completely, and the virus continues the
search for new files to infect.

JACKPOT!

Once all files in the current directory have been examined, the virus prints
four bytes of text that vary by generation. Infecting and executing a file over
five generations is needed to show the entire message (the message is "SLOT
MACH INE! JACK POT!"). Then the virus constructs parameters for the "exit"
syscall, but never calls it. This might be left-over debugging code. Finally,
the virus transfers control to the original entrypoint to let the host run as
before.

The code is not optimised for size at all, but this can also serve as an anti-
analysis technique. There are duplicated blocks where a subroutine would
typically be used instead, and loops are of the "cmp/conditional
branch/unconditional branch" style, such as:

SUB W11, W11, #1
0x548
LDR W12, [X2],#4
STR W12, [X1],#4
CMP W11, WZR
B.EQ 0x560
SUB W11, W11, #1
B 0x548

which could be written instead as:

0x548
LDR W12, [X2],#4
STR W12, [X1],#4
SUB W11, W11, #1
CBNZ W11, 0x548

BIG MESS O' WIRES

Further to the point about the subroutines, there are actually no subroutines
defined at all in the virus code, only conditional and unconditional branches.
Instead, the virus code is laid out in this way:

open dir
file iterator <--------------------------------
<-- eventually branches down to close dir |
| { |
| open file |
| map file |
| <-- branches to phent iterator |
| | close file <---------------------------- |
| | | |
| | branches up to file iterator ---------/|\->
| --> phent iterator |
| <-- eventually branches down to unmap file |
| | { |
| | pt_load check |
| | pt_note check |
| | <---- branches down to morpher |
| | | } |
| | --> morpher |
| --> unmap file |
| branches up to close file ------------->
| }
--> close dir
jump oep

The overlapping nature of the loops serves as an effective anti-analysis
technique.

POWER UNLEASHED

The metamorphism in the virus is implemented via a table of "line numbers" (all
instructions on AArch64 are fixed length, four bytes long) and transformation
possibilities. The transformation possibilities are structures that hold the
index within a symbol group, the total number of groups, the number of
instructions in a group, and the groups themselves. The symbols are values used
in an exclusive OR operation that are applied to the instruction(s) beginning
at the specified line number. It is a simple engine but it is capable of
exchanging the line order of some instructions, and changing the source
register order of many instructions. For example, the first entry in the table:

DCW 3 ; line number (*1)
DCB 01 ; symbol offset (high four bits)
; and count-1 (low four bits)
DCB 2 ; instruction count (*2)
DCD 0xC1 ; symbol (*3)
DCD 0xC1
DCD 0 ; next symbol
DCD 0

applies the symbol 0xC1 (*3) to the two (*2) instructions at lines 3 (*1) and
4. Lines 3 and 4 are these instructions:

21 00 3B D5 MRS X1, #3, c0, c0, #1
E0 00 3B D5 MRS X0, #3, c0, c0, #7

XORing the first byte of each with 0xC1 yields this change:

E0 00 3B D5 MRS X0, #3, c0, c0, #7
21 00 3B D5 MRS X1, #3, c0, c0, #1

and the lines are reversed. Another example:

DCW 178 ; line number (*1)
DCB 00 ; symbol offset (high four bits)
; and count-1 (low four bits)
DCB 1 ; instruction count (*2)
DCD 0x20040 ; symbol (*3)

This time the symbol 0x20040 (*3) is applied to the one (*2) instruction line
178 (*1). Line 178 is this instruction:

63 00 01 8B ADD X3, X3, X1

XORing the three bytes with 0x20040 yields this change:

23 00 03 8B ADD X3, X1, X3

and the register order is altered. After the transformation is applied, the
virus increments the indices of each group. If the index reaches the total then
the virus zeroes the index. Thus the transformations are applied in a cyclic
manner. Since the number of groups varies, the cycling happens faster or slower
for some sets. However, at any point in time, if the groups are examined, the
appearance of the next generation can be determined. Alternatively, the number
of the current generation can be determined, and it is possible to go back in
time to determine the appearance of the first build.

Interestingly, morphing the "CMP" instruction is not supported. In particular,
"CMP r0, r1 / B.NE" sequence seems like an obvious candidate for register
reordering as "CMP r1, r0 / B.NE", and "CMP r0, r1 / B.LE" is trivially
reversible as "CMP r1, r0 / B.GE". There are other missed morphing
opportunities, such as exchanging registers in the "ADD" instruction on line
190, or reordering sequential "ORR/MOV/LDRH/STR" instructions (for example,
lines 172-174). The jump to the original entrypoint is also hard-coded to use
the X8 register. This is likely just the place where the virus author decided
to stop. One could spend an eternity tinkering with the infinite possibilities,
and then this paper would not exist.

CONCLUSION

A metamorphic, entrypoint-obscuring virus is a wild start to 64-bit ARM
analysis. I could not have asked for better.

--[ PREV | HOME | NEXT ]--