There is an IRC bot that executes bash commands. Years ago I decided to try to make the smallest command to have it execute binary code. The smallest way is to inject over an existing application rather than write out teensy elf headers. This project is to create the smallest loadable bash builtin module. The same example code called s
for su
but it isn't the typical one. It just sheds root permissions by setuid/setgid to nobody.
Just to learn ELF sections and how linkers work. Everyone has a hobby eh.
A simple Hello World program in C is 14472 bytes in ELF-64. The same in ASM without libc is 4360 bytes. The latter has a lot fewer sections but could have even less.
You can pretty get lean output from standard tools but there is a very large section called .shstrtab
that nothing wants to remove. It's typically last so finding that string in a binary and removing everything after it seems fine. This saved 207 bytes for the simple Hello World asm program.
When you're making a shared library vs an executable you have more sections. The included su.asm
after being linked and stripped is 13096 bytes. Here are the program headers
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
LOAD 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x0000000000000240 0x0000000000000240 R 0x1000
LOAD 0x0000000000001000 0x0000000000001000 0x0000000000001000
0x0000000000000010 0x0000000000000010 R E 0x1000
LOAD 0x0000000000002000 0x0000000000002000 0x0000000000002000
0x0000000000000000 0x0000000000000000 R 0x1000
LOAD 0x0000000000002f10 0x0000000000002f10 0x0000000000002f10
0x0000000000000103 0x0000000000000103 RW 0x1000
DYNAMIC 0x0000000000002f10 0x0000000000002f10 0x0000000000002f10
0x00000000000000f0 0x00000000000000f0 RW 0x8
GNU_RELRO 0x0000000000002f10 0x0000000000002f10 0x0000000000002f10
0x00000000000000f0 0x00000000000000f0 R 0x1
Section to Segment mapping:
Segment Sections...
00 .hash .gnu.hash .dynsym .dynstr .rela.dyn
01 .text
02 .eh_frame
03 .dynamic .data
04 .dynamic
05 .dynamic
I don't know much about the details here but it's big and unnecessary. It can be removed during linkage by using a flag. After adding that option, ie ld -shared --no-ld-generated-unwind-info -o su su.o
the result after strip is 8920. That's 32% shrinkage!
The way we find symbols involves an old inefficient hash function I guess so GNU made it better. Our program has one symbol that needs located. So to revert to that old SysV style we use another flag. After linking with --hash-style=sysv
the stripped output is 8856 bytes. 36 bytes isn't much but it's gone now.
We still have 5 sections looking like:
Segment Sections...
00 .hash .dynsym .dynstr .rela.dyn
01 .text
02 .dynamic .data
03 .dynamic
04 .dynamic
We don't care about security we can load everything in 1 section that's RWX and have just two program headers. One LOAD and one DYNAMIC. Now we're at 5144 bytes -- a savings of 3712 bytes! Once we have our own ld script that discards lots of sections, we don't need that ld flag to remove .eh_frame
anymore, it's gone.
The current program headers:
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
LOAD 0x0000000000001000 0x0000000000000000 0x0000000000000000
0x0000000000000193 0x0000000000000193 RWE 0x1000
DYNAMIC 0x00000000000010a0 0x00000000000000a0 0x00000000000000a0
0x00000000000000e0 0x00000000000000e0 RW 0x8
That first LOAD offset is 0x1000 - that's 4K! Do we need that many zeros? No. Using . = SIZEOF_HEADERS;
in the ld script takes all that out. Then we're at 581 bytes after strip and manually removing shstrtab. We're now at 4% of our original 13096 bytes. So much fat trimmed.
Well that's where I thrww together elf2nasm.c
to get a clear view of the structure. It's not polished by any means. It was just enough to get this one-off program in a form I could start hacking away at. I found things still ran without having RELACOUNT, STRSZ, and SYMENT in the dynstr section. Then the rest of the savings possible are from overlapping things that can be re-used on top of things that don't matter so much. So right now that's 413 bytes. That's another 30% less than where we were before deep diving.