Building glibc to WASM
I'd recommend reading this doc in its entirety before trying to compile.
Prerequisites
We need WASM compatible clang
and ar
, which can be built locally from wasi-sdk
https://github.com/WebAssembly/wasi-sdk
Also strongly recommend to install wasm-objdump
from the wabt
toolkit
https://github.com/WebAssembly/wabt
Configure
Firstly we should write and run a config script like this in the glibc root directory
#!/bin/bash
set -e
BUILDDIR=build
mkdir -p $BUILDDIR
cd $BUILDDIR
../configure --disable-werror --disable-hidden-plt --with-headers=/usr/i686-linux-gnu/include --prefix=/sysroot-coulson --host=i686-linux-gnu --build=i686-linux-gnu\
CFLAGS=" -O2 -g" \
CC="/wasi-sdk/build/wasi-sdk-22.0/bin/clang-18 --target=wasm32-unkown-wasi -v -Wno-int-conversion"
You must replace CC
to the path to your clang
. If you define BUILDDIR=build
, then the compiled WASM object files will appear under glibc/build
.
Be aware that you should make sure this build directory is empty before running config script, so you need to rm -rf build
before recompiling it.
A crutial job of the configure script is deciding which sysdeps directories to use according to the host
and build
string.
We already changed the configure script in glibc root directory, and the lind add-on directories are already baked to be included.
The configure flags we need:
disable-werror
: we have countless warnings, so we ignore them for nowdisable-hidden-plt
: PLT bypassing optimization is causing ~50k errors, simply disable it for nowwith-headers
: glibc requires Linux kernel headers to be installed before config and compile, so set this flag to a built-in sysroot of 32bit, this doesn't seem to raise an issue for our WASM builtprefix=
: this is the path of the generated sysroot when you usemake install
. But note that, the glibc'smake install
will NOT work at all for WASM, because WASM sysroot has differen structure convention, also requires anllvm-ar
arhive. More details, see my scriptgen_sysroot.sh
. However, we can still usemake install
just to generate the.h
files of the sysroothost
&target
: we start off from the sysdeps direcotries of i686, so fixing these options
The compiler flags we need:
-O2 -g
: nah the glibc won't allow you to compile withO0
, so we bear with thisO2
optimization during debugging. But sometimes you can change toO1
.-Wno-int-conversion
: we disable int conversion warnings, cuz all 32bit types as WASM function arguments, are eventually i32 anyway--target=wasm32-unkown-wasi
: this tells the compiler we want to compile to WASM
After config succeed, you will see these in the build
directory,
Makefile bits config.h config.log config.make config.status
Compiling to object files
In the build directory, usually we use make --keep-going -j$(nproc)
. The first flag is to continue compiling after errors, we need this cuz there are too many errors now (mainly due to assembly about threading). The -j
is important to speed it up, but also makes the compilation log interleaved. The compilation log is VERY IMPORTANT, which tells why a given c file failed to be compiled. So sometimes we don't want the -j
. Also, we can copy the actual compiler command in the compile log. For such commands, if we want to compile a single C file, only the source file path need to be further specified. We can use this to test compiling a specific file.
Generating WASM sysroot
This procedure is specified in the gen_sysroot.sh
script in our glibc repo. It's main job is to generate a WASM sysroot structre like
sysroot/
- include/
- wasm32-wasi/
- stdio.h
- ...other headers
- lib/
- wasm32-wasi/
- crt1.o
- libc.a
Note that the header files should be pre-generated using make install
. The crt1.o should be pre-compiled from this simple C file (see the WASM compile doc as well). The main job of this script is find every valid WASM .o
file in the build
directory, and group everything into libc.a
, an llvm-ar
arvhive.
void _start() {
main();
}
void __wasm_call_dtors() {}
void __wasi_proc_exit(unsigned int exit_code) {}
Here are some macros we need to twist:
src_dir
: the glibcbuild
directory that contains all the WASM object filesinclude_source_dir
: the path to your pre-built headerscrt1_source_path
: path to your pre-built crt1.olind_syscall_path
: you also need to pre-compilelind_syscall.o
, just likecrt1.o
, and the source file is under glibc/lind_syscallsysroot_dir
: path to generate the sysroot atoutput_archive
: the path to the generate the libc.a, should be align withsysroot_dir
Running only the pre-processor
The pre-processing stage of the compiler expland all #include
and all macros. Because recursive macros are so prevalent in glibc, sometimes you want to see the actual source file after epxansions, then you want to use the -E
option of clang
.
The easiest way is to copy the compiler command from the compile log and add a -E
. If you want to run pre-prosessor on ALL files, then run the configure script again, but before compiling, add -E
to config.make
right after
# Build tools.
CC = /wasi-sdk/build/wasi-sdk-22.0/bin/clang-18 [ADD HERE!]