on
GHC's Cross Compilation Pipeline
Today is going to be a slight bit more technical, and less direct practical utility. We will look at the steps that GHC takes to cross compiles code via its LLVM backend.
In GHC’s Compiler Commentary we can see how the front end takes a Haskell file and after Parsing, Renaming, Desugaring, it ends up in GHC’s Core language. The Core is then processed repeatedly by the Simplification pass before being translated into STG and finally Cmm. Cmm is the language from which the three code generation backends in GHC take off.
The Cross Compilation Backend
The LLVM code generator takes in Cmm, and turns it into LLVM intermediate representation. The LLVM IR is then passed through the LLVM optimizer, the LLVM static compiler, GHC’s LLVM Mangler, before it is finally passed off to the assembler, and ends up as object code.
LLVM IR
The LLVM intermediate representation can be written either in the textual human readable version or as LLVM Bitcode. LLVM Bitcode is a binary format, that is represented as a stream of bits. Values in the Bitcode format do not necessarily need to align with byte boundaries.
/GHC’s LLVM code generator currently produces textual ir. As the textual IR is not guaranteed to be stable across LLVM releases, this is one of the reasons that GHC is usually tied to a specific LLVM release./
LLVM optimizer
The LLVM optimizer opt
reads in LLVM IR writes LLVM IR after
performing a set of optimizations. The LLVM IR GHC uses GHC’s custom
calling convention ghccc
, which requires the -mem2reg
pass to be run
by the optimizer, thus the backend always passes -mem2reg
unless the
-O<n>
flag that is passed from GHC to the optimizer is greater than
0
. In which case the optimizer runs -mem2reg
anyway.
LLVM static compiler
The LLVM static compiler llc
turns the LLVM IR produced by the LLVM
optimizer into assembly for the given target.
GHC’s LLVM Mangler
After the LLVM IR GHC produces is fed through LLVM’s optimizer and
static compiler, the resulting assembly might need some special
attention. Therefore GHC passes the generated assembly through the
LLVM Mangler. The mangler currently ensures that -dead_strip
has no
effect on Mach-O platforms (macOS, iOS, …). Dead stripping on Mach-O
platforms breaks GHC’s Tables Next To Code optimization; it requires
functions to carry prefix data. LLVM unconditionally
inserts =.subsections_via_symbols= into the assembly. This leads the
linker to believe that only code after live function symbols needs to
be retained and it then strips away the prefix data, if the previous
symbol is considered dead. This should not be needed with LLVM5
anymore! (LLVM: D30770/)/
The mangler currently mangles two additional items: function to object mangling for ELF, and AVX instruction rewrites to fix AVX stack spills. For AVX GHC essentially lies to LLVM about the stack size being 32byte aligned, but then needs to rewrite the aligned AVX instructions to their unaligned counterparts.
The Assembler
Finally the mangled assembly is turned into =.o= object code, which is
then handed of to the linker. On macOS clang
is currently used as the
assembler instead of the system assembler.
That concludes our midlevel tour through the GHC’s LLVM
backend. Please note that I did not discuss the optional Splitter=/, and optional/ =MergeForeign
phases.