Compiling Process in C — Overview
Introduction
The compiling process in C is the sequence of steps that transforms human-readable C source code into a runnable program.
Although developers often use a simple command like:
bashgcc main.c -o app
this command actually triggers multiple stages behind the scenes. Each stage has a specific responsibility and contributes to turning source code into machine-executable instructions.
Understanding this process helps developers:
-
debug compilation and linking errors,
-
understand how libraries are used,
-
inspect binaries effectively,
-
work with toolchains and embedded systems,
-
and build a strong foundation in systems programming.
High-Level Compilation Pipeline
At a high level, the compiling process consists of the following stages:
-
Preprocessing
-
Compilation
-
Assembly
-
Linking
-
Loading and Execution
A simplified view of the pipeline:
C Source Code (.c, .h)
↓
Preprocessing
↓
Compilation
↓
Assembly
↓
Object Files (.o)
↓
Linking
↓
Executable
↓
Loading
↓
Running Program
Each stage transforms the program into a more complete and executable form.
Stage 1: Preprocessing
The preprocessing stage handles all directives that begin with #.
Responsibilities
-
Expands
#include(header files) -
Expands macros defined with
#define -
Processes conditional compilation (
#if,#ifdef, etc.) -
Removes comments
Example
#define VALUE 10
int x = VALUE;
After preprocessing:
int x = 10;
Key Idea
The preprocessor performs text substitution, not actual compilation.
Stage 2: Compilation
In this stage, the compiler translates preprocessed C code into assembly code.
Responsibilities
-
Syntax analysis (checking valid C structure)
-
Semantic analysis (type checking, variable usage)
-
Optimization
-
Code generation
Output
- Assembly code (
.s)
Example
int add(int a, int b) {
return a + b;
}
Becomes (simplified):
add:
movl %edi, %eax
addl %esi, %eax
ret
Stage 3: Assembly
The assembler converts assembly code into machine code and produces an object file.
Output
- Object file (
.o)
What is inside an object file?
-
Machine instructions
-
Data sections (
.text,.data,.bss) -
Symbol table
-
Relocation information
Important Note
Object files are not complete programs.
They may still contain unresolved references to functions or variables.
Stage 4: Linking
The linker combines multiple object files and libraries into a final executable.
Responsibilities
1. Symbol Resolution
The linker matches:
- undefined symbols → their definitions
Example:
// main.c
extern int foo();
The linker finds foo in another object file or library.
2. Relocation
The linker updates machine code with correct memory addresses.
Example Linking Flow
gcc main.o util.o -o app
The linker:
-
combines
main.oandutil.o, -
resolves function calls between them,
-
generates the final executable.
Stage 5: Loading and Execution
After linking, the program becomes an executable file.
When you run:
./app
the operating system:
-
Loads the program into memory
-
Loads required shared libraries (if any)
-
Sets up the runtime environment
-
Starts execution
Important Detail
The program does not start directly at main().
Instead, execution begins at a low-level entry point (usually _start), which eventually calls main().
Putting It All Together
The compiling process is not a single step but a pipeline of transformations:
-
The preprocessor prepares the source code
-
The compiler analyzes and translates it
-
The assembler generates machine code
-
The linker connects all parts into a complete program
-
The loader prepares it for execution
Each stage reduces uncertainty:
-
missing code is included,
-
symbols are resolved,
-
addresses are fixed,
-
and the program becomes executable.
Conclusion
The compiling process in C is a structured pipeline that gradually transforms source code into a running program.
Even though tools like gcc hide this complexity, understanding each stage provides important insights into:
-
how programs are built,
-
why certain errors occur,
-
how libraries are linked,
-
and how binaries work internally.
This overview serves as a foundation for deeper topics such as:
-
object file structure (ELF),
-
symbol resolution and relocation,
-
static vs dynamic linking,
-
and runtime behavior.