Common Programming Language
A minimalist, dependency-free compiler for a C-like language that targets x86-32 assembly.
Overview
Common is a statically-typed systems programming language with:
- No runtime dependencies - compiles to standalone executables
- Direct C interoperability - call and be called by C code
- Predictable codegen - straightforward mapping to assembly
- Complete type system - 8 integer types, pointers, arrays
- Full control flow - if/else, loops, switch, functions
- Zero external dependencies - just libc, gcc, and nasm
Quick Start
Build the Compiler
gcc -o common common.c
Hello World
Create hello.cm:
void puts(uint8 *s);
int32 main(void) {
puts("Hello, World!");
return 0;
}
Compile and run:
./common hello.cm hello.asm
nasm -f elf32 hello.asm -o hello.o
gcc -m32 hello.o -o hello
./hello
Or use the Makefile:
make # Build compiler and test suite
make test # Run all tests
make hello # Build hello example
make examples # Build all examples
make run-examples # Build and run all examples
Documentation
For Users
- Quick Reference - One-page cheat sheet for syntax and operators
- Reference Manual - Complete language specification (80+ pages)
- Troubleshooting Guide - Solutions to common problems
For Developers
- Test Suite README - How to run and write tests
- Source Code - Well-commented compiler implementation
Language Features
Types
// Integers
int8 int16 int32 int64 // Signed
uint8 uint16 uint32 uint64 // Unsigned
// Pointers and arrays
int32 *ptr; // Pointer
int32 arr[10]; // Array
uint8 *str = "text"; // String
Control Flow
if (x > 0) { ... }
while (x < 100) { ... }
for (int32 i = 0; i < n; i++) { ... }
switch (x) { case 1: ... break; }
Operators
// Arithmetic: + - * / %
// Comparison: == != < <= > >=
// Logical: && || !
// Bitwise: & | ^ ~ << >>
// Pointers: & *
// Increment: ++ --
Functions
int32 add(int32 a, int32 b) {
return a + b;
}
int32 factorial(int32 n) {
if (n <= 1) return 1;
return n * factorial(n - 1);
}
Example Programs
All examples are in the examples/ directory:
| Program | Description |
|---|---|
| hello.cm | Hello World |
| fibonacci.cm | Recursive Fibonacci |
| arrays.cm | Array operations |
| pointers.cm | Pointer manipulation |
| bubblesort.cm | Bubble sort algorithm |
| bitwise.cm | Bitwise operations |
| types.cm | Type casting examples |
| switch.cm | Switch statements |
| primes.cm | Prime number calculator |
| strings.cm | String functions |
| calculator.cm | Expression evaluator |
| linkedlist.cm | Linked list (simulated) |
Build any example:
make fibonacci && ./fibonacci
make bubblesort && ./bubblesort
Test Suite
The test suite includes 60+ automated tests covering:
- Arithmetic and operators
- Variables and arrays
- Control flow
- Functions and recursion
- Pointers and type casting
- All integer types
Run tests:
make test
# or
./run_tests.sh
Compilation Pipeline
source.cm → [common compiler] → output.asm → [nasm] → output.o → [gcc] → executable
- Common compiler: Parses source, generates NASM assembly
- NASM: Assembles to ELF32 object file
- GCC: Links with C runtime library
Requirements
- GCC with 32-bit support (gcc-multilib)
- NASM assembler
- Linux or compatible environment (WSL works)
Installation:
# Ubuntu/Debian
sudo apt-get install build-essential gcc-multilib nasm
# Fedora/RHEL
sudo dnf install gcc glibc-devel.i686 nasm
# Arch
sudo pacman -S gcc lib32-gcc-libs nasm
Language Limitations
- Single file compilation - no modules or includes
- No structs/unions - use arrays for structured data
- No floating point - integers only
- No preprocessor - no #define, #include
- 1D arrays only - simulate 2D with manual indexing
- Partial 64-bit support - types exist but ops truncate to 32-bit
See MANUAL.md for complete details and workarounds.
Implementation Details
Target: x86-32 (IA-32) ELF
Calling convention: cdecl
Stack alignment: 16-byte (System V ABI)
Registers:
eax: return values, expressionsecx: left operandedx: scratchebp: frame pointeresp: stack pointer
Code sections:
.text: executable code.data: initialized globals, strings.bss: zero-initialized globals
Architecture
The compiler is a single-pass implementation in C99:
┌─────────────┐
│ Lexer │ Tokenize source
├─────────────┤
│ Parser │ Build AST
├─────────────┤
│ Type Check │ Infer expression types
├─────────────┤
│ Code Gen │ Emit NASM assembly
└─────────────┘
Key components:
- Lexer (150 LOC): Tokenization with lookahead
- Parser (400 LOC): Recursive descent parser
- Type System (200 LOC): Type inference for pointer arithmetic
- Code Generator (800 LOC): Assembly emission
Total: ~2000 lines of C99
C Interoperability
Common can call C functions:
// Declare C functions
void printf(uint8 *fmt, ...);
void *malloc(uint32 size);
void free(void *ptr);
int32 main(void) {
printf("Allocated %d bytes\n", 100);
void *mem = malloc(100);
free(mem);
return 0;
}
C can call Common functions:
// common.cm
int32 compute(int32 x) {
return x * x;
}
// main.c
extern int compute(int);
int main() {
printf("%d\n", compute(10));
}
Compile:
./common common.cm common.asm
nasm -f elf32 common.asm -o common.o
gcc -m32 main.c common.o -o program
Comparison to C
Similar to C
- Syntax and semantics
- Type system (with fewer types)
- Pointer arithmetic
- Control flow
- Function calls (cdecl)
Different from C
- No preprocessor
- No structs/unions
- No enums
- No static/extern keywords
- No goto
- Single file only
- Simpler type system
Simpler than C
- No type qualifiers (const, volatile)
- No storage classes (auto, register)
- No function pointers (can cast to void*)
- No variadic function definitions
- No bitfields
- No flexible array members
Project Structure
.
├── common.c # Compiler source (2000 LOC)
├── commonl # Linker
├── Makefile # Build automation
├── run_tests.sh # Quick test script
│
├── MANUAL.md # Complete language reference
├── QUICKREF.md # One-page cheat sheet
├── TROUBLESHOOTING.md # Problem solutions
├── README_TESTS.md # Test suite documentation
│
├── test_runner.c # Automated test harness
├── test_suite.cm # Test suite
│
└── examples/ # Example programs
├── hello.cm
├── fibonacci.cm
├── arrays.cm
├── pointers.cm
├── bubblesort.cm
├── bitwise.cm
├── types.cm
├── switch.cm
├── primes.cm
├── strings.cm
├── calculator.cm
└── linkedlist.cm
License & Educational Purpose
Public Domain / CC0
This project is dedicated to the public domain under the CC0 1.0 Universal license. This means you can copy, modify, distribute, and perform the work, even for commercial purposes, all without asking permission.
Why Public Domain?
- Ease of Use: - By removing all licensing restrictions, students can freely integrate code snippets from common.c into their own projects without legal overhead or attribution requirements.
- Educational Accessibility: - The compiler is designed to be a "pure" learning resource. Its single-file implementation is intended to be read and modified as if it were a textbook example. No Barriers: Just as the language requires "zero external dependencies," its legal status requires no compliance tracking, making it ideal for classroom settings and open-source forks.
- Simplicity: - A complex license would contradict the project's philosophy of "simplicity over features"
Credits
Inspired by:
- C - Dennis Ritchie and Brian Kernighan
- chibicc - Rui Ueyama's educational C compiler
- 8cc - Rui Ueyama's C compiler
- tcc - Fabrice Bellard's Tiny C Compiler
Built for programmers who value:
- Simplicity over features
- Control over convenience
- Learning over abstraction
Start with the Quick Reference or dive into the Manual.