Files
common/README.md

363 lines
8.8 KiB
Markdown
Raw Normal View History

2026-03-14 14:14:37 -04:00
# Common Programming Language
A minimalist, dependency-free compiler for a C-like language that targets x86-32 assembly.
## Overview
Common is a statically-typed systems programming language with:
- **No runtime dependencies** - compiles to standalone executables
- **Direct C interoperability** - call and be called by C code
- **Predictable codegen** - straightforward mapping to assembly
- **Complete type system** - 8 integer types, pointers, arrays
- **Full control flow** - if/else, loops, switch, functions
- **Zero external dependencies** - just libc, gcc, and nasm
## Quick Start
### Build the Compiler
```bash
gcc -o common common.c
```
### Hello World
Create `hello.cm`:
```c
void puts(uint8 *s);
int32 main(void) {
puts("Hello, World!");
return 0;
}
```
Compile and run:
```bash
./common hello.cm hello.asm
nasm -f elf32 hello.asm -o hello.o
gcc -m32 hello.o -o hello
./hello
```
Or use the Makefile:
```bash
make # Build compiler and test suite
make test # Run all tests
make hello # Build hello example
make examples # Build all examples
make run-examples # Build and run all examples
```
## Documentation
### For Users
- **[Quick Reference](QUICKREF.md)** - One-page cheat sheet for syntax and operators
- **[Reference Manual](MANUAL.md)** - Complete language specification (80+ pages)
- **[Troubleshooting Guide](TROUBLESHOOTING.md)** - Solutions to common problems
### For Developers
- **[Test Suite README](README_TESTS.md)** - How to run and write tests
- **[Source Code](common.c)** - Well-commented compiler implementation
## Language Features
### Types
```c
// Integers
int8 int16 int32 int64 // Signed
uint8 uint16 uint32 uint64 // Unsigned
// Pointers and arrays
int32 *ptr; // Pointer
int32 arr[10]; // Array
uint8 *str = "text"; // String
```
### Control Flow
```c
if (x > 0) { ... }
while (x < 100) { ... }
for (int32 i = 0; i < n; i++) { ... }
switch (x) { case 1: ... break; }
```
### Operators
```c
// Arithmetic: + - * / %
// Comparison: == != < <= > >=
// Logical: && || !
// Bitwise: & | ^ ~ << >>
// Pointers: & *
// Increment: ++ --
```
### Functions
```c
int32 add(int32 a, int32 b) {
return a + b;
}
int32 factorial(int32 n) {
if (n <= 1) return 1;
return n * factorial(n - 1);
}
```
## Example Programs
All examples are in the `examples/` directory:
| Program | Description |
|---------|-------------|
| **hello.cm** | Hello World |
| **fibonacci.cm** | Recursive Fibonacci |
| **arrays.cm** | Array operations |
| **pointers.cm** | Pointer manipulation |
| **bubblesort.cm** | Bubble sort algorithm |
| **bitwise.cm** | Bitwise operations |
| **types.cm** | Type casting examples |
| **switch.cm** | Switch statements |
| **primes.cm** | Prime number calculator |
| **strings.cm** | String functions |
| **calculator.cm** | Expression evaluator |
| **linkedlist.cm** | Linked list (simulated) |
Build any example:
```bash
make fibonacci && ./fibonacci
make bubblesort && ./bubblesort
```
## Test Suite
The test suite includes 60+ automated tests covering:
- Arithmetic and operators
- Variables and arrays
- Control flow
- Functions and recursion
- Pointers and type casting
- All integer types
Run tests:
```bash
make test
# or
./run_tests.sh
```
## Compilation Pipeline
```
source.cm → [common compiler] → output.asm → [nasm] → output.o → [gcc] → executable
```
1. **Common compiler**: Parses source, generates NASM assembly
2. **NASM**: Assembles to ELF32 object file
3. **GCC**: Links with C runtime library
## Requirements
- **GCC** with 32-bit support (gcc-multilib)
- **NASM** assembler
- **Linux** or compatible environment (WSL works)
Installation:
```bash
# Ubuntu/Debian
sudo apt-get install build-essential gcc-multilib nasm
# Fedora/RHEL
sudo dnf install gcc glibc-devel.i686 nasm
# Arch
sudo pacman -S gcc lib32-gcc-libs nasm
```
## Language Limitations
- **Single file compilation** - no modules or includes
- **No structs/unions** - use arrays for structured data
- **No floating point** - integers only
- **No preprocessor** - no #define, #include
- **1D arrays only** - simulate 2D with manual indexing
- **Partial 64-bit support** - types exist but ops truncate to 32-bit
See [MANUAL.md](MANUAL.md) for complete details and workarounds.
## Implementation Details
**Target**: x86-32 (IA-32) ELF
**Calling convention**: cdecl
**Stack alignment**: 16-byte (System V ABI)
**Registers**:
- `eax`: return values, expressions
- `ecx`: left operand
- `edx`: scratch
- `ebp`: frame pointer
- `esp`: stack pointer
**Code sections**:
- `.text`: executable code
- `.data`: initialized globals, strings
- `.bss`: zero-initialized globals
## Architecture
The compiler is a single-pass implementation in C99:
```
┌─────────────┐
│ Lexer │ Tokenize source
├─────────────┤
│ Parser │ Build AST
├─────────────┤
│ Type Check │ Infer expression types
├─────────────┤
│ Code Gen │ Emit NASM assembly
└─────────────┘
```
Key components:
- **Lexer** (150 LOC): Tokenization with lookahead
- **Parser** (400 LOC): Recursive descent parser
- **Type System** (200 LOC): Type inference for pointer arithmetic
- **Code Generator** (800 LOC): Assembly emission
Total: ~2000 lines of C99
## C Interoperability
Common can call C functions:
```c
// Declare C functions
void printf(uint8 *fmt, ...);
void *malloc(uint32 size);
void free(void *ptr);
int32 main(void) {
printf("Allocated %d bytes\n", 100);
void *mem = malloc(100);
free(mem);
return 0;
}
```
C can call Common functions:
```c
// common.cm
int32 compute(int32 x) {
return x * x;
}
// main.c
extern int compute(int);
int main() {
printf("%d\n", compute(10));
}
```
Compile:
```bash
./common common.cm common.asm
nasm -f elf32 common.asm -o common.o
gcc -m32 main.c common.o -o program
```
## Comparison to C
### Similar to C
- Syntax and semantics
- Type system (with fewer types)
- Pointer arithmetic
- Control flow
- Function calls (cdecl)
### Different from C
- No preprocessor
- No structs/unions
- No enums
- No static/extern keywords
- No goto
- Single file only
- Simpler type system
### Simpler than C
- No type qualifiers (const, volatile)
- No storage classes (auto, register)
- No function pointers (can cast to void*)
- No variadic function definitions
- No bitfields
- No flexible array members
## Project Structure
```
.
├── common.c # Compiler source (2000 LOC)
├── commonl # Linker
2026-03-14 14:14:37 -04:00
├── Makefile # Build automation
├── run_tests.sh # Quick test script
├── MANUAL.md # Complete language reference
├── QUICKREF.md # One-page cheat sheet
├── TROUBLESHOOTING.md # Problem solutions
├── README_TESTS.md # Test suite documentation
├── test_runner.c # Automated test harness
2026-03-14 14:38:39 -04:00
├── test_suite.cm # Test suite
2026-03-14 14:14:37 -04:00
└── examples/ # Example programs
├── hello.cm
├── fibonacci.cm
├── arrays.cm
├── pointers.cm
├── bubblesort.cm
├── bitwise.cm
├── types.cm
├── switch.cm
├── primes.cm
├── strings.cm
├── calculator.cm
└── linkedlist.cm
```
## License & Educational Purpose
2026-03-14 14:14:37 -04:00
### Public Domain / CC0
This project is dedicated to the public domain under the CC0 1.0 Universal license. This means you can copy, modify, distribute, and perform the work, even for commercial purposes, all without asking permission.
### Why Public Domain?
- **Ease of Use:** - By removing all licensing restrictions, students can freely integrate code snippets from common.c into their own projects without legal overhead or attribution requirements.
- **Educational Accessibility:** - The compiler is designed to be a "pure" learning resource. Its single-file implementation is intended to be read and modified as if it were a textbook example.
No Barriers: Just as the language requires "zero external dependencies," its legal status requires no compliance tracking, making it ideal for classroom settings and open-source forks.
- **Simplicity:** - A complex license would contradict the project's philosophy of "simplicity over features"
2026-03-14 14:14:37 -04:00
## Credits
Inspired by:
2026-03-14 14:14:37 -04:00
- **C** - Dennis Ritchie and Brian Kernighan
- **chibicc** - Rui Ueyama's educational C compiler
- **8cc** - Rui Ueyama's C compiler
- **tcc** - Fabrice Bellard's Tiny C Compiler
Built for programmers who value:
2026-03-14 14:14:37 -04:00
- Simplicity over features
- Control over convenience
- Learning over abstraction
---
Start with the [Quick Reference](QUICKREF.md) or dive into the [Manual](MANUAL.md).