mirror of
https://github.com/jart/cosmopolitan.git
synced 2025-08-01 23:40:28 +00:00
Add chibicc
This program popped up on Hacker News recently. It's the only modern compiler I've ever seen that doesn't have dependencies and is easily modified. So I added all of the missing GNU extensions I like to use which means it might be possible soon to build on non-Linux and have third party not vendor gcc binaries.
This commit is contained in:
parent
e44a0cf6f8
commit
8da931a7f6
298 changed files with 19493 additions and 11950 deletions
134
third_party/chibicc/README.md
vendored
Normal file
134
third_party/chibicc/README.md
vendored
Normal file
|
@ -0,0 +1,134 @@
|
|||
# chibicc: A Small C Compiler
|
||||
|
||||
(The old master has moved to
|
||||
[historical/old](https://github.com/rui314/chibicc/tree/historical/old)
|
||||
branch. This is a new one uploaded in September 2020.)
|
||||
|
||||
chibicc is yet another small C compiler that implements most C11
|
||||
features. Even though it still probably falls into the "toy compilers"
|
||||
category just like other small compilers do, chibicc can compile several
|
||||
real-world programs, including [Git](https://git-scm.com/),
|
||||
[SQLite](https://sqlite.org) and
|
||||
[libpng](http://www.libpng.org/pub/png/libpng.html), without making
|
||||
modifications to the compiled programs. Generated executables of these
|
||||
programs pass their corresponding test suites. So, chibicc actually
|
||||
supports a wide variety of C11 features and is able to compile hundreds of
|
||||
thousands of lines of real-world C code correctly.
|
||||
|
||||
chibicc is developed as the reference implementation for a book I'm
|
||||
currently writing about the C compiler and the low-level programming.
|
||||
The book covers the vast topic with an incremental approach; in the first
|
||||
chapter, readers will implement a "compiler" that accepts just a single
|
||||
number as a "language", which will then gain one feature at a time in each
|
||||
section of the book until the language that the compiler accepts matches
|
||||
what the C11 spec specifies. I took this incremental approach from [the
|
||||
paper](http://scheme2006.cs.uchicago.edu/11-ghuloum.pdf) by Abdulaziz
|
||||
Ghuloum.
|
||||
|
||||
Each commit of this project corresponds to a section of the book. For this
|
||||
purpose, not only the final state of the project but each commit was
|
||||
carefully written with readability in mind. Readers should be able to learn
|
||||
how a C language feature can be implemented just by reading one or a few
|
||||
commits of this project. For example, this is how
|
||||
[while](https://github.com/rui314/chibicc/commit/773115ab2a9c4b96f804311b95b20e9771f0190a),
|
||||
[[]](https://github.com/rui314/chibicc/commit/75fbd3dd6efde12eac8225d8b5723093836170a5),
|
||||
[?:](https://github.com/rui314/chibicc/commit/1d0e942fd567a35d296d0f10b7693e98b3dd037c),
|
||||
and [thread-local
|
||||
variable](https://github.com/rui314/chibicc/commit/79644e54cc1805e54428cde68b20d6d493b76d34)
|
||||
are implemented. If you have plenty of spare time, it might be fun to read
|
||||
it from the [first
|
||||
commit](https://github.com/rui314/chibicc/commit/0522e2d77e3ab82d3b80a5be8dbbdc8d4180561c).
|
||||
|
||||
If you like this project, please consider purchasing a copy of the book
|
||||
when it becomes available! 😀 I publish the source code here to give people
|
||||
early access to it, because I was planing to do that anyway with a
|
||||
permissive open-source license after publishing the book. If I don't charge
|
||||
for the source code, it doesn't make much sense to me to keep it private. I
|
||||
hope to publish the book in 2021.
|
||||
|
||||
I pronounce chibicc as _chee bee cee cee_. "chibi" means "mini" or
|
||||
"small" in Japanese. "cc" stands for C compiler.
|
||||
|
||||
## Status
|
||||
|
||||
Features that are often missing in a small compiler but supported by
|
||||
chibicc include (but not limited to):
|
||||
|
||||
- Preprocessor
|
||||
- long double (x87 80-bit floting point numbers)
|
||||
- Bit-field
|
||||
- alloca()
|
||||
- Variable-length array
|
||||
- Thread-local variable
|
||||
- Atomic variable
|
||||
- Common symbol
|
||||
- Designated initializer
|
||||
- L, u, U and u8 string literals
|
||||
|
||||
chibicc does not support digraphs, trigraphs, complex numbers, K&R-style
|
||||
function prototype, and inline assembly.
|
||||
|
||||
chibicc outputs a simple but nice error message when it finds an error in
|
||||
source code.
|
||||
|
||||
There's no optimization pass. chibicc emits terrible code which is probably
|
||||
twice or more slower than GCC's output. I have a plan to add an
|
||||
optimization pass once the frontend is done.
|
||||
|
||||
## Internals
|
||||
|
||||
chibicc consists of the following stages:
|
||||
|
||||
- Tokenize: A tokenizer takes a string as an input, breaks it into a list
|
||||
of tokens and returns them.
|
||||
|
||||
- Preprocess: A preprocessor takes as an input a list of tokens and output
|
||||
a new list of macro-expanded tokens. It interprets preprocessor
|
||||
directives while expanding macros.
|
||||
|
||||
- Parse: A recursive descendent parser constructs abstract syntax trees
|
||||
from the output of the preprocessor. It also adds a type to each AST
|
||||
node.
|
||||
|
||||
- Codegen: A code generator emits an assembly text for given AST nodes.
|
||||
|
||||
## Contributing
|
||||
|
||||
When I find a bug in this compiler, I go back to the original commit that
|
||||
introduced the bug and rewrite the commit history as if there were no such
|
||||
bug from the beginning. This is an unusual way of fixing bugs, but as a a
|
||||
part of a book, it is important to keep every commit bug-free.
|
||||
|
||||
Thus, I do not take pull requests in this repo. You can send me a pull
|
||||
request if you find a bug, but it is very likely that I will read your
|
||||
patch and then apply that to my previous commits by rewriting history. I'll
|
||||
credit your name somewhere, but your changes will be rewritten by me before
|
||||
submitted to this repository.
|
||||
|
||||
Also, please assume that I will occasionally force-push my local repository
|
||||
to this public one to rewrite history. If you clone this project and make
|
||||
local commits on top of it, your changes will have to be rebased by hand
|
||||
when I force-push new commits.
|
||||
|
||||
## About the Author
|
||||
|
||||
I'm Rui Ueyama. I'm the creator of [8cc](https://github.com/rui314/8cc),
|
||||
which is a hobby C compiler, and also the original creator of the current
|
||||
version of [LLVM lld](https://lld.llvm.org) linker, which is a
|
||||
production-quality linker used by various operating systems and large-scale
|
||||
build systems.
|
||||
|
||||
## References
|
||||
|
||||
- [tcc](https://bellard.org/tcc/): A small C compiler written by Fabrice
|
||||
Bellard. I learned a lot from this compiler, but the design of tcc and
|
||||
chibicc are different. In particular, tcc is a one-pass compiler, while
|
||||
chibicc is a multi-pass one.
|
||||
|
||||
- [lcc](https://github.com/drh/lcc): Another small C compiler. The creators
|
||||
wrote a [book](https://sites.google.com/site/lccretargetablecompiler/)
|
||||
about the internals of lcc, which I found a good resource to see how a
|
||||
compiler is implemented.
|
||||
|
||||
- [An Incremental Approach to Compiler
|
||||
Construction](http://scheme2006.cs.uchicago.edu/11-ghuloum.pdf)
|
Loading…
Add table
Add a link
Reference in a new issue