README

Radiance is a self-hosted compiler: the compiler is written in Radiance
and compiles itself. This creates a bootstrapping problem: you need a
working compiler to build the compiler. The solution is the "seed" -- a
trusted, checked-in binary that can compile the current source.

This document describes the workflows for developing the compiler,
updating the seed, and maintaining reproducibility.


CONCEPTS

  Seed            A known-good compiler binary checked into the repository
                  (`seed/radiance.rv64`). It can compile the current source
                  code into a working compiler.

  Stage           One round of self-compilation. Stage N uses the stage N-1
                  binary to compile the source. Stage 0 is the seed itself.

  Fixed point     When two consecutive stages produce bit-for-bit identical
                  binaries. This proves the compiler faithfully
                  reproduces itself.

  "Dev" binary    `bin/radiance.rv64.dev` -- built by `make` from the seed.
                  This is the working compiler used during development.
                  It is not checked in.

  Breaking        A source change that the current seed cannot compile.
  change          Eg. new syntax, changed calling conventions, removed
                  features the compiler uses during self-compilation. This
                  requires generating a new seed.

  Compatible      A source change that the current seed can still compile.
  change          Eg. bug fixes, new optimizations, new library code that
                  the compiler itself doesn't use, or isn't meaningfully
                  affected by.


FILES

  seed/radiance.rv64           Seed binary (RISC-V machine code).
  seed/radiance.rv64.ro.data   Read-only data section.
  seed/radiance.rv64.rw.data   Read-write data section.
  seed/radiance.rv64.git       SHA-256 of the git commit whose *source*
                               was compiled to produce this seed.
  seed/update                  Tool that finds the fixed point and
                               updates the seed.


EVERYDAY DEVELOPMENT

Most compiler work -- bug fixes, optimizations, new standard library
features, new backends -- does not require a seed update. The workflow is
simply:

  1. Edit source code
  2. Build the dev binary (produces a new `bin/radiance.rv64.dev`)

      make

  3. Run tests

      make test

  4. Commit source changes only. The seed is untouched.

The `dev` binary is ephemeral and rebuilt from the seed on every `make`. As
long as the seed can compile the current source, no seed update is needed.


WHEN TO UPDATE THE SEED

Some compiler work requires an update to the seed.

  * When a breaking change is introduced, i.e. a change that breaks the
    seed's ability to compile the source. You must update the seed *before*
    committing the breaking change. See "Breaking changes" below.

  * You want the benefits of compiler improvements (better code generation,
    faster compilation) to apply to the build itself. This is optional
    but often a good idea.

  * The fixed-point property needs re-verification after significant
    changes. Even compatible changes can alter the output binary, and
    reaching a fixed point confirms the compiler is self-consistent and
    deterministic.

Do *not* update the seed casually. Each seed update adds a large binary
diff to the repository.


BREAKING CHANGES

A breaking change is one where the new source cannot be compiled by the
old seed. Examples: new syntax the compiler uses on itself, changed
data structures in the AST, removed intrinsics.

The fundamental constraint is:

    The checked-in seed must always be able to compile the
    checked-in source.

This means you cannot simply commit a breaking change and update the
seed afterward, there would be a commit where the seed cannot build
the source. Instead:

    1. Add support for the new feature to the source, but don't use it
       in the compiler's own source yet. Ensure old syntax/behavior
       still works.

    2. Run `seed/update` to produce a new seed that understands the
       new feature.

    3. Commit source + updated seed together.

The seed now understands the new feature. From here, switching the
compiler's own source to use it is just a compatible change: the
seed can already compile it. No further seed update is required.


HOW UPDATING THE SEED WORKS

  seed/update [--seed <path>]

  1. Stage 1: Runs the seed to compile the current source.
     Outputs `seed/radiance.rv64.s1`.
  2. Compares the SEED with S1. If identical, done (fixed point reached).
  3. Stage 2: Runs S1 to compile the source. Outputs S2.
  4. Compares S1 and S2. If identical, done.
  5. Continues up to a certain number of stages. Fails if no fixed point is reached.

When a fixed point is found, it copies the converged binary to
`seed/radiance.rv64` and writes the current HEAD in `seed/radiance.rv64.git`.

Why might it take multiple stages?

  * Stage 1 differs from seed: The source changed, so the compiler
    binary changed. Normal.

  * Stage 2 differs from Stage 1: The source changes affected how the
    compiler generates code for itself. The S1 compiler (built by the
    old seed) generates slightly different code than the S2 compiler
    (built by S1, which incorporates the changes). Usually converges
    at Stage 2 or 3.

  * No convergence after 3+ stages: Something is non-deterministic in
    code generation (memory addresses leaking into output, hash map
    iteration order, etc.). This is a bug that must be fixed.


VERIFYING THE SEED

The seed is an opaque binary checked into the repository. Since binaries
can't be reviewed like source code, trust relies on reproducibility: anyone
can rebuild the seed from source and verify it matches.

  Verify the fixed-point property

    Run `seed/update`. If the seed is already at a fixed point, Stage 1
    will report IDENTICAL immediately. This confirms that compiling the
    current source with the seed produces the seed itself -- the compiler
    faithfully reproduces its own binary.

  Verify from an independent build

    If you have a separately-obtained Radiance compiler (e.g. built from
    a different trusted seed, or received from another party), use it as
    the starting point:

      seed/update --seed /path/to/trusted/radiance.rv64

    If this converges to the same fixed point as the checked-in seed,
    you have strong evidence that the seed is a faithful product of the
    source code and not a tampered binary. A backdoored seed cannot survive
    independent compilation.

    The bootstrapping compiler can serve as this independent second compiler.
    Its source can be audited, and any C99 compiler can be used to compile it.
    To use it as seed, pass `--from-s0` like so:

      seed/update --from-s0 --seed ./radiance.s0

  Verify the source commit

    The file `seed/radiance.rv64.git` records which commit's source was
    compiled to produce the seed.


TROUBLESHOOTING

  "No fixed point reached after N stages"

    The compiler output is non-deterministic. Diff the binaries
    to find what's changing. Common causes:

    * Pointer values or addresses leaking into generated code
    * Hash table iteration order affecting output
    * Uninitialized memory read during compilation