I'm often skeptical of the desire to create a lot of passes. In the early Vale compiler, and in the Mojo compiler, we were paying a lot of interest on tech debt because features were put in the wrong pass. We often incurred more complexity trying to make a concept work across passes than we would have had in fewer, larger passes. I imagine this also has analogies to microservices in some way. Maybe other compiler people can weigh in here on the correct number/kind of passes.
Yes, and a similar question is the organization of the thing being acted on by the passes. If I understand correctly, this is in scheme and the things being acted on are trees with pointers. A performance optimized compiler, on the other hand, will probably use some sort of array-based implementation of trees.
There's also a question of data about the trees (like, a flow graph) being recomputed for each nanopass. Also expensive.
I'm creating a language/compiler now, and I'm quite certain that I did not have enough passes initially, but I hope I'm at a good spot now - but time will tell.
I agree with the notion that having multiple passes makes compilers easier to understand and maintain but finding the right number of passes is the real challenge here.
The optimal number of passes/IRs depends heavily on what language is being compiled. Some languages naturally warrant this kind of an architecture that would involve a lot of passes.
Compiling Scheme for instance would naturally entail several passes.
It could look something like the following:
Wouldn't this kind of architecture yield a slower compiler, regardless of output quality? Conceptually, trying to implement the least-amount of passes with each doing as much work as possible would make more sense to me.
There's also a question of data about the trees (like, a flow graph) being recomputed for each nanopass. Also expensive.
The Nanopass dsl just gives the user a nicer syntax to specify the transformations.
I'm creating a language/compiler now, and I'm quite certain that I did not have enough passes initially, but I hope I'm at a good spot now - but time will tell.
Bottlenecks are changing and it's pretty interesting.
The optimal number of passes/IRs depends heavily on what language is being compiled. Some languages naturally warrant this kind of an architecture that would involve a lot of passes.
Compiling Scheme for instance would naturally entail several passes. It could look something like the following:
Lexer -> Parser -> Macro Expander -> Alpha Renaming -> Core AST (Lowering) -> CPS Transform -> Beta / Eta Reduction -> Closure Conversion -> Codegen