-
Notifications
You must be signed in to change notification settings - Fork 165
Name resolution
Consider the following simple rust program:
fn test(a :i32) -> i32 {
a + 1
}
fn main() {
let a;
a = 1;
let b;
b = a;
let c = test(b);
let a;
a = 1f32;
}
When this gets compiled there are 2 passes within the resolved code, top level and late. Toplevel is for scanning all of the top-level items into the relevant ROOT ribs. This allows for the lookup of functions declared after their use in this case if the test was declared after main. Late is all about delving into structs and functions and resolving all the types and names used in the blocks recursively.
In rust we use NodeIds thought the AST allows for side table lookups for every node within the IR. This is important as it keeps the AST immutable and forces API's to be designed to make this simpler. Usually in compilers name resolution is done via a stack of contexts with maps of strings. This is not suitable for the type of system that rust requires.
When we assign ids and have lookup tables everything becomes reusable at all stages of the compiler and moves towards a more query based compilation system (GCCRS is not fully there yet).
If you follow the above piece of code it is lexically very simple to follow so lets break this down into what the compiler see's this as:
<name-scope-root id=ROOT_RIB_ID>
<test id=1>
<test-parameters-rib id=2>
<a id=3>
<test-block-rib id=4>
<a ref=3 id=5>
<main id=6>
<main-parameters-rib id=7>
<main-block-rib id=8>
<a id=9>
<a ref=9 id=10>
<b id=11>
<b ref=11 id=12>
<a ref=9 id=13>
<c id=14>
<test ref=1 id=15>
<a id=16> // shadowing just becomes a new id
<a ref=16 id=17>
Here you can see each block is broken down into an appropriate level of nesting for each block, each declaration be it a function, parameter or var decl has an ID in the real code everything has an ID but ignore that for now. Note the name resolution dump is not implemented yet and is an open GSOC project.
When these mappings are built up we end up with a bunch of side table maps and lookups such that each rib can be looked up at any time.
root-rib id=ROOT_RIB_ID {
decls: [<fn test id=1>, <fn main id=6>]
references: [
<id=1>: [id=15],
<id=6>: []
]
test-rib id=1 {
test-params-rib {
decls: [<param a id=3>],
references: [
<id=3>: [id=5]
]
test-block-rib {
decls: [],
references: []
}
}
}
main-rib id=6 {
main-param-rib {
decls: [],
references: []
main-block-rib {
decls: [<var a id=9>, <var b id=11>, <var c id=14>, <var a id=16>]
references:[
<id=9>: [id=10, id=13],
<id=11>: [id=12],
<id=14>: [],
<id=16>: [id=17]
]
}
}
}
}
The important piece here is that for every BlockExpr on a function for example compilers need to know all of the locals associated with this block in order for it to be able to build the correct stack frame information. https://en.wikipedia.org/wiki/Call_stack
This new name resolution can be activated using the flag -frust-name-resolution-2.0
.
This new name resolution runs in multiple steps. All those steps fill a trie-like structure.
Collect all definitions in a crate.
This step runs multiple time during the macro expansion phase. It resolves Path, modules and macros. This allows macros to expand into new macro definitions that are subsequently resolved until a fixed point is reached. Name resolution errors are only emitted once that fixed point is reached.
Resolve types and identifiers.
Once all those steps done, the structure shall not be modified anymore and the name resolution result can be retrieved using an ImmutableNameResolutionContext
.