-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JIT: Move profile consistency checks to after morph #111253
JIT: Move profile consistency checks to after morph #111253
Conversation
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
/azp run runtime-coreclr outerloop, Fuzzlyn, Antigen |
Azure Pipelines successfully started running 3 pipeline(s). |
cc @dotnet/jit-contrib, @AndyAyersMS PTAL. Large size improvements. I suspect much of this is from throw blocks being kept in-line by block layout, now that they have non-zero flow into them (this is also lighting up other opts, like more CSEs). Some of these layout changes are clear regressions, if we take the block weights at face value. For example: ; V00 this [V00,T00] ( 3, 3 ) ref -> ecx this class-hnd single-def <System.Collections.Generic.Dictionary`2[int,System.Reflection.Emit.OpCode]>
; V01 RetBuf [V01,T01] ( 3, 3 ) byref -> esi single-def
-; V02 arg1 [V02,T03] ( 2, 1 ) int -> [ebp+0x08] single-def
+; V02 arg1 [V02,T03] ( 2, 1.50) int -> edi single-def
; V03 loc0 [V03,T02] ( 3, 3 ) byref -> eax single-def
;* V04 loc1 [V04 ] ( 0, 0 ) struct ( 8) zero-ref ld-addr-op <System.Reflection.Emit.OpCode>
;* V05 [V05 ] ( 0, 0 ) int -> zero-ref "field V04.m_value (fldOffset=0x0)" P-INDEP
@@ -20,35 +20,41 @@
G_M21284_IG01: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref, nogc <-- Prolog IG
push ebp
mov ebp, esp
+ push edi
push esi
mov esi, edx
; byrRegs +[esi]
- ;; size=6 bbWeight=1 PerfScore 2.50
+ mov edi, dword ptr [ebp+0x08]
+ ;; size=10 bbWeight=1 PerfScore 4.50
G_M21284_IG02: ; bbWeight=1, gcrefRegs=00000002 {ecx}, byrefRegs=00000040 {esi}, byref, isz
; gcrRegs +[ecx]
- mov edx, dword ptr [ebp+0x08]
+ mov edx, edi
call [<unknown method>]
; gcrRegs -[ecx]
; byrRegs +[eax]
test eax, eax
- je SHORT G_M21284_IG04
+ jne SHORT G_M21284_IG04
+ ;; size=12 bbWeight=1 PerfScore 4.50
+G_M21284_IG03: ; bbWeight=0.50, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref
+ ; byrRegs -[eax esi]
+ mov ecx, edi
+ call [System.ThrowHelper:ThrowKeyNotFoundException[int](int)]
+ int3
+ ;; size=9 bbWeight=0.50 PerfScore 1.75
+G_M21284_IG04: ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=00000041 {eax esi}, byref
+ ; byrRegs +[eax esi]
mov edx, dword ptr [eax]
mov dword ptr [esi], edx
mov edx, dword ptr [eax+0x04]
mov dword ptr [esi+0x04], edx
- ;; size=23 bbWeight=1 PerfScore 11.25
-G_M21284_IG03: ; bbWeight=1, epilog, nogc, extend
+ ;; size=10 bbWeight=1 PerfScore 6.00
+G_M21284_IG05: ; bbWeight=1, epilog, nogc, extend
pop esi
+ pop edi
pop ebp
ret 4
- ;; size=5 bbWeight=1 PerfScore 3.00
-G_M21284_IG04: ; bbWeight=0, gcVars=00000000 {}, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, gcvars, byref
- ; byrRegs -[eax esi]
- mov ecx, dword ptr [ebp+0x08]
- call [System.ThrowHelper:ThrowKeyNotFoundException[int](int)]
- int3
- ;; size=10 bbWeight=0 PerfScore 0.00
+ ;; size=6 bbWeight=1 PerfScore 3.50
-; Total bytes of code 44, prolog size 4, PerfScore 16.75, instruction count 18, allocated bytes for code 44 (MethodHash=b6a6acdb) for method System.Collections.Generic.Dictionary`2[int,System.Reflection.Emit.OpCode]:get_Item(int):System.Reflection.Emit.OpCode:this (Tier1)
+; Total bytes of code 47, prolog size 5, PerfScore 20.25, instruction count 21, allocated bytes for code 47 (MethodHash=b6a6acdb) for method System.Collections.Generic.Dictionary`2[int,System.Reflection.Emit.OpCode]:get_Item(int):System.Reflection.Emit.OpCode:this (Tier1) The block weights suggest the throw path is the colder of the two, and yet we now fall into it. I suspect more weight is supposed to be propagated into the throw path at some later point, but we lack profile maintenance there, so this is overall a PerfScore regression. We're currently missing 64-bit diffs due to an infra failure; I'm rerunning that leg with the hope it resolves... Outerloop tests and fuzzers didn't find any profile-related failures. Thanks! |
/azp run runtime-coreclr libraries-pgo, runtime-coreclr libraries-jitstress |
Azure Pipelines successfully started running 2 pipeline(s). |
JitStress and libraries-pgo pipelines are hitting profile consistency asserts for methods with OSR; I suspect #111362 will fix this, so I'll wait for that PR to go in first. |
/azp run runtime-coreclr libraries-pgo, runtime-coreclr libraries-jitstress |
Azure Pipelines successfully started running 2 pipeline(s). |
/azp run runtime-coreclr libraries-pgo |
Azure Pipelines successfully started running 1 pipeline(s). |
During morph, we will "un-protect" the original entry block for OSR methods such that (among other invariant changes) it will now be checked for profile consistency. Thus, we need to check its incoming weight for consistency and flag the profile as inconsistent, if necessary, to avoid asserts. With this fix in, the remaining @AndyAyersMS are you ok with me merging this as-is? Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
As I was mentioning the other day at some point we should revisit all these OSR special cases and see if there's a better way to handle them (not sure that there is).
Part of #107749. Enables profile checks for morph and post-morph phases.
For
benchmarks.run_pgo
, 45383 methods are consistent before inlining; after, we're down to 37215, or 82%. By the time we make it to morph, 33461 methods (~74% of the original) are consistent; after morph, we're down to 29402 (~65%). The decline isn't too dramatic for this collection, though I imagine we fare worse elsewhere.