Initial benchmarks for intersection types + a bit of speedup #11924

JaroslavTulach · 2024-12-19T16:07:40Z

Pull Request Description

Addressing #11846 by writing a set of benchmarks comparing the same algorithm with different kinds of intersection type values. The fix for #11846 is then:

using Truffle Node types for performing operations
e.g. new nodes like EnsoMultiValue.NewNode and FindIndexNode and AllTypesWith are introduced
all these nodes speculate on the intersection type to be the same as before and provide some speed up
as such we need a fast way to compare two intersection types to be the same
hence the introduction of an internal EnsoMultiType replacing Type[]
it has a factory method and a cache yielding same EnsoMultiType instance for Type[] with equal elements
guaranteeing quick comparison of two EnsoMultiType instances via ==
thus all the nodes can build various inline caches and speed things up

Originally the intersection types benchmarks were 60 times slower than the base benchmark. Now they are just 10-15 times slower. Given the current memory overhead of EnsoMultiValue and nature of the benchmark (million of + operation dispatches), that's a reasonable result. Moreover additional speed up is underway in #11975 and will be visible once these benchmarks get executed daily.

Checklist

Please ensure that the following checklist has been satisfied before submitting the PR:

All code follows the
Scala,
Java,
Unit tests continue to work
Benchmarks running will get faster (another run) till they are really fast (latest bench run) and last last run
- Originally the benchmarks were 60 times slower
- Now they are 10-15 times slower - than the base benchmark
- there is some improvement
- additional improvement separated into EnsoMultiValue.firstDispatch to speed benchmarks up #11975

…ype11846

JaroslavTulach · 2024-12-20T06:12:48Z

...marks/src/main/java/org/enso/interpreter/bench/benchmarks/semantic/MultiValueBenchmarks.java

+            go 0 0
+
+
+        make_vector type n =


A dedicated benchmark for EnsoMultiValue instances based on the idea of ArrayProxy one. Currently the more complicated benchmarks refuse to compile and the compiler bails out. The initial results are:

sbt:enso> runtime-benchmarks/benchOnly MultiValueBenchmarks Benchmark Mode Cnt Score Error Units MultiValueBenchmarks.sumOverComplexAndFloat5 avgt 5 214.330 ± 4.179 ms/op MultiValueBenchmarks.sumOverComplexFloatRecastedToFloat3 avgt 5 219.803 ± 11.872 ms/op MultiValueBenchmarks.sumOverFloat1 avgt 5 0.079 ± 0.006 ms/op MultiValueBenchmarks.sumOverFloatAndComplex6 avgt 5 219.525 ± 6.393 ms/op MultiValueBenchmarks.sumOverFloatComplexRecastedToFloat4 avgt 5 203.788 ± 9.843 ms/op MultiValueBenchmarks.sumOverInteger0 avgt 5 0.074 ± 0.001 ms/op

After 630ec62 the results are better:

MultiValueBenchmarks.sumOverComplexAndFloat5 avgt 5 30.109 ± 0.661 ms/op MultiValueBenchmarks.sumOverComplexFloatRecastedToFloat3 avgt 5 26.988 ± 0.446 ms/op MultiValueBenchmarks.sumOverFloat1 avgt 5 0.078 ± 0.003 ms/op MultiValueBenchmarks.sumOverFloatAndComplex6 avgt 5 27.821 ± 0.856 ms/op MultiValueBenchmarks.sumOverFloatComplexRecastedToFloat4 avgt 5 27.961 ± 0.263 ms/op MultiValueBenchmarks.sumOverInteger0 avgt 5 0.078 ± 0.002 ms/op

and there are no bailouts. Time to really speed things up.

Some speedup achieved...

…nsoMultiValue instance

JaroslavTulach · 2024-12-21T06:56:05Z

...marks/src/main/java/org/enso/interpreter/bench/benchmarks/semantic/MultiValueBenchmarks.java

+   * The <b>base benchmark</b> for this suite. Measures how much it takes to access an Atom in a
+   * Vector, read {@code re:Float} field out of it and sum all of them together.
+   */
+  @Benchmark


This is now the base benchmark. The results after 4dacf53 are:

# the base one MultiValueBenchmarks.sumOverComplexBaseBenchmark0 avgt 5 0.139 ± 0.013 ms/op # these two are supposed to be faster MultiValueBenchmarks.sumOverInteger1 avgt 5 0.065 ± 0.003 ms/op MultiValueBenchmarks.sumOverFloat2 avgt 5 0.073 ± 0.002 ms/op # these should catch up with sumOverComplexBaseBenchmark0 one day MultiValueBenchmarks.sumOverComplexAndFloat5 avgt 5 8.580 ± 0.326 ms/op MultiValueBenchmarks.sumOverComplexFloatRecastedToFloat3 avgt 5 9.118 ± 0.483 ms/op MultiValueBenchmarks.sumOverFloatAndComplex6 avgt 5 8.110 ± 0.160 ms/op MultiValueBenchmarks.sumOverFloatComplexRecastedToFloat4 avgt 5 9.393 ± 0.648 ms/op

still 60 times slower than it should be.

The biggest slowdown is currently in reorderOnly branch:

It seems to always allocate new arrays. Rather it should treat EnsoMultiType as compilation constants have everything ready from previous run.

Attempting to make things better in a follow up PR:

EnsoMultiValue.firstDispatch to speed benchmarks up #11975

JaroslavTulach · 2024-12-30T05:09:19Z

engine/runtime/src/main/java/org/enso/interpreter/runtime/data/EnsoMultiValue.java

+        if (reorderOnly) {
+          var copyTypes = allTypesWith.executeAllTypes(dispatch, mv.extra);
+          if (i == 0 && dispatch.typesLength() == 1) {
+            return newNode.newValue(copyTypes, 1, mv.values);


This if seems to speed sumOverFloatComplexRecastedToFloat4 up twice:

[info] MultiValueBenchmarks.sumOverFloatComplexRecastedToFloat4 avgt 5 1.071 ± 0.074 ms/op [info] MultiValueBenchmarks.sumOverFloatComplexRecastedToFloat4 avgt 5 2.322 ± 0.086 ms/op

…ype11846

JaroslavTulach · 2025-01-04T06:40:25Z

Benchmarks are written. They have been sped up just like:

There are possible ways to speed things up even more, but let's leave them for subsequent PRs like:

EnsoMultiValue.firstDispatch to speed benchmarks up #11975

JaroslavTulach · 2025-01-04T08:21:39Z

engine/runtime/src/main/java/org/enso/interpreter/node/typecheck/AllOfTypesCheckNode.java

+    if (at != checks.length) {
+      // request for Number & Integer may yield two integers collision
+      // request for Number & Float may yield two floats collision
+      // request for Number & Integer & Float must yield one collision


There are tests in Conversion_Spec that combine these types. Thus such a combination must be supported (and it clashes with assert checking each type is in multi value only once) - but it doesn't have to be fast. Thus transferToInterpreter().

JaroslavTulach · 2025-01-07T05:49:58Z

...marks/src/main/java/org/enso/interpreter/bench/benchmarks/semantic/MultiValueBenchmarks.java

+
+/**
+ * These benchmarks compare performance of {@link EnsoMultiValue}. They create a vector in a certain
+ * configuration representing numbers and then they perform {@code sum} operation on it.


@Akirathan, @hubertp, @4e6 (and also @radeusgd):

these are the benchmarks for intersection types

a Vector is created by make_vector typ method

various types include Integer, Float, Complex as non-intersection type benchmarks to compare to

then there are various types mixing Float and Complex together into different intersection types

Ideally the intersection types benchmark results should be close to base benchmark sumOverComplexBaseBenchmark0 with some "reasonable overhead".

Are you OK with these benchmarks?

hubertp · 2025-01-07T11:14:42Z

engine/runtime/src/main/java/org/enso/interpreter/runtime/data/EnsoMultiType.java

+      return false;
+    }
+    final EnsoMultiType other = (EnsoMultiType) obj;
+    return Arrays.deepEquals(this.types, other.types);


Shouldn't only deepEquals be under TruffleBoundary?

Yes/No.

technically the previous statements are eligible for PE

however I don't want PE to call into equals and hashCode at all

only callbacks from ALL_TYPES.computeIfAbset should access these two methods

only findOrCreateSlow (itself by a boundary) should be calling ALL_TYPES.computeIfAbset

so maybe we don't need @TruffleBoundary here at all

However excluding code from PE multiple times hurts nothing...

JaroslavTulach added 2 commits December 19, 2024 08:39

Ensure isAllTypes is compilation constant

086d024

Using internal MultiType to represent Type[] but with guaranteed ==

a33965d

JaroslavTulach added the CI: No changelog needed Do not require a changelog entry for this PR. label Dec 19, 2024

JaroslavTulach self-assigned this Dec 19, 2024

JaroslavTulach requested review from 4e6, hubertp and Akirathan as code owners December 19, 2024 16:07

JaroslavTulach marked this pull request as draft December 19, 2024 16:07

JaroslavTulach changed the title ~~Optimize building can casting of EnsoMultiValue~~ Optimize building and casting of EnsoMultiValue Dec 19, 2024

JaroslavTulach added 2 commits December 20, 2024 00:00

Merge remote-tracking branch 'origin/develop' into wip/jtulach/MultiT…

171c799

…ype11846

Prefer partialEvaluationConstant assert

775595c

enso-bot bot mentioned this pull request Dec 19, 2024

300% benchmarks regression between Dec 13 and 14 #11901

Closed

2 tasks

Benchmarks for intersection types

e2760ff

JaroslavTulach linked an issue Dec 20, 2024 that may be closed by this pull request

Benchmark EnsoMultiValue and speed it up #11846

Closed

JaroslavTulach commented Dec 20, 2024

View reviewed changes

JaroslavTulach added 2 commits December 20, 2024 07:36

Usage of Map & co. must be behind @TruffleBoundary

630ec62

Unify findInteropTypeValue

8defcfe

JaroslavTulach mentioned this pull request Dec 20, 2024

Benchmark EnsoMultiValue and speed it up #11846

Closed

JaroslavTulach added 8 commits December 20, 2024 08:16

Inline cache to find index of a type

d44d30e

Use EnsoMultiValue.NewNode to allocate new instances of EnsoMultiValue

7b6d364

Basic specializations for NewNode

bda398a

Splitting the FindIndexNode and caching requests for newNode

56b24aa

Just ask only for types the value 'has been cast to'

2dcc2c4

Provide cachedTypes as the first argument to activate the caches

ee080a3

AllOfTypesCheckNode needs cached EnsoMultiValue.NewNode to allocate E…

53c2222

…nsoMultiValue instance

Moving EnsoMultiType into outer scope

9567257

enso-bot bot mentioned this pull request Dec 21, 2024

EnsoMultiValue == isn't transitive #11845

Closed

Sum re field of a Complex object in a Vector is the base benchmark

6bfdbf9

JaroslavTulach commented Dec 21, 2024

View reviewed changes

Cache dispatch on EnsoMultiValue.getDispatchId

4dacf53

Turing allTypesWith method into EnsoMultiType.AllTypesWith node

ebe0553

unfurl-links bot mentioned this pull request Dec 29, 2024

Text.find slows down with the number of invocations #11859

Closed

JaroslavTulach added 2 commits December 30, 2024 05:19

Only assert valid payload

2301f9b

Speeding up non-reordering reorderOnly case twice

8d5452c

JaroslavTulach commented Dec 30, 2024

View reviewed changes

JaroslavTulach added 4 commits December 30, 2024 11:46

Merging with develop and resolving conflicts

ed8799c

Don't use keyword as variable name

615b600

Assert there is no intersection between dispatch and extra types

a68db22

Merge remote-tracking branch 'origin/develop' into wip/jtulach/MultiT…

98ebc39

…ype11846

JaroslavTulach added the CI: Clean build required CI runners will be cleaned before and after this PR is built. label Jan 4, 2025

JaroslavTulach marked this pull request as ready for review January 4, 2025 06:38

Avoiding duplications when Number & Integer & Float and co.

a9f57f0

JaroslavTulach commented Jan 4, 2025

View reviewed changes

JaroslavTulach added the CI: Keep up to date Automatically update this PR to the latest develop. label Jan 4, 2025

Merge branch 'develop' into wip/jtulach/MultiType11846

035b2be

JaroslavTulach mentioned this pull request Jan 5, 2025

EnsoMultiValue.firstDispatch to speed benchmarks up #11975

Merged

3 tasks

JaroslavTulach changed the title ~~Optimize building and casting of EnsoMultiValue~~ Initial benchmarks for intersection types + a bit of speedup Jan 6, 2025

Merge branch 'develop' into wip/jtulach/MultiType11846

3f0a3e3

JaroslavTulach commented Jan 7, 2025

View reviewed changes

JaroslavTulach requested a review from radeusgd January 7, 2025 06:08

mergify bot added 2 commits January 7, 2025 09:07

Merge branch 'develop' into wip/jtulach/MultiType11846

9503f79

Merge branch 'develop' into wip/jtulach/MultiType11846

f515d8a

JaroslavTulach removed the CI: Keep up to date Automatically update this PR to the latest develop. label Jan 7, 2025

hubertp approved these changes Jan 7, 2025

View reviewed changes

JaroslavTulach merged commit 2ead3f5 into develop Jan 7, 2025
47 of 49 checks passed

JaroslavTulach deleted the wip/jtulach/MultiType11846 branch January 7, 2025 13:52

JaroslavTulach added a commit that referenced this pull request Jan 7, 2025

Merging with latest develop that already contains #11924

1ffc0aa

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Initial benchmarks for intersection types + a bit of speedup #11924

Initial benchmarks for intersection types + a bit of speedup #11924

JaroslavTulach commented Dec 19, 2024 •

edited

Loading

JaroslavTulach Dec 20, 2024 •

edited

Loading

JaroslavTulach Dec 21, 2024

JaroslavTulach Dec 21, 2024 •

edited

Loading

JaroslavTulach Dec 28, 2024 •

edited

Loading

JaroslavTulach Dec 30, 2024

JaroslavTulach commented Jan 4, 2025 •

edited

Loading

JaroslavTulach Jan 4, 2025

JaroslavTulach Jan 7, 2025

hubertp Jan 7, 2025

JaroslavTulach Jan 7, 2025

Initial benchmarks for intersection types + a bit of speedup #11924

Initial benchmarks for intersection types + a bit of speedup #11924

Conversation

JaroslavTulach commented Dec 19, 2024 • edited Loading

Pull Request Description

Checklist

JaroslavTulach Dec 20, 2024 • edited Loading

Choose a reason for hiding this comment

JaroslavTulach Dec 21, 2024

Choose a reason for hiding this comment

JaroslavTulach Dec 21, 2024 • edited Loading

Choose a reason for hiding this comment

JaroslavTulach Dec 28, 2024 • edited Loading

Choose a reason for hiding this comment

JaroslavTulach Dec 30, 2024

Choose a reason for hiding this comment

JaroslavTulach commented Jan 4, 2025 • edited Loading

JaroslavTulach Jan 4, 2025

Choose a reason for hiding this comment

JaroslavTulach Jan 7, 2025

Choose a reason for hiding this comment

hubertp Jan 7, 2025

Choose a reason for hiding this comment

JaroslavTulach Jan 7, 2025

Choose a reason for hiding this comment

JaroslavTulach commented Dec 19, 2024 •

edited

Loading

JaroslavTulach Dec 20, 2024 •

edited

Loading

JaroslavTulach Dec 21, 2024 •

edited

Loading

JaroslavTulach Dec 28, 2024 •

edited

Loading

JaroslavTulach commented Jan 4, 2025 •

edited

Loading