Skip to content

Commit

Permalink
Avoid tracking small allocations by setting memThreshold
Browse files Browse the repository at this point in the history
In order to avoid running out of memory and crashing the server, Arkouda
uses `memTrack` to track memory allocations and raise a client error
when a large allocation would likely exceed the available memory.

Recently, we discovered some performance overheads from this tracking
while doing highly concurrent short-lived allocations. The `memTrack`
flag is implemented with a global lock to protect access to a hash
table, which is serializing and really hurting the performance of
concurrent allocs. To limit the impact of this, use `memThreshold=1M` to
only track larger allocations and skip skip tracking smaller ones.

Using `memThreshold` did not have quite the performance boost I was
expecting at first with Chapel 1.25 because we didn't have a way to do
threshold checks on free. That was added in chapel-lang/chapel 18465.

Here's an allocation micro-benchmark that shows the performance on a
single node with 128-Rome CPUs:

```chpl
use Time;
config const trials = 1_000_000;

var t: Timer; t.start();
coforall 1..here.maxTaskPar do
  for i in 1..trials do
    var s = i:string;
writeln(t.elapsed());
```

| config              | Time    |
| ------------------- | ------: |
| w/o memTrack        |   0.19s |
| w/ memTrack         | 144.50s |
| w/ threshold 1.25.0 |  33.06s |
| w/ threshold main   |   0.22s |

So this should benefit code that has highly concurrent allocations like
casting to strings and currently regex operations (though we're working
on optimizing the regex allocations out.)

Part of 675
Part of 929
  • Loading branch information
ronawho committed Sep 29, 2021
1 parent a18b753 commit 88192c7
Show file tree
Hide file tree
Showing 2 changed files with 2 additions and 1 deletion.
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ CHPL_FLAGS += --no-checks --no-loop-invariant-code-motion --no-fast-followers --
else
CHPL_FLAGS += --fast
endif
CHPL_FLAGS += -smemTrack=true
CHPL_FLAGS += -smemTrack=true -smemThreshold=1048576
CHPL_FLAGS += -lhdf5 -lhdf5_hl -lzmq

# We have seen segfaults with cache remote at some node counts
Expand Down
1 change: 1 addition & 0 deletions tests/client_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,7 @@ def test_get_mem_used(self):
expected or the call to ak.client.get_mem_used() fails
'''
try:
a = ak.ones(1024*1024)
mem_used = ak.client.get_mem_used()
except Exception as e:
raise AssertionError(e)
Expand Down

0 comments on commit 88192c7

Please sign in to comment.