Storage data exception #16531
-
Bug report criteria
What happened?
What did you expect to happen?
How can we reproduce it (as minimally and precisely as possible)?The problem still exists Anything else we need to know?No response Etcd version (please run commands below)$ etcd --version
# paste output here
$ etcd --version
etcd Version: 3.5.4
Git SHA: 08407ff76
Go Version: go1.16.15
Go OS/Arch: linux/amd64
$ etcdctl version
# paste output here $ etcdctl version Etcd configuration (command line flags or environment variables)paste your configuration hereEtcd debug information (please run commands below, feel free to obfuscate the IP address or FQDN in the output)$ etcdctl member list -w table
# paste output here
$ etcdctl --endpoints=<member list> endpoint status -w table
# paste output here Relevant log outputNo response |
Beta Was this translation helpful? Give feedback.
Replies: 12 comments 22 replies
-
It's standard for the DB file to be memory-mapped. And the memory can be largger than db data size because of copy-on-write. Maybe you can try to adjust the compact params to reduce the versions of keys and quicken the frequency of snapshot. And check the big key/value and optimize them. |
Beta Was this translation helpful? Give feedback.
-
Hey @liangpeihuahua - Thanks for your question, have you run a defragment recently for etcd? I'm wondering if this |
Beta Was this translation helpful? Give feedback.
-
$ etcdctl endpoint status -w table
$ etcdctl defrag --cluster
$ etcdctl endpoint status -w table
$ etcdctl endpoint status -w table
|
Beta Was this translation helpful? Give feedback.
-
This seems to be a big problem |
Beta Was this translation helpful? Give feedback.
-
{"level":"info","ts":"2023-09-04T06:20:02.486Z","caller":"v3rpc/maintenance.go:125","msg":"sending database snapshot to client","total-bytes":919449600,"size":"919 MB"} |
Beta Was this translation helpful? Give feedback.
-
In fact, the biggest problem now is that in this etcd cluster, all the key values of our business data have been deleted, that is, our business data is gone. But I don't understand why dbsize still has 919MB of storage, and the memory usage will reach 2G. The problem of restarting the etcd cluster still exists. |
Beta Was this translation helpful? Give feedback.
-
The real column is running normally and is not compressed before fragmentation, but etcd should be compressed every 5 minutes by default. |
Beta Was this translation helpful? Give feedback.
-
Because I can’t query the key of health either |
Beta Was this translation helpful? Give feedback.
-
I checked the db data in the /bitnami/etcd/data/member/snap directory. It contains our historical data. However, these data can no longer be found through etcdctl get {key name}, but they still exist in the db. |
Beta Was this translation helpful? Give feedback.
-
Please follow steps below,
|
Beta Was this translation helpful? Give feedback.
-
thanks
From: "Fu ***@***.***>
Date: Tue, Sep 5, 2023, 18:12
Subject: Re: [etcd-io/etcd] Storage data exception (Discussion #16531)
To: ***@***.***>
Cc: ***@***.***>, ***@***.***>
It's just about kernel memory management. I think you can check the cgroup metrics from container.
Just remember that the short answer is yes (how big the db is, how much memory should be used). :) Hope it can help.
… • https://www.kernel.org/doc/Documentation/cgroup-v1/memory.txt -> 5.2 stat file
• https://docs.kernel.org/admin-guide/cgroup-v2.html -> memory.stat
—
Reply to this email directly, view it on GitHub<#16531 (reply in thread)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/A4LZB7NHIH25YWMP4AB4BMTXY33IHANCNFSM6AAAAAA4KDLYTE>.
You are receiving this because you were mentioned.[image: https://github.com/notifications/beacon/A4LZB7O3P3BZJTXCHQFFD63XY33IHA5CNFSM6AAAAAA4KDLYTGWGG33NNVSW45C7OR4XAZNRIRUXGY3VONZWS33OINXW23LFNZ2KUY3PNVWWK3TUL5UWJTQANF5H4.gif]Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
Please follow steps below,
etcdctl --endpoints=:2379 endpoint status --write-out="json" | egrep -o '"revision":[0-9]*' | egrep -o '[0-9].*')
etcdctl compact ${the_revision_got_at_step_1}
etcdctl defrag