-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Exponential growth of watchers when ZK client disconnects/reconnects #10
Comments
As @jafingerhut pointed out, there are two apparently incorrect uses of swap!, one in atoms.clj and one in refs.clj. In both cases, the error is in the removeWatch implementation. This resolves liebke#11, and possibly issues liebke#3 and liebke#10.
As @jafingerhut pointed out, there are two apparently incorrect uses of swap!, one in atoms.clj and one in refs.clj. In both cases, the error is in the removeWatch implementation. This resolves liebke#11, and possibly issues liebke#3 and liebke#10.
As @jafingerhut pointed out, there are two apparently incorrect uses of swap!, one in atoms.clj and one in refs.clj. In both cases, the error is in the removeWatch implementation (shown to be broken by the new tests in my previous commit). This commit fixes resolves liebke#11 and resolves liebke#3. I can't tell for sure yet, but I suspect that it might help with liebke#10.
Hi @jonpither, I know it's been a while and I'm not sure if this is something you're still running into, but I've got some time and I'd like to see if I can fix this issue. I think I've got the cause nailed down, but I could use some help reproducing the issue so I can be sure. Can you point me in the right direction? |
Hi @holguinj It's a tough one to reproduce. You want to cause a session disconnect but NOT a session timeout. I can't recall how I programmatically nailed it down, but you can set a long-time out and effect some disconnects. I've since moved project so I don't have the sort of test infrastructural code available. Once you've got the disconnects happening you can then query the amount of watchers registered before/after. What should happen is that the watcher gets fired on a disconnect (and removes itself as per usual), then re-registers. What currently happens is that the watcher fires, doesn't remove itself, and re-registers itself. Hence an exponential growth. Good luck. |
Hi,
If for whatever reason the ZK client drops a connnection and reconnects before the SESSION_EXPIRY is raised, the number of watchers managed by the client will double each time.
During some network issues we had a few processes grind down to a halt because of this.
This is because Avout raises a watcher to manage the internal state of atoms and refs. I.e. see invalidateCache in avout.refs.
What should happen here is that each time the watcher is triggered, it should filter out events like {:event-type :None, :keeper-state :Disconnected}, perhaps by only ever being interested in event-type NodeDataChanged.
When these events get raised Avout currently registers brand new watchers even though the original watcher will still carry on firing. In ZK watchers are "fire-once" but not in the case of some events like the disconnect/connect.
This is the same for addWatch in refs and atoms.
The text was updated successfully, but these errors were encountered: