Delete entire site data on data corruption/loss? #431

abhishek-shanthkumar · 2024-10-15T13:52:07Z

IndexedDB data stored on disk may get corrupted or partially lost due to several reasons including user action. Attempts by the site to read such data fail persistently, and these failures are surfaced differently by each browser.
Issue #423 specified the DOMException type for one such scenario, but this has an impact only if developers update their sites to handle this specific error appropriately.

Can we do more to ensure that the "right thing" happens in these scenarios?

We should note that:

Sites may store relational data across object stores and/or databases.
Sites may even store metadata about IndexedDB data in other storage such as LocalStorage.
The impact of and recoverability potential from partial data corruption/loss varies across sites based on the nature of data stored.

Considering the above, it is best left to the individual sites to handle these scenarios as suits their specific usage patterns. Discussions and efforts have been ongoing to surface sufficiently detailed errors to the sites so that they are equipped to handle them appropriately (#423, whatwg #75).

However, a majority of sites may not handle these errors at all, in which case reads of the affected data will persistently fail. Should we attempt to mitigate these errors by deleting the entire site data (perhaps limited to the containing storage bucket) if we get a strong signal that the site does not handle these errors?

cc @asutherland, @evanstade, @inexorabletash

The text was updated successfully, but these errors were encountered:

abhishek-shanthkumar · 2024-10-16T08:01:28Z

One wrinkle I see in adopting this approach is the challenge of differentiating between sites that currently don't handle these errors and sites that don't intend to handle these errors. I presume that we don't want to tell sites to "handle these errors by so-and-so milestone after which we'll wipe all your data if this error occurs".

asutherland · 2024-10-16T15:17:55Z

Also related is the management section of the storage spec which says:

Whenever a storage bucket is cleared by the user agent, it must be cleared in its entirety. User agents should avoid clearing storage buckets while script that is able to access them is running, unless instructed otherwise by the user.

Quoting #431 (comment)

One wrinkle I see in adopting this approach is the challenge of differentiating between sites that currently don't handle these errors and sites that don't intend to handle these errors. I presume that we don't want to tell sites to "handle these errors by so-and-so milestone after which we'll wipe all your data if this error occurs".

From my perspective, the dominant concern is site breakage and that it's hard for the browser to tell if a site is broken or not so it's safest to assume if we experience corruption and the site is not currently actively opted in to trying to handle it itself that we have to assume the site is broken. The only way to provide some kind of "deal with this in the future if you want" would be to do something like back-up the origin into a magic bucket, but this creates new privacy complications.

asutherland · 2024-10-16T15:34:28Z

Quoting #431 (comment)

Considering the above, it is best left to the individual sites to handle these scenarios as suits their specific usage patterns. Discussions and efforts have been ongoing to surface sufficiently detailed errors to the sites so that they are equipped to handle them appropriately (#423, whatwg #75).

I don't really expect that sites can meaningfully do much more than could be done automatically if the site was using multiple storage buckets. But I do believe there are sites that will want to handle this and it seems reasonable to provide an opt-in affordance.

One thing I should note is that in #423 (comment) I proposed that calling preventDefault() on an error reporting corruption could prevent the default behavior of wiping the containing storage bucket because this is a straightforward use of the IDB event model, but this is of course potentially at odd with promise ergonomics because it's easy to go async in a way that depends on a new task being scheduled rather than everything being resolved in the microtask checkpoint for the event dispatch. I believe there is some related discussion in the proposal of observables in WICG/observable#170 (in particular involving preventDefault()).

If we wanted to ensure promises could go fully async, we'd want something like along the lines of ServiceWorker's FunctionalEvent.waitUntil so that it's still valid to call preventDefault() (and stalling the IDB transaction) until the passed-in promises either all resolve or reject. Semantically, a rejection there is potentially confusing if one thinks about it too much, but from a spec perspective would map equivalently to an exception being thrown by the event handler.

TonnyWildeman · 2024-12-09T10:25:28Z

I seem te be missing the simple option of having a simple call for a complete nuke of the current indexedDb.
Not iterating over indexedDb.databases(), but simply a call to e.g. indexedDb.destroyDatabases().
This call would completely destroy all indexedDb corrupted files on the physical filesystem.

In my case, I actually needed to manually destroy indexedDb in DevTools. And others needed to clear cache on their mobile device. Couldn't even iterate using indexedDb.databases().

inexorabletash · 2024-12-09T18:03:26Z

Using the Clear-Site-Data header is an option - https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Clear-Site-Data

Note that adding a new API (like a way to destroy all databases) to work around browser implementation bugs (like iteration or deletion failing) is not a pattern we like to follow.

abhishek-shanthkumar mentioned this issue Oct 16, 2024

Specify the exception for read failures arising due to partial data loss on the file system #423

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Delete entire site data on data corruption/loss? #431

Delete entire site data on data corruption/loss? #431

abhishek-shanthkumar commented Oct 15, 2024 •

edited

Loading

abhishek-shanthkumar commented Oct 16, 2024

asutherland commented Oct 16, 2024

asutherland commented Oct 16, 2024

TonnyWildeman commented Dec 9, 2024

inexorabletash commented Dec 9, 2024

Delete entire site data on data corruption/loss? #431

Delete entire site data on data corruption/loss? #431

Comments

abhishek-shanthkumar commented Oct 15, 2024 • edited Loading

abhishek-shanthkumar commented Oct 16, 2024

asutherland commented Oct 16, 2024

asutherland commented Oct 16, 2024

TonnyWildeman commented Dec 9, 2024

inexorabletash commented Dec 9, 2024

abhishek-shanthkumar commented Oct 15, 2024 •

edited

Loading