You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Something that could help with testing browsertrix-crawler is a way to list pages in the WACZ, maybe something like:
> wacz extract pages --url
https://example.com/
Not sure if its generally useful, outside of testing, but may be? Could allow listing just URL, or all the other fields.
Of course, easy enough to just unzip pages/pages.jsonl so not sure if we want this.
Just opening this to keep track.
(Extracting URLs from index could be another feature later)
The text was updated successfully, but these errors were encountered:
Something that could help with testing browsertrix-crawler is a way to list pages in the WACZ, maybe something like:
Not sure if its generally useful, outside of testing, but may be? Could allow listing just URL, or all the other fields.
Of course, easy enough to just unzip pages/pages.jsonl so not sure if we want this.
Just opening this to keep track.
(Extracting URLs from index could be another feature later)
The text was updated successfully, but these errors were encountered: