Replies: 2 comments 3 replies
-
Hi! |
Beta Was this translation helpful? Give feedback.
-
The issue is probably more the number of URLs in the index rather than the size or quantity of the WARC files? I wonder if @ikreymer has done benchmarking like this before? |
Beta Was this translation helpful? Give feedback.
-
Hey there! I noticed in this reply you (Mikkel) mention that you're crawling to WARC files. Are you keeping these around after indexing? I imagine the storage requirements for that would get quite large... but there are some cool web archiving possibilities / serving cached pages if you are!
Beta Was this translation helpful? Give feedback.
All reactions