Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Proposal] Add a globally-applicable table for rate limiting rules on URLs #2065

Open
metaprime opened this issue Jan 6, 2025 · 1 comment

Comments

@metaprime
Copy link
Contributor

metaprime commented Jan 6, 2025

For link aggregators like Reddit, the ripper shouldn't have to know every URL that could show up. However, Ripme broadly knows.

It might be a good idea to have a globally-applicable (to all rippers) table for rate limiting rules on URLs (in terms of not ripping a new URL for a given domain until the rate limit time has expired) so that rate limiting isn't piecemeal in each ripper's logic.

Broadly-speaking, we don't have any real rate limiting system in place, and any rate limiting has been ad hoc in simply delaying when the link gets added to the queue, which isn't actually the same as delaying after a download completes (consider if the "rate limit" delay is 3 seconds before the link is added to the download queue, but downloads take 10 seconds -- we then pretty immediately have no real rate limiting in place anymore this way).

@metaprime metaprime changed the title Add a globally-applicable table for rate limiting rules on URLs [Proposal] Add a globally-applicable table for rate limiting rules on URLs Jan 6, 2025
@metaprime
Copy link
Contributor Author

Also, consider the update scenario. Self-rate-limiting shouldn't be necessary if we check a URL, discover that we already have it, and don't actually rip it. So, we should have a mechanism in place to apply the rate limiting rules only after we successfully rip something.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant