this post was submitted on 13 Dec 2023
483 points (100.0% liked)
196
16822 readers
158 users here now
Be sure to follow the rule before you head out.
Rule: You must post before you leave.
Other rules
Behavior rules:
- No bigotry (transphobia, racism, etc…)
- No genocide denial
- No support for authoritarian behaviour (incl. Tankies)
- No namecalling
- Accounts from lemmygrad.ml, threads.net, or hexbear.net are held to higher standards
- Other things seen as cleary bad
Posting rules:
- No AI generated content (DALL-E etc…)
- No advertisements
- No gore / violence
- Mutual aid posts are not allowed
NSFW: NSFW content is permitted but it must be tagged and have content warnings. Anything that doesn't adhere to this will be removed. Content warnings should be added like: [penis], [explicit description of sex]. Non-sexualized breasts of any gender are not considered inappropriate and therefore do not need to be blurred/tagged.
If you have any questions, feel free to contact us on our matrix channel or email.
Other 196's:
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
It won't save everything, but if a script follows every link recursively, most content should be reached that way. That's kind of what Google does but for one site instead of the internet.
If there is a search function try very simple queries.
The alternative of brute forcing links would be unfeasible, even if you are not rate limited by the site, due to the exponential complexity.
If you want to do something please look into api/scraping etikette like exponential back off.