this post was submitted on 04 Aug 2025
55 points (100.0% liked)
Technology
353 readers
171 users here now
Share interesting Technology news and links.
Rules:
- No paywalled sites at all.
- News articles has to be recent, not older than 2 weeks (14 days).
- No videos.
- Post only direct links.
To encourage more original sources and keep this space commercial free as much as I could, the following websites are Blacklisted:
- Al Jazeera;
- NBC;
- CNBC;
- Substack;
- Tom's Hardware;
- ZDNet;
- TechSpot;
- Ars Technica;
- Vox Media outlets, with exception for Axios;
- Engadget;
- TechCrunch;
- Gizmodo;
- Futurism;
- PCWorld;
- ComputerWorld;
- Mashable;
- Hackaday;
- WCCFTECH;
- Neowin.
More sites will be added to the blacklist as needed.
Encouraged:
- Archive links in the body of the post.
- Linking to the direct source, instead of linking to an article talking about the source.
founded 3 months ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
The only surprising thing to me from this article is that OpenAI actually follows the rules for bot crawlers.
Or they haven't been caught yet.
The article explains PerplexityBot respects robots.txt, but then sends a different request with a different IP and different user-agent. They could very well be using a different method to walk around it.
The article explains how they tested for that, and as far as they could tell OpenAI is respecting the rules.