[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Error 500


On 2/6/26 3:49 AM, Fabien Bodard wrote:
robots.txt is ignored by ai crawlers. They do not respect this convention
So the need to find other indirect way  to block them.

I doubt there would be any significant monetary incentive for AI crawlers to slurp up the Gambas Wiki, so it is probably not much of a target. However, if it is an issue, there are ways to detect and block the less sophisticated bots:

1. User agent filtering. But, user agents are easily and commonly spoofed.
2. Ip address filtering. Same as above, but cloud services would have no legitimate reason to be visiting GambasWiki and their addresses could be blocked.
3. Rate limiting and throttling. I assume the server running GambasWiki already has that in place.
4. A honeypot.
5. Fingerprinting. This is more effectively accomplished at the server itself, but Javascript could be employed.


--
Lee

--- Gambas User List Netiquette [https://gambaswiki.org/wiki/doc/netiquette] ----
--- Gambas User List Archive [https://lists.gambas-basic.org/archive/user] ----


Follow-Ups:
Re: Error 500Benoît Minisini <benoit.minisini@xxxxxxxxxxxxxxxx>
References:
Error 500Gianluigi <gradobag@xxxxxxxxxxx>
Re: Error 500Gianluigi <gradobag@xxxxxxxxxxx>
Re: Error 500Gianluigi <gradobag@xxxxxxxxxxx>
Re: Error 500Bruce Steers <bsteers4@xxxxxxxxx>
Re: Error 500Benoît Minisini <benoit.minisini@xxxxxxxxxxxxxxxx>
Re: Error 500Benoît Minisini <benoit.minisini@xxxxxxxxxxxxxxxx>
Re: Error 500Bruce Steers <bsteers4@xxxxxxxxx>
Re: Error 500Benoît Minisini <benoit.minisini@xxxxxxxxxxxxxxxx>
Re: Error 500Fabien Bodard <gambas.fr@xxxxxxxxx>