Howto block AI bots with fail2ban (Apache)
This blocks AI bots with fail2ban, the list of bots comes from https://raw.githubusercontent.com/ai-robots-txt/ai.robots.txt/refs/heads/main/robots.txt
First, log client IP and user agent into a separate log file:
<VirtualHost <ipv4> [ipv6]:443>
[…]
ErrorLog ${APACHE_LOG_DIR}/<vhost>-error.log
CustomLog ${APACHE_LOG_DIR}/<vhost>-access.log combined
CustomLog ${APACHE_LOG_DIR}/useragent.log "%h %{User-agent}i"
[…]
Now, create a jail in ‘/etc/fail2ban/jail.local‘
[apache-ai-crawler]
enabled = true
port = http,https
logpath = /var/log/apache2/useragent.log
bantime = 86400
findtime = 10
maxretry = 1
filter = apache-ai-crawler
And, of course, the filter itself (‘/etc/fail2ban/filter.d/apache-ai-crawler.conf‘)
[Definition]failregex = ^<HOST> .*AI1Bot.*$
^<HOST> .*Ai2Bot-Dolma.*$
^<HOST> .*Amazonbot.*$
[…]
Reload both apache2 and fail2ban, and you are set.
#AI #fail2ban