fail2ban hangs after making change

Refer to KB http://kb.sp.parallels.com/en/122407

Symptoms

After enabling a jail, fail2ban service can be neither restarted nor stopped. The service status is shown wrong, without the jails list:

# service fail2ban status
fail2ban-server (pid  3291) is running

/var/log/fail2ban.log shows that it stopped when adding log files to its monitoring pool:

2014-07-27 21:09:25,487 fail2ban.filter [25047]: INFO    Added logfile = /var/www/vhosts/system/dom1.com/logs/proxy_access_log
2014-07-27 21:09:25,985 fail2ban.filter [25047]: INFO    Added logfile = /var/www/vhosts/system/domain.com/logs/proxy_access_ssl_log

Cause

Fail2ban has plesk-apache-badbot and plesk-apache (or other big) jails enabled. That jail forces fail2ban to parse all the access and error logs for each virtual host and Apache’s access log. In case if there are a lot of virtual host access logs, the service hangs by resource over usage trying to parse all of them.

NOTE: When you enable this jail in panel you might see the warning:

Warning: Fail2Ban might not work well if there are many domains and Fail2Ban has to monitor too many log files.

Resolution

The following instruction should be applied in case if there are less then 300 domains, and amount of log files in jail should be reduced:

  1. Kill the stuck process(es) by PID (exercise caution, it might be a good idea to check PIDs which it will kill first by omitting the last part after |):
    # ps aux | grep fail2ban|awk '{print $2}'|xargs kill -9
    
  2. Remove .pid file:
    # rm -f /var/run/fail2ban/fail2ban.pid
    
  3. Reduce the amount of logs to parse for the jail plesk-apache-badbot (or disable the jail altohether). Open file/etc/fail2ban/jail.d/plesk.conf and change the mask of the path to logs from '*access*log' to '*access_log':
    [plesk-apache-badbot]
    
    enabled  = true
    filter   = apache-badbots
    action   = iptables-multiport[name=BadBots, port="http,https,7080,7081"]
    logpath  = /var/www/vhosts/system/*/logs/*access_log
               /var/log/httpd/*access_log
    
  4. If fail2ban service is running, execute fail2ban-client reload. Otherwise start the service.

The below instruction is for big amount of domains (more then 300):

Fail2Ban can use a lot of RAM on the server, in case if it monitors a lot of jails with many log files! Make sure that the server will not experience the out-of-memory condition before applying this solution! If it does – disable some jails.

In case if you have a very big amount of domains on your Plesk server, and the above workaround doesn’t help, you may divide the logs by different jails, so that they are loaded one by one and therefore minimize the amount of logs in one jail. This should help, since issue is caused by a single big jail with a lot of logs, but it won’t be happening when there are a lot of jails with small amount of logs included.

Please use the following commands to create separate jails for domains according to the first name letter\digit:

  1. Get admin email:
    admin_email=`mysql -Ns -uadmin -p\`cat /etc/psa/.psa.shadow\` psa -Ne"select email from clients where login='admin'"`
    
  2. Set plesk-apache jails:
     for i in a b c d e f g h i j k l m n o p q r s t u v w x y z 1 2 3 4 5 6 7 8 9 0;do find /var/www/vhosts/system/$i*/logs/error_log 2>/dev/null 1>/dev/null; found=`echo $?`;if [ $found == "0" ];then echo "[[\"usedns\",\"no\"],[\"logpath\",\"\\/var\\/www\\/vhosts\\/system\\/$i*\\/logs\\/error_log\"],[\"enabled\",\"true\"],[\"filter\",\"apache-auth\"],[\"maxretry\",\"6\"],[\"__source__\",\"jail.d\\/plesk.conf\"],[\"action\",\"iptables-multiport[name=apache, port=\\\"http,https,7080,7081\\\"]\"],[\"ignoreip\",\"127.0.0.1\/8\"],[\"bantime\",\"600\"],[\"destemail\",\"$admin_email\"],[\"findtime\",\"600\"],[\"backend\",\"auto\"]]"|/usr/local/psa/admin/bin/f2bmng --set-jail plesk-apache-$i ;fi;done
    
  3. Set plesk-apache-badbot jails:
     for i in a b c d e f g h i j k l m n o p q r s t u v w x y z 1 2 3 4 5 6 7 8 9 0;do find /var/www/vhosts/system/$i*/logs/error_log 2>/dev/null 1>/dev/null; found=`echo $?`;if [ $found == "0" ];then echo "[[\"usedns\",\"no\"],[\"logpath\",\"\\/var\\/www\\/vhosts\\/system\\/$i*\\/logs\\/*access_log\"],[\"enabled\",\"true\"],[\"filter\",\"apache-badbots\"],[\"maxretry\",\"100\"],[\"__source__\",\"jail.d\\/plesk.conf\"],[\"action\",\"iptables-multiport[name=BadBots, port=\\\"http,https,7080,7081\\\"]\"],[\"ignoreip\",\"127.0.0.1\/8\"],[\"bantime\",\"172800\"],[\"destemail\",\"$admin_email\"],[\"findtime\",\"600\"],[\"backend\",\"auto\"]]" |/usr/local/psa/admin/bin/f2bmng --set-jail plesk-apache-badbot-$i;fi;done
    
  4. In regular plesk-apache-badbot and plesk-apache jails, leave only general error\access logs file paths:

    plesk-apache-badbot:

    /var/log/httpd/*error_log
    

    plesk-apache:

    /var/log/httpd/*access_log
    
  5. On steps 2 and 3 we created jails only if there are some domains matching the first digit\letter of domain name. Otherwise, Fail2ban will not start due to configuration errors. Now we need to set up the script that adds a jail upon new domains creation:
    • Download the attached script , put it on your server, and grant executable permissions:
      wget http://kb.sp.parallels.com/Attachments/kcs-32570/add_jails.sh
      
      chmod +x add_jails.sh
      
    • Create tasks in Plesk event manager with the following parameters:

      Domain created lowest (0) root /root/add_jails.sh <new_domain_name>

      Default domain (the first domain added to a subscription or webspace) created lowest (0) root /root/add_jails.sh <new_domain_name>

  6. Use the first instruction to restart Fail2ban if it hanged.
Posted in Uncategorized.