Some days ago we had trouble on one of our QFXes where the jdhcpd deamon would consume 100% CPU and “crash” – resulting in users not getting IP’s anymore.
While TAC is still investigating, I made a quick Workaround for this – the DHCP-Sheriff 😉
#!/bin/bash current=$(top | grep jdhcpd | awk '{ print $10 }') desired="1.00%" if [ ${current%.*} -eq ${desired%.*} ] && [ ${current#*.} \> ${desired#*.} ] || [ ${current%.*} -gt ${desired%.*} ]; then echo "$(date)" >> /var/log/dhcp-sheriff.log >> /var/log/dhcp-sheriff.log echo The current load of dhcp-service is above desired value - restarting the service >> /var/log/dhcp-sheriff.log echo "${current} >= ${desired}" >> /var/log/dhcp-sheriff.log cli restart dhcp-service echo "" >> /var/log/dhcp-sheriff.log; else echo "$(date)" >> /var/log/dhcp-sheriff.log echo The current load of dhcp-service is below desired value - no action needed >> /var/log/dhcp-sheriff.log echo "${current} <= ${desired}" >> /var/log/dhcp-sheriff.log echo "" >> /var/log/dhcp-sheriff.log; fi
This Script restarts the Service if the load of the Service is above 1% (adjustable) – this can be easily adopted to other services and thresholds.
1.) Login as root and in shell type: vi /var/tmp/dhcp-sheriff.sh
2.) Press “i” and paste the above lines, followed by “[Esc-Button]”. Save and Quit with :wq
3.)
chmod +x /var/tmp/dhcp-sheriff.sh
4.)
crontab -e
0 */8 * * * sh /var/tmp/dhcp-sheriff.sh (executes it every 8h)
5.)
crontab -l
0 */8 * * * sh /var/tmp/dhcp-sheriff.sh
6.) in cli check after job has finished to run via show log dhcp-sheriff.log
Feel free to use this to your advantage – hopefully this will be a workaround for you in urgent-times until a fix is released.
This is only a workaround – do not use this in production for a long time / use at your own risk.