If, during a heartbeat, the host receives an ERROR response from sending the last_modified time of its configuration, the host is expected to reconfigure itself. If the configuration changes many times in a short period of time, many heartbeats may be lost if they bail out as soon as the ERROR response is received. This has now been changed so that a heartbeat is completed in its entirety before the host attempts to reconfigure with the filter manager.
Increased the TCP config wait time between retries to one minute. Fixed sparodic errors with obtaining the I.P. address that caused a host to go all wibbly after a few days.
Lots of changes in the TCP configuration and TCP heartbeats. The aim of this is to make the ihost much more resilient when dealing with broken servers.
Just changed the expression within the connection while loop.
If we cannot connect to the filter manager, then we shall retry without closing the socket, as it won't be open to close ;)
Retry time is now ten minutes (wave of arm)
Added retry waits to the configuration process.
Added support for reconfiguring upon any type of heartbeat error.
Changed the path of the PID file... it didn't work so well when the ihost directory is on a shared file store :)
ihost now writes it's PID to a file, ihost.pid. This is so the ihostchk.sh script can verify if it's still running.
The host now acts sensibly when the server either does not return anything upon sending FILTER or if it returns a filter address that cannot be correctly split into a hostname and pair of ports.
Changed the way in which the host behaves when the server goes down mid-heartbeat. It shouldn't complain; it will just silently withdraw.
Fixed the heartbeat resilience (is that spelt right? ;-) Hosts should no longer die when (or rather, if) the server disappears. Heartbeats will just silently fail instead.
Changes made to ensure that the ihost keeps running if it is unable to open a socket to the filter for performing a heartbeat.
Eeeks. Left some debug info in, how very unprofessional of me :)
Although we were getting the FQDN fine, we weren't actually using it :)
Removed the now unneccessary code to work out the domain name, etc. The FQDN is provided by the server.
Updated to comply with the new FilterManger configuration protocol. Namely, the ihost now also sends "FQDN" to the server to request its fully qualified domain name. This simplifies the task of writing hosts as it no longer means that the host has to work out its own machine name (which is sometimes a non-trivial task on certain platforms)
It appears /etc/resolv.conf is different on some hosts, so we have to do some careful checks.
The building of disk information has been reworked to provide its information as attribute values, as these only occur at most once, thus reducing overall packet size.
Removed the printing of the entire $disk_info string.
Now using the correct names for packet values when obtaining them from the packet hash.
Tidied up the code a little, added explicit returns in the subroutines, a couple more comments where it was deemed necessary.
If the server 'disappears', then the ihost host will carry on running until it is once again able to deliver TCP heartbeats.
Added sending of the uptime value.
Improved the main loop.
Fixed problem of domain name not being specified on the first line of the cat'ed file.
Fixed a few minor print statements. Time is now expressed in milliseconds. The regexp for returning the machine name has been altered.
Changed the path of perl.
Machine hostname is no longer printed out.
And again...
Corrected the regexp for hostname again.
Modified the regular expression to return the hostname.
This form allows you to request diffs between any two revisions of this file. For each of the two "sides" of the diff, select a symbolic revision name using the selection box, or choose 'Use Text Field' and enter a numeric revision.