at 2007-05-24
in Editorial
by kiesel
(0 comments)
In the enterprise world, not everything is a website - there are a lot more types of applications you have to deploy to actually have a running enterprise website / application.
Often one of these "hidden" applications are cron jobs which take care of cleaning out not completed order jobs, doing lengthy cache pre-calculations, perform data mining or many other things.
Crons - or in general - backend applications are therefore an important part of your whole application. Running regularily they do their duty.
Of course, such a critical task should be surveilled. Read on to see how you can do effective, but simple surveillance with the XP framework's tools in combination with the open source surveillance system Nagios:Usually, cron systems provide out-of-the-box error reporting through the mail system; however in case of certain circumstances - like a general database outage - sending out mails to developers is a burden for them - when they receive a new mail minute-to-minute with always the same error message: "Database not reachable".
Or, from the reverted perspective, if the cron silently fails (because error reporting is 2>/dev/null), developers will not take notice of the failure.
This is where the XP framework's Heartbeat class comes into play. The name has been adopted from the human body: a doctor can listed to a man's heartbeat to detect he's not dead - the same goes for crons. Through the Heartbeat class a cron can send a small TCP/IP surveillance packet to the surveillance system - nagios which can then reflect the crons status in a status website. Nagios itself is then configured to alarm after a certain period without heartbeats from the application.
You can do this with org.nagios.nsca.NscaClient which is also used internally by org.nagios.nsca.Heartbeat - the latter is just a easier API to use. This is what a normal cron would look like:
Heartbeat::getInstance()->setup('nagios://surveillance.xp-framework.net/application');
...
Heartbeat::getInstance()->emit(NSCA_OK, "Application ok.");
The setup() call configures the NSCA client with the server information (server name, protocol version, service name, ...), while emit() actually sends out the status information at the end of the run. You could also use send() which however throws an exception in case the surveillance system itself was not reachable. There are also other NSCA states to choose from:
- NSCA_OK
- NSCA_WARN
- NSCA_ERROR
- NSCA_UNKNOWN
Heartbeats in between a run are possible as well, possibly for a sub-service. However, Nagios will usually only reflect the latest state of the service in it's frontend - so it should not be updated too often.
With a setup like this, the developers will only receive mail if the application has not run for a configured period of time. If a database outage occurs which is resolved in below the timeout, the developer will not be bugged by bogus emails - database administrators will take care of it.
|
Subscribe
You can subscribe to the XP framework's news by using RSS syndication.
CategoriesNews General PHP5 Announcements RFCs Further reading Examples Editorial EASC Experiments Unittests Databases 5.8-SERIES Unicode Language 5.9-SERIES
RelatedFind related articles by a search for «KISS:».
|