Sat, 23 Aug 2008
Nagios Service and Hosts stats - Graphed in Munin
We've been hitting some load issues on one of our monitoring
machines recently and while it looks like the munin graph generation
is the culprit we also decided to keep an eye on how many services and
hosts Nagios was checking.
One of the downsides of having a very automated server deployment system is how easy it is to suddenly find yourself with an extra dozen hosts you no longer really need. While each check is quite small and quick, add up the frequent runs and multiply it by a reasonable number of servers and you can soon hit problems.
So as a first step towards keeping an eye on those numbers we now have a munin Nagios hosts plugin and a munin Nagios services plugin that show the total number of hosts and services monitored and the states those resources are in.
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2008/08/23 15:20 | /tools/commandline | Permanent link to this entry | This entry + same date
Nagios Checks - Validate HTML and Validate Feed
As part of my ongoing attempt to stop myself from silently making
mistakes (I don't so much mind the ones I notice) I've added another
couple of Nagios
Plugins. This time validate_feed
and validate_html.
As both of these checks call out to an external, third party resource, if you use them be sure to tweak your Nagios polling interval down to a respectful level.
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2008/08/23 15:11 | /tools/commandline | Permanent link to this entry | This entry + same date
Fri, 15 Aug 2008
UnixDaemon.net gitweb - because everyone else has one!
I'm not exactly a demanding user of version control systems so I've not
been heavily motivated to ditch my personal SVN repo (which I don't use as
much as I should) and plunge in to the shiny new distributed ones.
However (and this is my excuse) I've recently wanted to put a handful of
my own Nagios
plugins under a public VCS. While we use a number of the checks
at work I don't necessarily want the local changes to be made immediately
public so I thought I'd take this as an opportunity to have a fiddle
with git.
I've now got my own gitweb instance
(because I'm a tech sheep) and while it's pretty easy to install and setup
it was oddly difficult to track down how to modify anything beyond the
basics (the answer? Hack the feature hash from the config file - eg
$feature{'snapshot'}{'default'} =
[];). First impressions of git? I've got a lot to learn. The basic
commands were easy enough to pick up but I am very conscious of how little
of its power I'm using.
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2008/08/15 17:09 | /unixdaemon | Permanent link to this entry | This entry + same date
The Mummy: Tomb of the Dragon Emperor - Short Review
It's been the summer of returns, from the very enjoyable Dark Knight to the should
have been left buried 'plot' of Indy and the Crystal Skull.
Unfortunately most of them have been rubbish - and the The Mummy: Tomb of the Dragon
Emperor doesn't do anything to address this.
The special effects are good but nothing ground breaking, the new Evelyn O'Connell is a masterpiece of terrible - how such an uninteresting character can steal and kill so many scenes baffles me. The story is predictable, boring and lacks the hammy goodness of the first two. At least they didn't cast Shia LaBeouf. 3/10.
Now roll on Hellboy 2!
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2008/08/15 16:54 | /movies | Permanent link to this entry | This entry + same date
Thu, 14 Aug 2008
Filter syslog logs with syslogslicer
While digging through a pile of syslog log files recently I needed
something a little more data format aware than pure grep. So I present the
first version of syslogslicer
- a simple perl script that knows a little bit about the syslog log file
format.
# some example command lines
syslogslicer -p cron -f program,message /var/log/syslog
# print the program and message for all lines with program 'cron'
syslogslicer -p cron -m hourly /var/log/syslog
# all fields for all lines with program 'cron' and message 'hourly'
syslogslicer -p cron -m hourly -s 20080810100000 -e 20080810123000 /var/log/syslog
# all fields for all lines with program 'cron' and message 'hourly'
# between 20080810100000 and 20080810123000
syslogslicer allows you to filter the output by matching text in the program or log message, only print certain output fields and do basic time based filtering. If you've ever wanted to see all the logs raised by postfix with the word 'database' in them between 10 and 11 am then this might be the tool for you.
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2008/08/14 13:28 | /tools/commandline | Permanent link to this entry | This entry + same date
Nagios - Check Proxy Check
"This script retrieves a URL via a specified proxy server and
alerts (using the standard Nagios conventions) if the request
fails."
We're running a couple of services through a proxy server for a number of good, and to be honest a couple of not so good but mandated, reasons. The Check Proxy Check Nagios Plugin ensures that if the proxy goes down in a way that stops us pulling pages through it we know.
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2008/08/14 10:30 | /tools/commandline | Permanent link to this entry | This entry + same date
Apache JMeter - Short Review
A short review for a short book.
Apache
JMeter (Packt Publishing) is a good book if you're new to both IT and
testing and want your hand securely held. It introduces you to the basic
ideas behind automated testing, takes you step by step through some
simple GUI test cases and then doesn't go any further.
It's a short book and maintains its beginners focus well but it has a very short lifespan (luckily it's also available as a cheap PDF) and if you're comfortable with GUIs and basic testing, or willing to click around for a while I'd recommend you dive straight in to the JMeter GUI rather than investing half a day to read this book.
On the downside it didn't cover any of the aspects of JMeter I found interesting and wanted to learn about - the access log sampler and distributed load testing spring to mind - which in a beginners book is fine enough but does make it completely the wrong book for me.
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2008/08/14 08:23 | /books | Permanent link to this entry | This entry + same date
Wed, 13 Aug 2008
Nagios Disk Check - Mountpoint or Filesystem?
If you mount filesystems under a specific mount point, and monitor
them with Nagios, then be sure
you understand what happens if the underlying file system goes away.
With:
/usr/lib/nagios/plugins/check_disk -w 15% -c 10% -p /a_mount_point
you'll get the value from the containing file system. In this case
/. If you'd rather know that your chosen mount point has
actually gone away, and that you're no longer checking what you thought
you were, then add the -E option to the command. This will
turn on exact path matching and catch that kind of error.
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2008/08/13 22:54 | /tools/commandline | Permanent link to this entry | This entry + same date
Testing the 'Net isn't there with Nagios
We've recently had to deliberately disable some machines this week to
ensure they can't connect out to the internet - we're building testing
versions of some of our more restricted secure environments and this is
one of the steps.
It was actually easier to do with IPTables than I thought (mostly because I didn't have to do it - my co-worker did) but once the work was done we needed to ensure it didn't accidently get broken so that networking was functional again. And yes that's an odd thing to type. So naturally we turned to Nagios and so, for my own memory as much as anything else, here is the check we're using:
# put this in the machines nrpe config file.
/usr/lib/nagios/plugins/negate -t 30 "/usr/lib/nagios/plugins/check_http -w 5 -c 10 -H www.google.com -u /"
In the Nagios 'Status Information' field you'll get a message that
looks like this - CRITICAL - Socket timeout after 10
seconds - but the check returns the correct error code so it's
all green.
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2008/08/13 22:50 | /tools/commandline | Permanent link to this entry | This entry + same date
Tue, 12 Aug 2008
You've gathered the requirements, written the code, debugged it, received the new requirements, rewritten the code, got more change requests, reached a 'compromise' with QA (and hidden the bodies) and now you want to have the sysadmins do the release.
Don't be like everyone else - when it comes to releases too many people fail at the last mile and make obvious mistakes. In an attempt to save myself some pain (and have something to point co-workers at) here are some of the software release principles that I hold dear.
Out of hours releases will have adequate support
Or as I like to think of it - "out of hours releases will hurt you as much as they will me. And a little bit more" If the release is important enough to require me in the office late at night or over the weekend then it's important enough to have development support and a manager present Just In Case. It'll also force people to be a little more considerate of my time and availability.
No live release will happen after 4pm (at the latest)
There's nothing quite as frustrating as getting two third of the way through a live release, hitting a problem or needing clarification of something that the staging environment didn't pick up (yes, I know it should have. Let's fix it for next time) and discovering it's 6pm and everyone else is already on the tube or in the pub.
You then have the pleasure of either backing the release out (if you actually can) and explaining why you killed the scheduled release or hanging around with half an upgraded system waiting for someone to get your voice mails and call you back. Which is even less likely to happen if you ignore...
No release will happen on the day before a non-work day.
Or the day the lead developer goes on holiday.
"We're nearly done. Can you get $dev to have a look at this line in the application error log please?" "Actually he's in Peru for the next three weeks. I'll get someone else who's never seen the system before to confirm that everything's fine." Apart from the obvious sign this is a made up conversation (application error logs that contain information - HAH!) I've been bitten by this a number of times. It always seems to end with some other poor developer with a postage stamp of hand over notes looking sheepishly at me while explaining that the log line could never happen.
You'll provide me with a list of what's changed
When you're developing you should maintain decent change logging above and beyond 'commit -m'. I'd like the world to agree that commit messages are for developers and release notes are for sysadmins; let's pretend I'm not paranoid enough to read the commits list anyway.
If you're using one of agile methodologies that uses stories for everything then feel free to put the story number in the notes to provide some background. However I still expect a one sentence summary of each change.
If you don't have a decent, and comprehensive, list of changes expect me to get... inquisitive about undocumented changes. I will diff the packages (better version of this coming soon) or source code (and if I ever get the time to look at SQL::Translator I'll be aggressively double checking your schemas). If you don't mention it I can't prepare (and add monitoring) for it, QA can't test all the new paths and I'll make a point of in the release retrospective meeting.
It should be possible to stagger releases across machines
I'm a fan of the one, few, many approach to software releases. I want the ability to role out the new system in chunks. I should be able to break off a couple of web servers so I have a warm standby just in case something goes wrong. I know this gets difficult to do once you involve databases but it's still a goal that should be considered - especially with read only copies of the data, replication slaves and data snapshots in the tool belt.
So in closing, I'm a demanding bugger. However, just like my Cron Commandments post, it's nice to have this list somewhere online to point people at.
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2008/08/12 23:12 | /sysadmin | Permanent link to this entry | This entry + same date
YAPC::EU 2008 - Not for me
Since I've been asked where about at the conference I am I should probably
mention that I'm not attending YAPC::EU this year.
Despite the excellent job the organisers did last year at the Nordic Perl
Workshop a combination of factors stopped me going back to Copenhagen.
The first one (and it's shallow but true) is that I've been there now. I like conferences in places I've never been before. If I'm going to spend a chunk of my own cash on travel I want to grab an extra day or two and have a wonder around. While Copenhagen was nice I did most of the city (and the mermaid, the river boat and got very sunburnt) last time I was there.
The second reason is there just ain't many interesting talks. While there are a handful I'm eagerly awaiting the slides from they are spread out over the entire conference. There are a number I've seen, a bunch I've no interest in (some in topics I already have a grounding in, some by people I can't watch for an hour) and only a few that I'd get out of bed early for. And we're not talking before ten am even for those. I don't think it's a perl wide problem - YAPC::Asia had a very interesting line this year and I'm sorry I missed it.
Add those two together and I can't really justify the time or money. So I've saved this years YAPC money and spent it on PyCon UK 2008 instead. It doesn't require me to suffer through an airport, I'm pretty sure I'll know almost nothing about any of the sessions beyond what I've seen on reddit and similar sites and, considering that work is all python on new projects it can't hurt for me to pick up some of the same technologies that our developers use.
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2008/08/12 16:35 | /events | Permanent link to this entry | This entry + same date
Yumdpkg-provides
I've never really felt as proficient with apt and dpkg as I did with RPM.
There always seems to be another option I've never seen before.
Luckily there are also big holes in my knowledge of yum to make me feel well
rounded.
After reading yum options you may not know exist and spending a while puzzling out how to get the same results in Debian (apt-file seems to be the closest fit but I never got the invocation right) I decided to write dpkg-provides.
It's not packaged, doesn't have a manpage, requires the network and isn't integrated with the existing tools. At least I know how I'd get the information now - from the web. Who'd thought it?
Note: it's actually quite simple to work out which package provides a file
that you've got installed locally (dpkg -S '*/df') - it's more
of a pain to probe packages you don't have installed.
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2008/08/12 16:13 | /tools/commandline | Permanent link to this entry | This entry + same date

