Sat, 31 Mar 2007
When Sysadmins Ruled the World - Like that'll Happen!
There is something immensely isolating about working alone in a very
secure, huge data centre, at 4am on a Sunday morning in an isolated
"business park" in rural Scotland that only a few people will ever
understand.
The mind wanders, your ears strain to hear things over the quite loud air conditioning and just five minutes in daylight with a can of diet coke and someone to talk to would make the last 48 hours seem tolerable. It's hard to describe and even harder to capture but When Sysadmins Ruled the Earth makes a decent go of it.
Here is to everyone who has played "hunt the vending machine" while swapping hard drives.
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2007/03/31 23:57 | /books | Permanent link to this entry | This entry + same date
Release the Kittens - Chris Blizzard
Yes, I'm completely behind with this one but it's Linux geek funny (the
images are CC licensed by Chris
Blizzard). It is also a sneaky test of the planet.gllug.org image handeling.
And then to the Debian and Ubuntu versions.
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2007/03/31 16:34 | /geekstuff | Permanent link to this entry | This entry + same date
Thu, 29 Mar 2007
Simulating Typing in Perl - Take Two
In my Simulating Typing in Perl post
I included a small chunk of perl for varying the typing speed of a fake
user. While it works it did have some oddities that were noticeable by a
sharp eyed viewer.
Thanks to a pointer from Mark Fowler I've now revised the script slightly and included String::KeyboardDistance. This nifty module knows how far away keys on the keyboard are from each other and so helps to smooth the delays out a little; for example the string 'aaaaa' is now typed much faster than before (because there is no travel involved) where as 'qpqpqpq' will be slower due to the finger movement - although I'm not bothered enough to make repeated sequences faster.
I've also uploaded the revised automatic typing script to UnixDaemon.net
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2007/03/29 00:02 | /perl | Permanent link to this entry | This entry + same date
Wed, 28 Mar 2007
Marooned In Realtime - Short Review
Marooned In
Realtime was the first Vinge book I read and it has prompted me to
start looking for all his others.
A small number of time travellers (that can only go forward) awaken to find out humanity is gone. Amid a plan to gather all the other travellers together and kick start the human race one of the more powerful techs dies in odd circumstances, a 9000 year old traveller returns, aliens might be waiting to finish us off and an ex-detective is ordered to lead a manhunt to find out just what happened to the projects architect and biggest supporter (who may have been murdered by old age). Oh and people of different backgrounds don't get on. So some of it is familiar :)
It's also worth noting that this is actually a sequel to The Peace War (which I've yet to read) but it stands alone as a riveting read. The combination of sci-fi and detective story is a favourite of mine and this one is a top notch example of how to do it right.
Summary: humanity is almost finished, a few of the survivors have all too powerful technology, a possible murder might have been committed and one of the lo-techs is roped in to find it. If I had a checklist this book would tick most of them. 7/10.
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2007/03/28 23:32 | /books | Permanent link to this entry | This entry + same date
True Names - Short Review
This is more like it, True
Names by Vernor Vinge is a great mix of sci-fi and fantasy.
Technical wizards join forces in cyberspace to oppose the "Great Adversary". When one of them is compromised and turned in the real world a hunt for the most dangerous of the online personas is launched, leading to a great chase and some nicely described online battled. I'm not doing it justice, just click the above link dammit.
Summary: an enjoyable, expertly paced story that was one of the first to introduce some of the most common themes in modern sci-fi. It has aged surprisingly well and is more than worth a read. 7/10.
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2007/03/28 23:08 | /books | Permanent link to this entry | This entry + same date
Blood Music - Short Review
I've been on a sci-fi novel kick again recently and despite its short page
count Blood Music by
Greg Bear was the one I found slowest to finish from my first batch.
A rogue biotechnologist starts his own experiments in to biological computers based on his own lymphocytes while on the company clock. He gets caught, ignores all precautions and injects himself with them. They then become intelligent and start spreading. If you're interested in the genre it's nothing you haven't seen before. Just (probably) slower moving and with less interesting characters. Blood music just never grabbed me.
Summary: an OK story of an Earth changing grey goo incident. Not very exciting, dull characters and the pacing felt very slow. 3/10
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2007/03/28 22:53 | /books | Permanent link to this entry | This entry + same date
Bonded | Teamed Network Interface Challenge
Here is another one for the sysadmins in the audience:
How ...
... many of your servers have multiple network ports in the back?
... many of them have bonding (teaming for the Windows people) enabled?
... do you know when one interface goes down if the machine stays connected?
... long does it take for you to be notified?
... do you know if they start flapping?
... many have their bonded interfaces plugged in to different switches?
... how do you know if some one mistakenly plugs both in to one switch?
I've got a fun week ahead.
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2007/03/28 00:06 | /sysadmin | Permanent link to this entry | This entry + same date
Tue, 27 Mar 2007
VMWare Free Converter - First Thoughts
While we're a Xen shop I've always been a VMWare fan and I
had the chance to take a look at the free (as in beer) VMWare Converter
Starter today. We've got a couple of old Windows machines with no
installation documents or run books so when working towards making them
reproducible grabbing a whole system image is a great first step.
The first machine I tried it on has a very unhappy hard drive (yes, it's my work laptop) and the converter refused to play past 5% of the disk; me thinks it's time to verify my backups. The second machine was a Windows 2000 server (amusingly running VMWare server). The converter required a reboot (which it didn't on the laptop running Windows XP) after installation but made an image afterwards without any complaints and with the machine up and running.
I've not had the time to fully dig in to how well this'll work on the more awkward machines (boxes with more than 2 CPUs, apps that expect hardware access, VMWare tools not installed etc.) but the image of my trial machine (which was written out to a UNC path) came up quite quickly and all the settings I checked were correct.
I like the tool, it provides a nice revertable image for me to dissect so I can work out what's on the machines with out being a resource drain on the live servers. It's simple to use, has a nice GUI, a great price tag and will make a painful task a lot simpler. In a worst case scenario the images can also be pushed in to service as a stop gap in order to reduce the MTTR of the original servers. Oh, you can also use it to help bootstrap server consolidation, but that'll never take off... ;)
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2007/03/27 23:48 | /tools/gui | Permanent link to this entry | This entry + same date
Simulating Typing in Perl
You'd think it would be easy - have a program type a previously written
program at a human speed (minus the typos). Vim has record and reply
functionality but it's done with typical vim efficiency: yes, instantly.
At EuroOSCON a couple of years ago Damian Conway handed out a
presentation tidbit, he uses the hand_print function from IO::Prompt to make
himself look like a master typist. Well, he could just have been saying
that to make us feel better, maybe he can type that fast... Anyway, I tried
a simple example using the module:
#!/usr/bin/perl
use strict;
use warnings;
use IO::Prompt qw/hand_print/;
hand_print("I am not really typing this...");
It works but the typing speed is so uniform it makes it obvious over past a handful of lines. So I wrote my own that adds a little randomness to the typing speed, it's not pretty, it does what I want and its output is "Out on the big bad web."
#!/usr/bin/perl
use strict;
use warnings;
use Time::HiRes qw(usleep);
$|++;
my $input;
{
local $/ = undef;
$input = <ARGV>;
}
$input =~ s/(.)/sleep_and_show($1)/esg;
sub sleep_and_show {
print $_[0];
usleep int rand(200_000);
}
It's a little more jittery, which is more like my typing, and has the
nice side effect of a pretty looking invocation - ./seditor
file_to_type - which could be a valid command.
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2007/03/27 20:11 | /perl | Permanent link to this entry | This entry + same date
Mon, 26 Mar 2007
Flu, The Puppet Muppets and NPW 2007
I've been in bed for most of the last week and a half (apart from two very
short staffed days in the office) with the cold / flu bug that seems to
stalk through our office on permanent rotation. Apart from the general
feeling ill and lots of sleeping I missed a GLLUG and the first London
Puppet Muppets meeting. But I did decide to go to the 2007 Nordic Perl
Workshop, an event I've managed to miss for the last three years.
I've never been to any of the Scandaweigan countries so I'm both looking forward to having a look around and the conference itself. I expect to see many trees and much snow. The perl in financial institutes talks and Flexible Business Rules with Brick (brian d foy) look very interesting. And I get to meet up with The Hukins as well, (who needs both an easy to link to web page and a blog dammit) just not in any vegetarian restaurants.
Now I just need to do the actual bookings and not get distracted before the event starts...
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2007/03/26 22:23 | /events | Permanent link to this entry | This entry + same date
Frank Miller's 300
I've never read the comic, I didn't recognise any of the cast and quite
enjoyed 300 as a not
very challenging film. Lots of very cool fight scenes, an acceptable amount of
plot and a great 'arrows blotting out the sun' scene. Oh, and a war
rhino.
What else is there to say? The fight scenes are bloody but not especially gory, the Spartans are portrayed with the right amount of bad-ass nature and it had a number of Sin Cityesque deformed villains in it. 7/10. Don't think I'll re-watch more than a fight scene or two though.
And it had a cut down (and Venom free) version of the Spider-man 3 trailer before it came on. OH YEAH! Come on Spidey!
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2007/03/26 22:13 | /movies | Permanent link to this entry | This entry + same date
Top $FOO Of All Time Lists
Digg People: Please note that "Top $FOO of all time lists" should not be
completely comprised of $FOO's from the last two years. You
should also dock points for all uppercase words, txtsp3k, leet speak and
every use of 'AMAZING!!111' and its ilk.
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2007/03/26 22:04 | /geekstuff | Permanent link to this entry | This entry + same date
Monolithic Config Files Considered Harmful^WAwkward to Manage
This came up in conversation with a developer at the Google OpenSource
Jam so I thought I'd mention it while it is fresh in my mind (update: at
which point I forgot to move it to the published directory. Doh).
Breaking up config files isn't done just to annoy people, it's done to
make automated and mass management easier.
A solid practical example is the Debian Apache configs. Historically most distros (and too many current ones) used a single config file for Apache. While this made interactive editing easier by presenting all the options in a single place (and in a sensible order) it made it very hard for the package management software (or automation aficionados) to add a module or virtual host without some hairy scripting. Removing settings when a package is removed or updating a small chunk of the config in an upgrade is even more painful; as for preserving local changes - haha.
By breaking the config out in to a number of files / directories and combining them at run time it makes the addition of a new vhost or module config just a file drop and possibly a symlink (often used if the configs are order dependent). This is easier for third party packages to perform and makes provisioning of additional apps a lot easier.
So what's the main downside? Debugging. An "Error on line 50" is harder to track when line 50 could be in one of twelve files. But with a little forethought debug messages can be updated to show all the useful information. So next time you're writing an app of many parts please spare a thought for the sysadmins and make it easily manageable.
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2007/03/26 22:02 | /sysadmin | Permanent link to this entry | This entry + same date
Google - Second London OpenSource Jam
I recently went to the second Google
London OpenSource Jam over at Belgrave House. I've been aware of some
of the London Google evenings but I've never made the effort to go, how
ever there were a couple of people I've not seen for ages on the attendee
list for this one so I decided to sign up.
I don't know exactly what I was expecting but what I got was more than a little weird, part pre-2000 dotcom and part group hug; it wasn't really my kind of event. The whole venue seemed to be baiting the bubble busters. A couple of people gave lightning talks about topics close to their hearts, free beer and pizza were made (copiously) available (my kool-aid detector was overwhelmed by the whole place so I stayed away from both) and lots of nattering in small groups followed.
I had a good time but I mostly spent the evening catching up with people I've not seen in a while; I don't speak at these kind of events - people still complain about the first time. I think I'd have been in and out damn quickly if I hadn't known any of the other attendees, it did feel like quite an established group (even though it was the first time for most of the people present, a lot of us cross paths at other tech nights). I met a couple of people I know from mailing lists but have never met in person, scared a Thoughtworker (I used to spend more time than I should researching potential speakers) and then left a bundle of Lonix and GLLUG people in the pub near closing time. A mostly fun night.
On a tangent, the fact that there were still five or so places available on the evening itself surprised me. I've always assumed a Google sponsored evening would fill out within minutes of being announced. Having seen Google booths at a number of conferences over the last couple of years they don't really seem to get the whole wider community thing; I'm not saying they don't do anything for us but they always seem like your great uncle trying to be cool with the kids. Giving out glowing badges and asking people if they want to enlist isn't playing well with others.
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2007/03/26 22:00 | /events | Permanent link to this entry | This entry + same date
Wed, 21 Mar 2007
A bigger boy made me do it - Log::Dispatch::Twitter
For reasons that are too dull to post about (yes, even on THIS blog!) I
spent some time today looking at Log::Dispatch. Bob (the afore mentioned bigger
boy) then made^Wsuggested I integrate it with the shining example of wasted
time that is Twitter. So I (not very)
proudly present: Log::Dispatch::Twitter!
Now, where's the build system source code...
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2007/03/21 17:08 | /perl | Permanent link to this entry | This entry + same date
The Wonderful World of Kernel Module Removing
All I wanted to do was stop the IPv6 kernel module from starting on boot.
It shouldn't be hard, it shouldn't be difficult and despite the early hour
of the day, it shouldn't require me to google.
But it seems that it does, as a start point the Planete Beranger Disable IPv6 post shows the many different ways to solve the problem. Unfortunately it seems that the Debian Etch install I'm testing on doesn't like:
# /etc/modprobe.d/00local
alias net-pf-10 off
alias ipv6 off
But it has no problems with a blacklist ipv6 - apart from a
number of cases where that might not work and you'll then have to rely
on a install ipv6 /bin/true GAH! This hasn't only bitten
me, Planete Beranger rants in more detail in the Messy modprobe.conf
post.
Amusingly I discovered this while writing a small check to show which of our servers have Ipv6 enabled (we don't use it) but rather than a one off run it'll now have to be a periodic check. It's going to be one of THOSE days.
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2007/03/21 10:09 | /operatingsystems/linux | Permanent link to this entry | This entry + same date
Wed, 14 Mar 2007
Playing with Facter
I'm on-call tonight so I invested some time in facter, "A
cross-platform Ruby library for retrieving facts from operating
systems." While facter is an interesting command line program
(its extension mechanism is quite nice) its main claim to fame is that
it's used by puppet (which I'm slowly evaluating as a CFEngine
replacement) to determine facts about a machine.
While the docs are a little light on the ground the tgz contains a couple
of examples and after some playing around I think I've got a basic Linux
Bonding fact ready. For your viewing pleasure, the
Facter Linux Network
Bonding custom fact. It's not amazingly powerful or complex but it does
seem to do what I want and it gave me a reason to look around the Ruby
Dir class so it's not all bad. I've mostly put it up to show
how easy it is for someone with very little ruby knowledge to extend
facter.
Note: I also discovered that you can't do a confine :bonding =>
:true, facter works on literal string values, not on true or false.
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2007/03/14 23:20 | /ruby | Permanent link to this entry | This entry + same date
Linux Laptop Mode and /proc block_dump
Over at the
top-like command for disk io thread on GLLUG Kostas Georgiou mentioned
a Linux /proc file entry I'd never heard of before, and after
some digging it looks like it could be useful when debugging certain IO
problems. Assuming you have 2.6.6 or above - or a vendor patched kernel.
When you activate the option with a echo 1 >
/proc/sys/vm/block_dump as root (read the article and consider
turning syslog off first) the kernel starts to log which processes are
accessing which disk blocks and inodes. Below is a small chunk of its
output on my test VMWare system:
Mar 14 19:16:44 localhost kernel: sshd(2659): dirtied inode 388836 (sshd) on sda1 Mar 14 19:16:44 localhost kernel: sshd(2659): dirtied inode 533395 (libwrap.so.0) on sda1 Mar 14 19:17:23 localhost kernel: cat(2672): dirtied inode 888805 (.bash_history) on sda1 Mar 14 19:17:46 localhost kernel: kjournald(913): WRITE block 14016 on sda1 Mar 14 19:17:48 localhost kernel: pdflush(104): WRITE block 12487672 on sda1
The short version is, 'dirtied' means changed but not written to disk, pdflush
will write the rows out later, and READs are what you'd expect. A brute
force way to trace an inode to a file path is with find: find / -inum
num. The longer explanation can be found at LJs Extending Battery
Life with Laptop Mode under the "Spinup Debugging" heading.
It's no DTrace (and no, SystemTap isn't as good as DTrace) but it is neat and a decent addition to the debugging toolbox.
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2007/03/14 18:18 | /operatingsystems/linux | Permanent link to this entry | This entry + same date
Mon, 12 Mar 2007
daemon_percentages.rb and Ruby Autovivification
Both Jim Weirich and
Ben Summers were kind enough to
email me about my Daemon
Logging Percentages and Playing with Ruby Idioms post. They sent
me an explanation on how to do the hash assignment in a way I find much
nicer, so with no more delays I present - Option 4:
tally = Hash.new(0)
tally[daemon] += 1
It really is that simple - and I still missed it by a mile. I've updated the script to use this and I wanted to say thank you for the pointer, so thank you both.
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2007/03/12 21:06 | /ruby | Permanent link to this entry | This entry + same date
Outlaw - Short Review
Tonight I saw Outlaw. It starts out
showing how people feel let down and abandoned by the law and the fact it
seems to treat criminals better than the victims. It's a great idea, the
topic is perfectly timed and it's only spoiled by a shoddy execution (no
pun intended).
It soon turns in to a badly plotted gang film that is amazingly one sided - the outlaws are never really developed so their point is lost, it is full of cliches and pretty dull. 3/10 (and those points are for the concept).
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2007/03/12 20:59 | /movies | Permanent link to this entry | This entry + same date
Sun, 11 Mar 2007
Twitter + Bash == Bad Idea
I'm not sure about the basic idea behind Twitter but after signing up, having a
little look and noticing the Net::Twitter CPAN
module I decided to implement a really bad idea...
#!/usr/bin/perl -w
use strict;
use warnings;
use Net::Twitter;
my $bot = Net::Twitter->new(
username => "username",
password => "password"
);
chomp(my $doing = <>);
$doing =~ s/^\s+\d+\s+//;
$bot->update($doing);
To make it 'useful' you'll need to run the following in your bash shell:
PROMPT_COMMAND='history | tail -n 1 |
/path/to/twitter_post.pl' and tada! People really will know what
you're doing RIGHT NOW. Ahem, sorry.
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2007/03/11 13:55 | /geekstuff | Permanent link to this entry | This entry + same date
Ghost Rider - Short Review
Ghost Rider took a
lot longer to reach our screens than it should have, and considering the
amount of re-work involved it isn't that good. Very middle of the road
(haha), only see it if you're a comic book geek, bored or want another
chance to see Nicolas Cage not have expressions. 4/10.
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2007/03/11 12:38 | /movies | Permanent link to this entry | This entry + same date
Busy February
I was more than a little slack in my online activities in February.
Between getting back from LCA and preparing for FOSDEM (tip: sleep a lot
before you go) I also managed to have curry with both David Cantrells, see Luke Kanies present Puppet
at GLLUG, attend a London PM Heretics, a Lonix and two other meetings
that don't have real names yet. And reach another birthday.
I'm not going to UKUUG in Manchester (I need some time at home) but I've been prodded in to potentially organising another GLLUG evening and a London PM tech meet, Brummie.pm are willing to come down and speak so it's a perfect time to put one together. All I need is to get enough hours in front of my email to plan it. I'm also attending the second Google OpenSource Jam in London so come and say "Hi" if you're there.
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2007/03/11 12:34 | /geekstuff | Permanent link to this entry | This entry + same date
Sat, 10 Mar 2007
Daemon Logging Percentages and Playing with Ruby Idioms
While digging in to some large log files recently I needed to work out
which daemons were causing the most noise, so I wrote a little perl
script called
daemon_percentages.pl. It was short, ran quickly and did what I wanted.
And then my lunch plans were cancelled due to rain.
With nothing but boredom, a newly compiled version of ruby and the google homepage at my side I decided to write a version in ruby. And then I realised how long it's been since I last looked at ruby. After slightly longer than the perl version took, and with a couple of false starts, I ended up with daemon_percentages.rb.
I had forgotten how much I disliked ri. It feels slower
than perldoc and I find it awkward to use. Then I hit the
lack of a post-increment operator; while I understand the reasons for its
omission I've got used to having it, so that took a couple of minutes to
debug. And then the biggie for me, a lack of hash key autovivification.
I'd forgotten how much of a perlism it is and so I spent a little while looking at different ways to do it (and got some good pointers from Will Jessop). In the end I tried the following:
# option 1
if tally.has_key? daemon
tally[daemon] += 1
else
tally[daemon] = 1
end
# option 2
tally[daemon] = 0 unless tally.has_key? daemon
tally[daemon] += 1
# option 3
tally[daemon] = (tally.has_key? daemon) ? tally[daemon] + 1 : 1
Option 1 felt too long, I didn't like option 2 when I reread it as the code seemed to imply I'd decided something and immediately then changed my mind so I settled on option 3. Although it's a little more complex (and denser) it's such a common thing for me to use I'd rather have it on a single line and gloss over the syntax as it becomes more familiar.
While I had some small teething problems I do like the look of the ruby code and apart from the missing perlisms it felt quite natural to write. I'm not willing to jump ship just yet (CPAN is still too useful) but I think I'll be writing more of my personal tools in ruby.
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2007/03/10 23:19 | /perl | Permanent link to this entry | This entry + same date
FRDNS Revisions - now with added ping checks!
I originally wrote frdns to
find and warn about inconsistencies in forward and reverse DNS records.
At the time I was also using a tool called hawk to show both IPs that
didn't have a reverse record and reverse records that didn't have a
responding IP address associated with them (we had a lot of orphaned
records).
While hawk did the job it required a MySQL instance, a daemon process
and an apache server to function - which was a PITA when it had to be moved
to another server. So I improvised. The first step was adding a
-p option to frdns that makes the program ping each IP
specified and flag the address if it doesn't have a reverse record. This
points out IPs that don't have DNS records. As for the no longer needed
records I've got a different tool for that - but that's for another blog
post.
I've also made frdns log both run time and how many issues it flags to syslog. The ping check can take a while so I added this to help me keep an eye on its performance. I did think about using one of the asynchronous DNS libraries to improve performance but we're only running it once a day to pick up mistakes so a long runtime isn't a huge issue.
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2007/03/10 22:46 | /tools/commandline | Permanent link to this entry | This entry + same date
Importance Levels - A Simple Example
When you're first introduced to an environment you'll have the ever
fun task of working out which machines should get the most time; and
that order seldom matches which machines actually need the most
attention. To help me prioritise I've worked out a simple
importance rating system to show where I spend my time.
Below is a simplified version. I use it to assign a single importance number to each machine, and then I allocate a certain amount of time each day to work on the issues, requests and improvements I've got in my todo list for that level. When I've run out of time I move down a number and start working on anything related to machines rated at that importance. The amount of time I put aside for each level decreases as I work towards one.
5: Customer facing systems that generate revenue.
This is my no brainer. Pretty much everything is secondary to keeping the money coming in.
Examples: customer database, webservers and databases related to customer payments.
4: Internal Money Makers and Customer Visible Systems.
I normally put customer facing systems that don't make money in this bracket. An online presence and reputation for availability have been important to most of the companies I've worked at. It sounds horrible but it's a lot easier to save face and beg forgiveness from a five day internal outage than a one day external one; well, sorta. If you're a blogger watched company then this is even more important.
I also put internal money makers at importance 4. "Cash is king" should be true in all departments, including those where Sysadmins dwell. I've only ever had simultaneous problems with both types of importance four systems a couple of times. Each time had circumstances that made the priorities clear.
Examples: corporate website, company blog, invoicing systems, time-sheets at month end.
3: Systems that stop a number of staff working.
I typically put machines that don't directly contribute to the bottom line but are required for the company to continue in this bracket. A short outage of any machines at this level can be survived for a little while but it'll slow a lot of staff down, cause frustration and (after a while) cause major damage.
Examples: internal request tracking/ticketing, bug tracking, build machines, version control
2: Systems that hinder small numbers of staff.
This is another level that I use to cover two types of machine. The first type slows or hinders a number of staff but can be lived without. You can think of these as convenience or favour systems that make tasks easier or more pleasant. You'll often get a disproportionate amount of queries when one of these goes down. This is a good sign and shows you understand what your users care about.
When I'm asked to help with desktop support I lump single user problems here. Although it's frustrating to have a single person unable to work it's often not as bad as any of the higher levels. I put a lot of special cases and caveats here (sales people on presentation days, QA engineers before a release) but the most sensible workaround is to separate desktop and sysadmin roles. You can typically hire desktop support staff cheaper than a sysadmin and give them the opportunity to train with the sysadmins when things are quiet.
Examples: Web front-end to a version control server, centralised log shares for debugging, departmental wikis, individual laptops or desktops.
1: Play / scratch machines that no one really cares about.
Not much lives in this level, if no one cares if it's up or not then you should seriously consider turning it off. The smaller and simpler your environment the easier it'll be to manage.
Examples: sysadmin "play" lab environment, company jukebox
And now some warnings - these categories are (obviously) not perfect. The ratings are host-centric (but can be pulled up a level and applied to clusters or groups of identical machines), it doesn't factor in office politics (some systems are loved by certain members of management and should be treated like one of their loved ones).
It's also worth noting that some systems rise in importance at certain times; examples are month end batch reports, time-sheet systems when invoices are due etc. It shouldn't be too hard to work out most of these (typically cyclical) requirements after speaking to the other staff. Asking about their requirements is always a good way to help build bridges and show you do understand that the systems are there for a reason; to be used.
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2007/03/10 16:05 | /sysadmin | Permanent link to this entry | This entry + same date
The Cron Commandments - part 1
Although it's a rare Unix machine that doesn't run at least a couple of
custom cronjobs it's an even more special snowflake that does them
properly. Below are some of the more common problems I've seen and my
thoughts on them.
Always use a script, never a bare command line.
A parenthesis wrapped command-line in a crontab sends shivers down my spine. Nothing says "I didn't really think this through" and "I've done the bare minimum to make it work" in quite the same way.
Don't shout about success
A cronjob that completes successfully shouldn't post anything to
stdout or stderr. Most developers have no idea
how annoying it is to get a single line email every minute proclaiming
all's well. It also trains people to delete messages with certain subject
lines without reading them, which'll catch you out when a real
problem occurs.
Caveat 1: Logging that the script finished, and adding some timing information, can often be useful. It's good to have an audit trail of what actually ran and how long it took. By logging to syslog you gain the benefits of centralised logs (you are centralising your log files right?) and, because it's passive, the sysadmin doesn't get notified about expected completions unless she looks for them.
Debug information should be an option
A script invoked via cron has a different environment than one run from the command line, it'll work (and break) in different ways - which you'll want to see. It should be possible to enable additional debug without making any changes to the script itself. A command-line flag or environmental variable should be enough to trigger additional debug information. Often all you'll get is an email with the error and the debug information so ensure you can diagnose from your own output.
Beware overrunning jobs
Almost all your cronjobs should check to ensure that another instance isn't already running and exit if it is - after logging the issue. I've lost track of the number of difficult to track bugs caused by a cronjob starting, taking longer to finish than the interval between runs, and then having another job follow it. This often causes deadlocks, resource conflicts, maxed out database connections and corrupted data. Some, very simple, cronjobs don't need this but when in doubt put it in. And log the fact, this can help pick up growth trends ("it took 2 minutes until we added the extra users").
Beware /dev/null redirects in crontabs
Any cronjob that redirects stdout, stderr or
(worse) both to /dev/null is going to cause you headaches
and will need some attention. People typically add these when something
is wrong and they lack either the skill or the time to fix it. The
presence of these redirects show a lack of confidence in the script and
should be treated as a red flag. On the plus side they point you at
potential trouble.
Avoid running as root
As in most things using root is bad. Try writing your cronjobs so they
can run under a non-privileged user, with a little sudo mixed
in if you need it. It'll save you a lot of hassle when something goes wrong
and the script tries to eat your file system.
Closing Comments
And to close, a couple of quick points: test your cronjobs from cron,
not just interactively. /etc/ is often backed up,
/var/spool/cron/crontabs/ is often missed so think about your
deployment locations. Make sure your admins know about any cronjobs your
packages add. And finally, if you generate your crontabs always add a
newline at the end.
If you at least know why you're breaking some of these rules (and they better be good reasons) then you'll be a good few steps above most developers I've worked with. And we'll get on a lot better.
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2007/03/10 15:59 | /sysadmin | Permanent link to this entry | This entry + same date
First Puppet London Users Meet - Thursday, March 22
I'm a lurker on the
Puppet mailing
list and after some discussions John Arundel has stepped up and done the
organising for the first
Puppet London Users Meet - Thursday, March 22. I'm not using Puppet
yet but I'm thinking of heading along to hear peoples adoption stories.
I've also been thinking about the lack of a sysadmin community in London since GLLUG became a lot more newbie friendly and SAGE-WISE faded out. If you're a sysadmin in London interested in meeting some of your peers come along and say "Hi", this might be the start of a beautiful friendship^Wuser group.
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2007/03/10 00:30 | /events | Permanent link to this entry | This entry + same date
Fri, 09 Mar 2007
Disk Delving - 2 Good Papers and a Blog
"The Google team found that 36% of the failed drives did not exhibit a single
SMART-monitored failure. They concluded that SMART data is almost useless
for predicting the failure of a single drive."
-- StorageMojo - Google's Disk Failure
Experience
There have been two excellent papers on disk drive failures released recently, the Dugg and Dotted Google paper - Failure trends in a large disk drive population (warning: PDF) and the also excellent but less hyped Disk failures in the real world: What does an MTTF of 1,000,000 hours mean to you?.
Both papers make very interesting reading, the comparisons of SCSI to SATA disks alone should turn some heads, but they are a little dry, so once you've worked your way through them it's worth looking at the summarised highlights over at StorageMojo, a top notch blog that was recommended to me by Kim Hawtin. StorageMojo covered both papers and I've linked to them in the quotes above and below.
"Further, these results validate the Google File System's central redundancy
concept: forget RAID, just replicate the data three times. If I'm an IT
architect, the idea that I can spend less money and get higher reliability
from simple cluster storage file replication should be very
attractive."
-- StorageMojo - Everything You Know
About Disks Is Wrong
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2007/03/09 09:21 | /sysadmin | Permanent link to this entry | This entry + same date
Sysadmin Challenge - Disk Usage
Here's one for the sysadmins in the crowd; if you were asked to show the
following how long would it take you to gather the information?
- Which of your file systems have the fastest growth rate?
- Which are the most under-utilised?
- Which haven't changed by more than 5% over the last month?
If you use Nagios you can cheat and work out the full drive size from the free space and percentage used reported by the disk checks, but that's... icky. You get bonus points for having prediction built in to your usage graphs.
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2007/03/09 08:48 | /sysadmin | Permanent link to this entry | This entry + same date
ls and the Missing Argument
When it comes to command line options GNU ls already uses
most of the alphabet, so for my own sanity can someone implement a
-j that doesn't change the behaviour much from a ls
-alh? It's my most common typo and I'm willing to offer beer to
remove the problem.
I could learn to type better but this is easier ;)
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2007/03/09 08:34 | /tools/commandline | Permanent link to this entry | This entry + same date

