Small Mosaic


Categories:

books
career
codinghorrors
comics
events
geekstuff
justdont
languages
languages/bash
linkshot
magazines
meta
misctech
movies
nottech
operatingsystems
operatingsystems/linux
operatingsystems/linux/debian
operatingsystems/solaris
paranoidadmin
perl
ruby
security
security/apache
security/tools
serversmells
sites
specifications
sysadmin
tools
tools/commandline
tools/firefox
tools/gui
tools/network
tools/online
tools/online/greasemonkey
unixdaemon

Archives:

April 20084
March 20081
February 20081
January 200815
August 20072
June 20079
May 20076
April 20078
March 200731
February 20073
January 200721
December 20061
November 20064
October 20066
September 200632
August 200617
July 200614
June 20069
May 200613
March 200611
February 200616
January 200611
December 20051
November 20056
October 200519
September 200525
August 200516
July 200516
June 200513
May 20052
April 200519
March 200531
February 200520
January 200531
December 200421
November 200430
October 200432
September 200418
August 20047
July 200414
June 20045

Sat, 10 Mar 2007

Daemon Logging Percentages and Playing with Ruby Idioms
While digging in to some large log files recently I needed to work out which daemons were causing the most noise, so I wrote a little perl script called daemon_percentages.pl. It was short, ran quickly and did what I wanted. And then my lunch plans were cancelled due to rain.

With nothing but boredom, a newly compiled version of ruby and the google homepage at my side I decided to write a version in ruby. And then I realised how long it's been since I last looked at ruby. After slightly longer than the perl version took, and with a couple of false starts, I ended up with daemon_percentages.rb.

I had forgotten how much I disliked ri. It feels slower than perldoc and I find it awkward to use. Then I hit the lack of a post-increment operator; while I understand the reasons for its omission I've got used to having it, so that took a couple of minutes to debug. And then the biggie for me, a lack of hash key autovivification.

I'd forgotten how much of a perlism it is and so I spent a little while looking at different ways to do it (and got some good pointers from Will Jessop). In the end I tried the following:

  
  # option 1
  if tally.has_key? daemon
    tally[daemon] += 1
  else
    tally[daemon] = 1
  end

  # option 2
  tally[daemon] = 0 unless tally.has_key? daemon
  tally[daemon] += 1

  # option 3
  tally[daemon] = (tally.has_key? daemon) ? tally[daemon] + 1 : 1
  

Option 1 felt too long, I didn't like option 2 when I reread it as the code seemed to imply I'd decided something and immediately then changed my mind so I settled on option 3. Although it's a little more complex (and denser) it's such a common thing for me to use I'd rather have it on a single line and gloss over the syntax as it becomes more familiar.

While I had some small teething problems I do like the look of the ruby code and apart from the missing perlisms it felt quite natural to write. I'm not willing to jump ship just yet (CPAN is still too useful) but I think I'll be writing more of my personal tools in ruby.

Like this post? - Digg Me! | Add to del.icio.us! | reddit this!

Posted: 2007/03/10 23:19 | /perl | Permanent link to this entry | This entry + same date


FRDNS Revisions - now with added ping checks!
I originally wrote frdns to find and warn about inconsistencies in forward and reverse DNS records. At the time I was also using a tool called hawk to show both IPs that didn't have a reverse record and reverse records that didn't have a responding IP address associated with them (we had a lot of orphaned records).

While hawk did the job it required a MySQL instance, a daemon process and an apache server to function - which was a PITA when it had to be moved to another server. So I improvised. The first step was adding a -p option to frdns that makes the program ping each IP specified and flag the address if it doesn't have a reverse record. This points out IPs that don't have DNS records. As for the no longer needed records I've got a different tool for that - but that's for another blog post.

I've also made frdns log both run time and how many issues it flags to syslog. The ping check can take a while so I added this to help me keep an eye on its performance. I did think about using one of the asynchronous DNS libraries to improve performance but we're only running it once a day to pick up mistakes so a long runtime isn't a huge issue.

Like this post? - Digg Me! | Add to del.icio.us! | reddit this!

Posted: 2007/03/10 22:46 | /tools/commandline | Permanent link to this entry | This entry + same date


Importance Levels - A Simple Example
When you're first introduced to an environment you'll have the ever fun task of working out which machines should get the most time; and that order seldom matches which machines actually need the most attention. To help me prioritise I've worked out a simple importance rating system to show where I spend my time.

Below is a simplified version. I use it to assign a single importance number to each machine, and then I allocate a certain amount of time each day to work on the issues, requests and improvements I've got in my todo list for that level. When I've run out of time I move down a number and start working on anything related to machines rated at that importance. The amount of time I put aside for each level decreases as I work towards one.

5: Customer facing systems that generate revenue.

This is my no brainer. Pretty much everything is secondary to keeping the money coming in.

Examples: customer database, webservers and databases related to customer payments.

4: Internal Money Makers and Customer Visible Systems.

I normally put customer facing systems that don't make money in this bracket. An online presence and reputation for availability have been important to most of the companies I've worked at. It sounds horrible but it's a lot easier to save face and beg forgiveness from a five day internal outage than a one day external one; well, sorta. If you're a blogger watched company then this is even more important.

I also put internal money makers at importance 4. "Cash is king" should be true in all departments, including those where Sysadmins dwell. I've only ever had simultaneous problems with both types of importance four systems a couple of times. Each time had circumstances that made the priorities clear.

Examples: corporate website, company blog, invoicing systems, time-sheets at month end.

3: Systems that stop a number of staff working.

I typically put machines that don't directly contribute to the bottom line but are required for the company to continue in this bracket. A short outage of any machines at this level can be survived for a little while but it'll slow a lot of staff down, cause frustration and (after a while) cause major damage.

Examples: internal request tracking/ticketing, bug tracking, build machines, version control

2: Systems that hinder small numbers of staff.

This is another level that I use to cover two types of machine. The first type slows or hinders a number of staff but can be lived without. You can think of these as convenience or favour systems that make tasks easier or more pleasant. You'll often get a disproportionate amount of queries when one of these goes down. This is a good sign and shows you understand what your users care about.

When I'm asked to help with desktop support I lump single user problems here. Although it's frustrating to have a single person unable to work it's often not as bad as any of the higher levels. I put a lot of special cases and caveats here (sales people on presentation days, QA engineers before a release) but the most sensible workaround is to separate desktop and sysadmin roles. You can typically hire desktop support staff cheaper than a sysadmin and give them the opportunity to train with the sysadmins when things are quiet.

Examples: Web front-end to a version control server, centralised log shares for debugging, departmental wikis, individual laptops or desktops.

1: Play / scratch machines that no one really cares about.

Not much lives in this level, if no one cares if it's up or not then you should seriously consider turning it off. The smaller and simpler your environment the easier it'll be to manage.

Examples: sysadmin "play" lab environment, company jukebox

And now some warnings - these categories are (obviously) not perfect. The ratings are host-centric (but can be pulled up a level and applied to clusters or groups of identical machines), it doesn't factor in office politics (some systems are loved by certain members of management and should be treated like one of their loved ones).

It's also worth noting that some systems rise in importance at certain times; examples are month end batch reports, time-sheet systems when invoices are due etc. It shouldn't be too hard to work out most of these (typically cyclical) requirements after speaking to the other staff. Asking about their requirements is always a good way to help build bridges and show you do understand that the systems are there for a reason; to be used.

Like this post? - Digg Me! | Add to del.icio.us! | reddit this!

Posted: 2007/03/10 16:05 | /sysadmin | Permanent link to this entry | This entry + same date


The Cron Commandments - part 1
Although it's a rare Unix machine that doesn't run at least a couple of custom cronjobs it's an even more special snowflake that does them properly. Below are some of the more common problems I've seen and my thoughts on them.

Always use a script, never a bare command line.

A parenthesis wrapped command-line in a crontab sends shivers down my spine. Nothing says "I didn't really think this through" and "I've done the bare minimum to make it work" in quite the same way.

Don't shout about success

A cronjob that completes successfully shouldn't post anything to stdout or stderr. Most developers have no idea how annoying it is to get a single line email every minute proclaiming all's well. It also trains people to delete messages with certain subject lines without reading them, which'll catch you out when a real problem occurs.

Caveat 1: Logging that the script finished, and adding some timing information, can often be useful. It's good to have an audit trail of what actually ran and how long it took. By logging to syslog you gain the benefits of centralised logs (you are centralising your log files right?) and, because it's passive, the sysadmin doesn't get notified about expected completions unless she looks for them.

Debug information should be an option

A script invoked via cron has a different environment than one run from the command line, it'll work (and break) in different ways - which you'll want to see. It should be possible to enable additional debug without making any changes to the script itself. A command-line flag or environmental variable should be enough to trigger additional debug information. Often all you'll get is an email with the error and the debug information so ensure you can diagnose from your own output.

Beware overrunning jobs

Almost all your cronjobs should check to ensure that another instance isn't already running and exit if it is - after logging the issue. I've lost track of the number of difficult to track bugs caused by a cronjob starting, taking longer to finish than the interval between runs, and then having another job follow it. This often causes deadlocks, resource conflicts, maxed out database connections and corrupted data. Some, very simple, cronjobs don't need this but when in doubt put it in. And log the fact, this can help pick up growth trends ("it took 2 minutes until we added the extra users").

Beware /dev/null redirects in crontabs

Any cronjob that redirects stdout, stderr or (worse) both to /dev/null is going to cause you headaches and will need some attention. People typically add these when something is wrong and they lack either the skill or the time to fix it. The presence of these redirects show a lack of confidence in the script and should be treated as a red flag. On the plus side they point you at potential trouble.

Avoid running as root

As in most things using root is bad. Try writing your cronjobs so they can run under a non-privileged user, with a little sudo mixed in if you need it. It'll save you a lot of hassle when something goes wrong and the script tries to eat your file system.

Closing Comments

And to close, a couple of quick points: test your cronjobs from cron, not just interactively. /etc/ is often backed up, /var/spool/cron/crontabs/ is often missed so think about your deployment locations. Make sure your admins know about any cronjobs your packages add. And finally, if you generate your crontabs always add a newline at the end.

If you at least know why you're breaking some of these rules (and they better be good reasons) then you'll be a good few steps above most developers I've worked with. And we'll get on a lot better.

Like this post? - Digg Me! | Add to del.icio.us! | reddit this!

Posted: 2007/03/10 15:59 | /sysadmin | Permanent link to this entry | This entry + same date


First Puppet London Users Meet - Thursday, March 22
I'm a lurker on the Puppet mailing list and after some discussions John Arundel has stepped up and done the organising for the first Puppet London Users Meet - Thursday, March 22. I'm not using Puppet yet but I'm thinking of heading along to hear peoples adoption stories.

I've also been thinking about the lack of a sysadmin community in London since GLLUG became a lot more newbie friendly and SAGE-WISE faded out. If you're a sysadmin in London interested in meeting some of your peers come along and say "Hi", this might be the start of a beautiful friendship^Wuser group.

Like this post? - Digg Me! | Add to del.icio.us! | reddit this!

Posted: 2007/03/10 00:30 | /events | Permanent link to this entry | This entry + same date


books career codinghorrors events geekstuff justdont languages/bash linkshot magazines meta misctech movies nottech operatingsystems/linux operatingsystems/linux/debian operatingsystems/solaris perl ruby security security/apache security/tools serversmells sites specifications sysadmin tools/commandline tools/firefox tools/gui tools/network tools/online tools/online/greasemonkey unixdaemon

Copyright © 2000-2005 Dean Wilson XML feed logo