Small Mosaic


Categories:

books
career
codinghorrors
comics
events
geekstuff
justdont
languages
languages/bash
linkshot
magazines
meta
misctech
movies
nottech
operatingsystems
operatingsystems/linux
operatingsystems/linux/debian
operatingsystems/solaris
paranoidadmin
perl
programming
python
ruby
security
security/apache
security/tools
serversmells
sites
specifications
sysadmin
tools
tools/commandline
tools/firefox
tools/gui
tools/network
tools/online
tools/online/greasemonkey
unixdaemon

Archives:

January 20095
December 20081
November 20084
October 20085
September 20084
August 200812
July 20089
April 20084
March 20081
February 20081
January 200815
August 20072
June 20079
May 20076
April 20078
March 200731
February 20073
January 200721
December 20061
November 20064
October 20066
September 200632
August 200617
July 200614
June 20069
May 200613
March 200611
February 200616
January 200611
December 20051
November 20056
October 200519
September 200525
August 200516
July 200516
June 200513
May 20052
April 200519
March 200531
February 200520
January 200531
December 200421
November 200430
October 200432
September 200418
August 20047
July 200414
June 20045

Mon, 05 Jan 2009

GUI config apps and a thousand cuts
Today has been one of those death by a thousand cut days. We did a migration first thing in the morning (I'm not supposed to be awake at 6am unless it's from a really late night) and while all the big bits were planned and moved successfully the work list was missing enough little pieces to make the rest of the day very annoying.

What made the work a lot harder was that the changes had to be made through a web front end that abstracted about 20 seconds of vim in to four minutes of clicking buttons that were never in the same place twice. It's been a while since I've had to bulk make production changes using this kind of interface so I was freshly amazed at how awful it was.

First of all was the time it took. The average change was about 8 mouse clicks, most of them on different pages, across a slow application that was working with a very large (for it) dataset. Second was the lack of a safety net. I had to do full copy and pastes to somewhere safe for each thing I wanted to change before changing it. It may not sound like much but if you come from the land of version control and diffing changes then it just feels so risky. And if you don't then I suggest you start learning one. Instead I had to rely on some hastily written post check scripts that confirmed the changes were correct when publicly viewed. We'd normally write these as a double check but without version control they become the single safeguard. Which were only effective after the change was made, which is better than nothing I suppose...

Like this post? - Digg Me! | Add to del.icio.us! | reddit this!

Posted: 2009/01/05 20:17 | /geekstuff | Permanent link to this entry | This entry + same date


Sun, 04 Jan 2009

Simple Stemming with Perl
Stemming is the process for reducing inflected (or sometimes derived) words to their stem, base or root form.
-- Wikipedia article on Stemming

Ever used a website that allowed you to tag content? Ever ended up accidently using slightly different tags? Something like graphs and graphing or blog and blogs? (I hope so, otherwise it's just me...) To spot some of the more obvious overlaps you can stem each of the words and look for a common base. Where one's found there is the possibility of mistaken duplication. For example if you passed hunts, hunted and hunting through a stemmer each would return 'hunt'. If you want to try for yourself there are online stemmers available.

As a more concrete example let's look at the wonderful service del.icio.us. You upload your own bookmarks, tag them with a number of keywords and can then group, sort and search them by your own defined terms. Except I have a habit of tagging articles about similar topics with nearly, but not quite the same tag.

The perl code below shows how easy it is (using Lingua::Stem from CPAN) to run your own data through a stemmer and look for overlaps. There are implementations in most languages (PyStemmer is also very nice) and the wikipedia article is actually a very easy to follow introduction.


#!/usr/bin/perl -w
use strict;
use warnings;
use Lingua::Stem;
use Net::Delicious;

my $del = Net::Delicious->new(
                               {
                                 user => "username",
                                 pswd => "password"
                               }
                             );

my $stemmer = Lingua::Stem->new( -locale => 'EN-UK' );

my %stems;
for my $tag ( $del->tags() ) {
  my $stemmed = $stemmer->stem( $tag->tag );

  push( @{ $stems{$stemmed->[0]} },  $tag->tag );
}

for my $stemmed (sort keys %stems ) {
  # we only care about base words with more than one tag associated
  next unless ( scalar @{ $stems{$stemmed} } > 1);

  print "Possible duplicates -\n";
  print "  --  ";
  print join(" : ", @{ $stems{$stemmed} }), "\n";
}


Like this post? - Digg Me! | Add to del.icio.us! | reddit this!

Posted: 2009/01/04 19:32 | /perl | Permanent link to this entry | This entry + same date


Sat, 03 Jan 2009

The Art of Capacity Planning - Short Review
The only books on capacity planning I've ever skimmed my way through have been dense, dull tomes of long mathematical formulas, advice that's hard to use in any practical way and page counts in the treble digits. Thankfully John Allspaw has bucked this trend with The Art of Capacity Planning and instead written a slender, thought provoking, book.

The main focus of the book is that measurement is good, blind guessing is bad and that capacity planning, like security, is an ongoing process. While a lot of the material is common sense - which is never that common in IT - it's a perfect introduction to capacity planning (and the principles of data collection and graphing) for novice to intermediate system administrators and a handy refresher for the experts in the crowd. I found it oddly reassuring that someone else has a lot of the same thoughts as I do when it comes to these topics.

The Art of Capacity Planning is an easy, engaging read that gets you thinking along the right lines without becoming dull or long winded. Well worth the couple of hours it'll take to read - 8/10

Like this post? - Digg Me! | Add to del.icio.us! | reddit this!

Posted: 2009/01/03 17:47 | /books | Permanent link to this entry | This entry + same date


Fri, 02 Jan 2009

New year, new laptop - Samsung NC10
Near the middle of December I lost a very dear, and constant, companion - my Sony Vaio 'some model number or other'. After nearly five years the laptop stopped charging and it wasn't worth paying for the repairs. I put off getting a replacement for as long as I could but while I had the work laptop as a standby I needed a machine I could treat as my own. Something outside the company security policy. Something I could install lots of applications and languages I'll only ever look at once on. So I bit the bullet and bought myself a Samsung NC10.

It's not exactly been a long time since I bought it so I've hardly stressed the machine too but first impressions are very favourable. Battery life on wireless is a good four-six hours (depending on what else I'm doing). The keyboard is much nicer than the Asus ePC I used for about ten minutes before cramping my hands up and the screens actually very usable. It'll never replace a dual monitor setup but it's fine for writing little scripts, web browsing and reading my email.

I've got a 1GB memory upgrade on order (it can only take 2GB) and then I'll see if I can make VMWare play nice without killing the battery.

Like this post? - Digg Me! | Add to del.icio.us! | reddit this!

Posted: 2009/01/02 21:22 | /geekstuff | Permanent link to this entry | This entry + same date


Thu, 01 Jan 2009

Erlang in Practice - PragProg Screencasts

I recently watched the first in the series of the Pragmatic Programmers Erlang in Practice Screencasts (by Kevin Smith - no, not that Kevin Smith). As I've not seen them discussed that much else where I thought I'd jot down my thoughts.

First up a disclaimer/warning - I'm not an Erlang person and despite the title of 'Episode 1' this series of screencasts is not aimed at people with no experience in the language. If you want to learn Erlang then I'd suggest you read Programming Erlang instead. Once you've been through the book then you should consider coming back to this series.

Now, to look at the screencasts from a different angle - production quality and value for money. Despite not knowing enough Erlang to understand all the code presented, I found the quality of the screencast to be perfect for watching on a laptop. The video was clear, the presenters voice didn't make me want to kill him (although this is a highly personal thing) and at five dollars the price was right for an hours worth of content.

So would I buy another one? Yes, but not this series. Until I get a chance to work my way through the Erlang book this series is off limits to me, The Ruby Object Model and Metaprogramming on the other other hand is mighty tempting for under five UK pounds...

Like this post? - Digg Me! | Add to del.icio.us! | reddit this!

Posted: 2009/01/01 20:05 | /programming | Permanent link to this entry | This entry + same date


Mon, 29 Dec 2008

End of 2008 Very short Book reviews -
Behind every good manager lurks dozens of bad ones. While Behind Closed Doors is full of mostly common sense tips it's uncommon to deal with management that actually apply more than a couple of them. It's an easy, quick read and an ideal gift for that special manager in your life that you really wished wasn't. 7/10

The Python Phasebook is a concise, well written set of examples. Each 'phrase' is a short task with some sample code that shows one of the possible solutions. Think of it as an O'Reilly cookbook, but not from O'Reilly. This is a good book but it needs a second edition to cover all the changes to the languages over the last couple of years. It could also do with a chapter on unit testing. 4/10 (because of age) but looking forward to the second edition.

I also read the Ruby Phrasebook but I'm not giving this book a score until I've worked my way through the Ruby Cookbook. Lastly was Practical Ruby for System Administration. I didn't like this book but I've not worked out the exact reasons why so I'll have to wait to post a full review.

Like this post? - Digg Me! | Add to del.icio.us! | reddit this!

Posted: 2008/12/29 14:02 | /books | Permanent link to this entry | This entry + same date


Sat, 08 Nov 2008

Disturbing Diffs - Unsafe open?


-  file_move_safe(move_from_path, move_to_path)
+  move_file(move_from_path, move_to_path)

Is move_file not as safe as file_move_safe? Is it safer? Dare I read the other diffs to find out? Am I better off not knowing?

Like this post? - Digg Me! | Add to del.icio.us! | reddit this!

Posted: 2008/11/08 13:11 | /programming | Permanent link to this entry | This entry + same date


Events - November 2008
It's actually a good month for dynamic language fans in London as we've got both the London Perl Workshop and the inaugural Ruby Manor - both of which I'll be attending.

Although, as a sysadmin, I feel a little bad about not making it to the Linux 2008 event (organised by the UKUUG) I couldn't really justify the time and cost this year. The talks were a decent selection but not enough to get me up to Manchester on my own budget for a weekend. I'll have to keep an eye out for next years LISA event in London to make up for it.

Last but not least - the FOSDEM 2009 dates have been announced (for the second time). Assuming they don't change during the week I'll be booking those before Xmas. Roll on February!

Like this post? - Digg Me! | Add to del.icio.us! | reddit this!

Posted: 2008/11/08 13:06 | /events | Permanent link to this entry | This entry + same date


Dynamic Languages and joining arrays
I've been spending a fair amount of time recently trying to choose my Language of the year for 2009. I've always been a dynamic language fan (yes, I know this means I should be looking further afield for the next one) and I was surprised at how different even such a common task as joining all the elements of an array together, using a given separator, looks between them.

First let's look at the big three, including perl, my current favourite.

  
# perl
$ perl -d -e 1;
DB<1> my @names = qw( A B C);
DB<2> print join(" : ", @names), "\n";
A : B : C

# python
>>> names = ['A', 'B', 'C' ]
>>> " : ".join(names)
'A : B : C'

# ruby
irb(main):009:0> names = [ 'A', 'B', 'C' ]
=> ["A", "B", "C"]
irb(main):010:0> names.join(' : ')
=> "A : B : C"
  

The perl approach is very procedural (ignore the use of the debugger as perl doesn't come with an excellent REPL in the core like the other two) and is the one I'm most familiar with so it's hard for me to be too critical about it. If you like OO then it's not for you.

Next we have Python, which is really growing on me as a language - apart from in this case. Putting the separator first and passing the list in as a parameter just feels very wrong and is the exact opposite of the ruby version, which I much prefer. To me the ruby approach of operating on the array is the most natural version and sits well in my head. As a small 'bonus' I also looked at the PHP equivalent -

  
# PHP
$array = array('A', 'B', 'C');
echo implode(" : ", $array);
  

This is close enough to the perl version that I can't really object to it, other than to (rhetorically) ask why the hell it's called 'implode'?

I guess all I can say in summary is round one to ruby.

Like this post? - Digg Me! | Add to del.icio.us! | reddit this!

Posted: 2008/11/08 12:41 | /programming | Permanent link to this entry | This entry + same date


Rebooting Via Proc and the magic sysreq key
You know what the best way to start the day is? I'm pretty sure that it doesn't include a production web server putting its file systems in to read only mode. When this happens most local commands don't work - init, shutdown, telnit and reboot all stop being useful and you have to resort to desperate measures... and here's the desperate measure of the day.

First, check that your system supports the magic sysreq key -


$ cat /proc/sys/kernel/sysrq
1  # nonzero is good

Now you know you have the power to destroy your system through a single incorrect character, have a look at the Redhat Sysrq command reference (you want the 'sysrq' section). We tried to make it sync the disks and reboot - your requirements may vary.


root@web02:~# echo s > /proc/sysrq-trigger
root@web02:~# echo b > /proc/sysrq-trigger

# machine reboots

As techniques go this one's a little obscure but it's very useful in the right circumstances.

Like this post? - Digg Me! | Add to del.icio.us! | reddit this!

Posted: 2008/11/08 12:25 | /tools/commandline | Permanent link to this entry | This entry + same date


Wed, 08 Oct 2008

October London Python UG
I made it along to my first ever London Python User Group tonight, and from what the regulars said about the turn out so did a lot of other people. Over 50 people in attendance is very respectable.

The first talk was a bit of a let down, it felt really long, quite slow moving and could have been much better as a lightning talk. Shame it was the best part of over an hour. Luckily the lightning talks themselves were good. Even though I'd seen a couple of them before at PyCon UK. PySmell, which is actually an IDE intellisense / auto-completion helper rather than anything to do with refactoring, is interesting (and you can read the slides online) and Metaclasses in Five Minutes (which took seven minutes) were both highlights of the evening.

ThoughtWorks have very nice offices in London (with a great view) and I'm looking forward to the next one. Kudos to Simon Brunning for organising it and let's hope Leon has the same turn out for tomorrows London.pm tech meet.

Like this post? - Digg Me! | Add to del.icio.us! | reddit this!

Posted: 2008/10/08 22:48 | /events | Permanent link to this entry | This entry + same date


Tue, 07 Oct 2008

The answer might be 'it depends'
You're in charge of a server that provides two types of assets. The first type is public and its visibility is important to your company. The second should be restricted access only and shouldn't be public.

Now suppose there is a mistake made and the private material is exposed publicly - what's more important, that the public data is available or that the private data isn't? Who'd make that decision where you work? How long would it take to get an answer from them?

Like this post? - Digg Me! | Add to del.icio.us! | reddit this!

Posted: 2008/10/07 07:23 | /misctech | Permanent link to this entry | This entry + same date


Sun, 05 Oct 2008

PyCon UK - 2008
At $DAYJOB I'm working with a strong team of Python (and Django) developers so over the last couple of months my interest in the language has grown. Thanks to YAPC::EU not being very exciting this year I had a spare slot in my "conference schedule" and went to the highly recommended (by me and previous attendees I'd spoken to) PyCon UK. I'm glad I did.

I was more than a little out of my depth in most of the talks but a lot of the speakers were excellent, especially Raymond Hettinger - who I ended up stalking (by accident) and seeing all of his talks. The technical level required of the audience was quite varied but I ended up going to a lot of the more technically indepth sessions as they just seemed more interesting. The downside is that I lacked the ability to filter module based talks in the same way I can at Perl Conferences and that I learned (the hard way) that Python has many test frameworks, modules and harnesses.

The venue itself was fine, large, easy to get around and had restaurants and pubs near enough that you could make a dash outside for lunch. The keynotes were both very interesting - Mark Shuttleworth and Ted Leung both gave their view (in different ways) on where Python is, was and should be. As a (mostly) Perl guy I was a little surprised by how little it even got mentioned - twice by my count and each time it was as an afterthought. In a way this is reassuring, it fits in with my own views and encourages me to learn a new dynamic language (Python and Ruby are both interesting in different ways).

I should probably note that There won't be a PyCon UK in 2009 - instead the organisers are doing PyCon Europe 2009. And based on how good a time I had this year I'll be there.

Like this post? - Digg Me! | Add to del.icio.us! | reddit this!

Posted: 2008/10/05 23:31 | /events | Permanent link to this entry | This entry + same date


Google Dev Day - London 2008
I recently went to the London 2008 Google Dev Day (the title of my post doesn't lie!) and while it was lovely to be near that hallowed grass (only half of which was actually down) the talks themselves left a lot to be desired - actual technical content.

I'm not sure if I'm the wrong audience in that I've already looked at the front pages and the code samples but I hoped, given the word developer in the events title, that it'd be a bit more tech heavy.

The actual talks were mostly well presented but they lacked any real depth on the subjects, most of them contained very similar material to the actual API introductions. It was nice catching up with some ex- coworkers though and, if nothing else, I've been inspired to look at the Google Visualisation APIs a lot more. When I get some spare time. Still, here's something to tick off on my PiP.

You can also view the Google Dev Day Videos on YouTube.

Like this post? - Digg Me! | Add to del.icio.us! | reddit this!

Posted: 2008/10/05 22:56 | /events | Permanent link to this entry | This entry + same date


Spooks Code 9 - Making Torchwood look Good
When it comes to spinoffs the BBC isn't doing too well. After two, very, very bad series of Torchwood we're now 'blessed' with Spooks: Code 9. It's got nothing to do with the main Spooks series (a series I do like), has very... inexperienced acting and dull plots.

What's good about it? A lot of the cast are very pretty and it's only 6 episodes long. Luckily it's been panned by nearly everyone who's posted a review of it (I like to be on the bandwagon every now and again) and with a little luck it'll be canned after just one season. Consistently bad. Let's hope this one stays dead.

Like this post? - Digg Me! | Add to del.icio.us! | reddit this!

Posted: 2008/10/05 22:47 | /nottech | Permanent link to this entry | This entry + same date


Sat, 06 Sep 2008

My First Day with Python - Initial Thoughts
While I've always been a bit of a perl guy I don't want this post to be "perl has x and python doesn't" in tone. Which is lucky really as Python has exceptions and threading as first class features where as perl has... ahem.

So after spending a chunk of today reading a python book and spending some time writing code here's my initial short list of gripes -

Considering how picky I can be that's a very short list so Python must sit well with me so far. Now, in order, I can't help but read except IOError as 'catch everything apart from IOError'. This one bugs me more than it should but considering how happy native exceptions in the language made me this just felt mean.

Secondly, print adding newlines. While this might seem trivial every other language I use on a daily basis has a print function that doesn't print a newline so this feels weird. At least it's not called say ;)

Now to the one that I'll get no sympathy on - whitespace in blocks. First up let me say I don't mind about the enforced indentation. I indent anyway so it's not a big deal. I guess I'll hit the odd case when it annoys me (probably involving heredocs) but I've got nothing against it. What does irk me is the lack of block delimiters - whitespace just doesn't cut it for me.

I like my { and } delimited blocks, a nasty voice in my head is telling me to add them but just comment them out ( if x == y: # { ) but that seems very wrong. I've always looked at those examples in C programming books that say...


# incorrect
if ( something )
  print("All's well");
  wellness++;

# this is wrong because wellness is a separate statement
# and not part of the if

... and thought - "just add the damn braces, you'll be back to add more code later anyway." Now I'm learning a language that seems to want me to slip up like this. I'll either get used to this or move to ruby.

Lastly we have the lack of ++ and --. I know the arguments, I've read them before. I disagree. I've never done anything insane with ++ and where I have used it it's saved me typing. Can we have ++ and remove nested ternary ( ? : ) instead please?

I like Python and I think I'll be investing more time in to learning it.

Like this post? - Digg Me! | Add to del.icio.us! | reddit this!

Posted: 2008/09/06 23:44 | /python | Permanent link to this entry | This entry + same date


Thu, 04 Sep 2008

Pragmatic Investment Plan - End of 2008

In the past I've written up a small list of general goals to help measure my technical progress. Over the last few years I've become a lot busier and this habit fell by the wayside. But no more! I've got a quarter left and I'm going to try and complete...

Considering this is one of the busiest times of the year I have no idea how far I'll get but I do think it's worth at least an attempt.

Like this post? - Digg Me! | Add to del.icio.us! | reddit this!

Posted: 2008/09/04 21:34 | /career | Permanent link to this entry | This entry + same date


Ubiquity - More Than Just Shiny Chrome

While Google Chrome has been getting all the press coverage recently Ubiquity, from Mozilla Labs, is where all the interesting action seems to be happening.

Ubiquity ticks all the boxes for me, it's a simple, easy to use idea, that'll save me time. It's easily extensible and already has a huge community of people working, enhancing and just trying new things with it. All the things I've come to expect from Firefox and the Mozilla using community.

I personally think this is an important distinction to make - while Google Chrome is a new browser with some great ideas (and a quickly revised EULA) FireFox is a proven, Free platform that encourages extension and has a track record of doing the right thing.

Like this post? - Digg Me! | Add to del.icio.us! | reddit this!

Posted: 2008/09/04 20:58 | /tools/firefox | Permanent link to this entry | This entry + same date


Wed, 03 Sep 2008

Google Chrome - Initial Thoughts

Like most of the techy part of the Internet I dutifully downloaded Google Chrome today and had a little play around. And just like all those other people I'm going to write about it. The difference is I'm very ambivalent about the whole thing.

Chrome seems nice enough. It's quick, works with all the websites I've tried so far and does have a killer feature - the task manager. Finally breaking tabs out in to their own sandbox is an idea whos time should have come years ago. Being able to see which sites are doing hugely evil things with my memory is a wonderful thing. I'm also inappropriately happy with the in-page search showing how many matches it found.

Unfortunately that's about it. While the minimal design and streamlined core functionality are lovely, these days I'm used to my extensions - the web developer toolbar, YSlow and the work flow changing Ubiquity are just too useful for me to give up.

It's not just the fact that these extensions are missing that puts me off, it's the lack of how to write custom extensions, searches etc. that feels wrong. Firefox is a platform as much as a web browser. Using Chrome what is the command line for pulling out the memory usage for the currently opened tabs? Do I need to screen scrape a running about:memory? I can't help but think they'd have three Firefox versions ready for download by now.

So will I be moving over to the new and shiny? Not yet. As useful as the broken out tabs are I need more functionality than Chrome can give me, so while I might use it for some day to day surfing it's no where near ready for me as a developer. Although I;m guessing they never intended for it to be.

Like this post? - Digg Me! | Add to del.icio.us! | reddit this!

Posted: 2008/09/03 00:08 | /geekstuff | Permanent link to this entry | This entry + same date


Sat, 23 Aug 2008

Nagios Service and Hosts stats - Graphed in Munin
We've been hitting some load issues on one of our monitoring machines recently and while it looks like the munin graph generation is the culprit we also decided to keep an eye on how many services and hosts Nagios was checking.

One of the downsides of having a very automated server deployment system is how easy it is to suddenly find yourself with an extra dozen hosts you no longer really need. While each check is quite small and quick, add up the frequent runs and multiply it by a reasonable number of servers and you can soon hit problems.

So as a first step towards keeping an eye on those numbers we now have a munin Nagios hosts plugin and a munin Nagios services plugin that show the total number of hosts and services monitored and the states those resources are in.

Like this post? - Digg Me! | Add to del.icio.us! | reddit this!

Posted: 2008/08/23 15:20 | /tools/commandline | Permanent link to this entry | This entry + same date


books career codinghorrors events geekstuff justdont languages/bash linkshot magazines meta misctech movies nottech operatingsystems/linux operatingsystems/linux/debian operatingsystems/solaris perl programming python ruby security security/apache security/tools serversmells sites specifications sysadmin tools/commandline tools/firefox tools/gui tools/network tools/online tools/online/greasemonkey unixdaemon

Copyright © 2000-2005 Dean Wilson XML feed logo