Sat, 21 Apr 2007
Deferring Defects - Autonomics
Autonomics refer to the ability of computer systems to be
self-managing. -- autonomics.ca
Here's one that has been bothering me. Suppose you have a recurring problem that your "autonomic solution" can handle every time it occurs without any one knowing. At what point does the fact there is a treatable issue propagate up to a real person?
While an automatic "fix and tell me later" approach helps change your work from fire fighting to planned tasks what classifies a temporary problem as being important enough to warrant you investigating it? It's hard enough to justify preventive maintenance with the current systems, if it fixes itself then you may never get given the time to investigate further.
If a problem fixes itself before any one notices or a sysadmin can look at it is it a problem?
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2007/04/21 17:41 | /sysadmin | Permanent link to this entry | This entry + same date
Handling Requests: Three Simple Rules
I'm a sysadmin, half my working life seems to be spent handling other
peoples requests (which is why I'm trying to move over to
infrastructure work - where I can hopefully concentrate on something
for three whole minutes). While chatting with a junior admin at a tech talk
in the week the following three tips came up:
Use a ticketing system. This one comes up a lot but it's true, never dropping someones request is well worth the time spent setting it up.
Customers sending requests to individuals is a BAD THING. People go on holiday, they get dragged in to meetings. They work on projects. Which of those do you think someone who's been waiting for a request will accept as an excuse? None of them. And telling them that it's their own fault is a great way of annoying them even more - even if it is true. Training your users to reply to all (so follow ups also get tagged by the ticketing system) and to not send a "Just a quick question" mail so their favourite sysadmin helps you keep an eye on the workload while ensuring that things can't drop between the cracks. Even if it's an often repeated uphill struggle.
There is a caveat to this one. If you've got the resources it's often helpful to assign a sysadmin to a new employee for their first couple of days. Asking those awkward new starter questions is a lot easier face to face than on a mailing list of who knows how many. Any requests can then be added in to the ticketing system while the sysadmin is present, showing the starter how to use it, and that the admins actually pay attention to and process tickets. Nothing beats a good first impression.
Lastly, people have an expectation of how long something should take. If you break this unwritten rule, even for a good reason, then they'll notice and it'll be used against you at some future point. While it's not ideal for concentration quickly completing short tasks like password changes can make a huge difference in how your team is perceived.
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2007/04/21 17:28 | /sysadmin | Permanent link to this entry | This entry + same date
Automated Database Provisioning Papers
It's been a week of databases, replication, provisioning and planning for
automation. While winding down (it's an on-call weekend) I found some links
I'd marked for future reading. If you're interested in database
provisioning (especially read only replicated slaves), practical
autonomics and how they could potentially be useful in a real environment
then these papers make for an interesting ten minutes
It doesn't take a massive leap in imagination to see how a similar approach could be used in to horizontally scale web servers in conjunction with an intelligent monitoring system or load balancer. Mix in some thin provisioning and centralised logging and it's something I should really schedule some time to play with. Now, to the papers!
Database Replication Policies for Dynamic Content Applications(pdf)
Autonomic
Provisioning of Backend Databases in Dynamic Content Web
Servers(pdf)
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2007/04/21 16:57 | /geekstuff | Permanent link to this entry | This entry + same date
Tue, 17 Apr 2007
The Great IPv6 Experiment
I don't normally write short posts with a single link but the The Great IPv6 Experiment amused
me. In an attempt to crack the chicken and the egg adoption problem they have put up
an IPv6 only website full of porn.
They say porn pushes technical innovation. We'll see. Although probably not until the videos are over.
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2007/04/17 22:11 | /geekstuff | Permanent link to this entry | This entry + same date
Sunshine - Short Movie Review
I wanted to like Sunshine, I really did. A
new sci-fi film by the writer and directors of 28 Days Later (a very
entertaining film) should have been enough to keep me going until
Spider-Man 3 is released. Instead it was a seriously dull and
predictable two hours.
The sun is going out (hip hip hip hurray?) so a small group of scientists are sent to detonate a bomb that'll kick start it. They don't make it. So we send a second batch; "Are you the saviour of the human race?" "No, I'm the second choice". The film starts off mid-trip and quickly goes downhill. The psych officer is obvious a fruit loop, the crew are sloppy as hell, the twists are predictable, the on board security system is nearly as comprehensive as an unpatched Windows 3.1 machine and Rose Byrne has An Accent; we're just not sure where it was supposed to be from.
The film starts out with some interesting psychological ideas and quickly becomes a very dull and done before space slasher. This film left me cold (haha) 2/10 - visually impressive but boring.
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2007/04/17 21:44 | /movies | Permanent link to this entry | This entry + same date
No one likes a whinger - The systems fight back
After my little
whine I logged in to do my last checks for the evening to discover that
one of our webservers had died due to a hard drive going bang, our
production environment Nagios box had lost one of its network
connections and a chunk of our SAN kit was complaining about power
issues. Turns out that most of these were due to a power surge that
killed a network switch and three of the racks power strips. On the very
plus side no one outside of the systems team noticed. Resilience is a
wonderful thing when you get it right.
Woke up this morning, checked the Nagioses Nagii and found
out that one of our other products database servers had gone boom (my
fellow sysadmins were fixing that one) and the fail over had mostly worked.
No interesting logs, no hardware problems and a three hour gap in syslog
(and only syslog) to help explain the outage.
What have I learned? That the production servers read my blog. And they hate me.
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2007/04/17 21:32 | /sysadmin | Permanent link to this entry | This entry + same date
Mon, 16 Apr 2007
Fractally Crap
A Fractal is "a rough or fragmented geometric shape that can be
subdivided in parts, each of which is (at least approximately) a
reduced-size copy of the whole" -- Wikipedia - Fractal
Fractally Crap - a system where any piece, when looked at individually, is every bit as broken, badly planned and undocumented as the rest.
And yes, I know that if you pile rubbish on rubbish then you get...
(strangely enough) rubbish but you can normally find the occasional little gem or ray
of sunshine. Not this month. An often seen symptom is that every RT ticket you close requires three more be
opened for new issues problems challenges that arose
while fixing the first. And no, this rabbit hole doesn't have a
bottom.
Not my best fortnight ever. Roll on the Nordic Perl Workshop!
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2007/04/16 22:47 | /misctech | Permanent link to this entry | This entry + same date
Fri, 06 Apr 2007
Bad Virus, Bad!
I've had a cold / flu for the last month or so that I've not been able to
fully shake off. I took a weeks holiday from work to relax and kill the
damn thing only to be told by the doctor it's not a cold or the flu, it's a
viral infection. Which will shift on its own, just not for a while. I was
told the usual (give it time and lots of rest) and so I'm posting this as a
quick and dirty way of explaining why I've not responded to your $FOO
Back to bed for me.
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2007/04/06 13:48 | /nottech | Permanent link to this entry | This entry + same date

