Fri, 09 Jul 2004
Sed Sickness -- Whitespace Reduction
Leafing through the live source-code should be a pleasant, calming
experience, instead it often becomes a game of cringe and seek. While
digging through some custom bandwidth monitoring scripts i came across
this gem.
cat /proc/net/dev | grep eth0 | sed -e 's/:/ /g; s/ / /g; s/ / /g; s/ /
/g; s/ / /g; s/ / /g; s/ / /g; s/ / /g; s/ / /g; s/ / /g; s/ / /g;
s/ / /g;'
Working left to right we have the useless use of cat. The grep command can take a file as an argument, it doesn't need to read from standard input. This takes away one command and a | (pipe). We then move onto the bastard stepchild that is this abuse of sed. The person who wrote this is no-longer available to beat^H^H^H^H^H ask for clarification but after some head scratching it seems the author had never head of quantifiers such as + and *.
Instead it takes every instance of two spaces and makes it one space, globally. It then does it again and again until it id reduced to a single space. This is a great example for a number of reasons, wasteful repetition of code, long ugly lines and it displays a lack of knowledge of the tool. Compare the above with the following, rewritten version.
grep eth0 /proc/net/dev | sed 's/ \+/ /g'
We've killed the cat and shrunk the sed. The + is a quantifier, it changes the behaviour of the previous pattern, in this case it changes a 'match two spaces' to a 'match one space followed by any number of spaces as long as its above two.' This whole matched block is then substituted with a single space. The code is shorter, faster and easier to maintain. And it doesn't make me lose another few (precious) hairs.
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2004/07/09 20:20 | /codinghorrors | Permanent link to this entry | This entry + same date
Breaking Grep
While rummaging around the grep man page i stumbled on something I'd never
noticed before; GREP_OPTIONS. This environmental variable does pretty much
what you'd expect, once set it passes the options you specified to each and
every invocation of grep that runs with the variable still in scope.
While I'm not aware of any real positive usages for this something slightly less wholesome crossed my mind. If you set 'GREP_OPTIONS=-v' then every run would return the lines NOT matching your criteria, -v is an absolute switch rather than a toggle one so its not possible to reverse it with another -v. This would be a seriously annoying problem to track, you'd have to get lucky and notice it in the output of 'set' or a similar command.
Not that I'd ever think of doing that of course ;)
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2004/07/09 00:17 | /misctech | Permanent link to this entry | This entry + same date

