Friday, August 20, 2010

Editing text files with sed -- adding a line in particular places

On unix and unix-like systems there is a stream editor called "sed" (which obviously stands for stream editor) which is very useful for editing large text files. It's a stream editor because it takes a text file one line at a time edits it according to your wishes, and outputs that edited line -- sorta' like a stream. I frequently use sed to make wholesale changes to text files. One could use an editor like emacs or Kedit or others to -- maybe -- make the same changes. However, the powerful and beautiful thing about sed is that it accepts regular expressions.

Suppose you have a file that contains dates like 1-Jan-2000, 15-Jul-1998, and so on and you want to replace the years with their two-digit equivalents (2000 with just 00, 1998 with 98,....). That would be a somewhat tedious task with a typical search and replace operation available in editors or word-processing programs. However, with regular expressions one can craft a symbolic expression that means "find me a number with one or two digits followed by a dash followed by three alphabets the first of which should be upper case, then a dash and finally four digits". A stream editor can then be used to replace the last four digits with just the last two digits.

Although I've been using sed for a while and consider myself quite proficient with it, I ran into an interesting problem: If a line begins with a date, I want to insert a blank line before it. While my previous experience with sed was confined to acting on each line individually this was an attempt to insert a line into the stream. Here's the command that did the trick:

sed '/^[0-9]\{1,2\}\-/i\
\
' inputfile.txt

The regular expression begins with the first forward slash and ends with the second forward slash and it means find me all lines that begin with one or two digits followed by a dash -- in the file that I was editing this was sufficient to find me all the relevant lines. The key to inserting a line is the everything after the second forward slash and ending with the '. It essentially says "insert a blank line preceding whatever matched the regular expression." Note: One has to literally type the backslash at the end of the line then hit the [Enter] key on the keyboard, then another backslash and [Enter], then the closing ' and the name of the input file.

If you regularly edit large text files making wholesale changes to them, I highly recommend sed -- it's very powerful and fast. There is a steep learning curve but once you start using it, you'll be doing all kinds of powerful edits that previously took you hours in literally seconds.

Review of "The Girl with the Dragon Tattoo" (Stieg Larsson)

Somehow ended up watching the movie on Netflix's instant streaming service before reading the book. Thoroughly enjoyed the movie. Didn't like the book. The movie doesn't leave you enough time to think about whether something makes sense or no. The book does. This is the problem. I find most mysteries to have little credibility because when you think about the progression of events they don't quite make sense. In fact, I'm going to stick out my neck and say non-fiction mysteries, i.e., real-world murder cases or scientific mysteries are way more interesting and credible than the average Agatha Christie type mystery.

Skip the book and watch the movie. And, oh, there are some really sick people in the world.

Friday, July 30, 2010

Music for insomnia


For unknown reasons, I'm unable to fall off to sleep these days. I'm reminded of Al Pacino's movie Insomnia (excellent movie...if you like whodunits, watch it) and I'm getting a feeling for being an insomniac. Last night, I decided that instead of tossing and turning in bed, I would use it to my advantage. When I was a kid growing up I would spend hours and hours just listening to music but as a grown-up I hardly do that due to lack of time. This was a great opportunity to go back to being a kid.

I listened to my favorite electronica album (these days): 76:14 by Global Communication (image of album cover above and left). If someone wants an easy introduction to contemporary electronica, this is the album I would recommend. Easy to listen to and evocative of some of the scenes in the Bourne trilogy....many fans of this album might wonder what the connection is, but my mind somehow always goes to some of the train scenes in the movies while listening to some parts of the album. [If there's anyone out there who feels the same way, please drop me an email....I hope I'm not the only one with this mental connection].

The album runs 76 minutes and 14 seconds...therefore the title. Each track's title is also the length of the title....a novel idea. My current favorite track: 9:39.

Thursday, June 24, 2010

Music review :: Zero 7's Simple Things


My ratiing: *****
Simple Things
Group: Zero 7
Genre/s: Acid jazz, lounge, blues, electronica.
Label: Palm Pictures (Audio)
Released: 2001

One of the few good things about aging is you begin to receive recommendations for music and movies from your nephews, nieces, and other young people. Typically, this turns out to be material that I would have never come across by myself just because my contemporaries are plugged into very different things. Zero 7's debut album Simple Things is a great example of what I mean. It was recommended to me by my very cool niece who lives and works in NYC. None of my friends have heard this album or even heard of this group. The first time I listened to this album I thought of it as interesting rock music. The second time I listened to this album I enjoyed it too but I remember it as a blues album. The next time I listened to it we were driving somewhere long distance and I had the late-night shift. I put it on at a low volume in the car. It just blew me away. The lyrics are superb, the vocals (Mozez, Sia Furler, Sophie Barker) are gorgeous. Mozez's voice reminds me of Roland Gift's (Fine Young Cannibals), Sia Furler's voice is like Chrissie Hynde's (The Pretenders) but with just a little hint of Janis Joplin in it, while Sophie Barker's is more conventional but beautiful.

The best feature, though, of Zero 7's music is the electronica. The multi-layered melodies are lush, fully developed and well thought out. I love electronic music and these guys blend their electronic
keyboards with the percussion and the vocals beautifully. Of course there are a number of tracks that are purely instrumental and these are the tracks that blew me away on my long distance drive (listen to Polaris in the dark to experience what I mean). Maybe the most unusual aspect of the album is the number of genres that it spans. I saw somewhere that Zero 7's music is classified as acid jazz. The term "acid jazz" evokes images of dark, smoky jazz clubs with people on
acid listening to jazz (well, at least to me). But if this is mainstream acid jazz I'm blown away by it. I don't think this is mainstream acid jazz...surely mainstream acid jazz doesn't have this stylistic quality to it with the blues-like singing, the electronica, and the laid-backness of a couple of the numbers. Anyway, what does it matter? Enjoy the music.

Friday, June 18, 2010

Book review: Snow Crash by Neal Stephenson


My rating: *****
Book title: Snow Crash
Author: Neal Stephenson.
Genre/s: Science-fiction, cyberpunk, drama.
Publisher: Bantam-DoubleDay
Published: 1992

Neal Stephenson can be credited with writing the first cyberpunk novel ever with Snow Crash. After reading this novel, you'll wonder if Stephenson had a time-travel machine because he foresaw the coming of the world-wide-web in such a big and accurate way -- after all, the novel was published in 1992 which means that it was written prior to 1992. The net wasn't a big force prior to 1995 and it wasn't a force at all in 1992 or even 1993.

Snow Crash is actually a virus but of a different sort: people get infected by it if they view it on their computer screens. It looks like "snow" (hey you RGB people -- you know what I mean!) but it has a horrible consequence -- death! The hero of this novel loses an old friend to the virus. Thus begins his quest for the search of the origins of the virus and its perpetrators. This is an amazing book because it's not really your traditional science-fiction with nebulae and space travel and so on but it's stuff that computer geeks (such as myself) can really relate to. One of the virtual reality simulations that is described here runs on a UNIX box -- I got a real kick out of reading that. The book is destined to be (some say it already is) a classic because it spawned a whole new type of literature. Get it and enjoy it (and once you are done, get Stephenson's other books).

Monday, June 14, 2010

Rebel code

I'm about to finish Glyn Moody's book "Rebel Code" that documents in fine detail the origins and the development of the free software or the open source movement (free software and open source are not necessarily the same thing as Richard Stallman has pointed out). This book is very well written as it gives a human dimension to the motivations, frustrations, challenges, triumphs and successes of the people behind open source. Programmers are usually not glamorous, but this book brings alive the world of software in a way that makes it general-interest reading.

I will write more about some of the specifics in the book and what I liked and didn't like about it. Right now let me just state that this book is quite a comprehensive account of the origins of the movement with Richard Stallman's work at MIT and why he felt that it was important for software to be free and the source code to be open. The book documents the creation of the major pieces of open source software like emacs, gcc, GNU/Linux, Apache, Perl, Tex, and GNOME and the role of the internet in the collaboration between people from across the world.

If you're interested in the software industry you can't miss this book. I highly recommend it to others too just to get a feel for the world of software programmers. Maybe you'll get the bug for open source software like I'm getting and switch to open source.

Friday, June 11, 2010

Seeing and hearing data

We as a society are generating and saving more data. But we are also inventing new ways of seeing and hearing what the data has to tell us. Researchers at the University of California, Santa Barbara have built what they call the "Allosphere" that allows them to visualize and to hear what the data can reveal that traditional methods of analysis may not. The Allosphere is a two-storey tall anechoic (sound proof) and opaque chamber that contains projectors and speakers that are fed an audio and video stream produced by the many software tools that they are using. The principal faculty member behind the project, Professor JoAnn Kuchera-Morin, presented this 6-minute video at the TED talks that gives several examples of how researchers are using the Allosphere to gain new insights.