sed porting (or the little r that could)

December 21, 2007 by
Filed under: linux 

My company is doing mainly mobile consumer applications, sold on almost every mobile platform available. In order to support them, we have servers providing various content, perform book keeping, etc.

About a year ago, we moved one of our servers from FreeBSD to Red Hat Linux. During this move, I’ve learned that not all UNIX’ small utilities were born equal. One of the tasks of this server was to receive content updates and transform them to the proper format read by our applications. All the feeds are in pipe delimited formats, since mobile clients couldn’t handle any fancy format (do I here XML?) very well.

The problem was that somewhere along the line, our content provider has changed some of the codes it uses. Since we needed to support old applications, and didn’t want to update all the applications’ code, we decided to transform the codes back to the original version while creating the new feeds. Now, the feeds at that time were plain text files, the clients would just download. In order to do that, we added a sed command to transform the codes. This command basically ran a lot of s/|1234/|5678/g and worked well, was quick, and didn’t ask for food.

Until we have moved it to the Red Hat. When this command ran on the Linux, the script froze for several minutes, and then an alarming message had appeared: sed: Couldn't re-allocate memory. Tracking the script output while running, we discovered that it tried to replace every character in the original 38,000 lines feed with the “|5678″ strings, and tried to do it about 100 times. Of course, the second s/../../g command had 5 times more characters to replace then the first, and the output was keep growing exponentially.

The fix was rather simple: we escaped the regular expressions so now they look like this s/\|1234/\|5678/g, and more importantly, we discover that if you want to use sed on Linux, you’d better add the -r parameter, which tells sed to use the extended regular expressions in the script.

Needless to mention that all is working well right now.

Share

Comments

2 Comments on sed porting (or the little r that could)

  1. Mike on Sat, 22nd Dec 2007 12:44 pm
  2. First off, welcome to your new site.
    Looks a little familiar. :)

    I remember that sed tidbit – is this something new, or something that you only now wrote about?

  3. David on Wed, 26th Dec 2007 1:19 pm
  4. Thanks :-)

    It is the incident you remember, finally I had some time to publish the drafts.

Tell me what you're thinking...
and oh, if you want a pic to show with your comment, go get a gravatar!