Tuesday, March 29, 2011

First success and failure with Ruby program (ruby pt.3)

So if you haven't been keeping up I'm trying to write my first Ruby program to teach myself the ruby programming language. I coded in BASIC in high school and did a little C coding in college. Since then I've put that part of my brain to sleep with a diet of fruit loops and Keebler elf cookies. The program I'm writing takes a text file and reverses the order of the characters (in Hebrew) but not the order of the lines.

So status update: I got my program running this morning. Yesterday as I was going to bed (after the smoke detector woke me up -- long story tell you later) I picked up my ruby book and had an idea.

This morning I put it together. And now the program works (sort-of, but more on that later). It searches the directory for files that need to be reversed, reverses each line in order and outputs a file! Yay! But alas like many programming victories it was Pyrrhic.

And here is the failure (which I was expecting) it mangles Hebrew. Here is the input and outputted files:

This sucks quite a bit since my research shows that the method reverse! I'm using is the culprit. Ruby assumes you're using ANSI text. Why? Are we in the 386 era again? Wasn't this written by a Japanese guy? It seems to be a memory issue. ANSI text is 8-bytes (or bits?) while Unicode is 16-bytes (or bits, I don't remember which at the moment). So ANSI text occupies half the memory that Unicode does. However, this is Beautiful Sunshine (or B.S. for short) because Java uses Unicode. Ok so now to find how to handle Unicode in Ruby. [ed. Unicode is 2 bytes or 16 bits, ANSI is 1 byte or 8 bits]

But first the overview of how the program works. The program has three parts now.

  1. It goes fetch the files it needs to reverse (in order!) and puts the whole file contents on variable text.
  2. It goes through each line of the text and reverses it then appends each line (one at a time) to the output variable (which is outside the loop so it remains after this is done).
  3. 1st part
  4. Writes the output variable to a file.

All good except step two. Which I'll have to figure out how to do differently.

2nd part
3rd part

No comments:

Post a Comment