ruby - Read file ending early? -

June 15, 2010

i ran weird thing line in file causing ruby script finish reading file early.

my code is:

file.readlines($file).each |line|     puts "line is: "+line.to_s     line.each_byte |c|         if(c == 9 || c==10 || c==13 || (c>31 && c < 127))             print c.chr         end     end end

the file i'm using has single character that, in notepad++ , sublime text 2, shows "sub".

in following line, appears in between cr , me towards end of first line:

"producttoken","estee-lauder-re-nutriv-replenishing-comfort-eye-crme-15ml" "producttoken","estee-lauder-youth-dew-body-satinee-150ml"

i have same lines in dropbox.

when execute sample script above, hits character , finishes. suspect file method treating character end of file.

the problem i've got absolutely no idea how sort out. can find , replace in sublime, or presumably using sed or something, i'd prefer not have each time.

i'm using ruby 1.9.3 on windows.

can use file encoding or something? have no idea file encoding is, let alone how handle this.

also, original readline function take contents of csv file, parse it, , stick hash. original file size approximately 28mb, on 350k unique lines database, when checked size of hash , found 2100 long, led me start looking this.

as requested, ran through od -c on mac, , got following:

0000000    "   p   r   o   d   u   c   t   t   o   k   e   n   "   ,   "   0000020    e   s   t   e   e   -   l     u   d   e   r   -   r   e   -   0000040    n   u   t   r     v   -   r   e   p   l   e   n     s   h   0000060      n   g   -   c   o   m   f   o   r   t   -   e   y   e   -   0000100    c   r 032   m   e   -   1   5   m   l   "  \n   "   p   r   o   0000120    d   u   c   t   t   o   k   e   n   "   ,   "   e   s   t   e   0000140    e   -   l     u   d   e   r   -   y   o   u   t   h   -   d   0000160    e   w   -   b   o   d   y   -   s     t     n   e   e   -   0000200    1   5   0   m   l   "  \n                                       0000207

http://blob.perl.org/books/beginning-perl/3145_appf.pdf

according this, in octal, 032 sub character. in case, if valid ascii character, why ruby think end-of-file?

you can bypass issue using file#read, lets specify correct number of bytes read.

file.open($file) |f|   f.read(f.size).each_line |line|     # ...   end end

or better, since data csv, can use csv lib reads past ctrl+z

require 'csv' rows = csv.read($file)

Search This Blog

My

ruby - Read file ending early? -

Comments

Post a Comment

Popular posts from this blog

javascript - RequestAnimationFrame not working when exiting fullscreen switching space on Safari -

Why am I getting Internal .NET Framework Data Provider error 1025 when passing Method to where? -

linux - phpmyadmin, neginx error.log - Check group www-data has read access and open_basedir -