ruby - Read file ending early? -


i ran weird thing line in file causing ruby script finish reading file early.

my code is:

file.readlines($file).each |line|     puts "line is: "+line.to_s     line.each_byte |c|         if(c == 9 || c==10 || c==13 || (c>31 && c < 127))             print c.chr         end     end end 

the file i'm using has single character that, in notepad++ , sublime text 2, shows "sub".

in following line, appears in between cr , me towards end of first line:

"producttoken","estee-lauder-re-nutriv-replenishing-comfort-eye-crme-15ml" "producttoken","estee-lauder-youth-dew-body-satinee-150ml" 

i have same lines in dropbox.

when execute sample script above, hits character , finishes. suspect file method treating character end of file.

the problem i've got absolutely no idea how sort out. can find , replace in sublime, or presumably using sed or something, i'd prefer not have each time.

i'm using ruby 1.9.3 on windows.

can use file encoding or something? have no idea file encoding is, let alone how handle this.

also, original readline function take contents of csv file, parse it, , stick hash. original file size approximately 28mb, on 350k unique lines database, when checked size of hash , found 2100 long, led me start looking this.


as requested, ran through od -c on mac, , got following:

0000000    "   p   r   o   d   u   c   t   t   o   k   e   n   "   ,   "   0000020    e   s   t   e   e   -   l     u   d   e   r   -   r   e   -   0000040    n   u   t   r     v   -   r   e   p   l   e   n     s   h   0000060      n   g   -   c   o   m   f   o   r   t   -   e   y   e   -   0000100    c   r 032   m   e   -   1   5   m   l   "  \n   "   p   r   o   0000120    d   u   c   t   t   o   k   e   n   "   ,   "   e   s   t   e   0000140    e   -   l     u   d   e   r   -   y   o   u   t   h   -   d   0000160    e   w   -   b   o   d   y   -   s     t     n   e   e   -   0000200    1   5   0   m   l   "  \n                                       0000207   

http://blob.perl.org/books/beginning-perl/3145_appf.pdf

according this, in octal, 032 sub character. in case, if valid ascii character, why ruby think end-of-file?

you can bypass issue using file#read, lets specify correct number of bytes read.

file.open($file) |f|   f.read(f.size).each_line |line|     # ...   end end 

or better, since data csv, can use csv lib reads past ctrl+z

require 'csv' rows = csv.read($file) 

Comments

Popular posts from this blog

javascript - RequestAnimationFrame not working when exiting fullscreen switching space on Safari -

linux - phpmyadmin, neginx error.log - Check group www-data has read access and open_basedir -