Python Pandas Reading CSV file with Specific Line Terminators -


i trying create dataframe below sample csv i've been given getting error tokenizing data. c error: eof inside string starting @ line 0. haven't had practise treating bad lines learn best way handle this. have attempted many different options in read_csv such error_bad_line=false has not worked either.

cparsererror: error tokenizing data. c error: eof inside string starting @ line 0 

i guessing line terminators of ," causing issue , guessing best way loop through each line , process came below generator different , hoping close. learn how use generator , yield also.

sample data:

"usnc3255","27","us","nc","lands end","72305006","knjm","knca","knkt","t72305006","","","ncc031","ncz095","","545","28594","america/new_york","34.65266","-77.07661","7","rdu","893727"," "usnc3256","27","us","nc","landsdown","72314058","keho","kakh","kipj","t72314058","","","ncc045","ncz068","sc007","517","28150","america/new_york","35.29374","-81.46537","797","clt","317845"," 

i have crafted below removes last 2 characters not sure hot produce dataframe lines:

def big_table_generator(filename):     open(filename, 'rt') f:         line in f:             yield line[:-3]  gen = big_table_generator('../data/test_sun_file.csv') pd.dataframe(gen) 

i had similar error. fixed using option quoting=csv.quote_none in read_csv.

for example:

df = pd.read_csv(csvfile, header = none, delimiter="\t", quoting=csv.quote_none, encoding='utf-8') 

some info why in second comment here: https://github.com/pydata/pandas/issues/5500


Comments

Popular posts from this blog

javascript - RequestAnimationFrame not working when exiting fullscreen switching space on Safari -

Python ctypes access violation with const pointer arguments -