python - How to support multiple file formats and field delimiters? -


a file contents in table form can exported in @ least 3 formats (utf-8, utf-16le, ascii), have columns tab-separated, pilcrow separated, or other, , have quotes/thorns/etc. around each item. following function reads in table utf-8, separated pilcrows, , each item surrounded thorns.

def read_app_dat(app_export):     """ reads , parses dat exported app      assumes delimiters concordance.      args:         app_export: str, file path dat exported app     returns:         dictionary id mapped list first         tuple uri id     """     app_dict = {}     f = codecs.open(app_export, encoding='utf-8')     line in f:         each_row = re.sub(r'\xfe', "", line).split("\x14")         if "id" in each_row[0] or "uri" in each_row[1]:             pass         else:             app_dict[each_row[0]] = each_row[1]     return app_dict 

as it's written, need define each row differently each scenario.

each_row = re.sub(r'\xfe', "", line).split("\x14") 

that's not pythonic thing do. how better deal separators, in case pilcrows , thorns, call them parameter? codecs module has been helpful far.

thank time.


Comments

Popular posts from this blog

javascript - RequestAnimationFrame not working when exiting fullscreen switching space on Safari -

linux - phpmyadmin, neginx error.log - Check group www-data has read access and open_basedir -