c# - Single RegEx expressiong to decode CSV with embedded dobule quotes and Commas -


i have lots of csv data trying decode using regex. tried build on existing code base other people/projects hit , dont want risk breaking data flows refactoring class much. so, wondering if possible decode text single regex (which how class works currently):

f1,f2,f3,f4,f5,f6,f7 ,"clean text","with,embedded,commas.","with""embedded""double""quotes",,"6.1", 

first row header. if save xxx.csv , open in excel, decompiles read (note space between fields cell breaks):

f1  f2  f3  f4  f5  f6  f7 clean text  with,embedded,commas.   with"embedded"double"quotes     6.1      

but when try in .net, stuck on regex. have this:

string regexp = "(((?<x>(?=[,\\r\\n]+))|\"(?<x>([^\"]|\"\")+)\"|(?<x>[^,\\r\\n]+)),?)"; 

you can see in action here:

http://ideone.com/hrq8xe

which results in this:

<start>  clean text with,embedded,commas. with""embedded""double""quotes  6.1 <end> 

this close not replace escaped double-double quotes single-double quote excel does. not come regex worked better. can done?

maybe can somehow manage match string using regular-expression-conditionals following constructors:

  • if-then sentence(?(?=regex)then|else)
  • multiple if-then sentences(?(?=condition)(then1|then2|then3)|(else1|else2|else3))

i came following pattern in order match body of text: ([^\,]+(?(?=[^\,])([^\"]+")|([^\,]+,))), however, need put effort in order create completly matching expression text or end using file parser. if so, can take @ filehelpers, pretty neat library parsing text files.

sources:


Comments

Popular posts from this blog

javascript - RequestAnimationFrame not working when exiting fullscreen switching space on Safari -

Python ctypes access violation with const pointer arguments -