ruby on rails - Nokogiri XML to hash using attibute names -
i'm new rails , i'm looking parse xml pubmed eutil's api hash attributes want. here have far:
def pubmed_search new if params[:search_terms].present? require 'nokogiri' require 'open-uri' @search_terms = params[:search_terms].split.join("+") uid_url = "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=pubmed&term="+@search_terms uid_doc = nokogiri::html(open(uid_url)) @uid = uid_doc.xpath("//id").map {|uid| uid.text}.join(",") detail_url = "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esummary.fcgi?db=pubmed&id="+@uid detail_doc = nokogiri::html(open(detail_url)) @details = @detail_doc.xpath("//item[@name='title']|//item[@name='fulljournalname']|//item[@name='author']").map{|article| article.text} render :new else render :new end
this gives me values want (authors, title, journal name) comes out in 1 giant array without attribute names so:
["keshmiri-neghab h", "goliaei b", "nikoofar a", "gossypol enhances radiation induced autophagy in glioblastoma multiforme.", "general physiology , biophysics", "alzahrani eo", "asiri a", "el-dessoky mm", "kuang y", "quiescence explanation of gompertzian tumor growth revisited.", "mathematical biosciences", "neofytou m", "tanos v", "constantinou i", "kyriacou e", "pattichis m", "pattichis c", "computer aided diagnosis in hysteroscopic imaging.", "ieee journal of biomedical , health informatics", "lou q", "ji l", "zhong w", "li s", "yu s", "li z", "meng x", "synthesis , cytotoxicity evaluation of naphthalimide derived n-mustards.", "molecules (basel, switzerland)", "sesang w", "punyanitya s", "pitchuanchom s", "udomputtimekakul p", "nuntasaen n", "banjerdpongchai r", "wudtiwai b", "pompimon w", "cytotoxic aporphine alkaloids leaves , twigs of pseuduvaria trimera (craib).", "molecules (basel, switzerland)", "yang xk", "xu my", "xu gs", "zhang yl", "xu zx", "in vitro , in vivo antitumor activity of scutebarbatine on human lung carcinoma a549 cell lines.", "molecules (basel, switzerland)", "yang cy", "lu rh", "lin ch", "jen ch", "tung cy", "yang sh", "lin jk", "jiang jk", "lin ch", "single nucleotide polymorphisms associated colorectal cancer susceptibility , loss of heterozygosity in taiwanese population.", "plos one", "zhang h", "gu l", "liu t", "chiang ky", "zhou m", "inhibition of mdm2 nilotinib contributes cytotoxicity in both philadelphia-positive , negative acute lymphoblastic leukemia.", "plos one", "oliveira a", "pinho d", "albino-teixeira a", "medeiros r", "dinis-oliveira rj", "carvalho f", "morphine glucuronidation increases analgesic effect in guinea-pigs.", "life sciences", "kabbout m", "dakhlallah d", "sharma s", "bronisz a", "srinivasan r", "piper m", "marsh cb", "ostrowski mc", "microrna 17-92 cluster mediates ets1 , ets2-dependent ras-oncogenic transformation.", "plos one", "kannen h", "hazama h", "kaneda y", "fujino t", "awazu k", "development of laser ionization techniques evaluation of effect of cancer drugs using imaging mass spectrometry.", "international journal of molecular sciences", "liang j", "tong p", "zhao w", "li y", "zhang l", "xia y", "yu y", "the rest gene signature predicts drug sensitivity in neuroblastoma cell lines , associated neuroblastoma tumor stage.", "international journal of molecular sciences", "mathur a", "ware c", "davis l", "gazdar a", "pan bs", "lutterbach b", "fgfr2 amplified in nci-h716 colorectal cancer cell line , required growth , survival.", "plos one", "van jw", "van den berg h", "van dalen ec", "different infusion durations preventing platinum-induced hearing loss in children cancer.", "the cochrane database of systematic reviews", "lynam-lennon n", "maher sg", "maguire a", "phelan j", "muldoon c", "reynolds jv", "o'sullivan j", "altered mitochondrial function , energy metabolism associated radioresistant phenotype in oesophageal adenocarcinoma.", "plos one", "meriggi f", "andreis f", "premi v", "liborio n", "codignola c", "mazzocchi m", "rizzi a", "prochilo t", "rota l", "di biasi b", "bertocchi p", "abeni c", "ogliosi c", "aroldi f", "zaniboni a", "assessing cancer caregivers' needs targeted psychosocial support project: experience of oncology department of poliambulanza foundation.", "palliative & supportive care", "gwede ck", "davis sn", "wilson s", "patel m", "vadaparampil st", "meade cd", "rivers bm", "yu d", "torres-roca j", "heysek r", "spiess pe", "pow-sang j", "jacobsen p", "perceptions of prostate cancer screening controversy , informed decision making: implications development of targeted decision aid unaffected male first-degree relatives.", "american journal of health promotion : ajhp", "simerska p", "suksamran t", "ziora zm", "rivera fd", "engwerda c", "toth i", "ovalbumin lipid core peptide vaccines , cd4<sup>+</sup> , cd8<sup>+</sup> t cell responses.", "vaccine", "ogembo jg", "manga s", "nulah k", "foglabenchi lh", "perlman s", "wamai rg", "welty t", "welty e", "tih p", "achieving high uptake of human papillomavirus vaccine in cameroon: lessons learned in overcoming challenges.", "vaccine", "chung cy", "alden sl", "funderburg nt", "fu p", "levine ad", "progressive proximal-to-distal reduction in expression of tight junction complex in colonic epithelium of virally-suppressed hiv+ individuals.", "plos pathogens"]
what i'm looking instead be:
@details = {{:title => {"title1"}, :authors => {"author1", "author2", "author3"}, :journal => {"journal1"}},{:title => {"title2"}, :authors => {"author4", "author5", "author6"}, :journal => {"journal2"}}
i've tried .to_hash methods described in other answers, don't create hash deals xml attributes well, name of attributes want in @name attribute each "item". here sample xml pubmed:
<esummaryresult><docsum><id>11850928</id><item name="pubdate" type="date">1965 aug</item><item name="epubdate" type="date"/><item name="source" type="string">arch dermatol</item><item name="authorlist" type="list"><item name="author" type="string">lopresti pj</item><item name="author" type="string">hambrick gw jr</item></item><item name="lastauthor" type="string">hambrick gw jr</item><item name="title" type="string">zirconium granuloma following treatment of rhus dermatitis.</item><item name="volume" type="string">92</item><item name="issue" type="string">2</item><item name="pages" type="string">188-91</item><item name="langlist" type="list"><item name="lang" type="string">english</item></item><item name="nlmuniqueid" type="string">0372433</item><item name="issn" type="string">0003-987x</item><item name="essn" type="string">1538-3652</item><item name="pubtypelist" type="list"><item name="pubtype" type="string">journal article</item></item><item name="recordstatus" type="string">pubmed - indexed medline</item><item name="pubstatus" type="string">ppublish</item><item name="articleids" type="list"><item name="pubmed" type="string">11850928</item><item name="eid" type="string">11850928</item><item name="rid" type="string">11850928</item></item><item name="history" type="list"><item name="pubmed" type="date">1965/08/01 00:00</item><item name="medline" type="date">2002/03/09 10:01</item><item name="entrez" type="date">1965/08/01 00:00</item></item><item name="references" type="list"/><item name="hasabstract" type="integer">1</item><item name="pmcrefcount" type="integer">0</item><item name="fulljournalname" type="string">archives of dermatology</item><item name="elocationid" type="string"/><item name="so" type="string">1965 aug;92(2):188-91</item></docsum><docsum><id>11482001</id><item name="pubdate" type="date">2001 jun</item><item name="epubdate" type="date"/><item name="source" type="string">adverse drug react toxicol rev</item><item name="authorlist" type="list"><item name="author" type="string">mantle d</item><item name="author" type="string">gok ma</item><item name="author" type="string">lennard tw</item></item><item name="lastauthor" type="string">lennard tw</item><item name="title" type="string">adverse , beneficial effects of plant extracts on skin , skin disorders.</item><item name="volume" type="string">20</item><item name="issue" type="string">2</item><item name="pages" type="string">89-103</item><item name="langlist" type="list"><item name="lang" type="string">english</item></item><item name="nlmuniqueid" type="string">9109474</item><item name="issn" type="string">0964-198x</item><item name="essn" type="string"/><item name="pubtypelist" type="list"><item name="pubtype" type="string">journal article</item><item name="pubtype" type="string">review</item></item><item name="recordstatus" type="string">pubmed - indexed medline</item><item name="pubstatus" type="string">ppublish</item><item name="articleids" type="list"><item name="pubmed" type="string">11482001</item><item name="eid" type="string">11482001</item><item name="rid" type="string">11482001</item></item><item name="history" type="list"><item name="pubmed" type="date">2001/08/03 10:00</item><item name="medline" type="date">2002/01/23 10:01</item><item name="entrez" type="date">2001/08/03 10:00</item></item><item name="references" type="list"/><item name="hasabstract" type="integer">1</item><item name="pmcrefcount" type="integer">3</item><item name="fulljournalname" type="string">adverse drug reactions , toxicological reviews</item><item name="elocationid" type="string"/><item name="so" type="string">2001 jun;20(2):89-103</item></docsum></esummaryresult>
thanks help, i've been dying trying finding answer.
there no automatic way this, structure of xml not match structure of required hash. must pick out desired nodes xml manually , construct hash values. using xpath easiest, code might this:
@details = [] detail_doc.xpath("/esummaryresult/docsum").each |node| detail = {} detail[:title] = node.xpath("item[@name='title']").text detail[:journal] = node.xpath("item[@name='journal']").text detail[:authors] = node.xpath("item[@name='authorlist']/item[@name='author']").map{|n| n.text} @details.push(detail) end
Comments
Post a Comment