python - Efficient way to loop over Tags with Beautiful Soup -

- June 15, 2015

i want extract information multiple xml tags structured alike. loop on every children append dictionary. there way avoid loop each tag (like sn , count in mwe).

from bs4 import beautifulsoup bs import pandas pd  xml = """     <info>     <tag>          <sn>9-542</sn>          <count>14</count>     </tag>     <tag>          <sn>3-425</sn>          <count>16</count>     </tag>     </info>     """  bs_obj = bs(xml, "lxml") info = bs_obj.find_all('tag')   d = {}  # want avoid these multiple for-loops d['sn'] = [i.sn.text in info] d['count'] = [i.count.text in info]  pd.dataframe(d)

consider following approach.
there 2 loops sake of solution being dynamic (the thing change if want tag needed_tags list):

from collections import defaultdict  d = defaultdict(list)  needed_tags = ['sn', 'count'] in info:     tag in needed_tags:         d[tag].append(getattr(i, tag).text)  print(d) >> defaultdict(<class 'list'>, {'count': ['14', '16'], 'sn': ['9-542', '3-425']})

for exact example, can simplified to:

from collections import defaultdict  d = defaultdict(list)  in info:    d['sn'].append(i.sn.text)    d['count'].append(i.count.text)  print(d) >> defaultdict(<class 'list'>, {'count': ['14', '16'], 'sn': ['9-542', '3-425']})

Search This Blog

To form

python - Efficient way to loop over Tags with Beautiful Soup -

Comments

Post a Comment

Popular posts from this blog

sequelize.js - Sequelize group by with association includes id -

delphi - Take screenshot in webcam using VFrames in Console Application -

extjs - Set tooltip on click event on the grid cell -