python - Efficient way to loop over Tags with Beautiful Soup -
i want extract information multiple xml tags structured alike. loop on every children append dictionary. there way avoid loop each tag (like sn , count in mwe).
from bs4 import beautifulsoup bs import pandas pd xml = """ <info> <tag> <sn>9-542</sn> <count>14</count> </tag> <tag> <sn>3-425</sn> <count>16</count> </tag> </info> """ bs_obj = bs(xml, "lxml") info = bs_obj.find_all('tag') d = {} # want avoid these multiple for-loops d['sn'] = [i.sn.text in info] d['count'] = [i.count.text in info] pd.dataframe(d)
consider following approach.
there 2 loops sake of solution being dynamic (the thing change if want tag needed_tags
list):
from collections import defaultdict d = defaultdict(list) needed_tags = ['sn', 'count'] in info: tag in needed_tags: d[tag].append(getattr(i, tag).text) print(d) >> defaultdict(<class 'list'>, {'count': ['14', '16'], 'sn': ['9-542', '3-425']})
for exact example, can simplified to:
from collections import defaultdict d = defaultdict(list) in info: d['sn'].append(i.sn.text) d['count'].append(i.count.text) print(d) >> defaultdict(<class 'list'>, {'count': ['14', '16'], 'sn': ['9-542', '3-425']})
Comments
Post a Comment