Python – BeautifulSoup – Find All with Lambda Function for Attributes

Today, I had to figure out a way to parse an HTML string in Python in order to find all of the attribute values of attributes starting with a specific string. Since we already have BeautifulSoup installed, I started researching how to use a lambda function in conjunction with the attrs argument of BeautifulSoup#findAll(). Unfortunately, I didn’t figure out a way to use a callable with the attrs argument, but I did with the name:

from BeautifulSoup import BeautifulSoup
soup = BeautifulSoup('
Click
Jump
') elems = soup.findAll(lambda tag:[a for a in tag.attrs if a[0].startswith('custom-')])

After running the above code to find all elements with attributes starting with custom-, I ended up with a list of the following two elements:

[
Click
, Jump]

In order to get a list of all of the attribute values, instead of traversing the attributes of each element returned, I decided to just add a little bit to the lambda function:

from BeautifulSoup import BeautifulSoup
soup = BeautifulSoup('
Click
Jump
') custom_values = [] soup.findAll(lambda tag:[custom_values.append(a[1]) for a in tag.attrs if a[0].startswith('custom-')]) print custom_values

That resulted in this:

[u'Clicker', u'Jumper']

Pretty simple, right! :cool:

Leave a Reply

Your email address will not be published. Required fields are marked *


five + = 8

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre class="">