Category Archives: Python

Python Quirks – List Concatenation or Mutation?

If you execute the following code, what do you think will be the result?

list1 = [1]
list2 = list1

list2 += [2, 3]
assert list1 == list2, '{} != {}'.format(list1, list2)

list1 = list1 + [4, 5]
assert list1 == list2, '{} != {}'.format(list1, list2)

Will an assert error occur? The answer is yes! The reason why is because list1 = list1 + list2 is actually different from list1 += list2 in Python. Let’s see what happened when I ran the above code in the command prompt:

>>> list1 = [1]
>>> list2 = list1
>>> 
>>> list2 += [2, 3]
>>> assert list1 == list2, '{} != {}'.format(list1, list2)
>>> 
>>> list1 = list1 + [4, 5]
>>> assert list1 == list2, '{} != {}'.format(list1, list2)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AssertionError: [1, 2, 3, 4, 5] != [1, 2, 3]

As is shown above, using `+=` is not assigning a concatenation of the two lists, but actually mutating the left-hand side list (list1). In other words, list1 += list2 is the same as list1.extend(list2). Interesting, right? :cool:

Python Snippet – Get List Item or Get Default

One of the many cool things about Python is that you can often use builtin functions to either get a value from a dictionary (or an object with getattr) or default to a specified value. Unfortunately lists do not provide such a function. The simplest use case would be to get the first value out of any list if it exists and if not simply return None. After fiddling around with a few different implementations, I landed on this one:

def get_at(list, index, default=None):
  return list[index] if max(~index, index) < len(list) else default

The above function can be used in order to either return the value at the specified index or a default value if the index doesn't exist:

lists = [[], ['Hello world!!!'], range(10)[3:]]

for list in lists:
  print 'list = {}'.format(list)
  print '- first value: {}'.format(get_at(list, 0))
  print '- second value: {}'.format(get_at(list, 1))
  print '- last value: {}'.format(get_at(list, -1))
  print '- 5th value (defaults to "missing"): {}'.format(get_at(list, 4, 'missing'))
  print ''

The above code outputs the following:

list = []
- first value: None
- second value: None
- last value: None
- 5th value (defaults to "missing"): missing

list = ['Hello world!!!']
- first value: Hello world!!!
- second value: None
- last value: Hello world!!!
- 5th value (defaults to "missing"): missing

list = [3, 4, 5, 6, 7, 8, 9]
- first value: 3
- second value: 4
- last value: 9
- 5th value (defaults to "missing"): 7

You may be thinking that there must be another way to do this. There are in fact various other ways to do this. One of them is to use `next(iter(list[index:]), default)` as the return value. The main reason I landed on the solution that uses max() and a ternary statement is because I wanted to minimize the amount of objects created. Using next() along with iter() requires a generator to be created for the sliced list. On the other hand, Python handles all of this pretty efficiently so its possible that the difference between the two solutions is minimal. Either way, now I have a succinct helper function that either gets me a list item or the default value I specify. :cool:

Python – BeautifulSoup – Find All with Lambda Function for Attributes

Today, I had to figure out a way to parse an HTML string in Python in order to find all of the attribute values of attributes starting with a specific string. Since we already have BeautifulSoup installed, I started researching how to use a lambda function in conjunction with the attrs argument of BeautifulSoup#findAll(). Unfortunately, I didn’t figure out a way to use a callable with the attrs argument, but I did with the name:

from BeautifulSoup import BeautifulSoup
soup = BeautifulSoup('
Click
Jump
') elems = soup.findAll(lambda tag:[a for a in tag.attrs if a[0].startswith('custom-')])

After running the above code to find all elements with attributes starting with custom-, I ended up with a list of the following two elements:

[
Click
, Jump]

In order to get a list of all of the attribute values, instead of traversing the attributes of each element returned, I decided to just add a little bit to the lambda function:

from BeautifulSoup import BeautifulSoup
soup = BeautifulSoup('
Click
Jump
') custom_values = [] soup.findAll(lambda tag:[custom_values.append(a[1]) for a in tag.attrs if a[0].startswith('custom-')]) print custom_values

That resulted in this:

[u'Clicker', u'Jumper']

Pretty simple, right! :cool: