I l@ve RuBoard Previous Section Next Section

1.18 Creating Lists of Lists Without Sharing References

Credit: David Ascher

1.18.1 Problem

You want to create a multidimensional list, but the apparently simplest solution is fraught with surprises.

1.18.2 Solution

Use list comprehensions (also known as list displays) to avoid implicit reference sharing:

multilist = [[0 for col in range(5)] for row in range(10)]

1.18.3 Discussion

When a newcomer to Python is shown the power of the multiplication operation on lists, he often gets quite excited about it, since it is such an elegant notation. For example:

>>> [0] * 5
[0, 0, 0, 0, 0]

The problem is that one-dimensional problems often grow a second dimension, so there is a natural progression to:

>>> multi = [[0] * 5] * 3
>>> print multi
[[0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0]]

This appears to have worked, but the same newcomer is then often puzzled by bugs, which typically can be boiled down to the following test:

>>> multi[0][0] = 'Changed!'
>>> print multi
[['Changed!', 0, 0, 0, 0], ['Changed!', 0, 0, 0, 0], ['Changed!', 0, 0, 0, 0]]

This problem definitely confuses most programmers at least once, if not a few times (see the FAQ entry at http://www.python.org/doc/FAQ.html#4.50). To understand it, it helps to decompose the creation of the multidimensional list into two steps:

>>> row = [0] * 5          # a list with five references to 0
>>> multi = [row] * 3      # a list with three references to the row object

The problem still exists in this version (Python is not that magical). The comments are key to understanding the source of the confusion. The process of multiplying a sequence by a number creates a new sequence with the specified number of new references to the original contents. In the case of the creation of row, it doesn't matter whether references are being duplicated or not, since the referent (the object being referred to) is immutable. In other words, there is no difference between an object and a reference to an object if that object is immutable. In the second line, however, what is created is a new list containing three references to the contents of the [row] list, which is a single reference to a list. Thus, multi contains three references to a single object. So when the first element of the first element of multi is changed, you are actually modifying the first element of the shared list. Hence the surprise.

List comprehensions, added in Python 2.2, provide a nice syntax that avoids the problem, as illustrated in the solution. With list comprehensions, there is no sharing of references�it's a truly nested computation. Note that the performance characteristics of the solution are O(M x N), meaning that it will scale with each dimension. The list-multiplication idiom, however, is an O(M) computation, as it doesn't really do duplications.

1.18.4 See Also

Documentation for the range built-in function in the Library Reference.

    I l@ve RuBoard Previous Section Next Section