Best and/or fastest way to create lists in python

In python, as far as I know, there are at least three or four ways to create and initialize lists of a given size:

Simple loop append:

my_list = []
for i in range(50):
    my_list.append(0)

Simple loop +=:

my_list = []
for i in range(50):
    my_list += [0]

List understanding:

my_list = [0 for i in range(50)]

List and integer multiplication:

my_list = [0] * 50

In these examples, I think there is any performance difference between only 50 elements in a list, but what if I need a list of 1 million elements? Will there be any improvement in using xrange? Which is the preferred / fastest way to create and initialize lists in python?

 

Solution

Let's run some time tests.* timeit.timeit:

>>> from timeit import timeit
>>>
>>> # Test 1
>>> test = """
... my_list = []
... for i in xrange(50):
...     my_list.append(0)
... """
>>> timeit(test)
22.384258893239178
>>>
>>> # Test 2
>>> test = """
... my_list = []
... for i in xrange(50):
...     my_list += [0]
... """
>>> timeit(test)
34.494779364416445
>>>
>>> # Test 3
>>> test = "my_list = [0 for i in xrange(50)]"
>>> timeit(test)
9.490926919482774
>>>
>>> # Test 4
>>> test = "my_list = [0] * 50"
>>> timeit(test)
1.5340533503559755
>>>

As you can see, the last method is by far the fastest.

However, it should use only invariable items, such as integers. This is because it will create a list containing references to the same item.

The following is a demonstration:

>>> lst = [[]] * 3
>>> lst
[[], [], []]
>>> # The ids of the items in `lst` are the same
>>> id(lst[0])
28734408
>>> id(lst[1])
28734408
>>> id(lst[2])
28734408
>>>

This behavior is usually undesirable and can lead to errors in the code.

If you have variables (such as lists), you should use lists that are still very fast to understand:

>>> lst = [[] for _ in xrange(3)]
>>> lst
[[], [], []]
>>> # The ids of the items in `lst` are different
>>> id(lst[0])
28796688
>>> id(lst[1])
28796648
>>> id(lst[2])
28736168
>>>

* Note: In all tests, I replaced range with xrange. Since the latter returns to the iterator, it should always be faster than the former.

 

This article starts from Python Black Hole Network Synchronized updates of blog Parks

Keywords: Python network

Added by rawisjp on Sat, 05 Oct 2019 04:31:14 +0300