Select Page

blog # Pool.map is Handy.

Processes and queues are the way to go to scale software performance. But I also recently found out about pool.map, which allows you to write less code (you don’t have to do anything with Queue) and get similar effects.

The basic usage is `pool.map(some_func, some_iterable)` and it works like `map`, returning a list of the results. So,`pool.map(is_prime, [2, 3, 4, 5])` would return `[True, True, False, True]`. There’s also an `itertools.imap` equivalent which returns an iterable and lets you work on any size iterable you want.

Given a `primes_for_list` function that returns a list containing the primes in the supplied list. And a `chunks` function which splits a list into sublists of size n:

### Calculating Primes With pool.map

``````def use_pool_for_primes(l, chunk_size=1000):
Finds primes in the provided list and returns a list of
those primes in the order provided.
Uses pool.map
pool = multiprocessing.Pool()
parts = pool.map(primes_for_list, chunks(l, chunk_size))
return list(itertools.chain(*parts))``````
``````\$ python use_of_pool.py multi
78498 primes &< 1000000. Calculated in 2.426864s
148933 primes &< 2000000. Calculated in 6.764274s
283146 primes &< 4000000. Calculated in 19.203458s
412849 primes &< 6000000. Calculated in 38.047622s
539777 primes &< 8000000. Calculated in 60.967163s
664579 primes &< 10000000. Calculated in 87.283632s``````

### Using Unordered Map

`imap_unordered` (and then sorting them), seems to squeeze a little more performance out:

``````def use_pool_unordered_for_primes(l, chunk_size=1000):
"""
Finds primes in the provided list and returns an ordered list
of those primes.
Uses pool.imap_unordered
"""
pool = multiprocessing.Pool()
parts = pool.imap_unordered(primes_for_list,
chunks(l, chunk_size))
return sorted(list(itertools.chain(*parts)))``````
``````\$ python use_of_pool.py unordered
78498 primes &< 1000000. Calculated in 2.222123s
148933 primes &< 2000000. Calculated in 6.13618s
283146 primes &< 4000000. Calculated in 17.588184s
412849 primes &< 6000000. Calculated in 35.159868s
539777 primes &< 8000000. Calculated in 56.629323s
664579 primes &< 10000000. Calculated in 75.582824s``````

### How Fast Would It Be In A Single Process?

``````def primes_for_list(l):
return [i for i in l if is_prime(i)]``````
``````78498 primes &< 1000000. Calculated in 9.630087s
148933 primes &< 2000000. Calculated in 26.764138s
283146 primes &< 4000000. Calculated in 76.268092s
412849 primes &< 6000000. Calculated in 139.234455s
539777 primes &< 8000000. Calculated in 217.067968s
664579 primes &< 10000000. Calculated in 302.099861s``````

### The Full Code

``````from datetime import datetime
import itertools
import multiprocessing
import sys
def is_prime(n):
if n < 2:
return False
if n < 4:
return True
check_factors_max = int(n ** 0.5) + 1
for i in range(2, check_factors_max):
is_factor = n % i == 0
if is_factor:
return False
return True
def chunks(l, chunk_size=1000):
i = 0
keep_going = True
while keep_going:
start = i
end = i + chunk_size
if end > len(l):
end = len(l)
keep_going = False
yield l[start:end]
i += chunk_size
def primes_for_list(l):
return [i for i in l if is_prime(i)]
def use_pool_for_primes(l, chunk_size=1000):
"""
Finds primes in the provided list and returns a list of those primes in the
order provided.
Uses pool.map
"""
pool = multiprocessing.Pool()
parts = pool.map(primes_for_list, chunks(l, chunk_size))
return list(itertools.chain(*parts))
def use_pool_unordered_for_primes(l, chunk_size=1000):
"""
Finds primes in the provided list and returns an ordered list of those
primes.
Uses pool.imap_unordered
"""
pool = multiprocessing.Pool()
parts = pool.imap_unordered(primes_for_list, chunks(l, chunk_size))
return sorted(list(itertools.chain(*parts)))
if __name__ == '__main__':
method = sys.argv
method_dict = {
'one': primes_for_list,
'multi': use_pool_for_primes,
'unordered': use_pool_unordered_for_primes,
}
for n in range(1000000, 10000001, 1000000):
before = datetime.now()
primes = method_dict[method](range(n))
seconds_elapsed = (datetime.now() - before).total_seconds()
print '%s primes < %s. Calculated in %ss' % (len(primes), n,
seconds_elapsed)``````

and I tested it with these tests:

``````class TestPrimes(object):
primes_lt_1000 = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47,
53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113,
127, 131, 137, 139, 149, 151, 157, 163, 167, 173, 179, 181, 191,
193, 197, 199, 211, 223, 227, 229, 233, 239, 241, 251, 257, 263,
269, 271, 277, 281, 283, 293, 307, 311, 313, 317, 331, 337, 347,
349, 353, 359, 367, 373, 379, 383, 389, 397, 401, 409, 419, 421,
431, 433, 439, 443, 449, 457, 461, 463, 467, 479, 487, 491, 499,
503, 509, 521, 523, 541, 547, 557, 563, 569, 571, 577, 587, 593,
599, 601, 607, 613, 617, 619, 631, 641, 643, 647, 653, 659, 661,
673, 677, 683, 691, 701, 709, 719, 727, 733, 739, 743, 751, 757,
761, 769, 773, 787, 797, 809, 811, 821, 823, 827, 829, 839, 853,
857, 859, 863, 877, 881, 883, 887, 907, 911, 919, 929, 937, 941,
947, 953, 967, 971, 977, 983, 991, 997,]
def test_primes_is_accurate(self):
primes = primes_for_list(range(1000))
assert(primes == TestPrimes.primes_lt_1000)
def test_use_pool_for_primes(self):
primes = use_pool_for_primes(range(1000), 100)
assert(primes == TestPrimes.primes_lt_1000)
def test_use_pool_unordered_for_primes(self):
primes = use_pool_unordered_for_primes(range(1000), 100)
assert(primes == TestPrimes.primes_lt_1000)
def test_chunks(self):
l = [1, 2, 3, 4, 5, 6, 7]
assert([[1, 2], [3, 4], [5, 6], ] == [c for c in chunks(l, 2)])
assert([[1, 2, 3, 4, 5, 6, 7]] == [c for c in chunks(l, 8)])``````

So there you have it `pool.map` is a simple way to use all those cores. This almost devolved into a how big a game of life can we compute, but maybe I’ll do that next time.

+ more

## The 5 W’s of Quality Assurance

Often, QA is thought of as just testing. In fact, it’s that and much more; testing is just one part of the entire process. The QA process covers all of the steps that are necessary to ensure that the product meets the stakeholder expectations and provides a positive...