Python多线程中锁与阻塞的使用误区
[Python]
关于阻塞主线程
join的错误用法
Thread.join()
作用为阻塞主线程,即在子线程未返回的时候,主线程等待其返回然后再继续执行. join不能与start在循环里连用.
以下为错误代码,代码创建了5个线程,然后用一个循环激活线程,激活之后令其阻塞主线程.
threads = [Thread() for i in range(5)]
for thread in threads:
thread.start()
thread.join()
执行过程:
1. 第一次循环中,主线程通过start函数激活线程1,线程1进行计算.
2. 由于start函数不阻塞主线程,在线程1进行运算的同时,主线程向下执行join函数.
3. 执行join之后,主线程被线程1阻塞,在线程1返回结果之前,主线程无法执行下一轮循环.
4. 线程1计算完成之后,解除对主线程的阻塞.
5. 主线程进入下一轮循环,激活线程2并被其阻塞…
如此往复,可以看出,本来应该并发的五个线程,在这里变成了顺序队列,效率和单线程无异.
join的正确用法
使用两个循环分别处理start
和join
函数.即可实现并发.
threads = [Thread() for i in range(5)]
for thread in threads:
thread.start()
for thread in threads:
thread.join()
time.sleep代替join进行调试
之前在一些项目里看到过这样的代码,使用time.sleep代替join手动阻塞主线程.
在所有子线程返回之前,主线程陷入无线循环而不能退出.
for thread in threads:
thread.start()
while 1:
if thread_num == 0:
break
time.sleep(0.01)
关于线程锁(threading.Lock)
单核CPU+PIL是否还需要锁?
非原子操作 count = count + 1
理论上是线程不安全的. 使用3个线程同时执行上述操作改变全局变量count的值,并查看程序执行结果. 如果结果正确,则表示未出现线程冲突.
使用以下代码测试
# -*- coding: utf-8 -*-
import threading
import time
count = 0
class Counter(threading.Thread):
def __init__(self, name):
self.thread_name = name
super(Counter, self).__init__(name=name)
def run(self):
global count
for i in xrange(100000):
count = count + 1
counters = [Counter('thread:%s' % i) for i in range(5)]
for counter in counters:
counter.start()
time.sleep(5)
print 'count=%s' % count
运行结果:count=275552
事实上每次运行结果都不相同且不正确,这证明单核CPU+PIL仍无法保证线程安全,需要加锁.
加锁后的正确代码:
# -*- coding: utf-8 -*-
import threading
import time
count = 0
lock = threading.Lock()
class Counter(threading.Thread):
def __init__(self, name):
self.thread_name = name
self.lock = threading.Lock()
super(Counter, self).__init__(name=name)
def run(self):
global count
global lock
for i in xrange(100000):
lock.acquire()
count = count + 1
lock.release()
counters = [Counter('thread:%s' % i) for i in range(5)]
for counter in counters:
counter.start()
time.sleep(5)
print 'count=%s' % count
结果: count=500000
注意锁的全局性
这是一个简单的Python语法问题,但在逻辑复杂时有可能被忽略. 要保证锁对于多个子线程来说是共用的,即不要在Thread的子类内部创建锁.
以下为错误代码
# -*- coding: utf-8 -*-
import threading
import time
count = 0
# lock = threading.Lock() # 正确的声明位置
class Counter(threading.Thread):
def __init__(self, name):
self.thread_name = name
self.lock = threading.Lock() # 错误的声明位置
super(Counter, self).__init__(name=name)
def run(self):
global count
for i in xrange(100000):
self.lock.acquire()
count = count + 1
self.lock.release()
counters = [Counter('thread:%s' % i) for i in range(5)]
for counter in counters:
print counter.thread_name
counter.start()
time.sleep(5)
print 'count=%s' % count