关于进程:
An executing instance of a program is called a process.程序的执行实例称为进程。
Each process provides the resources needed to execute a program. A process has a virtual address space, executable code, open handles to system objects, a security context, a unique process identifier, environment variables, a priority class, minimum and maximum working set sizes, and at least one thread of execution. Each process is started with a single thread, often called the primary thread, but can create additional threads from any of its threads.
每个进程都提供执行程序所需的资源。 进程具有虚拟地址空间,可执行代码,系统对象的打开句柄,安全上下文,唯一进程标识符,环境变量,优先级类别,最小和最大工作集大小以及至少一个执行线程。 每个进程都使用单线程启动,通常称为主线程,但可以从其任何线程创建其他线程。
电脑上的每个运行的程序是不同的进程,如qq,word,firefox,都是不同的进程。
关于线程:
A thread is an execution context, which is all the information a CPU needs to execute a stream of instructions.
Suppose you're reading a book, and you want to take a break right now, but you want to be able to come back and resume reading from the exact point where you stopped. One way to achieve that is by jotting down the page number, line number, and word number. So your execution context for reading a book is these 3 numbers.
If you have a roommate, and she's using the same technique, she can take the book while you're not using it, and resume reading from where she stopped. Then you can take it back, and resume it from where you were.
Threads work in the same way. A CPU is giving you the illusion that it's doing multiple computations at the same time. It does that by spending a bit of time on each computation. It can do that because it has an execution context for each computation. Just like you can share a book with your friend, many tasks can share a CPU.
On a more technical level, an execution context (therefore a thread) consists of the values of the CPU's registers.
Last: threads are different from processes. A thread is a context of execution, while a process is a bunch of resources associated with a computation. A process can have one or many threads.
Clarification: the resources associated with a process include memory pages (all the threads in a process have the same view of the memory), file descriptors (e.g., open sockets), and security credentials (e.g., the ID of the user who started the process).
线程是执行上下文,它是CPU执行指令流所需的全部信息。
假设你正在阅读一本书,并且你现在想休息一下,但是你希望能够从停止的确切位置回来并继续阅读。一种实现这一点的方法是记下页码,行号和字数。所以你阅读一本书的执行环境就是这3个数字。
如果你有一个室友,而且她使用的是同样的技巧,她可以在不使用的时候拿起这本书,然后从停止的地方继续阅读。然后,您可以将它收回来,然后从您所在的位置恢复。
线程以相同的方式工作。 CPU给你的错觉是它正在同时进行多个计算。它通过在每个计算上花费一些时间来完成这个工作。它可以这样做,因为它为每个计算都有一个执行上下文。就像你可以和你的朋友分享一本书一样,许多任务可以共享一个CPU。
在更技术层面上,执行上下文(因此是一个线程)由CPU寄存器的值组成。
最后:线程与进程不同。线程是执行的上下文,而进程是与计算相关的一堆资源。一个进程可以有一个或多个线程。
说明:与进程相关的资源包括内存页面(进程中的所有线程都具有相同的内存视图),文件描述符(例如打开的套接字)和安全证书(例如,启动处理)。
进程和线程的区别:
- Threads share the address space of the process that created it; processes have their own address space.
- Threads have direct access to the data segment of its process; processes have their own copy of the data segment of the parent process.
- Threads can directly communicate with other threads of its process; processes must use interprocess communication to communicate with sibling processes.
- New threads are easily created; new processes require duplication of the parent process.
- Threads can exercise considerable control over threads of the same process; processes can only exercise control over child processes.
- Changes to the main thread (cancellation, priority change, etc.) may affect the behavior of the other threads of the process; changes to the parent process does not affect child processes.
1.线程共享创建它的进程的地址空间; 进程拥有自己的地址空间。
2.线程可以直接访问其进程的数据段; 进程拥有父进程的数据段的自己的副本。
3.线程可以直接与其进程的其他线程通信; 进程必须使用进程间通信与兄弟进程进行通信。
4.新线程很容易创建; 新流程需要重复父流程。
5.线程可以对同一进程的线程进行相当程度的控制; 进程只能对子进程进行控制。
6.对主线程的更改(取消,优先级更改等)可能会影响进程其他线程的行为; 对父进程的更改不会影响子进程
Python threading模块:
线程有2种调用方式,如下:
直接调用
import threading import time def run(n): #定义每个线程要运行的函数 print("running on number:%s" %n) time.sleep(1) t1 = threading.Thread(target=run,args=(1,)) #生成一个线程实例 t2 = threading.Thread(target=run,args=(2,)) #生成另一个线程实例 t1.start() #启动线程 t2.start() #启动另一个线程 print(t1.getName()) #获取线程名 print(t2.getName())
继承式调用
import threading import time class MyThread(threading.Thread): def __init__(self,num): threading.Thread.__init__(self) self.num = num def run(self):#定义每个线程要运行的函数 print("running on number:%s" %self.num) time.sleep(3) if __name__ == '__main__': t1 = MyThread(1) t2 = MyThread(2) t1.start() t2.start()
Join & Daemon
Some threads do background tasks, like sending keepalive packets, or performing periodic garbage collection, or whatever. These are only useful when the main program is running, and it's okay to kill them off once the other, non-daemon, threads have exited.
Without daemon threads, you'd have to keep track of them, and tell them to exit, before your program can completely quit. By setting them as daemon threads, you can let them run and forget about them, and when your program quits, any daemon threads are killed automatically.
一些线程执行后台任务,例如发送Keepalive数据包,或执行定期垃圾回收等等。 这些仅在主程序运行时才有用,并且一旦其他非守护线程退出就可以关闭它们。
没有守护进程线程,你必须跟踪它们,并告诉它们退出,在你的程序完全退出之前。 通过将它们设置为守护进程线程,您可以让它们运行并忘记它们,并且当程序退出时,任何守护进程线程都会自动终止。
import threading import time def hi(num): print("hello %d"%num) time.sleep(3) t1=threading.Thread(target=hi,args=(10,)) t2=threading.Thread(target=hi,args=(9,)) t1.start() t2.start() t1.join()#主线程执行完毕后等待子线程t1,t2 t2.join() print('ending')
import threading import time def music(): print('begin to listen %s'%time.ctime()) time.sleep(3) print('end to listen %s' %time.ctime()) def game(): print('begin to game %s'%time.ctime()) time.sleep(10) print('end to game %s' %time.ctime()) t1=threading.Thread(target=music) t2=threading.Thread(target=game) t1.setDaemon(True) #设置为守护线程,当主线程执行完毕退出时 t1,t1也退出 t2.setDaemon(True) t1.start() t2.start() print(threading.active_count()) print('----end----')
python GIL(global interpreter lock)
In CPython, the global interpreter lock, or GIL, is a mutex that prevents multiple native threads from executing Python bytecodes at once. This lock is necessary mainly because CPython’s memory management is not thread-safe. (However, since the GIL exists, other features have grown to depend on the guarantees that it enforces.)
在CPython中,全局解释器锁(GIL)是一个互斥锁,它可以阻止多个本地线程一次执行Python字节码。 这个锁主要是因为CPython的内存管理不是线程安全的。 (但是,由于GIL的存在,其他功能已经发展到依赖于它强制执行的保证。)
由上图可以看出,就算有GIL程序执行计算结果还是有可能出错,上图出错的原因在于thread1第一次执行结果没有完成,第二次轮到thread1执行时
count值已经改变,没有重新取值而计算,这个结果是错误的。Thread Lock 就是为了解决这个问题出现的。
python 线程锁(Thread Lock)
import threading,time lock=threading.Lock() #实例化一个线程锁 def run(n): lock.acquire#加锁 global num num+=1 time.sleep(2) lock.release#解锁 #print("this is %s"%n) num=0 obj=[] for i in range(1000): t=threading.Thread(target=run,args=(i,)) t.start() obj.append(t) for t in obj: t.join() print("num:%s"%num)
递归锁(Thread Rlock)
#递归锁 import threading, time def run1(): lock.acquire() print('run1加锁') global num num += 1 lock.release() print('run1解锁') return num def run2(): lock.acquire() print('run2加锁') global num2 num2 += 1 lock.release() print('run2解锁') return num2 def run3(): lock.acquire() #先进入run3的锁 print('run3加锁') res = run1() #进入run1的锁再退出run1解锁 print('--------between run1 and run2-----') res2 = run2() #进入run2的锁再退出run2解锁 lock.release() #退出run3的锁 print('run3 解锁') num, num2 = 0, 0 lock = threading.RLock() #锁的类型为RLOCK 递归锁 for i in range(1): t = threading.Thread(target=run3) #启动run3 t.start() while threading.active_count() != 1: print("当前线程为数",threading.active_count()) else: print('----all threads done---') print(num, num2)
当lock = threading.RLock() 执行结果如下
run3加锁
run1加锁
当前线程为数 2
当前线程为数 2
当前线程为数 2
当前线程为数 2
当前线程为数 2
当前线程为数 2
当前线程为数 2
当前线程为数 2
run1解锁
--------between run1 and run2-----
run2加锁
run2解锁
run3 解锁
----all threads done---
1 1
当lock = threading.Lock()执行结果为死循环
当前线程为数 2
当前线程为数 2
当前线程为数 2
当前线程为数 2
当前线程为数 2
当前线程为数 2
当前线程为数 2
当前线程为数 2
....
Semaphore(信号量)
互斥锁 同时只允许一个线程更改数据,而Semaphore是同时允许一定数量的线程更改数据 ,比如mysql 连接只能并发多少链接,socket server并发多少链接进来,可以用到控制线程数。
import threading, time def run(n): semaphore.acquire() time.sleep(1) print("run the thread: %s " % n) semaphore.release() if __name__ == '__main__': semaphore = threading.BoundedSemaphore(5) # 最多允许5个线程同时运行 for i in range(20): t = threading.Thread(target=run, args=(i,)) t.start() while threading.active_count() != 1: pass # print threading.active_count() else: print('----all threads done---')
事件(EVENT)
An event is a simple synchronization object;
the event represents an internal flag, and threads
can wait for the flag to be set, or set or clear the flag themselves.
event = threading.Event()
# a client thread can wait for the flag to be set
event.wait()
# a server thread can set or reset it
event.set()
event.clear()
If the flag is set, the wait method doesn’t do anything.
If the flag is cleared, wait will block until it becomes set again.
Any number of threads may wait for the same event.
事件是一个简单的同步对象;
该事件表示一个内部标志和线程
可以等待标志被设置,或者自己设置或清除标志。
event = threading.Event()
#客户端线程可以等待标志被设置
event.wait()
服务器线程可以设置或重置它
event.set()
event.clear()
如果该标志已设置,则等待方法不会执行任何操作。
如果标志被清除,等待将被阻塞,直到它再次被设置。
任何数量的线程都可能等待相同的事件。
通过Event来实现两个或多个线程间的交互,下面是一个红绿灯的例子,即起动一个线程做交通指挥灯,生成几个线程做车辆,车辆行驶按红灯停,绿灯行的规则。
import threading,time import random def light(): if not event.isSet(): event.set() #wait就不阻塞 #绿灯状态 count = 0 while True: if count < 10: print('