1.生成器
生成器是一个对象,每次调用它的时候,都会调用next()方法返回一个值,直到抛出StopIteration异常;
一般生成器对象由两种:一种是对象本省就是生成器,另外一种即使包含yield语句的函数,可以简单理解为生成器;yield语句有两层含义:和return一样返回一个值,同时会记录解释器对栈的引用,在下次调用到来时,从上次yield执行的状态开始接着往下执行;
下面就是一个简单的生成器函数:
def mygenerator(): yield 1 yield 2 yield 3 yield 4 print mygenerator() g=mygenerator() print next(g) print next(g) print next(g) print next(g)
检测函数是否为生成器函数,可以使用inspect模块中的方法实现
import inspect inspect.isgeneratorfunction(mygenerator) inspect.isgenerator(mygenerator())
inspect.isgeneratorfunction的源码如下:
def isgeneratorfunction(object): """Return true if the object is a user-defined generator function. Generator function objects provides same attributes as functions. See help(isfunction) for attributes listing.""" return bool((isfunction(object) or ismethod(object)) and object.func_code.co_flags & CO_GENERATOR)
在python3中,有inspect.getgeneratorstate函数可以获取生成器的执行的状态,状态有:GEN_CREATED、GEN_SUSPENDED、GEN_CLOSED等;
>>> def mygenerator(): ... yield 1 >>> get = mygenerator() >>> get <generator object mygenerator at 0x7ff49fdfe4c0> >>> inspect.getgeneratorstate(get) 'GEN_CREATED' >>> next(get) 1 >>> inspect.getgeneratorstate(get) 'GEN_SUSPENDED' >>> next(get) Traceback (most recent call last): File "<stdin>", line 1, in <module> StopIteration >>> >>> inspect.getgeneratorstate(get) 'GEN_CLOSED'
生成器可以有效的处理即时生成的大量消耗内存的数据,因为处理这类数据的时候,就会在内存中加载全部的数据,非常消耗内存,而生成器可以让数据只有在被循环处理到的时候,才会在内存中创建数据;
这里我们将python的运行内存限制在128MB
[root@linux-node1 ~]# python Python 2.7.5 (default, Nov 6 2016, 00:28:07) [GCC 4.8.5 20150623 (Red Hat 4.8.5-11)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> >>> a = list(range(10000000)) Traceback (most recent call last): File "<stdin>", line 1, in <module> MemoryError #内存溢出 #使用生成器 >>> for value in xrange(10000000): ... if value == 50000: ... print ("fount it") ... break ... fount it
yield有一个send()函数,通过生成器,可以向生成器函数传入参数,下面的例子,可以在单个线程中实现并发的效果
#!/usr/bin/env python # _*_ coding:utf-8 _*_ __author__ = 'Charles Chang' def mygenerator(): yield 1 yield 2 yield 3 yield 4 print mygenerator() g=mygenerator() print next(g) print next(g) print next(g) print next(g) import inspect print inspect.isgeneratorfunction(mygenerator) print inspect.isgenerator(mygenerator()) #生成者生产包子,两个消费者吃包子 import time def consumer(name): print " 33[32m;32%s ready to eat baozi 33[0m" %name while True: baozi = yield print(" 33[31m baozi [%s] is coming,eaten by [%s]! 33[0m" %(baozi,name)) li=[] def producer(name): c = consumer('A') #c和c2都是生成器 c2 = consumer('B') c.next() c2.next() print(" 33[31m begin to eat baozi 33[0m") while True: time.sleep(1) print("two baozi have been done") c.send("delicious") "delicious"是向consumer传入的值,赋值给baozi
c2.send("decilious")
producer("haha")
结果:
;32A ready to eat baozi ;32B ready to eat baozi begin to eat baozi two baozi have been done baozi [delicious] is coming,eaten by [A]! baozi [decilious] is coming,eaten by [B]! two baozi have been done baozi [delicious] is coming,eaten by [A]! baozi [decilious] is coming,eaten by [B]!
生成器表达式
(x.upper for x in ['hello','world']) #生成器 [x.upper for x in ['hello','world']] #列表
2.列表解析
同时使用多条for和if实现过滤
x = [word.capitalize() for line in ("hello world?","world!","or not") for word in line.split() if not word.startswith("or")] print x 结果: ['Hello', 'World?', 'World!', 'Not']
3、map、filter
python2上述方法返回的结果为列表,python3返回的是可以迭代的对象;
如果想要返回一个可以被迭代的对象,就需要使用itertools模块中的方法,itertools.ifilter、itertools.imap;