• 用cython提升python的性能


    Boosting performance with Cython

     
     
    Even with my old pc (AMD Athlon II, 3GB ram), I seldom run into performance issues when running vectorized code. But unfortunately there are plenty of cases where that can not be easily vectorized, for example the drawdown function. My implementation of such was extremely slow, so I decided to use it as a test case for speeding things up. I'll be using the SPY timeseries with ~5k samples as test data. Here comes the original version of my drawdown function (as it is now implemented in the TradingWithPython library) 
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    def drawdown(pnl):
        """
        calculate max drawdown and duration
     
        Returns:
            drawdown : vector of drawdwon values
            duration : vector of drawdown duration
        """
        cumret = pnl
     
        highwatermark = [0]
     
        idx = pnl.index
        drawdown = pd.Series(index = idx)
        drawdowndur = pd.Series(index = idx)
     
        for t in range(1, len(idx)) :
            highwatermark.append(max(highwatermark[t-1], cumret[t]))
            drawdown[t]= (highwatermark[t]-cumret[t])
            drawdowndur[t]= (0 if drawdown[t] == 0 else drawdowndur[t-1]+1)
     
        return drawdown, drawdowndur
     
    %timeit drawdown(spy)
    1 loops, best of 3: 1.21 s per loop
    Hmm 1.2 seconds is not too speedy for such a simple function. There are some things here that could be a great drag to performance, such as a list *highwatermark* that is being appended on each loop iteration. Accessing Series by their index should also involve some processing that is not strictly necesarry. Let's take a look at what happens when this function is rewritten to work with numpy data 
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    def dd(s):
    #    ''' simple drawdown function '''
         
        highwatermark = np.zeros(len(s))
        drawdown = np.zeros(len(s))
        drawdowndur = np.zeros(len(s))
     
      
        for t in range(1,len(s)):
            highwatermark[t] = max(highwatermark[t-1], s[t])
            drawdown[t] = (highwatermark[t]-s[t])
            drawdowndur[t]= (0 if drawdown[t] == 0 else drawdowndur[t-1]+1)
            
          
        return drawdown , drawdowndur
     
    %timeit dd(spy.values)
    10 loops, best of 3: 27.9 ms per loop
    Well, this is much faster than the original function, approximately 40x speed increase. Still there is much room for improvement by moving to compiled code with cython Now I rewrite the dd function from above, but using optimisation tips that I've found on the cython tutorial .
    duanqs
  • 相关阅读:
    linux 鼠标中键粘帖功能?!!
    mysql 学习笔记(一)
    log4j的使用 与 父接口 slf4j 门面模式(外观模式)
    web.xml 配置 文章汇总
    2019.08.04 新建随笔
    spring-事务的七个传播行为,最近想出去面试,了解一下框架知识
    20190710 tomcat下的项目导入到eclipse中
    20190709 关于web.xml中webAppRootKey的解释
    20160624 策略模式
    20190616 IDEA-每次修改JS文件都需要重启Idea才能生效解决方法
  • 原文地址:https://www.cnblogs.com/duan-qs/p/5746333.html
Copyright © 2020-2023  润新知