• np.random.choices的使用


    在看莫烦python的RL源码时,他的DDPG记忆库Memory的实现是这样写的:

    class Memory(object):
        def __init__(self, capacity, dims):
            self.capacity = capacity
            self.data = np.zeros((capacity, dims))
            self.pointer = 0
    
        def store_transition(self, s, a, r, s_):
            transition = np.hstack((s, a, [r], s_))
            index = self.pointer % self.capacity  # replace the old memory with new memory
            self.data[index, :] = transition
            self.pointer += 1
    
        def sample(self, n):
            assert self.pointer >= self.capacity, 'Memory has not been fulfilled'
            indices = np.random.choice(self.capacity, size=n)
            return self.data[indices, :]

    其中sample方法用assert断言pointer >= capacity,也就是说Memory必须满了才能学习。

    我在设计一种方案,一开始往记忆库里存比较好的transition(也就是reward比较高的),要是等记忆库填满再学习好像有点浪费,因为会在填满之后很快被差的transition所替代,甚至好的transition不能填满Memory,从而不能有效学习好的经验。

    此时就需要关注np.random.choice方法了,看源码解释:

    def choice(a, size=None, replace=True, p=None): # real signature unknown; restored from __doc__
        """
        choice(a, size=None, replace=True, p=None)
        
                Generates a random sample from a given 1-D array
        
                        .. versionadded:: 1.7.0
        
                Parameters
                -----------
                a : 1-D array-like or int
                    If an ndarray, a random sample is generated from its elements.
                    If an int, the random sample is generated as if a were np.arange(a)
                size : int or tuple of ints, optional
                    Output shape.  If the given shape is, e.g., ``(m, n, k)``, then
                    ``m * n * k`` samples are drawn.  Default is None, in which case a
                    single value is returned.
                replace : boolean, optional
                    Whether the sample is with or without replacement
                p : 1-D array-like, optional
                    The probabilities associated with each entry in a.
                    If not given the sample assumes a uniform distribution over all
                    entries in a.
        
                Returns
                --------
                samples : single item or ndarray
                    The generated random samples

    主要第一个参数为ndarray,如果给的是int,np会自动将其通过np.arange(a)转换为ndarray。

    此处主要关注的是,a(我们使用int)< size时,np会怎么取?

    上代码测试

    import numpy as np
    
    samples = np.random.choice(3, 5)
    print(samples)

    输出:

    [2 1 2 1 1]

    所以,是会从np.array(a)重复取,可以推断出,np.random.choice是“有放回地取”(具体我也没看源码,从重复情况来看,至少a<size时是这样的)

    然后我分别测试了np.random.choice(5, 5)、np.random.choice(10, 5)等。多试几次会发现samples中确实是会有重复的。:

    import numpy as np
    
    samples = np.random.choice(10, 5)
    print(samples)
    
    [3 4 3 4 5]
  • 相关阅读:
    如何让pc端网站在手机上可以等比缩放的整个显示
    CSS
    常见的IE布局兼容问题
    CSS : 使用 z-index 的前提
    CSS : object-fit 和 object-position实现 图片或视频自适应
    CSS
    vscode
    如何识别Form字段中一对多或者多对多字段
    window.open简单使用
    由一个模型拿它的名字、app的名字、字段对象以及字段对象中的属性
  • 原文地址:https://www.cnblogs.com/peanutk/p/10786839.html
Copyright © 2020-2023  润新知