pandas学习
查找某个内容,使用Ctrl+f输入关键字查找
Series
import numpy as np
import pandas as pd
# 可以将Series理解为一个数组
# pandas.Series( data, index, dtype, copy)
# data 数据采取各种形式,如:ndarray,list,constants
# index 索引值必须是唯一的和散列的,与数据的长度相同。 默认np.arange(n)如果没有索引被传递
# dtype dtype用于数据类型。如果没有,将推断数据类型
# copy 复制数据,默认为false
# 数据源可以是数组、字典、常量、标量
# 创建一个空Series
s = pd.Series()
print(s)
Series([], dtype: float64)
# 获取numpy.array函数的帮助说明
help(np.array([]))
Help on ndarray object:
class ndarray(builtins.object)
| ndarray(shape, dtype=float, buffer=None, offset=0,
| strides=None, order=None)
|
| An array object represents a multidimensional, homogeneous array
| of fixed-size items. An associated data-type object describes the
| format of each element in the array (its byte-order, how many bytes it
| occupies in memory, whether it is an integer, a floating point number,
| or something else, etc.)
|
| Arrays should be constructed using `array`, `zeros` or `empty` (refer
| to the See Also section below). The parameters given here refer to
| a low-level method (`ndarray(...)`) for instantiating an array.
|
| For more information, refer to the `numpy` module and examine the
| methods and attributes of an array.
|
| Parameters
| ----------
| (for the __new__ method; see Notes below)
|
| shape : tuple of ints
| Shape of created array.
| dtype : data-type, optional
| Any object that can be interpreted as a numpy data type.
| buffer : object exposing buffer interface, optional
| Used to fill the array with data.
| offset : int, optional
| Offset of array data in buffer.
| strides : tuple of ints, optional
| Strides of data in memory.
| order : {'C', 'F'}, optional
| Row-major (C-style) or column-major (Fortran-style) order.
|
| Attributes
| ----------
| T : ndarray
| Transpose of the array.
| data : buffer
| The array's elements, in memory.
| dtype : dtype object
| Describes the format of the elements in the array.
| flags : dict
| Dictionary containing information related to memory use, e.g.,
| 'C_CONTIGUOUS', 'OWNDATA', 'WRITEABLE', etc.
| flat : numpy.flatiter object
| Flattened version of the array as an iterator. The iterator
| allows assignments, e.g., ``x.flat = 3`` (See `ndarray.flat` for
| assignment examples; TODO).
| imag : ndarray
| Imaginary part of the array.
| real : ndarray
| Real part of the array.
| size : int
| Number of elements in the array.
| itemsize : int
| The memory use of each array element in bytes.
| nbytes : int
| The total number of bytes required to store the array data,
| i.e., ``itemsize * size``.
| ndim : int
| The array's number of dimensions.
| shape : tuple of ints
| Shape of the array.
| strides : tuple of ints
| The step-size required to move from one element to the next in
| memory. For example, a contiguous ``(3, 4)`` array of type
| ``int16`` in C-order has strides ``(8, 2)``. This implies that
| to move from element to element in memory requires jumps of 2 bytes.
| To move from row-to-row, one needs to jump 8 bytes at a time
| (``2 * 4``).
| ctypes : ctypes object
| Class containing properties of the array needed for interaction
| with ctypes.
| base : ndarray
| If the array is a view into another array, that array is its `base`
| (unless that array is also a view). The `base` array is where the
| array data is actually stored.
|
| See Also
| --------
| array : Construct an array.
| zeros : Create an array, each element of which is zero.
| empty : Create an array, but leave its allocated memory unchanged (i.e.,
| it contains "garbage").
| dtype : Create a data-type.
|
| Notes
| -----
| There are two modes of creating an array using ``__new__``:
|
| 1. If `buffer` is None, then only `shape`, `dtype`, and `order`
| are used.
| 2. If `buffer` is an object exposing the buffer interface, then
| all keywords are interpreted.
|
| No ``__init__`` method is needed because the array is fully initialized
| after the ``__new__`` method.
|
| Examples
| --------
| These examples illustrate the low-level `ndarray` constructor. Refer
| to the `See Also` section above for easier ways of constructing an
| ndarray.
|
| First mode, `buffer` is None:
|
| >>> np.ndarray(shape=(2,2), dtype=float, order='F')
| array([[0.0e+000, 0.0e+000], # random
| [ nan, 2.5e-323]])
|
| Second mode:
|
| >>> np.ndarray((2,), buffer=np.array([1,2,3]),
| ... offset=np.int_().itemsize,
| ... dtype=int) # offset = 1*itemsize, i.e. skip first element
| array([2, 3])
|
| Methods defined here:
|
| __abs__(self, /)
| abs(self)
|
| __add__(self, value, /)
| Return self+value.
|
| __and__(self, value, /)
| Return self&value.
|
| __array__(...)
| a.__array__(|dtype) -> reference if type unchanged, copy otherwise.
|
| Returns either a new reference to self if dtype is not given or a new array
| of provided data type if dtype is different from the current dtype of the
| array.
|
| __array_function__(...)
|
| __array_prepare__(...)
| a.__array_prepare__(obj) -> Object of same type as ndarray object obj.
|
| __array_ufunc__(...)
|
| __array_wrap__(...)
| a.__array_wrap__(obj) -> Object of same type as ndarray object a.
|
| __bool__(self, /)
| self != 0
|
| __complex__(...)
|
| __contains__(self, key, /)
| Return key in self.
|
| __copy__(...)
| a.__copy__()
|
| Used if :func:`copy.copy` is called on an array. Returns a copy of the array.
|
| Equivalent to ``a.copy(order='K')``.
|
| __deepcopy__(...)
| a.__deepcopy__(memo, /) -> Deep copy of array.
|
| Used if :func:`copy.deepcopy` is called on an array.
|
| __delitem__(self, key, /)
| Delete self[key].
|
| __divmod__(self, value, /)
| Return divmod(self, value).
|
| __eq__(self, value, /)
| Return self==value.
|
| __float__(self, /)
| float(self)
|
| __floordiv__(self, value, /)
| Return self//value.
|
| __format__(...)
| Default object formatter.
|
| __ge__(self, value, /)
| Return self>=value.
|
| __getitem__(self, key, /)
| Return self[key].
|
| __gt__(self, value, /)
| Return self>value.
|
| __iadd__(self, value, /)
| Return self+=value.
|
| __iand__(self, value, /)
| Return self&=value.
|
| __ifloordiv__(self, value, /)
| Return self//=value.
|
| __ilshift__(self, value, /)
| Return self<<=value.
|
| __imatmul__(self, value, /)
| Return self@=value.
|
| __imod__(self, value, /)
| Return self%=value.
|
| __imul__(self, value, /)
| Return self*=value.
|
| __index__(self, /)
| Return self converted to an integer, if self is suitable for use as an index into a list.
|
| __int__(self, /)
| int(self)
|
| __invert__(self, /)
| ~self
|
| __ior__(self, value, /)
| Return self|=value.
|
| __ipow__(self, value, /)
| Return self**=value.
|
| __irshift__(self, value, /)
| Return self>>=value.
|
| __isub__(self, value, /)
| Return self-=value.
|
| __iter__(self, /)
| Implement iter(self).
|
| __itruediv__(self, value, /)
| Return self/=value.
|
| __ixor__(self, value, /)
| Return self^=value.
|
| __le__(self, value, /)
| Return self<=value.
|
| __len__(self, /)
| Return len(self).
|
| __lshift__(self, value, /)
| Return self<<value.
|
| __lt__(self, value, /)
| Return self<value.
|
| __matmul__(self, value, /)
| Return self@value.
|
| __mod__(self, value, /)
| Return self%value.
|
| __mul__(self, value, /)
| Return self*value.
|
| __ne__(self, value, /)
| Return self!=value.
|
| __neg__(self, /)
| -self
|
| __or__(self, value, /)
| Return self|value.
|
| __pos__(self, /)
| +self
|
| __pow__(self, value, mod=None, /)
| Return pow(self, value, mod).
|
| __radd__(self, value, /)
| Return value+self.
|
| __rand__(self, value, /)
| Return value&self.
|
| __rdivmod__(self, value, /)
| Return divmod(value, self).
|
| __reduce__(...)
| a.__reduce__()
|
| For pickling.
|
| __reduce_ex__(...)
| Helper for pickle.
|
| __repr__(self, /)
| Return repr(self).
|
| __rfloordiv__(self, value, /)
| Return value//self.
|
| __rlshift__(self, value, /)
| Return value<<self.
|
| __rmatmul__(self, value, /)
| Return value@self.
|
| __rmod__(self, value, /)
| Return value%self.
|
| __rmul__(self, value, /)
| Return value*self.
|
| __ror__(self, value, /)
| Return value|self.
|
| __rpow__(self, value, mod=None, /)
| Return pow(value, self, mod).
|
| __rrshift__(self, value, /)
| Return value>>self.
|
| __rshift__(self, value, /)
| Return self>>value.
|
| __rsub__(self, value, /)
| Return value-self.
|
| __rtruediv__(self, value, /)
| Return value/self.
|
| __rxor__(self, value, /)
| Return value^self.
|
| __setitem__(self, key, value, /)
| Set self[key] to value.
|
| __setstate__(...)
| a.__setstate__(state, /)
|
| For unpickling.
|
| The `state` argument must be a sequence that contains the following
| elements:
|
| Parameters
| ----------
| version : int
| optional pickle version. If omitted defaults to 0.
| shape : tuple
| dtype : data-type
| isFortran : bool
| rawdata : string or list
| a binary string with the data (or a list if 'a' is an object array)
|
| __sizeof__(...)
| Size of object in memory, in bytes.
|
| __str__(self, /)
| Return str(self).
|
| __sub__(self, value, /)
| Return self-value.
|
| __truediv__(self, value, /)
| Return self/value.
|
| __xor__(self, value, /)
| Return self^value.
|
| all(...)
| a.all(axis=None, out=None, keepdims=False)
|
| Returns True if all elements evaluate to True.
|
| Refer to `numpy.all` for full documentation.
|
| See Also
| --------
| numpy.all : equivalent function
|
| any(...)
| a.any(axis=None, out=None, keepdims=False)
|
| Returns True if any of the elements of `a` evaluate to True.
|
| Refer to `numpy.any` for full documentation.
|
| See Also
| --------
| numpy.any : equivalent function
|
| argmax(...)
| a.argmax(axis=None, out=None)
|
| Return indices of the maximum values along the given axis.
|
| Refer to `numpy.argmax` for full documentation.
|
| See Also
| --------
| numpy.argmax : equivalent function
|
| argmin(...)
| a.argmin(axis=None, out=None)
|
| Return indices of the minimum values along the given axis of `a`.
|
| Refer to `numpy.argmin` for detailed documentation.
|
| See Also
| --------
| numpy.argmin : equivalent function
|
| argpartition(...)
| a.argpartition(kth, axis=-1, kind='introselect', order=None)
|
| Returns the indices that would partition this array.
|
| Refer to `numpy.argpartition` for full documentation.
|
| .. versionadded:: 1.8.0
|
| See Also
| --------
| numpy.argpartition : equivalent function
|
| argsort(...)
| a.argsort(axis=-1, kind=None, order=None)
|
| Returns the indices that would sort this array.
|
| Refer to `numpy.argsort` for full documentation.
|
| See Also
| --------
| numpy.argsort : equivalent function
|
| astype(...)
| a.astype(dtype, order='K', casting='unsafe', subok=True, copy=True)
|
| Copy of the array, cast to a specified type.
|
| Parameters
| ----------
| dtype : str or dtype
| Typecode or data-type to which the array is cast.
| order : {'C', 'F', 'A', 'K'}, optional
| Controls the memory layout order of the result.
| 'C' means C order, 'F' means Fortran order, 'A'
| means 'F' order if all the arrays are Fortran contiguous,
| 'C' order otherwise, and 'K' means as close to the
| order the array elements appear in memory as possible.
| Default is 'K'.
| casting : {'no', 'equiv', 'safe', 'same_kind', 'unsafe'}, optional
| Controls what kind of data casting may occur. Defaults to 'unsafe'
| for backwards compatibility.
|
| * 'no' means the data types should not be cast at all.
| * 'equiv' means only byte-order changes are allowed.
| * 'safe' means only casts which can preserve values are allowed.
| * 'same_kind' means only safe casts or casts within a kind,
| like float64 to float32, are allowed.
| * 'unsafe' means any data conversions may be done.
| subok : bool, optional
| If True, then sub-classes will be passed-through (default), otherwise
| the returned array will be forced to be a base-class array.
| copy : bool, optional
| By default, astype always returns a newly allocated array. If this
| is set to false, and the `dtype`, `order`, and `subok`
| requirements are satisfied, the input array is returned instead
| of a copy.
|
| Returns
| -------
| arr_t : ndarray
| Unless `copy` is False and the other conditions for returning the input
| array are satisfied (see description for `copy` input parameter), `arr_t`
| is a new array of the same shape as the input array, with dtype, order
| given by `dtype`, `order`.
|
| Notes
| -----
| .. versionchanged:: 1.17.0
| Casting between a simple data type and a structured one is possible only
| for "unsafe" casting. Casting to multiple fields is allowed, but
| casting from multiple fields is not.
|
| .. versionchanged:: 1.9.0
| Casting from numeric to string types in 'safe' casting mode requires
| that the string dtype length is long enough to store the max
| integer/float value converted.
|
| Raises
| ------
| ComplexWarning
| When casting from complex to float or int. To avoid this,
| one should use ``a.real.astype(t)``.
|
| Examples
| --------
| >>> x = np.array([1, 2, 2.5])
| >>> x
| array([1. , 2. , 2.5])
|
| >>> x.astype(int)
| array([1, 2, 2])
|
| byteswap(...)
| a.byteswap(inplace=False)
|
| Swap the bytes of the array elements
|
| Toggle between low-endian and big-endian data representation by
| returning a byteswapped array, optionally swapped in-place.
|
| Parameters
| ----------
| inplace : bool, optional
| If ``True``, swap bytes in-place, default is ``False``.
|
| Returns
| -------
| out : ndarray
| The byteswapped array. If `inplace` is ``True``, this is
| a view to self.
|
| Examples
| --------
| >>> A = np.array([1, 256, 8755], dtype=np.int16)
| >>> list(map(hex, A))
| ['0x1', '0x100', '0x2233']
| >>> A.byteswap(inplace=True)
| array([ 256, 1, 13090], dtype=int16)
| >>> list(map(hex, A))
| ['0x100', '0x1', '0x3322']
|
| Arrays of strings are not swapped
|
| >>> A = np.array(['ceg', 'fac'])
| >>> A.byteswap()
| Traceback (most recent call last):
| ...
| UnicodeDecodeError: ...
|
| choose(...)
| a.choose(choices, out=None, mode='raise')
|
| Use an index array to construct a new array from a set of choices.
|
| Refer to `numpy.choose` for full documentation.
|
| See Also
| --------
| numpy.choose : equivalent function
|
| clip(...)
| a.clip(min=None, max=None, out=None, **kwargs)
|
| Return an array whose values are limited to ``[min, max]``.
| One of max or min must be given.
|
| Refer to `numpy.clip` for full documentation.
|
| See Also
| --------
| numpy.clip : equivalent function
|
| compress(...)
| a.compress(condition, axis=None, out=None)
|
| Return selected slices of this array along given axis.
|
| Refer to `numpy.compress` for full documentation.
|
| See Also
| --------
| numpy.compress : equivalent function
|
| conj(...)
| a.conj()
|
| Complex-conjugate all elements.
|
| Refer to `numpy.conjugate` for full documentation.
|
| See Also
| --------
| numpy.conjugate : equivalent function
|
| conjugate(...)
| a.conjugate()
|
| Return the complex conjugate, element-wise.
|
| Refer to `numpy.conjugate` for full documentation.
|
| See Also
| --------
| numpy.conjugate : equivalent function
|
| copy(...)
| a.copy(order='C')
|
| Return a copy of the array.
|
| Parameters
| ----------
| order : {'C', 'F', 'A', 'K'}, optional
| Controls the memory layout of the copy. 'C' means C-order,
| 'F' means F-order, 'A' means 'F' if `a` is Fortran contiguous,
| 'C' otherwise. 'K' means match the layout of `a` as closely
| as possible. (Note that this function and :func:`numpy.copy` are very
| similar, but have different default values for their order=
| arguments.)
|
| See also
| --------
| numpy.copy
| numpy.copyto
|
| Examples
| --------
| >>> x = np.array([[1,2,3],[4,5,6]], order='F')
|
| >>> y = x.copy()
|
| >>> x.fill(0)
|
| >>> x
| array([[0, 0, 0],
| [0, 0, 0]])
|
| >>> y
| array([[1, 2, 3],
| [4, 5, 6]])
|
| >>> y.flags['C_CONTIGUOUS']
| True
|
| cumprod(...)
| a.cumprod(axis=None, dtype=None, out=None)
|
| Return the cumulative product of the elements along the given axis.
|
| Refer to `numpy.cumprod` for full documentation.
|
| See Also
| --------
| numpy.cumprod : equivalent function
|
| cumsum(...)
| a.cumsum(axis=None, dtype=None, out=None)
|
| Return the cumulative sum of the elements along the given axis.
|
| Refer to `numpy.cumsum` for full documentation.
|
| See Also
| --------
| numpy.cumsum : equivalent function
|
| diagonal(...)
| a.diagonal(offset=0, axis1=0, axis2=1)
|
| Return specified diagonals. In NumPy 1.9 the returned array is a
| read-only view instead of a copy as in previous NumPy versions. In
| a future version the read-only restriction will be removed.
|
| Refer to :func:`numpy.diagonal` for full documentation.
|
| See Also
| --------
| numpy.diagonal : equivalent function
|
| dot(...)
| a.dot(b, out=None)
|
| Dot product of two arrays.
|
| Refer to `numpy.dot` for full documentation.
|
| See Also
| --------
| numpy.dot : equivalent function
|
| Examples
| --------
| >>> a = np.eye(2)
| >>> b = np.ones((2, 2)) * 2
| >>> a.dot(b)
| array([[2., 2.],
| [2., 2.]])
|
| This array method can be conveniently chained:
|
| >>> a.dot(b).dot(b)
| array([[8., 8.],
| [8., 8.]])
|
| dump(...)
| a.dump(file)
|
| Dump a pickle of the array to the specified file.
| The array can be read back with pickle.load or numpy.load.
|
| Parameters
| ----------
| file : str or Path
| A string naming the dump file.
|
| .. versionchanged:: 1.17.0
| `pathlib.Path` objects are now accepted.
|
| dumps(...)
| a.dumps()
|
| Returns the pickle of the array as a string.
| pickle.loads or numpy.loads will convert the string back to an array.
|
| Parameters
| ----------
| None
|
| fill(...)
| a.fill(value)
|
| Fill the array with a scalar value.
|
| Parameters
| ----------
| value : scalar
| All elements of `a` will be assigned this value.
|
| Examples
| --------
| >>> a = np.array([1, 2])
| >>> a.fill(0)
| >>> a
| array([0, 0])
| >>> a = np.empty(2)
| >>> a.fill(1)
| >>> a
| array([1., 1.])
|
| flatten(...)
| a.flatten(order='C')
|
| Return a copy of the array collapsed into one dimension.
|
| Parameters
| ----------
| order : {'C', 'F', 'A', 'K'}, optional
| 'C' means to flatten in row-major (C-style) order.
| 'F' means to flatten in column-major (Fortran-
| style) order. 'A' means to flatten in column-major
| order if `a` is Fortran *contiguous* in memory,
| row-major order otherwise. 'K' means to flatten
| `a` in the order the elements occur in memory.
| The default is 'C'.
|
| Returns
| -------
| y : ndarray
| A copy of the input array, flattened to one dimension.
|
| See Also
| --------
| ravel : Return a flattened array.
| flat : A 1-D flat iterator over the array.
|
| Examples
| --------
| >>> a = np.array([[1,2], [3,4]])
| >>> a.flatten()
| array([1, 2, 3, 4])
| >>> a.flatten('F')
| array([1, 3, 2, 4])
|
| getfield(...)
| a.getfield(dtype, offset=0)
|
| Returns a field of the given array as a certain type.
|
| A field is a view of the array data with a given data-type. The values in
| the view are determined by the given type and the offset into the current
| array in bytes. The offset needs to be such that the view dtype fits in the
| array dtype; for example an array of dtype complex128 has 16-byte elements.
| If taking a view with a 32-bit integer (4 bytes), the offset needs to be
| between 0 and 12 bytes.
|
| Parameters
| ----------
| dtype : str or dtype
| The data type of the view. The dtype size of the view can not be larger
| than that of the array itself.
| offset : int
| Number of bytes to skip before beginning the element view.
|
| Examples
| --------
| >>> x = np.diag([1.+1.j]*2)
| >>> x[1, 1] = 2 + 4.j
| >>> x
| array([[1.+1.j, 0.+0.j],
| [0.+0.j, 2.+4.j]])
| >>> x.getfield(np.float64)
| array([[1., 0.],
| [0., 2.]])
|
| By choosing an offset of 8 bytes we can select the complex part of the
| array for our view:
|
| >>> x.getfield(np.float64, offset=8)
| array([[1., 0.],
| [0., 4.]])
|
| item(...)
| a.item(*args)
|
| Copy an element of an array to a standard Python scalar and return it.
|
| Parameters
| ----------
| *args : Arguments (variable number and type)
|
| * none: in this case, the method only works for arrays
| with one element (`a.size == 1`), which element is
| copied into a standard Python scalar object and returned.
|
| * int_type: this argument is interpreted as a flat index into
| the array, specifying which element to copy and return.
|
| * tuple of int_types: functions as does a single int_type argument,
| except that the argument is interpreted as an nd-index into the
| array.
|
| Returns
| -------
| z : Standard Python scalar object
| A copy of the specified element of the array as a suitable
| Python scalar
|
| Notes
| -----
| When the data type of `a` is longdouble or clongdouble, item() returns
| a scalar array object because there is no available Python scalar that
| would not lose information. Void arrays return a buffer object for item(),
| unless fields are defined, in which case a tuple is returned.
|
| `item` is very similar to a[args], except, instead of an array scalar,
| a standard Python scalar is returned. This can be useful for speeding up
| access to elements of the array and doing arithmetic on elements of the
| array using Python's optimized math.
|
| Examples
| --------
| >>> np.random.seed(123)
| >>> x = np.random.randint(9, size=(3, 3))
| >>> x
| array([[2, 2, 6],
| [1, 3, 6],
| [1, 0, 1]])
| >>> x.item(3)
| 1
| >>> x.item(7)
| 0
| >>> x.item((0, 1))
| 2
| >>> x.item((2, 2))
| 1
|
| itemset(...)
| a.itemset(*args)
|
| Insert scalar into an array (scalar is cast to array's dtype, if possible)
|
| There must be at least 1 argument, and define the last argument
| as *item*. Then, ``a.itemset(*args)`` is equivalent to but faster
| than ``a[args] = item``. The item should be a scalar value and `args`
| must select a single item in the array `a`.
|
| Parameters
| ----------
| *args : Arguments
| If one argument: a scalar, only used in case `a` is of size 1.
| If two arguments: the last argument is the value to be set
| and must be a scalar, the first argument specifies a single array
| element location. It is either an int or a tuple.
|
| Notes
| -----
| Compared to indexing syntax, `itemset` provides some speed increase
| for placing a scalar into a particular location in an `ndarray`,
| if you must do this. However, generally this is discouraged:
| among other problems, it complicates the appearance of the code.
| Also, when using `itemset` (and `item`) inside a loop, be sure
| to assign the methods to a local variable to avoid the attribute
| look-up at each loop iteration.
|
| Examples
| --------
| >>> np.random.seed(123)
| >>> x = np.random.randint(9, size=(3, 3))
| >>> x
| array([[2, 2, 6],
| [1, 3, 6],
| [1, 0, 1]])
| >>> x.itemset(4, 0)
| >>> x.itemset((2, 2), 9)
| >>> x
| array([[2, 2, 6],
| [1, 0, 6],
| [1, 0, 9]])
|
| max(...)
| a.max(axis=None, out=None, keepdims=False, initial=<no value>, where=True)
|
| Return the maximum along a given axis.
|
| Refer to `numpy.amax` for full documentation.
|
| See Also
| --------
| numpy.amax : equivalent function
|
| mean(...)
| a.mean(axis=None, dtype=None, out=None, keepdims=False)
|
| Returns the average of the array elements along given axis.
|
| Refer to `numpy.mean` for full documentation.
|
| See Also
| --------
| numpy.mean : equivalent function
|
| min(...)
| a.min(axis=None, out=None, keepdims=False, initial=<no value>, where=True)
|
| Return the minimum along a given axis.
|
| Refer to `numpy.amin` for full documentation.
|
| See Also
| --------
| numpy.amin : equivalent function
|
| newbyteorder(...)
| arr.newbyteorder(new_order='S')
|
| Return the array with the same data viewed with a different byte order.
|
| Equivalent to::
|
| arr.view(arr.dtype.newbytorder(new_order))
|
| Changes are also made in all fields and sub-arrays of the array data
| type.
|
|
|
| Parameters
| ----------
| new_order : string, optional
| Byte order to force; a value from the byte order specifications
| below. `new_order` codes can be any of:
|
| * 'S' - swap dtype from current to opposite endian
| * {'<', 'L'} - little endian
| * {'>', 'B'} - big endian
| * {'=', 'N'} - native order
| * {'|', 'I'} - ignore (no change to byte order)
|
| The default value ('S') results in swapping the current
| byte order. The code does a case-insensitive check on the first
| letter of `new_order` for the alternatives above. For example,
| any of 'B' or 'b' or 'biggish' are valid to specify big-endian.
|
|
| Returns
| -------
| new_arr : array
| New array object with the dtype reflecting given change to the
| byte order.
|
| nonzero(...)
| a.nonzero()
|
| Return the indices of the elements that are non-zero.
|
| Refer to `numpy.nonzero` for full documentation.
|
| See Also
| --------
| numpy.nonzero : equivalent function
|
| partition(...)
| a.partition(kth, axis=-1, kind='introselect', order=None)
|
| Rearranges the elements in the array in such a way that the value of the
| element in kth position is in the position it would be in a sorted array.
| All elements smaller than the kth element are moved before this element and
| all equal or greater are moved behind it. The ordering of the elements in
| the two partitions is undefined.
|
| .. versionadded:: 1.8.0
|
| Parameters
| ----------
| kth : int or sequence of ints
| Element index to partition by. The kth element value will be in its
| final sorted position and all smaller elements will be moved before it
| and all equal or greater elements behind it.
| The order of all elements in the partitions is undefined.
| If provided with a sequence of kth it will partition all elements
| indexed by kth of them into their sorted position at once.
| axis : int, optional
| Axis along which to sort. Default is -1, which means sort along the
| last axis.
| kind : {'introselect'}, optional
| Selection algorithm. Default is 'introselect'.
| order : str or list of str, optional
| When `a` is an array with fields defined, this argument specifies
| which fields to compare first, second, etc. A single field can
| be specified as a string, and not all fields need to be specified,
| but unspecified fields will still be used, in the order in which
| they come up in the dtype, to break ties.
|
| See Also
| --------
| numpy.partition : Return a parititioned copy of an array.
| argpartition : Indirect partition.
| sort : Full sort.
|
| Notes
| -----
| See ``np.partition`` for notes on the different algorithms.
|
| Examples
| --------
| >>> a = np.array([3, 4, 2, 1])
| >>> a.partition(3)
| >>> a
| array([2, 1, 3, 4])
|
| >>> a.partition((1, 3))
| >>> a
| array([1, 2, 3, 4])
|
| prod(...)
| a.prod(axis=None, dtype=None, out=None, keepdims=False, initial=1, where=True)
|
| Return the product of the array elements over the given axis
|
| Refer to `numpy.prod` for full documentation.
|
| See Also
| --------
| numpy.prod : equivalent function
|
| ptp(...)
| a.ptp(axis=None, out=None, keepdims=False)
|
| Peak to peak (maximum - minimum) value along a given axis.
|
| Refer to `numpy.ptp` for full documentation.
|
| See Also
| --------
| numpy.ptp : equivalent function
|
| put(...)
| a.put(indices, values, mode='raise')
|
| Set ``a.flat[n] = values[n]`` for all `n` in indices.
|
| Refer to `numpy.put` for full documentation.
|
| See Also
| --------
| numpy.put : equivalent function
|
| ravel(...)
| a.ravel([order])
|
| Return a flattened array.
|
| Refer to `numpy.ravel` for full documentation.
|
| See Also
| --------
| numpy.ravel : equivalent function
|
| ndarray.flat : a flat iterator on the array.
|
| repeat(...)
| a.repeat(repeats, axis=None)
|
| Repeat elements of an array.
|
| Refer to `numpy.repeat` for full documentation.
|
| See Also
| --------
| numpy.repeat : equivalent function
|
| reshape(...)
| a.reshape(shape, order='C')
|
| Returns an array containing the same data with a new shape.
|
| Refer to `numpy.reshape` for full documentation.
|
| See Also
| --------
| numpy.reshape : equivalent function
|
| Notes
| -----
| Unlike the free function `numpy.reshape`, this method on `ndarray` allows
| the elements of the shape parameter to be passed in as separate arguments.
| For example, ``a.reshape(10, 11)`` is equivalent to
| ``a.reshape((10, 11))``.
|
| resize(...)
| a.resize(new_shape, refcheck=True)
|
| Change shape and size of array in-place.
|
| Parameters
| ----------
| new_shape : tuple of ints, or `n` ints
| Shape of resized array.
| refcheck : bool, optional
| If False, reference count will not be checked. Default is True.
|
| Returns
| -------
| None
|
| Raises
| ------
| ValueError
| If `a` does not own its own data or references or views to it exist,
| and the data memory must be changed.
| PyPy only: will always raise if the data memory must be changed, since
| there is no reliable way to determine if references or views to it
| exist.
|
| SystemError
| If the `order` keyword argument is specified. This behaviour is a
| bug in NumPy.
|
| See Also
| --------
| resize : Return a new array with the specified shape.
|
| Notes
| -----
| This reallocates space for the data area if necessary.
|
| Only contiguous arrays (data elements consecutive in memory) can be
| resized.
|
| The purpose of the reference count check is to make sure you
| do not use this array as a buffer for another Python object and then
| reallocate the memory. However, reference counts can increase in
| other ways so if you are sure that you have not shared the memory
| for this array with another Python object, then you may safely set
| `refcheck` to False.
|
| Examples
| --------
| Shrinking an array: array is flattened (in the order that the data are
| stored in memory), resized, and reshaped:
|
| >>> a = np.array([[0, 1], [2, 3]], order='C')
| >>> a.resize((2, 1))
| >>> a
| array([[0],
| [1]])
|
| >>> a = np.array([[0, 1], [2, 3]], order='F')
| >>> a.resize((2, 1))
| >>> a
| array([[0],
| [2]])
|
| Enlarging an array: as above, but missing entries are filled with zeros:
|
| >>> b = np.array([[0, 1], [2, 3]])
| >>> b.resize(2, 3) # new_shape parameter doesn't have to be a tuple
| >>> b
| array([[0, 1, 2],
| [3, 0, 0]])
|
| Referencing an array prevents resizing...
|
| >>> c = a
| >>> a.resize((1, 1))
| Traceback (most recent call last):
| ...
| ValueError: cannot resize an array that references or is referenced ...
|
| Unless `refcheck` is False:
|
| >>> a.resize((1, 1), refcheck=False)
| >>> a
| array([[0]])
| >>> c
| array([[0]])
|
| round(...)
| a.round(decimals=0, out=None)
|
| Return `a` with each element rounded to the given number of decimals.
|
| Refer to `numpy.around` for full documentation.
|
| See Also
| --------
| numpy.around : equivalent function
|
| searchsorted(...)
| a.searchsorted(v, side='left', sorter=None)
|
| Find indices where elements of v should be inserted in a to maintain order.
|
| For full documentation, see `numpy.searchsorted`
|
| See Also
| --------
| numpy.searchsorted : equivalent function
|
| setfield(...)
| a.setfield(val, dtype, offset=0)
|
| Put a value into a specified place in a field defined by a data-type.
|
| Place `val` into `a`'s field defined by `dtype` and beginning `offset`
| bytes into the field.
|
| Parameters
| ----------
| val : object
| Value to be placed in field.
| dtype : dtype object
| Data-type of the field in which to place `val`.
| offset : int, optional
| The number of bytes into the field at which to place `val`.
|
| Returns
| -------
| None
|
| See Also
| --------
| getfield
|
| Examples
| --------
| >>> x = np.eye(3)
| >>> x.getfield(np.float64)
| array([[1., 0., 0.],
| [0., 1., 0.],
| [0., 0., 1.]])
| >>> x.setfield(3, np.int32)
| >>> x.getfield(np.int32)
| array([[3, 3, 3],
| [3, 3, 3],
| [3, 3, 3]], dtype=int32)
| >>> x
| array([[1.0e+000, 1.5e-323, 1.5e-323],
| [1.5e-323, 1.0e+000, 1.5e-323],
| [1.5e-323, 1.5e-323, 1.0e+000]])
| >>> x.setfield(np.eye(3), np.int32)
| >>> x
| array([[1., 0., 0.],
| [0., 1., 0.],
| [0., 0., 1.]])
|
| setflags(...)
| a.setflags(write=None, align=None, uic=None)
|
| Set array flags WRITEABLE, ALIGNED, (WRITEBACKIFCOPY and UPDATEIFCOPY),
| respectively.
|
| These Boolean-valued flags affect how numpy interprets the memory
| area used by `a` (see Notes below). The ALIGNED flag can only
| be set to True if the data is actually aligned according to the type.
| The WRITEBACKIFCOPY and (deprecated) UPDATEIFCOPY flags can never be set
| to True. The flag WRITEABLE can only be set to True if the array owns its
| own memory, or the ultimate owner of the memory exposes a writeable buffer
| interface, or is a string. (The exception for string is made so that
| unpickling can be done without copying memory.)
|
| Parameters
| ----------
| write : bool, optional
| Describes whether or not `a` can be written to.
| align : bool, optional
| Describes whether or not `a` is aligned properly for its type.
| uic : bool, optional
| Describes whether or not `a` is a copy of another "base" array.
|
| Notes
| -----
| Array flags provide information about how the memory area used
| for the array is to be interpreted. There are 7 Boolean flags
| in use, only four of which can be changed by the user:
| WRITEBACKIFCOPY, UPDATEIFCOPY, WRITEABLE, and ALIGNED.
|
| WRITEABLE (W) the data area can be written to;
|
| ALIGNED (A) the data and strides are aligned appropriately for the hardware
| (as determined by the compiler);
|
| UPDATEIFCOPY (U) (deprecated), replaced by WRITEBACKIFCOPY;
|
| WRITEBACKIFCOPY (X) this array is a copy of some other array (referenced
| by .base). When the C-API function PyArray_ResolveWritebackIfCopy is
| called, the base array will be updated with the contents of this array.
|
| All flags can be accessed using the single (upper case) letter as well
| as the full name.
|
| Examples
| --------
| >>> y = np.array([[3, 1, 7],
| ... [2, 0, 0],
| ... [8, 5, 9]])
| >>> y
| array([[3, 1, 7],
| [2, 0, 0],
| [8, 5, 9]])
| >>> y.flags
| C_CONTIGUOUS : True
| F_CONTIGUOUS : False
| OWNDATA : True
| WRITEABLE : True
| ALIGNED : True
| WRITEBACKIFCOPY : False
| UPDATEIFCOPY : False
| >>> y.setflags(write=0, align=0)
| >>> y.flags
| C_CONTIGUOUS : True
| F_CONTIGUOUS : False
| OWNDATA : True
| WRITEABLE : False
| ALIGNED : False
| WRITEBACKIFCOPY : False
| UPDATEIFCOPY : False
| >>> y.setflags(uic=1)
| Traceback (most recent call last):
| File "<stdin>", line 1, in <module>
| ValueError: cannot set WRITEBACKIFCOPY flag to True
|
| sort(...)
| a.sort(axis=-1, kind=None, order=None)
|
| Sort an array in-place. Refer to `numpy.sort` for full documentation.
|
| Parameters
| ----------
| axis : int, optional
| Axis along which to sort. Default is -1, which means sort along the
| last axis.
| kind : {'quicksort', 'mergesort', 'heapsort', 'stable'}, optional
| Sorting algorithm. The default is 'quicksort'. Note that both 'stable'
| and 'mergesort' use timsort under the covers and, in general, the
| actual implementation will vary with datatype. The 'mergesort' option
| is retained for backwards compatibility.
|
| .. versionchanged:: 1.15.0.
| The 'stable' option was added.
|
| order : str or list of str, optional
| When `a` is an array with fields defined, this argument specifies
| which fields to compare first, second, etc. A single field can
| be specified as a string, and not all fields need be specified,
| but unspecified fields will still be used, in the order in which
| they come up in the dtype, to break ties.
|
| See Also
| --------
| numpy.sort : Return a sorted copy of an array.
| argsort : Indirect sort.
| lexsort : Indirect stable sort on multiple keys.
| searchsorted : Find elements in sorted array.
| partition: Partial sort.
|
| Notes
| -----
| See `numpy.sort` for notes on the different sorting algorithms.
|
| Examples
| --------
| >>> a = np.array([[1,4], [3,1]])
| >>> a.sort(axis=1)
| >>> a
| array([[1, 4],
| [1, 3]])
| >>> a.sort(axis=0)
| >>> a
| array([[1, 3],
| [1, 4]])
|
| Use the `order` keyword to specify a field to use when sorting a
| structured array:
|
| >>> a = np.array([('a', 2), ('c', 1)], dtype=[('x', 'S1'), ('y', int)])
| >>> a.sort(order='y')
| >>> a
| array([(b'c', 1), (b'a', 2)],
| dtype=[('x', 'S1'), ('y', '<i8')])
|
| squeeze(...)
| a.squeeze(axis=None)
|
| Remove single-dimensional entries from the shape of `a`.
|
| Refer to `numpy.squeeze` for full documentation.
|
| See Also
| --------
| numpy.squeeze : equivalent function
|
| std(...)
| a.std(axis=None, dtype=None, out=None, ddof=0, keepdims=False)
|
| Returns the standard deviation of the array elements along given axis.
|
| Refer to `numpy.std` for full documentation.
|
| See Also
| --------
| numpy.std : equivalent function
|
| sum(...)
| a.sum(axis=None, dtype=None, out=None, keepdims=False, initial=0, where=True)
|
| Return the sum of the array elements over the given axis.
|
| Refer to `numpy.sum` for full documentation.
|
| See Also
| --------
| numpy.sum : equivalent function
|
| swapaxes(...)
| a.swapaxes(axis1, axis2)
|
| Return a view of the array with `axis1` and `axis2` interchanged.
|
| Refer to `numpy.swapaxes` for full documentation.
|
| See Also
| --------
| numpy.swapaxes : equivalent function
|
| take(...)
| a.take(indices, axis=None, out=None, mode='raise')
|
| Return an array formed from the elements of `a` at the given indices.
|
| Refer to `numpy.take` for full documentation.
|
| See Also
| --------
| numpy.take : equivalent function
|
| tobytes(...)
| a.tobytes(order='C')
|
| Construct Python bytes containing the raw data bytes in the array.
|
| Constructs Python bytes showing a copy of the raw contents of
| data memory. The bytes object can be produced in either 'C' or 'Fortran',
| or 'Any' order (the default is 'C'-order). 'Any' order means C-order
| unless the F_CONTIGUOUS flag in the array is set, in which case it
| means 'Fortran' order.
|
| .. versionadded:: 1.9.0
|
| Parameters
| ----------
| order : {'C', 'F', None}, optional
| Order of the data for multidimensional arrays:
| C, Fortran, or the same as for the original array.
|
| Returns
| -------
| s : bytes
| Python bytes exhibiting a copy of `a`'s raw data.
|
| Examples
| --------
| >>> x = np.array([[0, 1], [2, 3]], dtype='<u2')
| >>> x.tobytes()
| b'x00x00x01x00x02x00x03x00'
| >>> x.tobytes('C') == x.tobytes()
| True
| >>> x.tobytes('F')
| b'x00x00x02x00x01x00x03x00'
|
| tofile(...)
| a.tofile(fid, sep="", format="%s")
|
| Write array to a file as text or binary (default).
|
| Data is always written in 'C' order, independent of the order of `a`.
| The data produced by this method can be recovered using the function
| fromfile().
|
| Parameters
| ----------
| fid : file or str or Path
| An open file object, or a string containing a filename.
|
| .. versionchanged:: 1.17.0
| `pathlib.Path` objects are now accepted.
|
| sep : str
| Separator between array items for text output.
| If "" (empty), a binary file is written, equivalent to
| ``file.write(a.tobytes())``.
| format : str
| Format string for text file output.
| Each entry in the array is formatted to text by first converting
| it to the closest Python type, and then using "format" % item.
|
| Notes
| -----
| This is a convenience function for quick storage of array data.
| Information on endianness and precision is lost, so this method is not a
| good choice for files intended to archive data or transport data between
| machines with different endianness. Some of these problems can be overcome
| by outputting the data as text files, at the expense of speed and file
| size.
|
| When fid is a file object, array contents are directly written to the
| file, bypassing the file object's ``write`` method. As a result, tofile
| cannot be used with files objects supporting compression (e.g., GzipFile)
| or file-like objects that do not support ``fileno()`` (e.g., BytesIO).
|
| tolist(...)
| a.tolist()
|
| Return the array as an ``a.ndim``-levels deep nested list of Python scalars.
|
| Return a copy of the array data as a (nested) Python list.
| Data items are converted to the nearest compatible builtin Python type, via
| the `~numpy.ndarray.item` function.
|
| If ``a.ndim`` is 0, then since the depth of the nested list is 0, it will
| not be a list at all, but a simple Python scalar.
|
| Parameters
| ----------
| none
|
| Returns
| -------
| y : object, or list of object, or list of list of object, or ...
| The possibly nested list of array elements.
|
| Notes
| -----
| The array may be recreated via ``a = np.array(a.tolist())``, although this
| may sometimes lose precision.
|
| Examples
| --------
| For a 1D array, ``a.tolist()`` is almost the same as ``list(a)``:
|
| >>> a = np.array([1, 2])
| >>> list(a)
| [1, 2]
| >>> a.tolist()
| [1, 2]
|
| However, for a 2D array, ``tolist`` applies recursively:
|
| >>> a = np.array([[1, 2], [3, 4]])
| >>> list(a)
| [array([1, 2]), array([3, 4])]
| >>> a.tolist()
| [[1, 2], [3, 4]]
|
| The base case for this recursion is a 0D array:
|
| >>> a = np.array(1)
| >>> list(a)
| Traceback (most recent call last):
| ...
| TypeError: iteration over a 0-d array
| >>> a.tolist()
| 1
|
| tostring(...)
| a.tostring(order='C')
|
| Construct Python bytes containing the raw data bytes in the array.
|
| Constructs Python bytes showing a copy of the raw contents of
| data memory. The bytes object can be produced in either 'C' or 'Fortran',
| or 'Any' order (the default is 'C'-order). 'Any' order means C-order
| unless the F_CONTIGUOUS flag in the array is set, in which case it
| means 'Fortran' order.
|
| This function is a compatibility alias for tobytes. Despite its name it returns bytes not strings.
|
| Parameters
| ----------
| order : {'C', 'F', None}, optional
| Order of the data for multidimensional arrays:
| C, Fortran, or the same as for the original array.
|
| Returns
| -------
| s : bytes
| Python bytes exhibiting a copy of `a`'s raw data.
|
| Examples
| --------
| >>> x = np.array([[0, 1], [2, 3]], dtype='<u2')
| >>> x.tobytes()
| b'x00x00x01x00x02x00x03x00'
| >>> x.tobytes('C') == x.tobytes()
| True
| >>> x.tobytes('F')
| b'x00x00x02x00x01x00x03x00'
|
| trace(...)
| a.trace(offset=0, axis1=0, axis2=1, dtype=None, out=None)
|
| Return the sum along diagonals of the array.
|
| Refer to `numpy.trace` for full documentation.
|
| See Also
| --------
| numpy.trace : equivalent function
|
| transpose(...)
| a.transpose(*axes)
|
| Returns a view of the array with axes transposed.
|
| For a 1-D array this has no effect, as a transposed vector is simply the
| same vector. To convert a 1-D array into a 2D column vector, an additional
| dimension must be added. `np.atleast2d(a).T` achieves this, as does
| `a[:, np.newaxis]`.
| For a 2-D array, this is a standard matrix transpose.
| For an n-D array, if axes are given, their order indicates how the
| axes are permuted (see Examples). If axes are not provided and
| ``a.shape = (i[0], i[1], ... i[n-2], i[n-1])``, then
| ``a.transpose().shape = (i[n-1], i[n-2], ... i[1], i[0])``.
|
| Parameters
| ----------
| axes : None, tuple of ints, or `n` ints
|
| * None or no argument: reverses the order of the axes.
|
| * tuple of ints: `i` in the `j`-th place in the tuple means `a`'s
| `i`-th axis becomes `a.transpose()`'s `j`-th axis.
|
| * `n` ints: same as an n-tuple of the same ints (this form is
| intended simply as a "convenience" alternative to the tuple form)
|
| Returns
| -------
| out : ndarray
| View of `a`, with axes suitably permuted.
|
| See Also
| --------
| ndarray.T : Array property returning the array transposed.
| ndarray.reshape : Give a new shape to an array without changing its data.
|
| Examples
| --------
| >>> a = np.array([[1, 2], [3, 4]])
| >>> a
| array([[1, 2],
| [3, 4]])
| >>> a.transpose()
| array([[1, 3],
| [2, 4]])
| >>> a.transpose((1, 0))
| array([[1, 3],
| [2, 4]])
| >>> a.transpose(1, 0)
| array([[1, 3],
| [2, 4]])
|
| var(...)
| a.var(axis=None, dtype=None, out=None, ddof=0, keepdims=False)
|
| Returns the variance of the array elements, along given axis.
|
| Refer to `numpy.var` for full documentation.
|
| See Also
| --------
| numpy.var : equivalent function
|
| view(...)
| a.view(dtype=None, type=None)
|
| New view of array with the same data.
|
| Parameters
| ----------
| dtype : data-type or ndarray sub-class, optional
| Data-type descriptor of the returned view, e.g., float32 or int16. The
| default, None, results in the view having the same data-type as `a`.
| This argument can also be specified as an ndarray sub-class, which
| then specifies the type of the returned object (this is equivalent to
| setting the ``type`` parameter).
| type : Python type, optional
| Type of the returned view, e.g., ndarray or matrix. Again, the
| default None results in type preservation.
|
| Notes
| -----
| ``a.view()`` is used two different ways:
|
| ``a.view(some_dtype)`` or ``a.view(dtype=some_dtype)`` constructs a view
| of the array's memory with a different data-type. This can cause a
| reinterpretation of the bytes of memory.
|
| ``a.view(ndarray_subclass)`` or ``a.view(type=ndarray_subclass)`` just
| returns an instance of `ndarray_subclass` that looks at the same array
| (same shape, dtype, etc.) This does not cause a reinterpretation of the
| memory.
|
| For ``a.view(some_dtype)``, if ``some_dtype`` has a different number of
| bytes per entry than the previous dtype (for example, converting a
| regular array to a structured array), then the behavior of the view
| cannot be predicted just from the superficial appearance of ``a`` (shown
| by ``print(a)``). It also depends on exactly how ``a`` is stored in
| memory. Therefore if ``a`` is C-ordered versus fortran-ordered, versus
| defined as a slice or transpose, etc., the view may give different
| results.
|
|
| Examples
| --------
| >>> x = np.array([(1, 2)], dtype=[('a', np.int8), ('b', np.int8)])
|
| Viewing array data using a different type and dtype:
|
| >>> y = x.view(dtype=np.int16, type=np.matrix)
| >>> y
| matrix([[513]], dtype=int16)
| >>> print(type(y))
| <class 'numpy.matrix'>
|
| Creating a view on a structured array so it can be used in calculations
|
| >>> x = np.array([(1, 2),(3,4)], dtype=[('a', np.int8), ('b', np.int8)])
| >>> xv = x.view(dtype=np.int8).reshape(-1,2)
| >>> xv
| array([[1, 2],
| [3, 4]], dtype=int8)
| >>> xv.mean(0)
| array([2., 3.])
|
| Making changes to the view changes the underlying array
|
| >>> xv[0,1] = 20
| >>> x
| array([(1, 20), (3, 4)], dtype=[('a', 'i1'), ('b', 'i1')])
|
| Using a view to convert an array to a recarray:
|
| >>> z = x.view(np.recarray)
| >>> z.a
| array([1, 3], dtype=int8)
|
| Views share data:
|
| >>> x[0] = (9, 10)
| >>> z[0]
| (9, 10)
|
| Views that change the dtype size (bytes per entry) should normally be
| avoided on arrays defined by slices, transposes, fortran-ordering, etc.:
|
| >>> x = np.array([[1,2,3],[4,5,6]], dtype=np.int16)
| >>> y = x[:, 0:2]
| >>> y
| array([[1, 2],
| [4, 5]], dtype=int16)
| >>> y.view(dtype=[('width', np.int16), ('length', np.int16)])
| Traceback (most recent call last):
| ...
| ValueError: To change to a dtype of a different size, the array must be C-contiguous
| >>> z = y.copy()
| >>> z.view(dtype=[('width', np.int16), ('length', np.int16)])
| array([[(1, 2)],
| [(4, 5)]], dtype=[('width', '<i2'), ('length', '<i2')])
|
| ----------------------------------------------------------------------
| Static methods defined here:
|
| __new__(*args, **kwargs) from builtins.type
| Create and return a new object. See help(type) for accurate signature.
|
| ----------------------------------------------------------------------
| Data descriptors defined here:
|
| T
| The transposed array.
|
| Same as ``self.transpose()``.
|
| Examples
| --------
| >>> x = np.array([[1.,2.],[3.,4.]])
| >>> x
| array([[ 1., 2.],
| [ 3., 4.]])
| >>> x.T
| array([[ 1., 3.],
| [ 2., 4.]])
| >>> x = np.array([1.,2.,3.,4.])
| >>> x
| array([ 1., 2., 3., 4.])
| >>> x.T
| array([ 1., 2., 3., 4.])
|
| See Also
| --------
| transpose
|
| __array_finalize__
| None.
|
| __array_interface__
| Array protocol: Python side.
|
| __array_priority__
| Array priority.
|
| __array_struct__
| Array protocol: C-struct side.
|
| base
| Base object if memory is from some other object.
|
| Examples
| --------
| The base of an array that owns its memory is None:
|
| >>> x = np.array([1,2,3,4])
| >>> x.base is None
| True
|
| Slicing creates a view, whose memory is shared with x:
|
| >>> y = x[2:]
| >>> y.base is x
| True
|
| ctypes
| An object to simplify the interaction of the array with the ctypes
| module.
|
| This attribute creates an object that makes it easier to use arrays
| when calling shared libraries with the ctypes module. The returned
| object has, among others, data, shape, and strides attributes (see
| Notes below) which themselves return ctypes objects that can be used
| as arguments to a shared library.
|
| Parameters
| ----------
| None
|
| Returns
| -------
| c : Python object
| Possessing attributes data, shape, strides, etc.
|
| See Also
| --------
| numpy.ctypeslib
|
| Notes
| -----
| Below are the public attributes of this object which were documented
| in "Guide to NumPy" (we have omitted undocumented public attributes,
| as well as documented private attributes):
|
| .. autoattribute:: numpy.core._internal._ctypes.data
| :noindex:
|
| .. autoattribute:: numpy.core._internal._ctypes.shape
| :noindex:
|
| .. autoattribute:: numpy.core._internal._ctypes.strides
| :noindex:
|
| .. automethod:: numpy.core._internal._ctypes.data_as
| :noindex:
|
| .. automethod:: numpy.core._internal._ctypes.shape_as
| :noindex:
|
| .. automethod:: numpy.core._internal._ctypes.strides_as
| :noindex:
|
| If the ctypes module is not available, then the ctypes attribute
| of array objects still returns something useful, but ctypes objects
| are not returned and errors may be raised instead. In particular,
| the object will still have the ``as_parameter`` attribute which will
| return an integer equal to the data attribute.
|
| Examples
| --------
| >>> import ctypes
| >>> x
| array([[0, 1],
| [2, 3]])
| >>> x.ctypes.data
| 30439712
| >>> x.ctypes.data_as(ctypes.POINTER(ctypes.c_long))
| <ctypes.LP_c_long object at 0x01F01300>
| >>> x.ctypes.data_as(ctypes.POINTER(ctypes.c_long)).contents
| c_long(0)
| >>> x.ctypes.data_as(ctypes.POINTER(ctypes.c_longlong)).contents
| c_longlong(4294967296L)
| >>> x.ctypes.shape
| <numpy.core._internal.c_long_Array_2 object at 0x01FFD580>
| >>> x.ctypes.shape_as(ctypes.c_long)
| <numpy.core._internal.c_long_Array_2 object at 0x01FCE620>
| >>> x.ctypes.strides
| <numpy.core._internal.c_long_Array_2 object at 0x01FCE620>
| >>> x.ctypes.strides_as(ctypes.c_longlong)
| <numpy.core._internal.c_longlong_Array_2 object at 0x01F01300>
|
| data
| Python buffer object pointing to the start of the array's data.
|
| dtype
| Data-type of the array's elements.
|
| Parameters
| ----------
| None
|
| Returns
| -------
| d : numpy dtype object
|
| See Also
| --------
| numpy.dtype
|
| Examples
| --------
| >>> x
| array([[0, 1],
| [2, 3]])
| >>> x.dtype
| dtype('int32')
| >>> type(x.dtype)
| <type 'numpy.dtype'>
|
| flags
| Information about the memory layout of the array.
|
| Attributes
| ----------
| C_CONTIGUOUS (C)
| The data is in a single, C-style contiguous segment.
| F_CONTIGUOUS (F)
| The data is in a single, Fortran-style contiguous segment.
| OWNDATA (O)
| The array owns the memory it uses or borrows it from another object.
| WRITEABLE (W)
| The data area can be written to. Setting this to False locks
| the data, making it read-only. A view (slice, etc.) inherits WRITEABLE
| from its base array at creation time, but a view of a writeable
| array may be subsequently locked while the base array remains writeable.
| (The opposite is not true, in that a view of a locked array may not
| be made writeable. However, currently, locking a base object does not
| lock any views that already reference it, so under that circumstance it
| is possible to alter the contents of a locked array via a previously
| created writeable view onto it.) Attempting to change a non-writeable
| array raises a RuntimeError exception.
| ALIGNED (A)
| The data and all elements are aligned appropriately for the hardware.
| WRITEBACKIFCOPY (X)
| This array is a copy of some other array. The C-API function
| PyArray_ResolveWritebackIfCopy must be called before deallocating
| to the base array will be updated with the contents of this array.
| UPDATEIFCOPY (U)
| (Deprecated, use WRITEBACKIFCOPY) This array is a copy of some other array.
| When this array is
| deallocated, the base array will be updated with the contents of
| this array.
| FNC
| F_CONTIGUOUS and not C_CONTIGUOUS.
| FORC
| F_CONTIGUOUS or C_CONTIGUOUS (one-segment test).
| BEHAVED (B)
| ALIGNED and WRITEABLE.
| CARRAY (CA)
| BEHAVED and C_CONTIGUOUS.
| FARRAY (FA)
| BEHAVED and F_CONTIGUOUS and not C_CONTIGUOUS.
|
| Notes
| -----
| The `flags` object can be accessed dictionary-like (as in ``a.flags['WRITEABLE']``),
| or by using lowercased attribute names (as in ``a.flags.writeable``). Short flag
| names are only supported in dictionary access.
|
| Only the WRITEBACKIFCOPY, UPDATEIFCOPY, WRITEABLE, and ALIGNED flags can be
| changed by the user, via direct assignment to the attribute or dictionary
| entry, or by calling `ndarray.setflags`.
|
| The array flags cannot be set arbitrarily:
|
| - UPDATEIFCOPY can only be set ``False``.
| - WRITEBACKIFCOPY can only be set ``False``.
| - ALIGNED can only be set ``True`` if the data is truly aligned.
| - WRITEABLE can only be set ``True`` if the array owns its own memory
| or the ultimate owner of the memory exposes a writeable buffer
| interface or is a string.
|
| Arrays can be both C-style and Fortran-style contiguous simultaneously.
| This is clear for 1-dimensional arrays, but can also be true for higher
| dimensional arrays.
|
| Even for contiguous arrays a stride for a given dimension
| ``arr.strides[dim]`` may be *arbitrary* if ``arr.shape[dim] == 1``
| or the array has no elements.
| It does *not* generally hold that ``self.strides[-1] == self.itemsize``
| for C-style contiguous arrays or ``self.strides[0] == self.itemsize`` for
| Fortran-style contiguous arrays is true.
|
| flat
| A 1-D iterator over the array.
|
| This is a `numpy.flatiter` instance, which acts similarly to, but is not
| a subclass of, Python's built-in iterator object.
|
| See Also
| --------
| flatten : Return a copy of the array collapsed into one dimension.
|
| flatiter
|
| Examples
| --------
| >>> x = np.arange(1, 7).reshape(2, 3)
| >>> x
| array([[1, 2, 3],
| [4, 5, 6]])
| >>> x.flat[3]
| 4
| >>> x.T
| array([[1, 4],
| [2, 5],
| [3, 6]])
| >>> x.T.flat[3]
| 5
| >>> type(x.flat)
| <class 'numpy.flatiter'>
|
| An assignment example:
|
| >>> x.flat = 3; x
| array([[3, 3, 3],
| [3, 3, 3]])
| >>> x.flat[[1,4]] = 1; x
| array([[3, 1, 3],
| [3, 1, 3]])
|
| imag
| The imaginary part of the array.
|
| Examples
| --------
| >>> x = np.sqrt([1+0j, 0+1j])
| >>> x.imag
| array([ 0. , 0.70710678])
| >>> x.imag.dtype
| dtype('float64')
|
| itemsize
| Length of one array element in bytes.
|
| Examples
| --------
| >>> x = np.array([1,2,3], dtype=np.float64)
| >>> x.itemsize
| 8
| >>> x = np.array([1,2,3], dtype=np.complex128)
| >>> x.itemsize
| 16
|
| nbytes
| Total bytes consumed by the elements of the array.
|
| Notes
| -----
| Does not include memory consumed by non-element attributes of the
| array object.
|
| Examples
| --------
| >>> x = np.zeros((3,5,2), dtype=np.complex128)
| >>> x.nbytes
| 480
| >>> np.prod(x.shape) * x.itemsize
| 480
|
| ndim
| Number of array dimensions.
|
| Examples
| --------
| >>> x = np.array([1, 2, 3])
| >>> x.ndim
| 1
| >>> y = np.zeros((2, 3, 4))
| >>> y.ndim
| 3
|
| real
| The real part of the array.
|
| Examples
| --------
| >>> x = np.sqrt([1+0j, 0+1j])
| >>> x.real
| array([ 1. , 0.70710678])
| >>> x.real.dtype
| dtype('float64')
|
| See Also
| --------
| numpy.real : equivalent function
|
| shape
| Tuple of array dimensions.
|
| The shape property is usually used to get the current shape of an array,
| but may also be used to reshape the array in-place by assigning a tuple of
| array dimensions to it. As with `numpy.reshape`, one of the new shape
| dimensions can be -1, in which case its value is inferred from the size of
| the array and the remaining dimensions. Reshaping an array in-place will
| fail if a copy is required.
|
| Examples
| --------
| >>> x = np.array([1, 2, 3, 4])
| >>> x.shape
| (4,)
| >>> y = np.zeros((2, 3, 4))
| >>> y.shape
| (2, 3, 4)
| >>> y.shape = (3, 8)
| >>> y
| array([[ 0., 0., 0., 0., 0., 0., 0., 0.],
| [ 0., 0., 0., 0., 0., 0., 0., 0.],
| [ 0., 0., 0., 0., 0., 0., 0., 0.]])
| >>> y.shape = (3, 6)
| Traceback (most recent call last):
| File "<stdin>", line 1, in <module>
| ValueError: total size of new array must be unchanged
| >>> np.zeros((4,2))[::2].shape = (-1,)
| Traceback (most recent call last):
| File "<stdin>", line 1, in <module>
| AttributeError: incompatible shape for a non-contiguous array
|
| See Also
| --------
| numpy.reshape : similar function
| ndarray.reshape : similar method
|
| size
| Number of elements in the array.
|
| Equal to ``np.prod(a.shape)``, i.e., the product of the array's
| dimensions.
|
| Notes
| -----
| `a.size` returns a standard arbitrary precision Python integer. This
| may not be the case with other methods of obtaining the same value
| (like the suggested ``np.prod(a.shape)``, which returns an instance
| of ``np.int_``), and may be relevant if the value is used further in
| calculations that may overflow a fixed size integer type.
|
| Examples
| --------
| >>> x = np.zeros((3, 5, 2), dtype=np.complex128)
| >>> x.size
| 30
| >>> np.prod(x.shape)
| 30
|
| strides
| Tuple of bytes to step in each dimension when traversing an array.
|
| The byte offset of element ``(i[0], i[1], ..., i[n])`` in an array `a`
| is::
|
| offset = sum(np.array(i) * a.strides)
|
| A more detailed explanation of strides can be found in the
| "ndarray.rst" file in the NumPy reference guide.
|
| Notes
| -----
| Imagine an array of 32-bit integers (each 4 bytes)::
|
| x = np.array([[0, 1, 2, 3, 4],
| [5, 6, 7, 8, 9]], dtype=np.int32)
|
| This array is stored in memory as 40 bytes, one after the other
| (known as a contiguous block of memory). The strides of an array tell
| us how many bytes we have to skip in memory to move to the next position
| along a certain axis. For example, we have to skip 4 bytes (1 value) to
| move to the next column, but 20 bytes (5 values) to get to the same
| position in the next row. As such, the strides for the array `x` will be
| ``(20, 4)``.
|
| See Also
| --------
| numpy.lib.stride_tricks.as_strided
|
| Examples
| --------
| >>> y = np.reshape(np.arange(2*3*4), (2,3,4))
| >>> y
| array([[[ 0, 1, 2, 3],
| [ 4, 5, 6, 7],
| [ 8, 9, 10, 11]],
| [[12, 13, 14, 15],
| [16, 17, 18, 19],
| [20, 21, 22, 23]]])
| >>> y.strides
| (48, 16, 4)
| >>> y[1,1,1]
| 17
| >>> offset=sum(y.strides * np.array((1,1,1)))
| >>> offset/y.itemsize
| 17
|
| >>> x = np.reshape(np.arange(5*6*7*8), (5,6,7,8)).transpose(2,3,1,0)
| >>> x.strides
| (32, 4, 224, 1344)
| >>> i = np.array([3,5,2,2])
| >>> offset = sum(i * x.strides)
| >>> x[3,5,2,2]
| 813
| >>> offset / x.itemsize
| 813
|
| ----------------------------------------------------------------------
| Data and other attributes defined here:
|
| __hash__ = None
# 未指定索引
data = np.array(list('abcd'))
s = pd.Series(data)
print(s)
0 a
1 b
2 c
3 d
dtype: object
# 创建Series指定索引
s = pd.Series(list('qwer'),index=[list('ABCD')])
print(s)
A q
B w
C e
D r
dtype: object
# 从标量创建Series
print(pd.Series(5,index={0,1,2,3}))
0 5
1 5
2 5
3 5
dtype: int64
# 字典类型创建Series,并指定数据类型为float,不指定索引
data = {'a':0,'b':1,'c':2}
s = pd.Series(data,dtype=float)
print(s)
a 0.0
b 1.0
c 2.0
dtype: float64
# 字典类型创建Series,指定索引创建,如果data中的没有对应的索引,将以NaN(即无值)填充
data = {'a':0,'b':1,'c':2}
s = pd.Series(data,index=['a','b','d','c'],dtype=float)
print(s)
a 0.0
b 1.0
d NaN
c 2.0
dtype: float64
# 获取Series的元素
# 1、通过位置索引获取
print(s[0])
0.0
# 2、通过index来索引
print(s[['a','b']])
a 0.0
b 1.0
dtype: float64
# 3、索引不存在的的index会报错
print(s['g'])
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
c:python37-32libsite-packagespandascoreindexesase.py in get_value(self, series, key)
4728 try:
-> 4729 return libindex.get_value_box(s, key)
4730 except IndexError:
pandas\_libsindex.pyx in pandas._libs.index.get_value_box()
pandas\_libsindex.pyx in pandas._libs.index.get_value_at()
pandas\_libsutil.pxd in pandas._libs.util.get_value_at()
pandas\_libsutil.pxd in pandas._libs.util.validate_indexer()
TypeError: 'str' object cannot be interpreted as an integer
During handling of the above exception, another exception occurred:
KeyError Traceback (most recent call last)
<ipython-input-30-df1755480940> in <module>
1 # 3、索引不存在的的index会报错
----> 2 print(s['g'])
c:python37-32libsite-packagespandascoreseries.py in __getitem__(self, key)
1062 key = com.apply_if_callable(key, self)
1063 try:
-> 1064 result = self.index.get_value(self, key)
1065
1066 if not is_scalar(result):
c:python37-32libsite-packagespandascoreindexesase.py in get_value(self, series, key)
4735 raise InvalidIndexError(key)
4736 else:
-> 4737 raise e1
4738 except Exception: # pragma: no cover
4739 raise e1
c:python37-32libsite-packagespandascoreindexesase.py in get_value(self, series, key)
4721 k = self._convert_scalar_indexer(k, kind="getitem")
4722 try:
-> 4723 return self._engine.get_value(s, k, tz=getattr(series.dtype, "tz", None))
4724 except KeyError as e1:
4725 if len(self) > 0 and (self.holds_integer() or self.is_boolean()):
pandas\_libsindex.pyx in pandas._libs.index.IndexEngine.get_value()
pandas\_libsindex.pyx in pandas._libs.index.IndexEngine.get_value()
pandas\_libsindex.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas\_libshashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
pandas\_libshashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
KeyError: 'g'
DataFrame
# 数据帧(DataFrame)是二维数据结构,即数据以行和列的表格方式排列
# 就是除了数据外还有行标记、列标记,默认为从0 -- (row/col) num-1
# 可以用于创建DataFrame的数据类型有列表、字典、系列、Numpy.array、另一个DataFrame
# pandas.DataFrame( data, index, columns, dtype, copy)
# 创建一个空的DataFrame
df = pd.DataFrame()
df
# 从列表中创建DataFrame,
# 普通list创建
data = [1,2,3,4]
df = pd.DataFrame(data)
print(df)
0
0 1
1 2
2 3
3 4
# list的list创建,如[[A1,B1],[A2,B2]] ,为两列
data = [['Row1',12],['Row2',23],['Row3',34]]
df = pd.DataFrame(data)
df
| 0 | 1 |
0 |
Row1 |
12 |
1 |
Row2 |
23 |
2 |
Row3 |
34 |
# 指定列名,创建
df = pd.DataFrame(data,columns={'Col1','Col2'},dtype = float)
df
| Col2 | Col1 |
0 |
Row1 |
12.0 |
1 |
Row2 |
23.0 |
2 |
Row3 |
34.0 |
# 从numpy的narray创建dataFrame
data = np.ones([5,6],dtype=int)
df = pd.DataFrame(data,columns=list('ABCDEF'))
df
| A | B | C | D | E | F |
0 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
2 |
1 |
1 |
1 |
1 |
1 |
1 |
3 |
1 |
1 |
1 |
1 |
1 |
1 |
4 |
1 |
1 |
1 |
1 |
1 |
1 |
# 从series字典创建DataFrame
d = {'one':pd.Series([1,2,3],index = ['A','B','C']),
'two':pd.Series([1,2,3,4,5],index = list('ABCDE'))}
df = pd.DataFrame(d)
df
| one | two |
A |
1.0 |
1 |
B |
2.0 |
2 |
C |
3.0 |
3 |
D |
NaN |
4 |
E |
NaN |
5 |
# 从list字典中创
data = {'Name':['Tom', 'Jack', 'Steve', 'Ricky'],'Age':[28,34,29,42]}
df = pd.DataFrame(data)
df
| Name | Age |
0 |
Tom |
28 |
1 |
Jack |
34 |
2 |
Steve |
29 |
3 |
Ricky |
42 |
# 指定索引
df = pd.DataFrame(data, index=['rank1','rank2','rank3','rank4'])
df
| Name | Age |
rank1 |
Tom |
28 |
rank2 |
Jack |
34 |
rank3 |
Steve |
29 |
rank4 |
Ricky |
42 |
data = [{'a': 1, 'b': 2},{'a': 5, 'b': 10, 'c': 20}]
df = pd.DataFrame(data)
df
| a | b | c |
0 |
1 |
2 |
NaN |
1 |
5 |
10 |
20.0 |
# 与上不同的使用有索引
data = [{'a': 1, 'b': 2},{'a': 5, 'b': 10, 'c': 20}]
df = pd.DataFrame(data, index=['first', 'second'])
df
| a | b | c |
first |
1 |
2 |
NaN |
second |
5 |
10 |
20.0 |
# 根据列名来创建列,data中多余的忽略
df1 = pd.DataFrame(data, index=['first', 'second'], columns=['a', 'b'])
df1
# 对于这类型的数据类型,A:B:{...},A为列,B为索引,剩下的为值
data = {'cat':{'name':['Tom','Jack','Steve', 'Ricky'],'Age':[8,4,9,2]},'dog':{'name':['Jon','Dany','Pipy'],'Age':[3,5,2]}}
df = pd.DataFrame(data)
df
| cat | dog |
name |
[Tom, Jack, Steve, Ricky] |
[Jon, Dany, Pipy] |
Age |
[8, 4, 9, 2] |
[3, 5, 2] |
# DataFrame的元素选择
# 1、按标签选择
d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']),
'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])}
df = pd.DataFrame(d)
print(df)
print()
df.loc['b']
one two
a 1.0 1
b 2.0 2
c 3.0 3
d NaN 4
one 2.0
two 2.0
Name: b, dtype: float64
# 2、按位置索引
print(df)
print()
df.iloc[2]
one two
a 1.0 1
b 2.0 2
c 3.0 3
d NaN 4
one 3.0
two 3.0
Name: c, dtype: float64
# 3、切片索引
print(df)
print()
df[2:4]
one two
a 1.0 1
b 2.0 2
c 3.0 3
d NaN 4
# 附加行
df = pd.DataFrame([[1, 2], [3, 4]], columns = ['a','b'])
df2 = pd.DataFrame([[5, 6], [7, 8]], columns = ['a','b'])
df = df.append(df2)
df
| a | b |
0 |
1 |
2 |
1 |
3 |
4 |
0 |
5 |
6 |
1 |
7 |
8 |
# 删除行
print(df)
print()
df = df.drop(1)
print()
df
a b
0 1 2
1 3 4
0 5 6
1 7 8
DataFrame常用属性
dtypes empty ndim values shape T转置
d = {'Name':pd.Series(['Tom','James','Ricky','Vin','Steve','Minsu','Jack']),
'Age':pd.Series([25,26,25,23,30,29,23]),
'Rating':pd.Series([4.23,3.24,3.98,2.56,3.20,4.6,3.8])}
df = pd.DataFrame(d)
# empty属性
df.empty
False
# dtypes属性
df.dtypes
Name object
Age int64
Rating float64
dtype: object
# ndim属性
df.ndim
2
# values属性
df.values
array([['Tom', 25, 4.23],
['James', 26, 3.24],
['Ricky', 25, 3.98],
['Vin', 23, 2.56],
['Steve', 30, 3.2],
['Minsu', 29, 4.6],
['Jack', 23, 3.8]], dtype=object)
DataFrame的loc、iloc、ix索引的区别
# 1、loc 只能通过行和列的标签进行索引
# iloc只能通过行和列在DataFrame中的位置进行索引
# ix支持以上两种索引方式
# 2、行索引和列索引都为list或其等价形式(:),返回结果为DataFrame
# 行索引和列索引中有一个为确定值,放回结果为Series
# 行索引和列索引都为确定值,返回结果为DataFrame中的某一确定元素
Pandas常用函数
# reindex —— 重建索引
# reindex_like —— 重建和另一对象相同的索引
# rename —— 行、列标签重命名
# sort_index —— 按行标签排序
# sort_values —— 按值排序
unsorted_df=pd.DataFrame(np.random.randn(10,2),index=[1,4,6,2,3,5,9,8,0,7],columns=['col2','col1'])
print (unsorted_df)
col2 col1
1 0.588919 -1.286545
4 -0.216037 0.229273
6 0.980793 0.251861
2 -0.179507 0.115554
3 0.226075 0.446773
5 -2.184631 0.040103
9 1.593689 0.378275
8 0.778931 -0.134482
0 1.294250 -0.086114
7 -0.756915 -0.446006
# reidnex
N=20
df = pd.DataFrame({
'A': pd.date_range(start='2016-01-01',periods=N,freq='D'),
'x': np.linspace(0,stop=N-1,num=N),
'y': np.random.rand(N),
'C': np.random.choice(['Low','Medium','High'],N).tolist(),
'D': np.random.normal(100, 10, size=(N)).tolist()
})
print(df)
print()
df_reindexed = df.reindex(index=[0,2,5], columns=['A', 'C', 'B'])
print (df_reindexed)
A x y C D
0 2016-01-01 0.0 0.661369 Low 83.894715
1 2016-01-02 1.0 0.790472 Medium 116.237968
2 2016-01-03 2.0 0.302724 Low 90.430196
3 2016-01-04 3.0 0.751360 Low 102.215610
4 2016-01-05 4.0 0.519902 High 102.216478
5 2016-01-06 5.0 0.438849 Medium 106.463023
6 2016-01-07 6.0 0.885001 Low 89.814096
7 2016-01-08 7.0 0.109956 Medium 90.270910
8 2016-01-09 8.0 0.389458 Low 98.465227
9 2016-01-10 9.0 0.318306 High 118.776363
10 2016-01-11 10.0 0.152488 Low 98.696870
11 2016-01-12 11.0 0.485482 High 116.381673
12 2016-01-13 12.0 0.387862 Medium 114.534515
13 2016-01-14 13.0 0.637408 Low 94.842515
14 2016-01-15 14.0 0.456157 Medium 97.441336
15 2016-01-16 15.0 0.015865 Low 100.749580
16 2016-01-17 16.0 0.833705 Medium 115.324600
17 2016-01-18 17.0 0.093276 Medium 96.512771
18 2016-01-19 18.0 0.797762 Low 95.039926
19 2016-01-20 19.0 0.336154 High 103.888419
A C B
0 2016-01-01 Low NaN
2 2016-01-03 Low NaN
5 2016-01-06 Medium NaN
# reindex_like
df1 = pd.DataFrame(np.random.randn(10,3),columns=['col1','col2','col3'])
df2 = pd.DataFrame(np.random.randn(7,3),columns=['col1','col2','col3'])
df1 = df1.reindex_like(df2)
df1
| col1 | col2 | col3 |
0 |
0.052837 |
-0.253690 |
-0.696232 |
1 |
1.249527 |
-1.210593 |
0.397398 |
2 |
-0.527549 |
0.336701 |
1.796881 |
3 |
1.126255 |
-1.054539 |
0.773854 |
4 |
0.821721 |
0.400086 |
-1.051927 |
5 |
-0.490465 |
0.472505 |
0.545293 |
6 |
-0.652554 |
-0.858469 |
0.099429 |
# rename
df1 = pd.DataFrame(np.random.randn(6,3),columns=['col1','col2','col3'])
print(df1)
print()
print ("After renaming the rows and columns:
")
df1.rename(columns={'col1' : 'c1', 'col2' : 'c2'},
index = {0 : 'apple', 1 : 'banana', 2 : 'durian'})
col1 col2 col3
0 -0.932208 -0.140297 1.568057
1 0.343363 -0.466544 -1.395083
2 -0.313693 1.194218 -1.123901
3 1.051939 0.734287 -0.459811
4 0.389540 1.178237 -0.729928
5 1.581532 0.455418 -0.004001
After renaming the rows and columns:
| c1 | c2 | col3 |
apple |
-0.932208 |
-0.140297 |
1.568057 |
banana |
0.343363 |
-0.466544 |
-1.395083 |
durian |
-0.313693 |
1.194218 |
-1.123901 |
3 |
1.051939 |
0.734287 |
-0.459811 |
4 |
0.389540 |
1.178237 |
-0.729928 |
5 |
1.581532 |
0.455418 |
-0.004001 |
# sort_index
help(unsorted_df.sort_index)
Help on method sort_index in module pandas.core.frame:
sort_index(axis=0, level=None, ascending=True, inplace=False, kind='quicksort', na_position='last', sort_remaining=True, by=None) method of pandas.core.frame.DataFrame instance
Sort object by labels (along an axis).
Parameters
----------
axis : {0 or 'index', 1 or 'columns'}, default 0
The axis along which to sort. The value 0 identifies the rows,
and 1 identifies the columns.
level : int or level name or list of ints or list of level names
If not None, sort on values in specified index level(s).
ascending : bool, default True
Sort ascending vs. descending.
inplace : bool, default False
If True, perform operation in-place.
kind : {'quicksort', 'mergesort', 'heapsort'}, default 'quicksort'
Choice of sorting algorithm. See also ndarray.np.sort for more
information. `mergesort` is the only stable algorithm. For
DataFrames, this option is only applied when sorting on a single
column or label.
na_position : {'first', 'last'}, default 'last'
Puts NaNs at the beginning if `first`; `last` puts NaNs at the end.
Not implemented for MultiIndex.
sort_remaining : bool, default True
If True and sorting by level and index is multilevel, sort by other
levels too (in order) after sorting by specified level.
Returns
-------
sorted_obj : DataFrame or None
DataFrame with sorted index if inplace=False, None otherwise.
# 行名排序
sorted_df=unsorted_df.sort_index()
print (sorted_df)
# 行名倒序
sorted_df = unsorted_df.sort_index(ascending=False)
print (sorted_df)
# 按照列属性名排序
sorted_df=unsorted_df.sort_index(axis=1)
print (sorted_df)
col2 col1
0 1.294250 -0.086114
1 0.588919 -1.286545
2 -0.179507 0.115554
3 0.226075 0.446773
4 -0.216037 0.229273
5 -2.184631 0.040103
6 0.980793 0.251861
7 -0.756915 -0.446006
8 0.778931 -0.134482
9 1.593689 0.378275
col2 col1
9 1.593689 0.378275
8 0.778931 -0.134482
7 -0.756915 -0.446006
6 0.980793 0.251861
5 -2.184631 0.040103
4 -0.216037 0.229273
3 0.226075 0.446773
2 -0.179507 0.115554
1 0.588919 -1.286545
0 1.294250 -0.086114
col1 col2
1 -1.286545 0.588919
4 0.229273 -0.216037
6 0.251861 0.980793
2 0.115554 -0.179507
3 0.446773 0.226075
5 0.040103 -2.184631
9 0.378275 1.593689
8 -0.134482 0.778931
0 -0.086114 1.294250
7 -0.446006 -0.756915
# sort_values
# 按col1倒序
unsorted_df = pd.DataFrame({'col1':[2,1,1,1],'col2':[1,3,2,4]})
sorted_df = unsorted_df.sort_values(by='col1',ascending=False)
print (sorted_df)
col1 col2
0 2 1
1 1 3
2 1 2
3 1 4
# 先按col1排序,再按col2排序
unsorted_df = pd.DataFrame({'col1':[2,1,1,1],'col2':[1,3,2,4]})
sorted_df = unsorted_df.sort_values(by=['col1','col2'])
print (sorted_df)
col1 col2
2 1 2
1 1 3
3 1 4
0 2 1
常用的字符串函数
s = pd.Series(['Tom', 'William Rick', 'John', 'Alber@t', np.nan, '1234','SteveMinsu'])
s
0 Tom
1 William Rick
2 John
3 Alber@t
4 NaN
5 1234
6 SteveMinsu
dtype: object
s.str.lower()
0 tom
1 william rick
2 john
3 alber@t
4 NaN
5 1234
6 steveminsu
dtype: object
d = {'Name':pd.Series(['Tom','James','Ricky','Vin','Steve','Minsu','Jack']),
'Age':pd.Series([25,26,25,23,30,29,23]),
'Rating':pd.Series([4.23,3.24,3.98,2.56,3.20,4.6,3.8])}
df = pd.DataFrame(d)
df
| Name | Age | Rating |
0 |
Tom |
25 |
4.23 |
1 |
James |
26 |
3.24 |
2 |
Ricky |
25 |
3.98 |
3 |
Vin |
23 |
2.56 |
4 |
Steve |
30 |
3.20 |
5 |
Minsu |
29 |
4.60 |
6 |
Jack |
23 |
3.80 |
# 1 lower() 将Series/Index中的字符串转换为小写。
# 2 upper() 将Series/Index中的字符串转换为大写。
# 3 len() 计算字符串长度。
# 4 strip() 帮助从两侧的系列/索引中的每个字符串中删除空格(包括换行符)。
# 5 split(' ') 用给定的模式拆分每个字符串。
# 6 cat(sep=' ') 使用给定的分隔符连接系列/索引元素。
# 8 contains(pattern) 如果元素中包含子字符串,则返回每个元素的布尔值True,否则为False。
# 9 replace(a,b) 将值a替换为值b。
# 10 repeat(value) 重复每个元素指定的次数。
# 11 count(pattern) 返回模式中每个元素的出现总数。
# 12 startswith(pattern) 如果系列/索引中的元素以模式开始,则返回true。
# 13 endswith(pattern) 如果系列/索引中的元素以模式结束,则返回true。
# 17 islower() 检查系列/索引中每个字符串中的所有字符是否小写,返回布尔值
# 18 isupper() 检查系列/索引中每个字符串中的所有字符是否大写,返回布尔值
# 19 isnumeric() 检查系列/索引中每个字符串中的所有字符是否为数字,返回布尔值
df.Name.str.upper()
0 TOM
1 JAMES
2 RICKY
3 VIN
4 STEVE
5 MINSU
6 JACK
Name: Name, dtype: object
pandas迭代
N=20
df = pd.DataFrame({
'A': pd.date_range(start='2016-01-01',periods=N,freq='D'),
'x': np.linspace(0,stop=N-1,num=N),
'y': np.random.rand(N),
'C': np.random.choice(['Low','Medium','High'],N).tolist(),
'D': np.random.normal(100, 10, size=(N)).tolist()
})
df
| A | x | y | C | D |
0 |
2016-01-01 |
0.0 |
0.446246 |
Low |
90.082811 |
1 |
2016-01-02 |
1.0 |
0.820547 |
Medium |
85.204472 |
2 |
2016-01-03 |
2.0 |
0.482619 |
Medium |
105.942642 |
3 |
2016-01-04 |
3.0 |
0.718412 |
Low |
115.687592 |
4 |
2016-01-05 |
4.0 |
0.331059 |
Medium |
102.752829 |
5 |
2016-01-06 |
5.0 |
0.538200 |
Low |
98.576102 |
6 |
2016-01-07 |
6.0 |
0.513321 |
Medium |
104.510604 |
7 |
2016-01-08 |
7.0 |
0.035004 |
Low |
91.549099 |
8 |
2016-01-09 |
8.0 |
0.732639 |
Low |
85.738154 |
9 |
2016-01-10 |
9.0 |
0.129754 |
Low |
106.541511 |
10 |
2016-01-11 |
10.0 |
0.985311 |
Medium |
102.242245 |
11 |
2016-01-12 |
11.0 |
0.116828 |
High |
101.988366 |
12 |
2016-01-13 |
12.0 |
0.479230 |
Low |
113.952426 |
13 |
2016-01-14 |
13.0 |
0.412539 |
Medium |
95.093407 |
14 |
2016-01-15 |
14.0 |
0.469513 |
Low |
107.428139 |
15 |
2016-01-16 |
15.0 |
0.552472 |
High |
99.081998 |
16 |
2016-01-17 |
16.0 |
0.434911 |
Low |
93.127380 |
17 |
2016-01-18 |
17.0 |
0.591642 |
Medium |
117.855380 |
18 |
2016-01-19 |
18.0 |
0.801644 |
Low |
127.432539 |
19 |
2016-01-20 |
19.0 |
0.555657 |
High |
81.427083 |
# 遍历列名
for col in df:
print (col)
A
x
y
C
D
# 遍历元素
for row_data in df.values:
print(row_data)
[Timestamp('2016-01-01 00:00:00') 0.0 0.44624633059211527 'Low'
90.08281134807373]
[Timestamp('2016-01-02 00:00:00') 1.0 0.8205471491612922 'Medium'
85.20447173046472]
[Timestamp('2016-01-03 00:00:00') 2.0 0.4826193547059088 'Medium'
105.94264179882202]
[Timestamp('2016-01-04 00:00:00') 3.0 0.7184117272123155 'Low'
115.68759211724687]
[Timestamp('2016-01-05 00:00:00') 4.0 0.3310586774340063 'Medium'
102.75282946947281]
[Timestamp('2016-01-06 00:00:00') 5.0 0.5382003539908004 'Low'
98.57610178066079]
[Timestamp('2016-01-07 00:00:00') 6.0 0.5133208064440394 'Medium'
104.5106036836102]
[Timestamp('2016-01-08 00:00:00') 7.0 0.03500435923546885 'Low'
91.54909932326467]
[Timestamp('2016-01-09 00:00:00') 8.0 0.7326388691009159 'Low'
85.73815394265921]
[Timestamp('2016-01-10 00:00:00') 9.0 0.1297542568969341 'Low'
106.54151098431808]
[Timestamp('2016-01-11 00:00:00') 10.0 0.9853111321167449 'Medium'
102.24224501648254]
[Timestamp('2016-01-12 00:00:00') 11.0 0.11682758077167632 'High'
101.98836610106864]
[Timestamp('2016-01-13 00:00:00') 12.0 0.4792303697943716 'Low'
113.95242649298567]
[Timestamp('2016-01-14 00:00:00') 13.0 0.41253852157312554 'Medium'
95.09340708407247]
[Timestamp('2016-01-15 00:00:00') 14.0 0.4695131476336729 'Low'
107.42813873921236]
[Timestamp('2016-01-16 00:00:00') 15.0 0.5524721315517419 'High'
99.0819978391391]
[Timestamp('2016-01-17 00:00:00') 16.0 0.4349111111142554 'Low'
93.12738041265543]
[Timestamp('2016-01-18 00:00:00') 17.0 0.5916418035431593 'Medium'
117.85538021367091]
[Timestamp('2016-01-19 00:00:00') 18.0 0.8016441672459498 'Low'
127.43253872465036]
[Timestamp('2016-01-20 00:00:00') 19.0 0.555657043090727 'High'
81.4270829436166]
pandas中文文档