python基础(二)----数据类型

Python基础第二章

二进制

字符编码

基本数据类型-数字

基本数据类型-字符串

基本数据类型-列表

基本数据类型-元组

可变、不可变数据类型和hash

基本数据类型-字典

基本数据类型-集合

二进制

二进制是计算技术中采用的一种进制。二进制数由0和1两个数码组成，它的基数为2，进制规则是“逢二进一”，由18世纪德国数理哲学大师莱布尼兹发现。当前的计算机系统使用的基本上是二进制系统，数据在计算机中主要是以补码的形式存储的。计算机中的二进制则是一个非常微小的开关，用“开”代表1，“关”代表0。

二进制与十进制的转换

二进制的第n位代表的十进制值都刚好遵循着2的n次方这个规律

装水桶法

先把他们代表的值一次写出来，然后再根据10进制的值把数填到相应位置，就好了~~~
十进制转二进制的方法相同，只要对照二进制为1的那一位对应的十进制值相加就可以了。

        128     64      32      16      8       4       2        1

20       0       0       0       1      0       1       0        0   
200      1       1       0       0      1       0       0        0

字符编码

十进制与二进制之间的转换只能解决计算机理解数字的问题，那么为文字的话需要怎么让计算机去理解呢？
于是就有了一种曲线救国的方法，既然数字可以转换成十进制，那么我们只需要想办法解决文字转换成数字的问题，那么文字不就是可以表示成二进制了吗？
可是文字应该怎么转换成二进制呢？就是强制转换
我们自己强行约定了一个表，把文字和数字对应上，这张表就等于是进行了翻译，我们可以拿一个数字来对比对应表中的文字，反之亦然。

ASCII码

ASCII表

ASCII（American Standard Code for Information Interchange，美国信息交换标准代码）是基于拉丁字母的一套电脑编码系统，主要用于显示现代英语和其他西欧语言。它是现今最通用的单字节编码系统，并等同于国际标准ISO/IEC 646。

由于计算机是美国人发明的，因此，最早只有127个字母被编码到计算机里，也就是大小写英字母、数字和一些符号，这个编码表被称为ASCII编码，比如大写字母 A的编码是65，小写字母 z的编码是122。后128个称为扩展ASCII码。

那现在我们就知道了上面的字母符号和数字对应的表是早就存在的。那么根据现在有的一些十进制，我们就可以转换成二进制的编码串。
比如：

一个空格对应的数字是0     翻译成二进制就是0(注意字符0和整数0是不同的)
一个对勾√对应的数字是251  翻译成二进制就是11111011

提问：如果我们要打印两个空格和一个对勾，写成二进制就应该是[`0011111011]，但是问题来了，我们怎么知道从哪到哪是一个字符呢？聪明的人类就想出了一个解决办法，既然一共就这255个字符，那最长的也不过11111111八位，不如就把所有的二进制都转换成8位的，不足的用0来替换。

这样一来，刚刚的两个空格一个对勾就写作[`000000000000000011111011]，读取的时候只要每次读8个字符就能知道每个字符的二进制值啦。

在这里，[`每一位0或者1所占的空间单位为bit(比特)]，这是计算机中最小的表示单位。每8个bit组成一个字节，这是计算机最小的存储单位。

    bit                 位，计算机中最小的表示单位
    8bit = 1bytes       字节，最小的存储单位，1bytes缩写为1B
    1KB = 1024KB
    1MB = 1024KB
    1GB = 1024MB
    1TB = 1024GB
    1PB = 1024TB
    ......

那么，到了现在学完了ASCII码，作为一名英文程序员来说，基本是圆满了，但是作为一名中国程序员，是不是觉得还少了点什么？

GBK和GB2312

很显然，对于我们来说，能在计算机中显示中文才是最重要的，可是在刚才ASCII表里连一个偏旁部首都没有。所以我们需要一张关于中文和数字对应的关系表。之前我们看到，1个字节最多表示256个字符，要处理中文显然一个字节是不够的，所以我们需要采用2个字节来表示，而且还不能和ASCII编码冲突，所以中国制订了GB2312编码，用来把中文编进去。

那么此时，又出现了问题：全世界由一百多种语言，当国外某个游戏或者电影引到中国来时，那么编码需要怎么处理？

Unicode

因此，Unicode应运而生，Unicode把所有语言都统一到一套编码里，这样就不会再有乱码问题了。

Unicode标准也在不断发展，但最常用的是用两个字节表示一个字符（如果要用到非常偏僻的字符，就需要4个字节）。现代操作系统和大多数编程语言都直接支持Unicode。
现在，捋一捋ASCII编码和Unicode编码的区别：

ASCII编码是1个字节，而Unicode编码通常是2个字节。

字母A用ASCII编码是十进制的65，二进制的01000001；

字符0用ASCII编码是十进制的48，二进制的00110000；

汉字“中”已经超出了ASCII编码的范围，用Unicode编码是十进制的20013，二进制的01001110 00101101。

你可以猜测，如果把ASCII编码的A用Unicode编码，只需要在前面补0就可以，因此，A的Unicode编码是00000000 01000001。

新的问题又出现了，如果都改成Unicode编码的话，乱码问题就会消失，那么随之而来的就是：如果你写的文本基本上都是英文的话，用Unicode编码要比ASCII编码需要多一倍的空间，在存储和传输上就十分不划算了。

UTF-8

所以，本着节约的精神，又出现了把Unicode编码转化为“可变长编码”的UTF-8编码。UTF-8编码把一个Unicode字符根据不同的数字大小编码成1-6个字节，常用的英文字母被编码成1个字节，汉字通常是3个字节，只有很生僻的字符才会被编码成4-6个字节。如果你要传输的文本包含大量英文字符，用UTF-8编码就能节省空间：

    字符 	ASCII 	        Unicode 	            UTF-8
    A 	       01000001     00000000 01000001 	           01000001
    中 	           x 	    01001110 00101101 	    11100100 10111000 10101101

从上图来看，UTF-8编码有一个额外的好处，就是ASCII编码实际上可以看作为UTF-8的一部分，所以，大量只支持ASCII编码的历史遗留软件可以在UTF-8编码下继续工作。

搞清楚了ASCII和、Unicode和UTF-8的区关系，可以总结一下计算机系统通用的字符编码工作方式：
在计算机内存中，统一使用Unicode编码，当需要保存到硬盘或者是需要传输的时候，就转换为UTF-8编码；

用记事本编辑的时候，从文件读取的UTF-8字符被转换成Unicode字符到内存里，编辑完成后，保存的时候再把Unicode转换成UTF-8保存到文件；

windows默认是GBK

MacOSLinux默认为UTF-8

Python2编码为ASCII

Python3编码为UTF-8

基本数据类型-数字

布尔型

bool型只有两个值：True和False

之所以将bool值归类为数字，是因为我们也习惯用1表示True，0表示False

整型

Python中的整数属于int类型，默认用十进制表示，此外也支持二进制，八进制,十六进制表示方式。

进制转换

虽然计算机只认识二进制，但是为了迎合我们的习惯，python的数字默认还是十进制。还提供了一些方法来帮助我们做转换，比如是进制转换为二进制使用[`bin] 方法，在转换结果面前还会加上0b表示是一个二进制的数。

>>> num = 132
>>> bin(num)
'0b10000100'

既然十进制可以转换为二进制，那么其实使用同样的原理也可以转换为其他进制，python也为我们提供了十进制转换成八进制和十六进制的方法，分别是oct和hex。八进制前面以0o表示，十六进制以0表示

>>> num = 129
>>> oct(num)
'0o201'
>>> hex(num)
'0x81'

取余运算

>>> 16%5
1

divmod

>>> divmod(16,3)
(5, 1)          #5为商，1为余数

浮点型

浮点数是属于有理数中某特定子集的数的数字表示，在计算机中用以近似表示任意某个实数。具体的说，这个实数由一个整数或定点数（即尾数）乘以某个基数（计算机中通常是2）的整数次幂得到，这表示方法类似于基数为10的科学计数法。

python的浮点数就是数学中的小数(有限小数和无限循环小数)

在运算中，整数与浮点数运算的结果也是一个浮点数

为什么要用浮点数

浮点数也就是小数，之所以称为浮点数，是因为按照科学记数法表示时，
一个浮点数的小数点位置是可变的，比如，
1.23x109和12.3x108是相等的。
浮点数可以用数学写法，如1.23，3.14，-9.01，等等。但是对于很大或很小的浮点数，就必须用科学计数法表示，把10用e替代：
1.23*109就是1.23e9，或者12.3e8，0.000012可以写成1.2e-5，等等。
整数和浮点数在计算机内部存储的方式是不同的，整数运算永远是精确的而浮点数运算则可能会有四舍五入的误差。

关于小数不精准问题

python默认是17位精准度，也就是小数点的后16位，尽管有16位，但这个精确度还是越往后越不准的。

首先，这个问题不是只存在在python中，其他语言也有同样的问题

其次，小数不精准是因为在转换成二进制的过程中会出现无限循环的情况，在约省的时候就会出现偏差。

比如：11.2的小数部分0.2转换为2进制则是无限循环的00110011001100110011...

单精度在存储的时候用23bit来存放这个尾数部分（前面9比特存储指数和符号）；同样0.6也是无限循环的；

这里有一个问题，就是当我们的计算需要更高的精度(超过16位数)的情况下怎么做呢？

#这里需要借助decimal模块的getcontext()和Decimal()方法
>> a = 3.141592653513651054608317828332
>>> a
3.141592653513651
>>> from decimal import *
>>> getcontext()
Context(prec=28, rounding=ROUND_HALF_EVEN, Emin=-999999, Emax=999999, capitals=1, clamp=0, flags=[], traps=[InvalidOperation, DivisionByZero, Overflow])
>>> getcontext().prec = 50      #将小数点后的尾数修改到50
>>> a = Decimal(1)/Decimal(3)   #在分数计算中结果正确，如果直接定义超长精度小数不会准确
>>> a
Decimal('0.33333333333333333333333333333333333333333333333333')
>>> a = 3.141592653513651054608317828332
>>> a
3.141592653513651
>>> Decimal(a)
Decimal('3.141592653513650912344701282563619315624237060546875')

复数

复数complex是由实数和虚数组成的。

要了解复数，其实关于复数还需要先了解虚数。虚数(就是虚假不实的数):平方为复数的数叫做虚数。

复数是指能写成如下形式的数a+bi，这里a和b是实数，i是虚数单位(即-1开根)。在复数a+bi中，a称为复数的实部，b称为复数的虚部(虚数是指平方为负数的数)，i称为虚数单位。

当虚部等于零时，这个复数就是实数；当虚部不等于零时，这个复数称为虚数。

注，虚数部分的字母j大小写都可以。

基本数据类型-字符串

字符串的定义和创建

字符串是一个有序的字符的集合，用于存储和表示基本的文本信息，''或“”或“”“中间包含的内存称之字符串

创建：

s = 'Hello XiaoYafei！Hello Python!'

字符串的特性与常用操作

特性

1.按照从左往右的顺序定义字符集合，下表从0开始顺序访问，有序

    str      =           hello
    索引                 01234

补充：
1.字符串的单引号和双引号都无法取消特殊字符的含义，如果想让引号内所有字符均取消特殊意义，在引导前面添加r，如：

name = r'pytho	n'

使用ctrl加上鼠标左键可以查看源码

class str(object):
    """
    str(object='') -> str
    str(bytes_or_buffer[, encoding[, errors]]) -> str
    
    Create a new string object from the given object. If encoding or
    errors is specified, then the object must expose a data buffer
    that will be decoded using the given encoding and error handler.
    Otherwise, returns the result of object.__str__() (if defined)
    or repr(object).
    encoding defaults to sys.getdefaultencoding().
    errors defaults to 'strict'.
    """
    def capitalize(self): # real signature unknown; restored from __doc__
        """
        S.capitalize() -> str
        
        Return a capitalized version of S, i.e. make the first character
        have upper case and the rest lower case.
        """
        return ""

    def casefold(self): # real signature unknown; restored from __doc__
        """
        S.casefold() -> str
        
        Return a version of S suitable for caseless comparisons.
        """
        return ""

    def center(self, width, fillchar=None): # real signature unknown; restored from __doc__
        """
        S.center(width[, fillchar]) -> str
        
        Return S centered in a string of length width. Padding is
        done using the specified fill character (default is a space)
        """
        return ""

    def count(self, sub, start=None, end=None): # real signature unknown; restored from __doc__
        """
        S.count(sub[, start[, end]]) -> int
        
        Return the number of non-overlapping occurrences of substring sub in
        string S[start:end].  Optional arguments start and end are
        interpreted as in slice notation.
        """
        return 0

    def encode(self, encoding='utf-8', errors='strict'): # real signature unknown; restored from __doc__
        """
        S.encode(encoding='utf-8', errors='strict') -> bytes
        
        Encode S using the codec registered for encoding. Default encoding
        is 'utf-8'. errors may be given to set a different error
        handling scheme. Default is 'strict' meaning that encoding errors raise
        a UnicodeEncodeError. Other possible values are 'ignore', 'replace' and
        'xmlcharrefreplace' as well as any other name registered with
        codecs.register_error that can handle UnicodeEncodeErrors.
        """
        return b""

    def endswith(self, suffix, start=None, end=None): # real signature unknown; restored from __doc__
        """
        S.endswith(suffix[, start[, end]]) -> bool
        
        Return True if S ends with the specified suffix, False otherwise.
        With optional start, test S beginning at that position.
        With optional end, stop comparing S at that position.
        suffix can also be a tuple of strings to try.
        """
        return False

    def expandtabs(self, tabsize=8): # real signature unknown; restored from __doc__
        """
        S.expandtabs(tabsize=8) -> str
        
        Return a copy of S where all tab characters are expanded using spaces.
        If tabsize is not given, a tab size of 8 characters is assumed.
        """
        return ""

    def find(self, sub, start=None, end=None): # real signature unknown; restored from __doc__
        """
        S.find(sub[, start[, end]]) -> int
        
        Return the lowest index in S where substring sub is found,
        such that sub is contained within S[start:end].  Optional
        arguments start and end are interpreted as in slice notation.
        
        Return -1 on failure.
        """
        return 0

    def format(self, *args, **kwargs): # known special case of str.format
        """
        S.format(*args, **kwargs) -> str
        
        Return a formatted version of S, using substitutions from args and kwargs.
        The substitutions are identified by braces ('{' and '}').
        """
        pass

    def format_map(self, mapping): # real signature unknown; restored from __doc__
        """
        S.format_map(mapping) -> str
        
        Return a formatted version of S, using substitutions from mapping.
        The substitutions are identified by braces ('{' and '}').
        """
        return ""

    def index(self, sub, start=None, end=None): # real signature unknown; restored from __doc__
        """
        S.index(sub[, start[, end]]) -> int
        
        Return the lowest index in S where substring sub is found, 
        such that sub is contained within S[start:end].  Optional
        arguments start and end are interpreted as in slice notation.
        
        Raises ValueError when the substring is not found.
        """
        return 0

    def isalnum(self): # real signature unknown; restored from __doc__
        """
        S.isalnum() -> bool
        
        Return True if all characters in S are alphanumeric
        and there is at least one character in S, False otherwise.
        """
        return False

    def isalpha(self): # real signature unknown; restored from __doc__
        """
        S.isalpha() -> bool
        
        Return True if all characters in S are alphabetic
        and there is at least one character in S, False otherwise.
        """
        return False

    def isdecimal(self): # real signature unknown; restored from __doc__
        """
        S.isdecimal() -> bool
        
        Return True if there are only decimal characters in S,
        False otherwise.
        """
        return False

    def isdigit(self): # real signature unknown; restored from __doc__
        """
        S.isdigit() -> bool
        
        Return True if all characters in S are digits
        and there is at least one character in S, False otherwise.
        """
        return False

    def isidentifier(self): # real signature unknown; restored from __doc__
        """
        S.isidentifier() -> bool
        
        Return True if S is a valid identifier according
        to the language definition.
        
        Use keyword.iskeyword() to test for reserved identifiers
        such as "def" and "class".
        """
        return False

    def islower(self): # real signature unknown; restored from __doc__
        """
        S.islower() -> bool
        
        Return True if all cased characters in S are lowercase and there is
        at least one cased character in S, False otherwise.
        """
        return False

    def isnumeric(self): # real signature unknown; restored from __doc__
        """
        S.isnumeric() -> bool
        
        Return True if there are only numeric characters in S,
        False otherwise.
        """
        return False

    def isprintable(self): # real signature unknown; restored from __doc__
        """
        S.isprintable() -> bool
        
        Return True if all characters in S are considered
        printable in repr() or S is empty, False otherwise.
        """
        return False

    def isspace(self): # real signature unknown; restored from __doc__
        """
        S.isspace() -> bool
        
        Return True if all characters in S are whitespace
        and there is at least one character in S, False otherwise.
        """
        return False

    def istitle(self): # real signature unknown; restored from __doc__
        """
        S.istitle() -> bool
        
        Return True if S is a titlecased string and there is at least one
        character in S, i.e. upper- and titlecase characters may only
        follow uncased characters and lowercase characters only cased ones.
        Return False otherwise.
        """
        return False

    def isupper(self): # real signature unknown; restored from __doc__
        """
        S.isupper() -> bool
        
        Return True if all cased characters in S are uppercase and there is
        at least one cased character in S, False otherwise.
        """
        return False

    def join(self, iterable): # real signature unknown; restored from __doc__
        """
        S.join(iterable) -> str
        
        Return a string which is the concatenation of the strings in the
        iterable.  The separator between elements is S.
        """
        return ""

    def ljust(self, width, fillchar=None): # real signature unknown; restored from __doc__
        """
        S.ljust(width[, fillchar]) -> str
        
        Return S left-justified in a Unicode string of length width. Padding is
        done using the specified fill character (default is a space).
        """
        return ""

    def lower(self): # real signature unknown; restored from __doc__
        """
        S.lower() -> str
        
        Return a copy of the string S converted to lowercase.
        """
        return ""

    def lstrip(self, chars=None): # real signature unknown; restored from __doc__
        """
        S.lstrip([chars]) -> str
        
        Return a copy of the string S with leading whitespace removed.
        If chars is given and not None, remove characters in chars instead.
        """
        return ""

    def maketrans(self, *args, **kwargs): # real signature unknown
        """
        Return a translation table usable for str.translate().
        
        If there is only one argument, it must be a dictionary mapping Unicode
        ordinals (integers) or characters to Unicode ordinals, strings or None.
        Character keys will be then converted to ordinals.
        If there are two arguments, they must be strings of equal length, and
        in the resulting dictionary, each character in x will be mapped to the
        character at the same position in y. If there is a third argument, it
        must be a string, whose characters will be mapped to None in the result.
        """
        pass

    def partition(self, sep): # real signature unknown; restored from __doc__
        """
        S.partition(sep) -> (head, sep, tail)
        
        Search for the separator sep in S, and return the part before it,
        the separator itself, and the part after it.  If the separator is not
        found, return S and two empty strings.
        """
        pass

    def replace(self, old, new, count=None): # real signature unknown; restored from __doc__
        """
        S.replace(old, new[, count]) -> str
        
        Return a copy of S with all occurrences of substring
        old replaced by new.  If the optional argument count is
        given, only the first count occurrences are replaced.
        """
        return ""

    def rfind(self, sub, start=None, end=None): # real signature unknown; restored from __doc__
        """
        S.rfind(sub[, start[, end]]) -> int
        
        Return the highest index in S where substring sub is found,
        such that sub is contained within S[start:end].  Optional
        arguments start and end are interpreted as in slice notation.
        
        Return -1 on failure.
        """
        return 0

    def rindex(self, sub, start=None, end=None): # real signature unknown; restored from __doc__
        """
        S.rindex(sub[, start[, end]]) -> int
        
        Return the highest index in S where substring sub is found,
        such that sub is contained within S[start:end].  Optional
        arguments start and end are interpreted as in slice notation.
        
        Raises ValueError when the substring is not found.
        """
        return 0

    def rjust(self, width, fillchar=None): # real signature unknown; restored from __doc__
        """
        S.rjust(width[, fillchar]) -> str
        
        Return S right-justified in a string of length width. Padding is
        done using the specified fill character (default is a space).
        """
        return ""

    def rpartition(self, sep): # real signature unknown; restored from __doc__
        """
        S.rpartition(sep) -> (head, sep, tail)
        
        Search for the separator sep in S, starting at the end of S, and return
        the part before it, the separator itself, and the part after it.  If the
        separator is not found, return two empty strings and S.
        """
        pass

    def rsplit(self, sep=None, maxsplit=-1): # real signature unknown; restored from __doc__
        """
        S.rsplit(sep=None, maxsplit=-1) -> list of strings
        
        Return a list of the words in S, using sep as the
        delimiter string, starting at the end of the string and
        working to the front.  If maxsplit is given, at most maxsplit
        splits are done. If sep is not specified, any whitespace string
        is a separator.
        """
        return []

    def rstrip(self, chars=None): # real signature unknown; restored from __doc__
        """
        S.rstrip([chars]) -> str
        
        Return a copy of the string S with trailing whitespace removed.
        If chars is given and not None, remove characters in chars instead.
        """
        return ""

    def split(self, sep=None, maxsplit=-1): # real signature unknown; restored from __doc__
        """
        S.split(sep=None, maxsplit=-1) -> list of strings
        
        Return a list of the words in S, using sep as the
        delimiter string.  If maxsplit is given, at most maxsplit
        splits are done. If sep is not specified or is None, any
        whitespace string is a separator and empty strings are
        removed from the result.
        """
        return []

    def splitlines(self, keepends=None): # real signature unknown; restored from __doc__
        """
        S.splitlines([keepends]) -> list of strings
        
        Return a list of the lines in S, breaking at line boundaries.
        Line breaks are not included in the resulting list unless keepends
        is given and true.
        """
        return []

    def startswith(self, prefix, start=None, end=None): # real signature unknown; restored from __doc__
        """
        S.startswith(prefix[, start[, end]]) -> bool
        
        Return True if S starts with the specified prefix, False otherwise.
        With optional start, test S beginning at that position.
        With optional end, stop comparing S at that position.
        prefix can also be a tuple of strings to try.
        """
        return False

    def strip(self, chars=None): # real signature unknown; restored from __doc__
        """
        S.strip([chars]) -> str
        
        Return a copy of the string S with leading and trailing
        whitespace removed.
        If chars is given and not None, remove characters in chars instead.
        """
        return ""

    def swapcase(self): # real signature unknown; restored from __doc__
        """
        S.swapcase() -> str
        
        Return a copy of S with uppercase characters converted to lowercase
        and vice versa.
        """
        return ""

    def title(self): # real signature unknown; restored from __doc__
        """
        S.title() -> str
        
        Return a titlecased version of S, i.e. words start with title case
        characters, all remaining cased characters have lower case.
        """
        return ""

    def translate(self, table): # real signature unknown; restored from __doc__
        """
        S.translate(table) -> str
        
        Return a copy of the string S in which each character has been mapped
        through the given translation table. The table must implement
        lookup/indexing via __getitem__, for instance a dictionary or list,
        mapping Unicode ordinals to Unicode ordinals, strings, or None. If
        this operation raises LookupError, the character is left untouched.
        Characters mapped to None are deleted.
        """
        return ""

    def upper(self): # real signature unknown; restored from __doc__
        """
        S.upper() -> str
        
        Return a copy of S converted to uppercase.
        """
        return ""

    def zfill(self, width): # real signature unknown; restored from __doc__
        """
        S.zfill(width) -> str
        
        Pad a numeric string S with zeros on the left, to fill a field
        of the specified width. The string S is never truncated.
        """
        return ""

常见操作

字符串是不可变的？

>>> a = 'xiaoyafei'     #变量的复制，即是将这个变量指向内存空间中的地址
>>> id(a)               #查看到此时变量a的空间地址
139742341438256     
>>> a = 'xiaobaba'         #对a进行修改
>>> id(a)                   #再次查看变量a的空间地址
139742341436848

为什么说字符串是不可变的呢？我们可以很清晰的看到变量a两次的空间地址并不一样，是因为在我们修改时，把a指向了另一个内存空间地址，而xiaoyafei这个值将会被垃圾回收

swapcase() 小写变大写，大写变小写

>>> str = 'Hello World'
>>> str.swapcase()
'hELLO wORLD'

capitalize() 首字母大写，其余全变成小写

>>> str = 'helLo WORLD'
>>> str.capitalize()
'Hello world'

casefold() 把大写都全部换成小写

>>> str = 'HELLO WORLD!'
>>> str.casefold()
'hello world!'

center() 返回字符串在中间，两边以字符填充

>>> str = 'hello world'
>>> str.center(50,'*')
'*******************hello world********************'

count() 统计字符出现的次数

>>> str = 'abcdfdawadsqasacsasasaa'
>>> str.count('a')
9
#也可以指定从哪开始到哪结束
>>> str.count('a',0,10)
3

endswith() 判断是否以某字符串结尾，返回True或False

>>> str = 'hello world!python'
>>> str.endswith('python')
True
>>> str.endswith('hello')
False

expandtabs() 扩展tab键

>>> a = 'a	b'
>>> a
'a	b'
>>> print(a)
a       b
>>> a.expandtabs(20)
'a                   b'

find() 查找，找到则返回索引，找不到则返回-1

>>> name = 'xiaoyafei'
>>> name.find('o')
3
>>> name.find('g')
-1
#也可以规定搜索范围
>>> name.find('a',4,9)
5

format() 格式化字符串

>>> info = 'my name is {0},i am {1} years old.'
>>> info.format('xiaoyafei',22)
'my name is xiaoyafei,i am 22 years old.'
#也可以使用变量
>>> info = 'my name is {name},my age is {age}.'
>>> info.format(name = 'xiaoyafei',age=22)
'my name is xiaoyafei,my age is 22.'

index() 返回索引值，找不到则报错

>>> s = 'hello tairan and welcome!'
>>> s.index('t')
6
>>> s.index('g')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: substring not found
#也可以根据开始和终止位置查找
>>> s
'hello tairan and welcome!'
>>> s.index(
... 'a')
7

isalnum() 判断是不是阿拉伯字符（数字和字母）

>>> a = '123abc'
>>> a.isalnum()
True
>>> b = '123!abc'
>>> b.isalnum()
False

isalpha() 判断是否只包含字木，不包含数字

>>> a = 'abc123'
>>> a.isal
a.isalnum(  a.isalpha(  
>>> a.isalpha()
False
>>> b = 'abcdefg'
>>> b.isalpha()
True

decimal() 判断是否是整数

>>> a = '111'
>>> a.isdecimal()
True
#第二种方法
>>> a.isdigit()
True

isidentifier() 判断字符串的值是不是可以使用的变量名

>>> a = '111'
>>> a.isidentifier()
False           #数字不能作为变量名
>>> b = 'Menu'
>>> b.isidentifier()
True

islower() 判断字符串是不是都是小写

>>> str = 'HelloWorld'
>>> str.islower()
False
>>> str2 = 'helloworld'
>>> str2.islower()
True

isnumeric() 判断是不是只有数字在里面

>>> a = '1233'
>>> a.isnumeric()
True
>>> b = 'abc123'
>>> b.isnumeric()
False

isspace() 判断是不是空格

>>> a = ' '
>>> a.isspace()
True

title() 变成title，即每一个单词的首字母都是大写

>>> str = 'hello world python tairan '
>>> str.title()
'Hello World Python Tairan '

upper() 将字符串全部转换成大写

>>> namew = 'xiaoyafei'
>>> namew.upper()
'XIAOYAFEI'

join() 把列表变成字符串后，以指定的字符串来区分列表里的元素

>>> names = ['lipeng','likun','lipeng']
>>> str = '---'
>>> str.join(names)
'lipeng---likun---lipeng'

ljust() 从左边开始，总长度不够的话用指定的字符去代替

>>> name = 'xiaoyafei'
>>> name.ljust(50,'_')
'xiaoyafei_________________________________________'

lower() 把字符串都变成小写

>>> name = 'XIAOYAFEI'
>>> name.lower()
'xiaoyafei'

strip() 去除两边的空格和换行

>>> str = '
 hello world              '
>>> str.strip()
'hello world'
# lstrp只去左边不去右边
# rstrip只去右边不去左边

maketrans() 密码表

>>> str_in = '1234567890'       #这个为原本的
>>> str_out = '!@#$%^&*()'        #这个是输出后的
>>> table = str.maketrans(str_in,str_out)       #将两张表进行对应关系
>>> s = '572428582'         #重新输入的
>>> s.translate(table)      #输出后的密码表
'%&@$@*%*@'

partition() 把整个字符串以自己想要的进行切割

>>> str
'hello world'
>>> str.partition('o')
('hell', 'o', ' world')

replace() 替换，默认全部更换掉

>>> s = 'hello world,hello tairan'
>>> s.replace('llo','LLO')
'heLLO world,heLLO tairan'
#只更换一次
>>> s
'hello world,hello tairan'
>>> s.replace('llo','LLO',1)
'heLLO world,hello tairan'

rsplit() 从右边开始将字符串以指定的字符切割

> s
'hello world,hello tairan'
>>> s.rsplit('o')
['hell', ' w', 'rld,hell', ' tairan']

splitlines() 按行来分，格式为列表

>>> str = 'a
b
c
d
'
>>> str.splitlines()
['a', 'b', 'c', 'd']

startswith() 判断以指定字符开始

>>> str
'hello,xiaoyafei'
>>> str.startswith('hel')
True

zfill() 从左开始，以指定字符代替，（右边为原字符串）

>>> str = 'hello tairan'
>>> str.zfill(50)
'00000000000000000000000000000000000000hello tairan'

基本数据类型-列表

列表的定义和创建

定义：[ ]内以逗号分隔，按照索引，存放各种数据类型，每一个位置代表一个元素

列表的创建

list_test = ['张三','李四','xiaoyafei',22]
或
list_test = list(['zhangsan','lisi','wanger'])

列表的特点

特征：

可存放多个值

按照从左到右的顺序定义列表元素，下标从0开始顺序访问，有序

    list    =       ['张三','李四','校长']
    索引                0      1      2

可以修改指定索引的值，可变

列表的常见操作

这里用了不同的列表，请注意查看！

append() 追加一个元素在列表的最后

>>> list = ['xiaoyafei']
>>> list.append('lipeng')
>>> list
['xiaoyafei', 'lipeng']

切片

#切下表为2到下标为4的元素，顾头不顾尾
>>> list
['xiaoyafei', 'lipeng', 'zhangsan', 'lisi', 'wanger', 'mazi', 'zhaoliu', 'likun']
>>> list[2:5]
['zhangsan', 'lisi', 'wanger']

#切下表为2到最后的元素，顾头也顾尾
>>> list
['xiaoyafei', 'lipeng', 'zhangsan', 'lisi', 'wanger', 'mazi', 'zhaoliu', 'likun']
>>> list[2:]
['zhangsan', 'lisi', 'wanger', 'mazi', 'zhaoliu', 'likun']

#切从第一个到最后一个
>>> list
['xiaoyafei', 'lipeng', 'zhangsan', 'lisi', 'wanger', 'mazi', 'zhaoliu', 'likun']
>>> list[:]
['xiaoyafei', 'lipeng', 'zhangsan', 'lisi', 'wanger', 'mazi', 'zhaoliu', 'likun']

#切最后两个，顾头也顾尾
>>> list
['xiaoyafei', 'lipeng', 'zhangsan', 'lisi', 'wanger', 'mazi', 'zhaoliu', 'likun']
>>> list[-2:]
['zhaoliu', 'likun']

# 步长语句，只切下表为奇数
>>> list
['xiaoyafei', 'lipeng', 'zhangsan', 'lisi', 'wanger', 'mazi', 'zhaoliu', 'likun']
>>> list[::2]
['xiaoyafei', 'zhangsan', 'wanger', 'zhaoliu']

count() 查看存在多少相同的元素

>>> list1
[2, 2, 3, 4, 1, 2]
>>> list1.count(2)
3
>>> list2 = ['lipeng','likun','lipeng','weiwenwu','lihang','lisiru','lipeng']
>>> list2.count('lipeng')
3

index() 获取元素的下标

>>> list
['xiaoyafei', 'lipeng', 'zhangsan', 'lisi', 'wanger', 'mazi', 'zhaoliu', 'likun']
>>> list.       #tab键
list.__add__(           list.__dir__(           list.__getattribute__(  list.__imul__(          list.__lt__(            list.__reduce_ex__(     list.__setitem__(       list.clear(             list.insert(            
list.__class__(         list.__doc__            list.__getitem__(       list.__init__(          list.__mul__(           list.__repr__(          list.__sizeof__(        list.copy(              list.pop(               
list.__contains__(      list.__eq__(            list.__gt__(            list.__iter__(          list.__ne__(            list.__reversed__(      list.__str__(           list.count(             list.remove(            
list.__delattr__(       list.__format__(        list.__hash__           list.__le__(            list.__new__(           list.__rmul__(          list.__subclasshook__(  list.extend(            list.reverse(           
list.__delitem__(       list.__ge__(            list.__iadd__(          list.__len__(           list.__reduce__(        list.__setattr__(       list.append(            list.index(             list.sort(              
>>> list.index('lipeng')
1

获取指定下标的值

>>> list
['xiaoyafei', 'lipeng', 'zhangsan', 'lisi', 'wanger', 'mazi', 'zhaoliu', 'likun']
>>> list[2]
'zhangsan'

insert() 插入一个元素

>>> list
['xiaoyafei', 'lipeng', 'zhangsan', 'lisi', 'wanger', 'mazi', 'zhaoliu', 'likun']
# 在下表为3的元素前插入一个元素
>>> list.insert(3,'xixihaha')
>>> list
['xiaoyafei', 'lipeng', 'zhangsan', 'xixihaha', 'lisi', 'wanger', 'mazi', 'zhaoliu', 'likun']

列表的修改

>>> list2
['lipeng', 'likun', 'lipeng', 'weiwenwu', 'lihang', 'lisiru', 'lipeng']
>>> list2[2] = 'xixihaha'
>>> list2
['lipeng', 'likun', 'xixihaha', 'weiwenwu', 'lihang', 'lisiru', 'lipeng']

# 想要连续修改或者批量修改，可以使用切片，因为会把字符串拆掉，不够的话会自动创建
>>> list
['xiaoyafei', 'lipeng', 'xixihaha', 'xixihaha', 'lisi', 'wanger', 'mazi', 'zhaoliu', 'likun']
>>> list[5:] = 'xixihaha'
>>> list
['xiaoyafei', 'lipeng', 'xixihaha', 'xixihaha', 'lisi', 'x', 'i', 'x', 'i', 'h', 'a', 'h', 'a']

pop() 删除列表的最后一个元素

>>> list2
['lipeng', 'likun', 'xixihaha', 'weiwenwu', 'lihang', 'lisiru']
>>> list2.pop()
'lisiru'        #会把删除的东西打印出来
>>> list2
['lipeng', 'likun', 'xixihaha', 'weiwenwu', 'lihang']

remove() 删除指定元素

>>> list2
['lipeng', 'likun', 'xixihaha', 'weiwenwu', 'lihang']
>>> list2.remove('likun')       #删除元素不会打印
>>> list2
['lipeng', 'xixihaha', 'weiwenwu', 'lihang']

# 也可以使用del
>>> list2
['lipeng', 'xixihaha', 'weiwenwu', 'lihang']
>>> del list2[1]
>>> list2
['lipeng', 'weiwenwu', 'lihang']

# 批量删除
>>> list
['xiaoyafei', 'lipeng', 'xixihaha', 'xixihaha', 'lisi', 'x', 'i', 'x', 'i', 'h', 'a', 'h', 'a']
>>> del list[5:]
>>> list
['xiaoyafei', 'lipeng', 'xixihaha', 'xixihaha', 'lisi']

    pop()和remove()和del的区别：
        在于pop()函数删完元素之后，会将删除的元素打印出来，而remove()函数以及del不会打印

列表的循环

>>> list
['xiaoyafei', 'lipeng', 'xixihaha', 'xixihaha', 'lisi']
>>> for i in list:
...   print(i)
... 
xiaoyafei
lipeng
xixihaha
xixihaha
lisi

    for和while的区别：
        while是可以支持死循环的，例如while True，而for循环是有边界的

sort() 根据ASCII码标表顺序排序

# 有重复的元素，但是在排序时就去重了
>>> list3 = ['a','b','c','d','e','A','D','c','f','g']
>>> list3.sort()        不会立马就打印出来
>>> list3
['A', 'D', 'a', 'b', 'c', 'c', 'd', 'e', 'f', 'g']

reverse() 把整个列表倒过来

>>> list3
['A', 'D', 'a', 'b', 'c', 'c', 'd', 'e', 'f', 'g']
>>> list3.reverse()
>>> list3
['g', 'f', 'e', 'd', 'c', 'c', 'b', 'a', 'D', 'A']

把两个列表进行相加

>>> list 
['xiaoyafei', 'lipeng', 'xixihaha', 'xixihaha', 'lisi']
>>> list3
['g', 'f', 'e', 'd', 'c', 'c', 'b', 'a', 'D', 'A']
>>> list+list3
['xiaoyafei', 'lipeng', 'xixihaha', 'xixihaha', 'lisi', 'g', 'f', 'e', 'd', 'c', 'c', 'b', 'a', 'D', 'A']

# 或者也可以使用extend()函数
>>> list
['xiaoyafei', 'lipeng', 'xixihaha', 'xixihaha', 'lisi']
>>> list3
['g', 'f', 'e', 'd', 'c', 'c', 'b', 'a', 'D', 'A']
>>> list.extend(list3)      #同样不会立马打印出来
>>> list
['xiaoyafei', 'lipeng', 'xixihaha', 'xixihaha', 'lisi', 'g', 'f', 'e', 'd', 'c', 'c', 'b', 'a', 'D', 'A']

clear() 清空列表

>>> list3
['g', 'f', 'e', 'd', 'c', 'c', 'b', 'a', 'D', 'A']
>>> list3.clear()
>>> list3
[]

深拷贝和浅拷贝

这里重新定义了列表！

# 把list复制给list3
>>> list
['zhangsan', 'xiaoyafei']
>>> list3 = list
>>> id(list)            #查看到列表的ID号是一样的
2661905413128
>>> id(list3)
2661905413128
>>> list[1] = 'lidapeng'    #所以可以理解为这个列表是公用的
>>> list3
['zhangsan', 'lidapeng']
>>> list
['zhangsan', 'lidapeng']

让我们首先来回顾下变量:

    name = 'xiaoyafei'      #代表在内存中开辟一个新的空间用来存放这个值xiaoyafei,变量名为name
    name2 = name        #代表着name2指向了这个空间地址,这个空间地址的值是xiaoyafei
    >>> id(name)        #内存空间地址相同，即可证明这一点
    2292157129392
    >>> id(name2)
    2292157129392

那么我们希望list能和list4分开，那么我们就需要使用copy()函数：

# 此时的list和list2列表是完全独立的
>>> list = ['zhangsan','lisi']
>>> list
['zhangsan', 'lisi']
>>> list2 = list.copy()     #list调用copy()函数，copy给list2
>>> list2
['zhangsan', 'lisi']
>>> list[1] = 'xiaoyafei'
>>> list2
['zhangsan', 'lisi']

# 查看两个列表的ID号
>>> id(list)
2661905413128
>>> id(list2)
2661906326600

知识点2--浅copy

# 重新定义list和list2函数，然后使用append将list2添加到list中
>>> list = ['zhangsan','lisi','wangwu']
>>> list2 = ['beijing','shanghai','guangzhou']
>>> list.append(list2)
>>> list
['zhangsan', 'lisi', 'wangwu', ['beijing', 'shanghai', 'guangzhou']]

# 这个时候我们使用copy()函数
>>> list3 = list.copy()
>>> list3
['zhangsan', 'lisi', 'wangwu', ['beijing', 'shanghai', 'guangzhou']]
>>> id(list)        #查看内存地址后发现是不一样的，那么我们对list进行修改的话，list3也会修改吗？
1708042040328
>>> id(list3)
1708042953736

# 我们来修改测试一下
>>> list[1] = 'xixihaha'        #对list列表下标为1的元素进行修改，发现 list3没有继续修改
>>> list
['zhangsan', 'xixihaha', 'wangwu', ['beijing', 'shanghai', 'guangzhou']]
>>> list3
['zhangsan', 'lisi', 'wangwu', ['beijing', 'shanghai', 'guangzhou']]

# 让我们再来修改一下看看
>>> list[3][0] = '珠海'
>>> list
['zhangsan', 'xixihaha', 'wangwu', ['珠海', 'shanghai', 'guangzhou']]
>>> list3
['zhangsan', 'lisi', 'wangwu', ['珠海', 'shanghai', 'guangzhou']]
# 奇怪的事情发生了，为什么list3也会跟着修改呢？可刚刚我们测试的时候是没有修改的啊

# 原因就是
>>> id(list[3])     #我们发现在列表中的子列表在内存中的空间地址是一模一样的，难不成会跟着修改呢
1708042061768
>>> id(list3[3])
1708042061768

知识点3--深copy

# 在这里，我们又重新定义了列表
>>> list = ['zhangsan','lisi','wangwu']
>>> list2 = ['beijing','shanghai']
>>> list.append(list2)
>>> list
['zhangsan', 'lisi', 'wangwu', ['beijing', 'shanghai']]

# 如果想使用深copy的话，需要导入python的工具包copy
>>> list
['zhangsan', 'lisi', 'wangwu', ['beijing', 'shanghai']]
>>> import copy     #导入python工具包
>>> list3 = copy.deepcopy(list)
>>> list3
['zhangsan', 'lisi', 'wangwu', ['beijing', 'shanghai']]
>>> id(list)
1255945745416
>>> id(list3)
1255948140680

# 让我们来试一下
>>> list[1] = 'xixihaha'        #对list列表下标为1的元素进行修改，发现list3并无修改
>>> list
['zhangsan', 'xixihaha', 'wangwu', ['beijing', 'shanghai']]
>>> list3
['zhangsan', 'lisi', 'wangwu', ['beijing', 'shanghai']]

# 再来试一下
>>> list[3][0] = '珠海'
>>> list
['zhangsan', 'xixihaha', 'wangwu', ['珠海', 'shanghai']]
>>> list3
['zhangsan', 'lisi', 'wangwu', ['beijing', 'shanghai']]
那么为什么list3没有随着list的改变再修改呢？
原因看下面

深copy和浅copy的区别

浅copy不会重新创建内存地址，内容指向之前的空间地址，如果浅copy对象中有其他子对象，那么在修改这个子对象的时侯子对象的内容就会改变

深copy就是新建一个对象重复分配空间地址，复制对象的内容

那么我们可以理解成，在list这个列表中，存在着子列表，那么我们对子列表之外的元素进行修改时，另一个对象不会去修改，但是在我们修改子列表的元素的时候，另一个列表就会跟着修改

基本数据类型-元组

元组的定义

与列表相似，只不过[]换成了()，我们一般称元组为制度列表

元组的特征

可存放多个值

不可变,元组本身不可变，但是如果元组还存放其余可变类型的数据，则是可变的

按照从左到右的顺序定义元组元素，下表从0开始顺序访问，有序

元组的用途

显示的告知别人，此处数据不能修改

数据库连接配置信息等

元组的创建

    ages = (11,22,33,44,55)
    或
    ages = tuple(11,22,33,44,55)

元组的常见操作

索引

>>> ages = (11,22,33,44,55)
>>> ages[0]
11
>>> ages[3]
44

切片

>>> ages
(11, 22, 33, 44, 55)
>>> ages[1:2]
(22,)
>>> ages[1:]
(22, 33, 44, 55)
>>> ages[::2]
(11, 33, 55)

循环

>>> ages
(11, 22, 33, 44, 55)
>>> for i in ages:
...  print(i)
...
11
22
33
44
55

len() 长度

>>> ages
(11, 22, 33, 44, 55)
>>> len(ages)
5

in 包含

>>> ages
(11, 22, 33, 44, 55)
>>> 11 in ages
True

最后讲解元组的修改

>>> ages
(11, 22, 33, 44, 55)
>>> ages[1] = 'libaba'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'tuple' object does not support item assignment      #立马报错提示不能修改

可变、不可变数据类型和hash

    可变类型                不可变类型
    列表                      数字
                             字符串
                             元组

我们看下什么是可变什么是不可变：

列表

>>> list =  [1,2,3,4]
>>> id(list)
1602240328712
>>> list.append(5)
>>> list
[1, 2, 3, 4, 5]
>>> id(list)
1602240328712

数字

>>> a = 1
>>> id(a)
1664904208
>>> a += 1
>>> a
2
>>> id(a)
1664904240

字符串

# 示范 1
>>> str = 'hello'
>>> str[1] = 'a'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'str' object does not support item assignment
# 示范2
>>> str = 'hello'
>>> id(str)
1602241176272
>>> str += ' world'
>>> str
'hello world'
>>> id(str)
1602241242992

元组

>>> tup = (1,2,3,4)
>>> tup[1] = 3
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'tuple' object does not support item assignment

所以:可变不可变重要的区分在于内存空间地址是否发生改变

hash

就是将任意长度的字符串，通过散列算法，变成固定长度的输出，该输出就是散列值

这种转换被称为压缩映射，散列值的空间通常远小于输入的空间，不同的输入可能会散列成相同的输出，所以不可能从散列值来却东唯一输入值

简单的来说，就是将一种任意长度的消息压缩到固定长度的消息摘要的函数

特征

hash值的计算过程是依据这个值的一些特征进行计算的，要求被hash值是固定的，因此被hash是不可变的

用途

文件签名

md5加密（无法反解）

密码验证

语法

>>> hash('xiaoyafei')
-1204730465301308513

基本数据类型-字典

字典的定义

字典是Python语言中唯一的映射类型。

>>> info = {
...     'stu1':'zhangsan',
...     'stu2':'lisi',
...     'stu3':'wangwu'
... }

# 也可以使用另一种方法
>>> person = dict({'name':'zhangsan'})

字典的特征

key-value结构

key必须可hash,且必须为不可变类型必须唯一

可存放任意多个值可修改可以不唯一

无序

字典的常见操作

增加

>>> info
{'stu1': 'zhangsan', 'stu2': 'lisi', 'stu3': 'wangwu'}
>>> info['stu4'] = 'xixihaha'
>>> info
{'stu1': 'zhangsan', 'stu2': 'lisi', 'stu3': 'wangwu', 'stu4': 'xixihaha'}

修改

>>> info
{'stu1': 'zhangsan', 'stu2': 'lisi', 'stu3': 'wangwu', 'stu4': 'xixihaha'}
>>> info['stu1'] = 'student1'           #修改的是value
>>> info
{'stu1': 'student1', 'stu2': 'lisi', 'stu3': 'wangwu', 'stu4': 'xixihaha'}

集合去重

# 集合会自动去重
>>> s = {1,1,2,2,3,3,4,4}
>>> s
{1, 2, 3, 4}

将列表转换成集合

>>> list = [1,2,3,4,5,6,7]
>>> s = set(list)
>>> s
{1, 2, 3, 4, 5, 6, 7}

in 查找表示准语法

>>> info
{'stu1': 'student1', 'stu2': 'lisi', 'stu3': 'wangwu', 'stu4': 'xixihaha'}
>>> 'stu1' in info
True

获取value值,存在则返回,不存在不返回任何信息

>>> info
{'stu1': 'student1', 'stu2': 'lisi', 'stu3': 'wangwu', 'stu4': 'xixihaha'}
>>> info.get('stu1')
'student1'

# 存在则返回,不存在则报错
>>> info['stu1']
'student1'

pop() 删除

>>> info
{'stu1': 'student1', 'stu2': 'lisi', 'stu3': 'wangwu', 'stu4': 'xixihaha'}
>>> info.pop('stu1')             #会返回删除的信息
'student1'
>>> info
{'stu2': 'lisi', 'stu3': 'wangwu', 'stu4': 'xixihaha'}

popitem() 随机删除,如果数据量足够大那么就是无序的

>>> info
{'stu2': 'lisi', 'stu3': 'wangwu', 'stu4': 'xixihaha', 'stu1': 'zhangsan', 'stu5': 'lihaha'}
>>> info.popitem()      
('stu5', 'lihaha')
>>> info.popitem()
('stu1', 'zhangsan')

丢弃discard

>>> s
{1, 2, 3, 4, 5, 6, 7}
>>> s.discard(6)
>>> s.discard(10)           #即使没有也不会报错

del 删除

>>> person
{'name': 'zhangsan'}
>>> del person
>>> person
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'person' is not defined

清空clear()

>>> s
{1, 2, 3, 4, 5, 7}
>>> s.clear()
>>> s
set()

打印所有key

>>> info
{'stu2': 'lisi', 'stu3': 'wangwu', 'stu4': 'xixihaha', 'stu5': 'lihaha', 'stu6': 'zhangsan'}
>>> info.keys()
dict_keys(['stu2', 'stu3', 'stu4', 'stu5', 'stu6'])

打印所有value

>>> info.values()
dict_values(['lisi', 'wangwu', 'xixihaha', 'lihaha', 'zhangsan'])

打印key和value变成元组,把字典转成列表

>>> info
{'stu2': 'lisi', 'stu3': 'wangwu', 'stu4': 'xixihaha', 'stu5': 'lihaha', 'stu6': 'zhangsan'}
>>> info.items()
dict_items([('stu2', 'lisi'), ('stu3', 'wangwu'), ('stu4', 'xixihaha'), ('stu5', 'lihaha'), ('stu6', 'zhangsan')])

update 把两个字典合并成一个字典,key有重复则覆盖,没有则新建

>>> info
{'stu2': 'lisi', 'stu3': 'wangwu', 'stu4': 'xixihaha', 'stu5': 'lihaha', 'stu6': 'zhangsan'}
>>> info2
{'stu2': '北京', 'stu8': '上海', 'stu9': '广州'}
>>> info.update(info2)
>>> info
{'stu2': '北京', 'stu3': 'wangwu', 'stu4': 'xixihaha', 'stu5': 'lihaha', 'stu6': 'zhangsan', 'stu8': '上海', 'stu9': '广州'}

setdefault创建,存在则不操作,不存在则创建

>>> info2
{'stu2': '北京', 'stu8': '上海', 'stu9': '广州'}
>>> info2.setdefault(2,'new2')
'new2'
>>> info2
{'stu2': '北京', 'stu8': '上海', 'stu9': '广州', 2: 'new2'}

fromkeys 不会添加到原来字典当中,而是直接弹出

>>> info.fromkeys(['a','b','c','d'],['xiaoyafei'])
{'a': ['xiaoyafei'], 'b': ['xiaoyafei'], 'c': ['xiaoyafei'], 'd': ['xiaoyafei']}

字典的循环

>>> info
{'stu2': '北京', 'stu3': 'wangwu', 'stu4': 'xixihaha', 'stu5': 'lihaha', 'stu6': 'zhangsan', 'stu8': '上海', 'stu9': '广州'}
>>> for i in info:
...     print(i,info[i])
...
stu2 北京
stu3 wangwu
stu4 xixihaha
stu5 lihaha
stu6 zhangsan
stu8 上海
stu9 广州

# 还有一种方法,但是很低效,因为要把字典转换成列表
>>> for k,v in info.items():
...     print(k,v)
...
stu2 北京
stu3 wangwu
stu4 xixihaha
stu5 lihaha
stu6 zhangsan
stu8 上海
stu9 广州

集合

集合的创建

set = {'xiaozhang','laowang','dage'}
有人可能会说了着明明是一个字典,那我们用type()函数试一下
>>> set = {'xiaozhang','laowang','dage'}
>>> type(set)
结果为:class 'set'>
原来当{}里为空就是一个字典,如果有东西就是集合

由一个或者多个确定的元素所构成的整体叫做集合

集合中的元素由三个特征

确定性(元素必须可hash)

互异性(去重)

无序性(集合中的元素没有先后之分)

注意:集合存在的意义就是在于去重和关系运算

用过例子来说明关系运算

  自定义购买iphone7和iphone8的人
    i7 = {'xiaozhang','xiangwang','xiaoli'}
    i8 = {'xiaoli','laozhao','laowang'}

交集 intersection()

# 两个都购买的人
>>> l_p  = i7&i8
>>> l_p
{'xiaoli'}

# 也可以使用
>>> i7
{'xiaoli', 'xiangwang', 'xiaozhang'}
>>> i8
{'xiaoli', 'laozhao', 'laowang'}
>>> i7.intersection(i8)
{'xiaoli'}

差集 difference()

# 只购买iphone7的人
>>> i7.difference(i8)
{'xiangwang', 'xiaozhang'}

# 只购买iphone8的人
>>> i8.difference(i7)
{'laowang', 'laozhao'}

并集 union()

# 购买iphone7和iphone8的人
>>> i7.union(i8)
{'laozhao', 'laowang', 'xiaozhang', 'xiaoli', 'xiangwang'}
>>> i8.union(i7)
{'laozhao', 'laowang', 'xiaozhang', 'xiaoli', 'xiangwang'}

# 也可以使用
>>> i7|i8
{'laozhao', 'laowang', 'xiaozhang', 'xiaoli', 'xiangwang'}

对称差集,把不交集的地方取出来 symmetric_difference

# 可以理解为: 只买了i7和只买了i8的人
>>> i7 = {'xiaozhang','xiangwang','xiaoli'}
>>> i8 = {'xiaoli','laozhao','laowang'}
>>> i7.symmetric_difference(i8)
{'laowang', 'xiaozhang', 'xiangwang', 'laozhao'}

子集

# 可以理解成北京市是中国的子集,而中国是北京市的超集
>>> s = {1,2,3}
>>> s2 = {1,2,3,4,5,6}
>>> s.issubset(s2)
True

# 也可以使用
>>> s2>s
True

超集

>>> print(s,s2)
{1, 2, 3} {1, 2, 3, 4, 5, 6}
>>> s2.issuperset(s)
True

# 也可以使用
>>> s<s2
True

判断两个集合是不是不相交 isdisjoint()

>>> set1 = {1,2,3}
>>> set2 = {7,8,9}
>>> set1.isdisjoint(set2)
True

获取差集并重新赋值给set1

>>> set1
{1, 2, 3, -1, -2}
>>> set2
{1, 2, 3, 7, 8, 9}
>>> set1.difference(set2)
{-1, -2}
>>> set1.difference_update(set2)
>>> set1
{-1, -2}
>>> set2
{1, 2, 3, 7, 8, 9}

相关阅读:
大数据在企业中发挥的作用，以及如何驱动企业创新
 大数据在企业中发挥的作用，以及如何驱动企业创新
 学习各种预测数据的方法
 学习各种预测数据的方法
 大数据时代企业须打好信息资源攻坚战
 大数据时代企业须打好信息资源攻坚战
 小白学数据分析--充值记录分析
 大数据可视分析背后的商业逻辑
 大数据可视分析背后的商业逻辑
 大数据架构师必读的NoSQL建模技术
原文地址：https://www.cnblogs.com/xiaoyafei/p/8904015.html