• Spark Programming--Fundamental operation


    max

    max(key=None)

    Find the maximum item in this RDD.

    Parameters:key – A function used to generate key for comparing

    例子:

    mean

    mean()

    Compute the mean of this RDD’s elements.

    min

    min(key=None)

    Find the minimum item in this RDD.

    Parameters:key – A function used to generate key for comparing

    name/setName

    name()

    setName(name)

    给RDD命名或者返回RDD的名字

    例子:

    others

    sc.parallelize():创建RDD,建议使用xrange

    getNumPartitions():获取分区数

    sc.emptyRDD():返回一个空的RDD

    glom():以分区为单位返回list

    collect():返回list(一般是返回driver program)

    例子:

    sc.textFile(path):读取文件,返回RDD(具体见Actions II)

    官网函数:textFile(nameminPartitions=Noneuse_unicode=True)

    支持读取文件:a text file from HDFS, a local file system (available on all nodes), or any Hadoop-supported file system URI, and return it as an RDD of Strings.

    例子(本地文件读取)

  • 相关阅读:
    笔记0510
    笔记0514
    笔记0521
    GridView专题
    笔记0418
    笔记0516
    笔记0515
    笔记0507
    Python 安装与环境变量配置
    ffmpeg 下载安装和简单应用
  • 原文地址:https://www.cnblogs.com/loadofleaf/p/5090134.html
Copyright © 2020-2023  润新知