• Spark scala使用na.replace替换DataFrame中的字符串


    创建DataFrameF示例

    val df = sc.parallelize(Seq(
         |   (0,"cat26","cat26"),
         |   (1,"cat67","cat26"),
         |   (2,"cat56","cat26"),
         |   (3,"cat8","cat26"))).toDF("Hour", "Category", "Value")

    方法一:

    scala> df.na.replace("*", Map[Any, Any](
         |      "cat26" -> "cat23"
         |    )).show()
    +----+--------+-----+
    |Hour|Category|Value|
    +----+--------+-----+
    |   0|   cat23|cat23|
    |   1|   cat67|cat23|
    |   2|   cat56|cat23|
    |   3|    cat8|cat23|
    +----+--------+-----+

    spark官方源码示例:org/apache/spark/sql/DataFrameNaFunctionsSuite.scala
    name是列名

    df.na.replace("name", Map(
            "Bob" -> "Bravo",
            "Alice" -> null
          ))
    
    df.na.replace("*", Map[Any, Any](
         false -> null
       ))

    方法二:

    替换hour列中的0为9
    import com.google.common.collect.ImmutableMap; scala
    > df.na.replace("hour", ImmutableMap.of(0, 9)).show() +----+--------+-----+ |Hour|Category|Value| +----+--------+-----+ | 9| cat26|cat26| | 1| cat67|cat26| | 2| cat56|cat26| | 3| cat8|cat26| +----+--------+-----+ 替换所有列中"cat26""cat222" scala> df.na.replace("*", ImmutableMap.of("cat26", "cat222")).show() +----+--------+------+ |Hour|Category| Value| +----+--------+------+ | 0| cat222|cat222| | 1| cat67|cat222| | 2| cat56|cat222| | 3| cat8|cat222| +----+--------+------+

    spark官方源码示例:

    org/apache/spark/sql/DataFrameNaFunctions.scala
    * {{{
    *   import com.google.common.collect.ImmutableMap;
    *
    *   // Replaces all occurrences of 1.0 with 2.0 in column "height".
    *   df.na.replace("height", ImmutableMap.of(1.0, 2.0));
    *
    *   // Replaces all occurrences of "UNKNOWN" with "unnamed" in column "name".
    *   df.na.replace("name", ImmutableMap.of("UNKNOWN", "unnamed"));
    *
    *   // Replaces all occurrences of "UNKNOWN" with "unnamed" in all string columns.
    *   df.na.replace("*", ImmutableMap.of("UNKNOWN", "unnamed"));
    * }}}

    如果没有一直坚持,也不会有质的飞跃,当生命有了限度,每个人的价值就会浮现。

    船长博客,期待共同交流提高!

    本文如对您有帮助,记得点击右下边小球【赞一下】,热烈期待您关注博客 n(*≧▽≦*)n

    0成本创业_月入5000被动收入

  • 相关阅读:
    系统程序员成长计划并发(二)(下)
    Web开发必知的八种隔离级别
    国产Android视频,Broncho A1
    Android中的BatteryService及相关组件
    Struts2输出XML格式的Result
    系统程序员成长计划并发(三)(上)
    入选”第一期中国最受欢迎50大技术博客”
    Broncho团队招聘应届毕业生(包括大四学生) 2名
    系统程序员成长计划并发(三)(下)
    WordPress MailUp插件‘Ajax’函数安全绕过漏洞
  • 原文地址:https://www.cnblogs.com/v5captain/p/14846377.html
Copyright © 2020-2023  润新知