• spark优化之数据结构(减少内存)


    官网是这么说的:

    The first way to reduce memory consumption is to avoid the Java features that add overhead, such as pointer-based data structures and wrapper objects. There are several ways to do this:
    1、Design your data structures to prefer arrays of objects, and primitive types, instead of the standard Java or Scala collection classes (e.g. HashMap). The fastutil library provides convenient collection classes for primitive types that are compatible with the Java standard library. 2、Avoid nested structures with a lot of small objects and pointers when possible. 3、Consider using numeric IDs or enumeration objects instead of strings for keys. 4、If you have less than 32 GB of RAM, set the JVM flag -XX:+UseCompressedOops to make pointers be four bytes instead of eight. You can add these options in spark-env.sh.

    总之,尽量使用原生类型或者数组,而不要使用诸如hashmap,linkedlist之类的复杂类型,因为它会占用更多的存储空间。

    然后,避免使用很多小对象嵌套结构的指针?(这个没太看懂)

    然后,尽量使用数字,或者枚举类型,而不要使用字符串,因为字符串会占用更多的内存空间。

    如果你的结点内存小于32G(不知道为什么一定是32G),那么设计JVM参数 -XX:+UseCompressedOops  使用对象指针占用从8字节变成4字节

  • 相关阅读:
    透明度问题解决方案
    不得不去奋斗的原因
    未来的你肯定会感谢现在努力的你
    前端学习参考
    js仿京东轮播图效果
    hyper容器网络相关源码分析
    利用setns()将进程加入一个新的network namespace
    frakti && RunPodSandbox 源码分析
    Makefile 编写 tips
    《In Search of an Understandable Consensus Algorithm》翻译
  • 原文地址:https://www.cnblogs.com/hark0623/p/4515213.html
Copyright © 2020-2023  润新知