• 集群某节点DataNode服务无法启动解决(报java.net.BindException:Address already in use错误)



    现象:

    在集群中某节点, 启动DataNode服务后马上又Shutdown, 在操作系统没看到有DataNode的日志(可能是服务启动失败, 自动删除了日志文件),幸好在界面上可以查看报错的日志:

     
     
     
    点开报错信息, 可以看到如下信息:
     
    HDFS的端口为50010, 但是使用netstat -ntulp | grep 50010查看不到此端口。

    分析:

    原因:当应用程序崩溃后, 它会留下一个滞留的socket,以便能够提前重用socket, 当尝试绑定socket并重用它,你需要将socket的flag设置为SO_REUSEADDR,但是HDFS不是这么做的。解决办法是使用设置SO_REUSEADDR的应用程序绑定到这个端口, 然后停止这个应用程序。可以使用netcat工具实现。
    解决办法: 安装nc工具, 使用nc工具占用50010端口, 然后关闭nc服务, 再次启动DataNode后正常。

     
     

    参考链接:
    http://www.nosql.se/2013/10/hadoop-hdfs-datanode-java-net-bindexception-address-already-in-use/
    参考文字:
    1. After an application crashes it might leave a lingering socket, so to reuse that
    2. socket early you need to set the socket flag SO_REUSEADDR when attempting to bind to
    3. it to be allowed to reuse it. The HDFS datanode doesnt do that, and I didnt want to
    4. restart the HBase regionserver (which was locking the socket with a connection it hadnt realized was dead).
    5. The solution was to bind to the port with an application that sets SO_REUSEADDR and
    6. then stop that application, I used netcat for that:
    7. # nc -l 50010


    1. 2017-02-17 20:54:52,250 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Shutdown complete.
    2. 2017-02-17 20:54:52,251 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in secureMain
    3. java.net.BindException: Address already in use
    4. at sun.nio.ch.Net.bind0(Native Method)
    5. at sun.nio.ch.Net.bind(Net.java:444)
    6. at sun.nio.ch.Net.bind(Net.java:436)
    7. at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214)
    8. at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
    9. at com.cloudera.io.netty.channel.socket
    10. .nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:125)
    11. at com.cloudera.io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:475)
    12. at com.cloudera.io.netty.channel.DefaultChannelPipeline$HeadContext.bind(DefaultChannelPipeline.java:1021)
    13. at com.cloudera.io.netty.channel.AbstractChannelHandlerContext.invokeBind(AbstractChannelHandlerContext.java:455)
    14. at com.cloudera.io.netty.channel.AbstractChannelHandlerContext.bind(AbstractChannelHandlerContext.java:440)
    15. at com.cloudera.io.netty.channel.DefaultChannelPipeline.bind(DefaultChannelPipeline.java:844)
    16. at com.cloudera.io.netty.channel.AbstractChannel.bind(AbstractChannel.java:194)
    17. at com.cloudera.io.netty.bootstrap.AbstractBootstrap$2.run(AbstractBootstrap.java:340)
    18. at com.cloudera.io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:380)
    19. at com.cloudera.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
    20. at com.cloudera.io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
    21. at com.cloudera.io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
    22. at java.lang.Thread.run(Thread.java:745)
    23. 2017-02-17 20:54:52,262 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1
    24. 2017-02-17 20:54:52,264 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
    25. /************************************************************
    26. SHUTDOWN_MSG: Shutting down DataNode at cdh1/192.168.5.78




  • 相关阅读:
    69. 二叉树的层次遍历
    17. 子集(Subsets)
    33. N皇后问题(回溯)
    15. 全排列
    53. 数字组合 II
    135. 数字组合
    95. 验证二叉查找树
    88. 最近公共祖先
    245. 子树
    [python应用]python简单图片抓取
  • 原文地址:https://www.cnblogs.com/xiaohe001/p/6427413.html
Copyright © 2020-2023  润新知