• hadoop hdfs 有内网、公网ip后,本地调试访问不了集群解决


    问题背景:
    使用云上的虚拟环境搭建测试集群,导入一些数据,在本地idea做些debug调试,但是发现本地idea连接不上测试环境
    集群内部配置hosts映射是内网映射(内网ip与主机名映射),本地只能通过公网ip访问集群
    本地ide连不上内网ip,报的也是这个错误6000ms超时,连接不到内网ip(client要连内网ip是肯定连不上的)
    报错信息如下:

    WARN BlockReaderFactory: I/O error constructing remote block reader.
    org.apache.hadoop.net.ConnectTimeoutException: 60000 millis timeout while waiting for channel to be ready for connect. ch : java.nio.channels.SocketChannel[connection-pending remote=/10.0.0.10:9866]
    	at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:533)
    	at org.apache.hadoop.hdfs.DFSClient.newConnectedPeer(DFSClient.java:3101)
    	at org.apache.hadoop.hdfs.BlockReaderFactory.nextTcpPeer(BlockReaderFactory.java:755)
    	at org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:670)
    	at org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:337)
    	at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:576)
    	at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:800)
    	at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:847)
    	at java.io.DataInputStream.read(DataInputStream.java:100)
    	at org.apache.hadoop.util.LineReader.fillBuffer(LineReader.java:180)
    	at org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:216)
    	at org.apache.hadoop.util.LineReader.readLine(LineReader.java:174)
    	at org.apache.hadoop.mapred.LineRecordReader.skipUtfByteOrderMark(LineRecordReader.java:206)
    	at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:244)
    	at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:47)
    	at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:277)
    	at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:214)
    	at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73)
    	at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
    	at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
    	at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
    	at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
    	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source)
    	at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
    	at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$10$$anon$1.hasNext(WholeStageCodegenExec.scala:614)
    	at org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:253)
    	at org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:247)
    	at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:830)
    	at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:830)
    	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
    	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
    	at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
    	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
    	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
    	at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
    	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
    	at org.apache.spark.scheduler.Task.run(Task.scala:109)
    	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
    	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    	at java.lang.Thread.run(Thread.java:748)
    19/08/21 12:14:05 WARN DFSClient: Failed to connect to /10.0.0.10:9866 for block, add to deadNodes and continue. org.apache.hadoop.net.ConnectTimeoutException: 60000 millis timeout while waiting for channel to be ready for connect. ch : java.nio.channels.SocketChannel[connection-pending remote=/10.0.0.10:9866]
    org.apache.hadoop.net.ConnectTimeoutException: 60000 millis timeout while waiting for channel to be ready for connect. ch : java.nio.channels.SocketChannel[connection-pending remote=/10.0.0.10:9866]
    	at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:533)
    	at org.apache.hadoop.hdfs.DFSClient.newConnectedPeer(DFSClient.java:3101)
    	at org.apache.hadoop.hdfs.BlockReaderFactory.nextTcpPeer(BlockReaderFactory.java:755)
    	at org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:670)
    	at org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:337)
    	at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:576)
    	at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:800)
    	at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:847)
    	at java.io.DataInputStream.read(DataInputStream.java:100)
    

    解决方案是在ide中resource的hdfs-site.xml 配置如下:意思是使用hostname连接datanode

    <property>
        <name>dfs.client.use.datanode.hostname</name>
        <value>true</value>
        <description>only cofig in clients</description>
    </property>
    

    问题解决

  • 相关阅读:
    ddos(分布式拒绝服务)攻击防御措施
    arp_announce和arp_ignore 详细解说
    TCP三次握手和四次挥手
    ARP请求详解
    LVS/DR模式原理剖析(FAQs)
    nfs配置 /etc/exports
    LVS集群之十种调度算法及负载均衡-理论
    SSH 故障排查思路
    shell脚本基础和编写规范
    计算机操作系统概述
  • 原文地址:https://www.cnblogs.com/jiangxiaoxian/p/11388016.html
Copyright © 2020-2023  润新知