signed

QiShunwang

“诚信为本、客户至上”

本地读取HDFS上文件报错:BlockReaderFactory: I/O error constructing remote block reader.

2021/6/3 14:10:57   来源:

背景:
利用IntelliJ IDEA开发在线教育离线数据仓库时,需要将HDFS上的*.log数据进行ETL后加载到dwd层的数据表中,即ods=>dwd,这里面如果使用本地模式运行(local[*])的会涉及到“本地测试环境(外网)访问云服务器HDFS文件系统间的通信”问题,通俗理解,本地环境读取云服务器上的数据。

报错点:

21/06/03 13:41:28 INFO HadoopRDD: Input split: hdfs://mycluster/warehouse/online_education/ods/baseadlog.log:0+285
21/06/03 13:42:29 WARN BlockReaderFactory: I/O error constructing remote block reader.
org.apache.hadoop.net.ConnectTimeoutException: 60000 millis timeout while waiting for channel to be ready for connect. ch : java.nio.channels.SocketChannel[connection-pending remote=/172.18.39.51:9866]
	at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:533)
	at org.apache.hadoop.hdfs.DFSClient.newConnectedPeer(DFSClient.java:3090)
	at org.apache.hadoop.hdfs.BlockReaderFactory.nextTcpPeer(BlockReaderFactory.java:778)
	at org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:693)
	at org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:354)
	at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:617)
	at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:841)
	at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:889)
	at java.io.DataInputStream.read(DataInputStream.java:149)
	at org.apache.hadoop.mapreduce.lib.input.UncompressedSplitLineReader.fillBuffer(UncompressedSplitLineReader.java:62)
	at org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:216)
	at org.apache.hadoop.util.LineReader.readLine(LineReader.java:174)
	at org.apache.hadoop.mapreduce.lib.input.UncompressedSplitLineReader.readLine(UncompressedSplitLineReader.java:94)
	at org.apache.hadoop.mapred.LineRecordReader.skipUtfByteOrderMark(LineRecordReader.java:208)
	at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:246)
	at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:48)
	at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:293)
	at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:224)
	at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73)
	at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
	at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
	at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:461)
	at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
	at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source)
	at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
	at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$13$$anon$1.hasNext(WholeStageCodegenExec.scala:636)
	at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:438)
	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.processNext(Unknown Source)
	at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
	at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$13$$anon$1.hasNext(WholeStageCodegenExec.scala:636)
	at org.apache.spark.sql.execution.UnsafeExternalRowSorter.sort(UnsafeExternalRowSorter.java:227)
	at org.apache.spark.sql.execution.SortExec$$anonfun$1.apply(SortExec.scala:116)
	at org.apache.spark.sql.execution.SortExec$$anonfun$1.apply(SortExec.scala:109)
	at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:858)
	at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:858)
	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:346)
	at org.apache.spark.rdd.RDD.iterator(RDD.scala:310)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
	at org.apache.spark.scheduler.Task.run(Task.scala:123)
	at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)

问题分析:

  • 云服务器防火墙关了吗?关了!因为搭建集群时第一步就是配置ssh免密登录和关闭防火墙
  • 集群启动起来了吗?起来了!用jps看看进程状态或者用WebUI看看,端口根据Hadoop版本不同而不同(9870/50070),如果hiveserver2metastore两兄弟没开的话,报的不是这个错误哈。
  • 以前在本地搭建了虚拟机,做同样的操作,没有出现过这个问题,那么思考一下,原因大概率就是局域网问题了(本地虚拟机和本地测试环境属于同一局域网)

解决方法:
首先本地测试环境在先前的Log日志中显示能够访问到NameNode,NameNode给本地测试环境返回数据所在的DataNode的IP地址,猜测本地测试环境无法通过IP地址访问到各个DataNode,那么如何解决?
本地hdfs-site.xml中添加下列配置即可:

<property>
        <name>dfs.datanode.use.datanode.hostname</name>
        <value>true</value>
</property>

以上语句表示DataNode之间的通信也通过域名来访问,也就是将默认的IP访问改为域名访问的形式。