spark任务报错java.io.IOException: Failed to send RPC xxxxxx to xxxx:xxx, but got no response. Marking as

时间：2023-01-14 23:11:11浏览次数：53

标签：netty Marking java 4.0 xxx 43 报错 io Final

## 日志信息如下
``` Attempted to get executor loss reason for executor id 17 at RPC address 192.168.48.172:59070, but got no response. Marking as slave lost. java.io.IOException: Failed to send RPC 9102760012410878153 to /192.168.48.172:59047: java.nio.channels.ClosedChannelException at org.apache.spark.network.client.TransportClient.lambda$sendRpc$2(TransportClient.java:237) ~[spark-network-common_2.11-2.2.0.jar:2.2.0] at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:507) ~[netty-all-4.0.43.Final.jar:4.0.43.Final] at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:481) ~[netty-all-4.0.43.Final.jar:4.0.43.Final] at io.netty.util.concurrent.DefaultPromise.access$000(DefaultPromise.java:34) ~[netty-all-4.0.43.Final.jar:4.0.43.Final] at io.netty.util.concurrent.DefaultPromise$1.run(DefaultPromise.java:431) ~[netty-all-4.0.43.Final.jar:4.0.43.Final] at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:399) ~[netty-all-4.0.43.Final.jar:4.0.43.Final] at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:446) ~[netty-all-4.0.43.Final.jar:4.0.43.Final] at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131) ~[netty-all-4.0.43.Final.jar:4.0.43.Final] at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144) ~[netty-all-4.0.43.Final.jar:4.0.43.Final] at java.lang.Thread.run(Thread.java:745) [?:1.8.0_101] Caused by: java.nio.channels.ClosedChannelException at io.netty.channel.AbstractChannel$AbstractUnsafe.write(...)(Unknown Source) ~[netty-all-4.0.43.Final.jar:4.0.43.Final] ```
## 现象
driver端显示日志内容为RPC通信错误，从而认为心跳超时，执行器被yarn杀掉，该问题有两种解决思路
1. driver或executor内存不足，GC时无法进行RPC通信从而心跳超时，定位方法 - driver端：查询driver的pid，jstat -gcutil pid查看内存使用情况，或jmap -heap pid查看内存使用 - executor端：查询executor的pid（可以从spark UI的执行器页面查看到执行器的ip和端口，通过ip和端口查询到executor所在的服务器和pid），根据pid查看内存使用情况 2. driver所在服务器与executor所在服务器之间的时间相差较多，相差1分钟以上就应该及时修改时间了，究其根本原因也很简单，两台服务器时间相差过大，造成本来就1ms内完成的通信，由于两个java进程计算的时间戳不同，造成driver认为响应超时，目前看大部分文章给的解决方式都是第一种，直接加executor内存，未必能解决问题，我们大部分集群都做了时钟同步，为什么还会造成时间相差很大呢，此时需要查看服务器是否开启了chronyd，如果你使用的是ntp，chronyd会对ntp有干扰，可以关闭chronyd
关闭chronyd方法 ``` systemctl disable chronyd systemctl stop chronyd systemctl enable ntpd systemctl start ntpd ``` 　　

标签：netty,Marking,java,4.0,xxx,43,报错,io,Final
From： https://www.cnblogs.com/wanghy-keepcoding/p/17052761.html

exp导数据时报错ORA-01578 ORA-01110
问题描述：exp导数据时报错ORA-01578ORA-01110，如下所示：数据库：oracle19.12多租户1、异常重现[oracle@dbserver~]$expora1/ora1@orclpdbfile=emp.dmptables=emplog=exp......
elasticsearch 报错："no [query] registered for [missing]"
这个错误是在用elasticsearch查询时使用missing这个api报出的错误：比如查询语句为：GETent_search/_search{"_source":["eid","ent_name","enttype_code"],"query":{......
springboot @Autowried报错
原因：没啥，就是idea偶尔有薄毛病解决：作用是屏蔽一些无关紧要的警告。使开发者能看到一些他们真正关心的警告。从而提高开发者的效率@SuppressWarnings("all")......
启动MySQL服务时报错: Warning: mysqld.service changed on disk
报错:Warning:mysqld.servicechangedondisk.Run'systemctldaemon-reload'toreloadunits. 警告：磁盘上的mysqld.service已更改。运行“systemctldaemon-rel......
svn更新报错Please execute the ‘Cleanup‘ command.
......
Error: ENOSPC: System limit for number of file watchers reached, watch '文件路径
在Linux系统上运行vue项目。出现如题报错代码的解决办法，在终端执行以下命令最简单的命令，粘贴执行，即可解决！sudosysctlfs.inotify.max_user_watches=524288执行这两条......
升级python报错"You are using pip version 8.1.2, however version 22.2.2 is availa
问题描述：在CentOS7中安装更新python-pip时，报出更新的版本是8.1.2，然而最新的版本是22.2.2的错如下：#安装pipyuminstallpython-pip#升级pippipinstall--upgradepip......
SAP STMS传输请求报错无法重新传输请求
1.问题描述在传输请求号的时候，第一次传输到测试系统是传输一半，报错；用表E070取消传输后，再次传输报错该请求号已经传输，无法再次传输。2.解决方案把这个请求重新加入......
关于Maven执行mvn help:system命令报错
报错：[ERROR]ErrorexecutingMaven.[ERROR]2problemswereencounteredwhilebuildingtheeffectivesettings[FATAL]Non-parseablesettingsD:\apache-maven-3.6.......
安装curl过程中执行 ./configure报错：configure: error: select TLS backend(s) or dis
001、问题：configure:error:selectTLSbackend(s)ordisableTLSwith--without-ssl. 002、解决方法[root@PC1curl-7.87.0]#yuminstallopensslopenss......

spark任务报错java.io.IOException: Failed to send RPC xxxxxx to xxxx:xxx, but got no response. Marking as

相关文章

赞助商

阅读排行