org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 29.1 failed 4 times, most recent failure: Lost task 1.3 in stage 29.1 (TID 466, magnesium, executor 4): java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
at shade.com.datastax.spark.connector.google.common.base.Throwables.propagate(Throwables.java:160)
at com.datastax.driver.core.NettyUtil.newEventLoopGroupInstance(NettyUtil.java:136)
at com.datastax.driver.core.NettyOptions.eventLoopGroup(NettyOptions.java:99)
at com.datastax.driver.core.Connection$Factory. (Connection.java:769)
at com.datastax.driver.core.Cluster$Manager.init(Cluster.java:1410)
at com.datastax.driver.core.Cluster.init(Cluster.java:159)
at com.datastax.driver.core.Cluster.connectAsync(Cluster.java:330)
at com.datastax.driver.core.Cluster.connect(Cluster.java:280)
at StreamingIntegrationKafkaBac$$anonfun$main$1$$anonfun$apply$1.apply(StreamingIntegrationKafkaBac.scala:155)
at StreamingIntegrationKafkaBac$$anonfun$main$1$$anonfun$apply$1.apply(StreamingIntegrationKafkaBac.scala:144)
at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$29.apply(RDD.scala:935)
at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$29.apply(RDD.scala:935)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2074)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2074)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:109)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

我是在SparkStreaming查询Cassandra时遇到这个报错的。

SRE实战 互联网时代守护先锋,助力企业售后服务体系运筹帷幄!一键直达领取阿里云限量特价优惠。
dataFrame.foreachPartition { part =>
  val poolingOptions = new PoolingOptions
  poolingOptions
    .setCoreConnectionsPerHost(HostDistance.LOCAL, 4)
    .setMaxConnectionsPerHost(HostDistance.LOCAL, 10)
  val cluster = Cluster
    .builder
    .addContactPoints("localhost")
    .withCredentials("cassandra", "cassandra")
    .withPoolingOptions(poolingOptions)
    .build
  val session = cluster.connect("keyspace")
  part.foreach { item =>
    // 业务逻辑
  }
  cluster.close()
  session.close()
}

每个批次中,首先检查cluster和session,是否都close,没有close会报这个错误。

若还没有解决,需要检查netty的版本。

推荐在IDEA中安装Maven Helper插件。然后把冲突的低版本netty相关的依赖删掉即可。

扫码关注我们
微信号:SRE实战
拒绝背锅 运筹帷幄