
时间:2021-02-21 20:48:34

I am setting up a Spark Streaming project with Kinesis and when I try to connect to my Kinesis stream I am getting the following error from Spark:

我正在使用Kinesis设置Spark Streaming项目,当我尝试连接到我的Kinesis流时,我从Spark获得以下错误:

ERROR ShardSyncTask: Caught exception while sync'ing Kinesis shards and leases
com.amazonaws.services.kinesis.clientlibrary.exceptions.internal.KinesisClientLibIOException: Parent shard shardId-000000000000 exists but not the child shard shardId-000000000002

When I post test data to this stream or read data from the stream using the base Amazon libraries I get no errors, this only occurs when I try to connect with Spark.


Below is the code that I am using for my tests:


val conf = new SparkConf().setMaster("local[2]").setAppName("KinesisCounter")
val ssc = new StreamingContext(conf, Seconds(1))
val rawStream = KinesisUtils.createStream(ssc, "dev-test", "kinesis.us-east-1.amazonaws.com", Duration(1000), InitialPositionInStream.TRIM_HORIZON, StorageLevel.MEMORY_ONLY)
rawStream.map(msg => new String(msg)).count.print

1 个解决方案


How many shards you have on Kinesis?


what I would do is:


  1. check the Kinesis region, make sure your application setting and stream are in the same region
  2. 检查Kinesis区域,确保您的应用程序设置和流在同一区域

  3. delete your DynomoTable which stores the Kinesis streaming shards, and start all over again. below is from official documentation:
  4. 删除存储Kinesis流分片的DynomoTable,然后重新开始。以下是官方文件:

Changing the application name or stream name can lead to Kinesis errors in some cases. If you see errors, you may need to manually delete the DynamoDB table


  1. check your application code, to see if some settings are being set during code running.
  2. 检查您的应用程序代码,以查看代码运行期间是否正在设置某些设置。

Hope it helps.



