
时间:2022-05-29 22:18:53

Please keep in mind this is a open question and I am not looking for a specific answer but just approaches and routes I can take.


Essentially I am getting a csv file from my aws s3 bucket. I am able to get it successfully using

基本上我从我的aws s3桶中获取一个csv文件。我能够成功使用它

AmazonS3 s3Client = new AmazonS3Client(new ProfileCredentialsProvider());        
S3Object object = s3Client.getObject(
                  new GetObjectRequest(bucketName, key));

Now I want to populate a dynamodb table using this JSON file.


I was confused as i found all sorts of stuff online.


Here is one suggestion - This approach is however only reading the file it is not inserting anything to the dynamodb table.

这是一个建议 - 但是这种方法只是读取它没有向dynamodb表插入任何内容的文件。

Here is another suggestion - This approach is lot closer to what i am looking for , it is populating a table from a JSON file.

这是另一个建议 - 这种方法更接近我正在寻找的,它是从JSON文件填充表。

However i was wondering is there a generic way to ready any json file and populate a dynamodb table based on that ? Also for my case what approach is the best?


Since i originally asked the question I did more work.


What I have done so far


I have a csv file sitting in s3 that looks like this


Lebron James,SF,41,12,11
Kyrie Irving,PG,41,7,5
Stephen Curry,PG,29,8,4
Klay Thompson,SG,31,5,5

I am able to sucessfully pick it up as a s3object doing the following


AmazonS3 s3client = new AmazonS3Client(/**new ProfileCredentialsProvider()*/); 
    S3Object object = s3client.getObject(
            new GetObjectRequest("lambda-function-bucket-blah-blah", "nba.json"));
            InputStream objectData = object.getObjectContent();

Now I want to insert this in to my dynamodb table so i am attempting the following.


AmazonDynamoDBClient dbClient = new AmazonDynamoDBClient();

DynamoDB dynamoDB = new DynamoDB(dbClient); 
//DynamoDB dynamoDB = new DynamoDB(client); 
Table table = dynamoDB.getTable("MyTable"); 

//after this point i have tried many json parsers etc and did table.put(item) etc but nothing has worked. I would appreciate kind help


1 个解决方案



For CSV parsing, you can use plain reader as your file looks quite simple


    AmazonS3 s3client = new AmazonS3Client(/**new ProfileCredentialsProvider()*/); 
    S3Object object = s3client.getObject(
                new GetObjectRequest("lambda-function-bucket-blah-blah", "nba.json"));
    InputStream objectData = object.getObjectContent();

    AmazonDynamoDBClient dbClient = new AmazonDynamoDBClient();

    DynamoDB dynamoDB = new DynamoDB(dbClient); 
    //DynamoDB dynamoDB = new DynamoDB(client); 
    Table table = dynamoDB.getTable("MyTable"); 

    String line = "";
    String cvsSplitBy = ",";

    try (BufferedReader br = new BufferedReader(
                                new InputStreamReader(objectData, "UTF-8"));

        while ((line = br.readLine()) != null) {

            // use comma as separator
            String[] elements = line.split(cvsSplitBy);

            try {
                table.putItem(new Item()
                    .withPrimaryKey("name", elements[0])
                    .withString("position", elements[1])
                    .withInt("points", elements[2])

                System.out.println("PutItem succeeded: " + elements[0]);

            } catch (Exception e) {
                System.err.println("Unable to add user: " + elements);


    } catch (IOException e) {

Depending the complexity of your CSV, you can use 3rd party libraries like Apache CSV Parser or open CSV

根据CSV的复杂程度,您可以使用第三方库,如Apache CSV Parser或打开CSV

I leave the original answer for parsing JSon


I would use the Jackson library and following your code do the following


    AmazonS3 s3client = new AmazonS3Client(/**new ProfileCredentialsProvider()*/); 
    S3Object object = s3client.getObject(
                new GetObjectRequest("lambda-function-bucket-blah-blah", "nba.json"));
    InputStream objectData = object.getObjectContent();

    AmazonDynamoDBClient dbClient = new AmazonDynamoDBClient();

    DynamoDB dynamoDB = new DynamoDB(dbClient); 
    //DynamoDB dynamoDB = new DynamoDB(client); 
    Table table = dynamoDB.getTable("MyTable"); 

    JsonParser parser = new JsonFactory()

    JsonNode rootNode = new ObjectMapper().readTree(parser);
    Iterator<JsonNode> iter = rootNode.iterator();

    ObjectNode currentNode;

    while (iter.hasNext()) {
        currentNode = (ObjectNode) iter.next();

        String lastName  = currentNode.path("lastName").asText();
        String firstName = currentNode.path("firstName").asText();
        int minutes      = currentNode.path("minutes").asInt();
        // read all attributes from your JSon file

        try {
            table.putItem(new Item()
                .withPrimaryKey("lastName", lastName, "firstName", firstName)
                .withInt("minutes", minutes));

            System.out.println("PutItem succeeded: " + lastName + " " + firstName);

        } catch (Exception e) {
            System.err.println("Unable to add user: " + lastName + " " + firstName);

Inserting the records in your table will depend of your schema, I just put an arbitrary example, but anyway this will get you the reading of your file and the way to insert into the dynamoDB table


As you talked about the different approaches, another possibility is to setup a AWS Pipeline




For CSV parsing, you can use plain reader as your file looks quite simple


    AmazonS3 s3client = new AmazonS3Client(/**new ProfileCredentialsProvider()*/); 
    S3Object object = s3client.getObject(
                new GetObjectRequest("lambda-function-bucket-blah-blah", "nba.json"));
    InputStream objectData = object.getObjectContent();

    AmazonDynamoDBClient dbClient = new AmazonDynamoDBClient();

    DynamoDB dynamoDB = new DynamoDB(dbClient); 
    //DynamoDB dynamoDB = new DynamoDB(client); 
    Table table = dynamoDB.getTable("MyTable"); 

    String line = "";
    String cvsSplitBy = ",";

    try (BufferedReader br = new BufferedReader(
                                new InputStreamReader(objectData, "UTF-8"));

        while ((line = br.readLine()) != null) {

            // use comma as separator
            String[] elements = line.split(cvsSplitBy);

            try {
                table.putItem(new Item()
                    .withPrimaryKey("name", elements[0])
                    .withString("position", elements[1])
                    .withInt("points", elements[2])

                System.out.println("PutItem succeeded: " + elements[0]);

            } catch (Exception e) {
                System.err.println("Unable to add user: " + elements);


    } catch (IOException e) {

Depending the complexity of your CSV, you can use 3rd party libraries like Apache CSV Parser or open CSV

根据CSV的复杂程度,您可以使用第三方库,如Apache CSV Parser或打开CSV

I leave the original answer for parsing JSon


I would use the Jackson library and following your code do the following


    AmazonS3 s3client = new AmazonS3Client(/**new ProfileCredentialsProvider()*/); 
    S3Object object = s3client.getObject(
                new GetObjectRequest("lambda-function-bucket-blah-blah", "nba.json"));
    InputStream objectData = object.getObjectContent();

    AmazonDynamoDBClient dbClient = new AmazonDynamoDBClient();

    DynamoDB dynamoDB = new DynamoDB(dbClient); 
    //DynamoDB dynamoDB = new DynamoDB(client); 
    Table table = dynamoDB.getTable("MyTable"); 

    JsonParser parser = new JsonFactory()

    JsonNode rootNode = new ObjectMapper().readTree(parser);
    Iterator<JsonNode> iter = rootNode.iterator();

    ObjectNode currentNode;

    while (iter.hasNext()) {
        currentNode = (ObjectNode) iter.next();

        String lastName  = currentNode.path("lastName").asText();
        String firstName = currentNode.path("firstName").asText();
        int minutes      = currentNode.path("minutes").asInt();
        // read all attributes from your JSon file

        try {
            table.putItem(new Item()
                .withPrimaryKey("lastName", lastName, "firstName", firstName)
                .withInt("minutes", minutes));

            System.out.println("PutItem succeeded: " + lastName + " " + firstName);

        } catch (Exception e) {
            System.err.println("Unable to add user: " + lastName + " " + firstName);

Inserting the records in your table will depend of your schema, I just put an arbitrary example, but anyway this will get you the reading of your file and the way to insert into the dynamoDB table


As you talked about the different approaches, another possibility is to setup a AWS Pipeline
