Please keep in mind this is a open question and I am not looking for a specific answer but just approaches and routes I can take.
请记住这是一个悬而未决的问题,我不是在寻找具体的答案,而只是我可以采取的方法和路线。
Essentially I am getting a csv file from my aws s3 bucket. I am able to get it successfully using
基本上我从我的aws s3桶中获取一个csv文件。我能够成功使用它
AmazonS3 s3Client = new AmazonS3Client(new ProfileCredentialsProvider());
S3Object object = s3Client.getObject(
new GetObjectRequest(bucketName, key));
Now I want to populate a dynamodb table using this JSON file.
现在我想使用这个JSON文件填充一个dynamodb表。
I was confused as i found all sorts of stuff online.
我很困惑,因为我在网上发现了各种各样的东西。
Here is one suggestion - This approach is however only reading the file it is not inserting anything to the dynamodb table.
这是一个建议 - 但是这种方法只是读取它没有向dynamodb表插入任何内容的文件。
Here is another suggestion - This approach is lot closer to what i am looking for , it is populating a table from a JSON file.
这是另一个建议 - 这种方法更接近我正在寻找的,它是从JSON文件填充表。
However i was wondering is there a generic way to ready any json file and populate a dynamodb table based on that ? Also for my case what approach is the best?
但是我想知道是否有一种通用的方法来准备任何json文件并基于它填充一个dynamodb表?另外对于我的情况,哪种方法最好?
Since i originally asked the question I did more work.
自从我最初问这个问题后,我做了更多工作。
What I have done so far
到目前为止我做了什么
I have a csv file sitting in s3 that looks like this
我有一个坐在s3中的csv文件,看起来像这样
name,position,points,assists,rebounds
Lebron James,SF,41,12,11
Kyrie Irving,PG,41,7,5
Stephen Curry,PG,29,8,4
Klay Thompson,SG,31,5,5
I am able to sucessfully pick it up as a s3object doing the following
我能够成功地将其作为s3object进行以下操作
AmazonS3 s3client = new AmazonS3Client(/**new ProfileCredentialsProvider()*/);
S3Object object = s3client.getObject(
new GetObjectRequest("lambda-function-bucket-blah-blah", "nba.json"));
InputStream objectData = object.getObjectContent();
Now I want to insert this in to my dynamodb table so i am attempting the following.
现在我想将它插入到我的dynamodb表中,所以我尝试以下操作。
AmazonDynamoDBClient dbClient = new AmazonDynamoDBClient();
dbClient.setRegion(Region.getRegion(Regions.US_BLAH_1));
DynamoDB dynamoDB = new DynamoDB(dbClient);
//DynamoDB dynamoDB = new DynamoDB(client);
Table table = dynamoDB.getTable("MyTable");
//after this point i have tried many json parsers etc and did table.put(item) etc but nothing has worked. I would appreciate kind help
//在这一点之后我尝试了很多json解析器等并且做了table.put(item)等但没有任何工作。我会很感激的帮助
1 个解决方案
#1
1
For CSV parsing, you can use plain reader as your file looks quite simple
对于CSV解析,您可以使用普通阅读器,因为您的文件看起来非常简单
AmazonS3 s3client = new AmazonS3Client(/**new ProfileCredentialsProvider()*/);
S3Object object = s3client.getObject(
new GetObjectRequest("lambda-function-bucket-blah-blah", "nba.json"));
InputStream objectData = object.getObjectContent();
AmazonDynamoDBClient dbClient = new AmazonDynamoDBClient();
dbClient.setRegion(Region.getRegion(Regions.US_BLAH_1));
DynamoDB dynamoDB = new DynamoDB(dbClient);
//DynamoDB dynamoDB = new DynamoDB(client);
Table table = dynamoDB.getTable("MyTable");
String line = "";
String cvsSplitBy = ",";
try (BufferedReader br = new BufferedReader(
new InputStreamReader(objectData, "UTF-8"));
while ((line = br.readLine()) != null) {
// use comma as separator
String[] elements = line.split(cvsSplitBy);
try {
table.putItem(new Item()
.withPrimaryKey("name", elements[0])
.withString("position", elements[1])
.withInt("points", elements[2])
.....);
System.out.println("PutItem succeeded: " + elements[0]);
} catch (Exception e) {
System.err.println("Unable to add user: " + elements);
System.err.println(e.getMessage());
break;
}
}
} catch (IOException e) {
e.printStackTrace();
}
Depending the complexity of your CSV, you can use 3rd party libraries like Apache CSV Parser or open CSV
根据CSV的复杂程度,您可以使用第三方库,如Apache CSV Parser或打开CSV
I leave the original answer for parsing JSon
我留下解析JSon的原始答案
I would use the Jackson library and following your code do the following
我会使用Jackson库并按照您的代码执行以下操作
AmazonS3 s3client = new AmazonS3Client(/**new ProfileCredentialsProvider()*/);
S3Object object = s3client.getObject(
new GetObjectRequest("lambda-function-bucket-blah-blah", "nba.json"));
InputStream objectData = object.getObjectContent();
AmazonDynamoDBClient dbClient = new AmazonDynamoDBClient();
dbClient.setRegion(Region.getRegion(Regions.US_BLAH_1));
DynamoDB dynamoDB = new DynamoDB(dbClient);
//DynamoDB dynamoDB = new DynamoDB(client);
Table table = dynamoDB.getTable("MyTable");
JsonParser parser = new JsonFactory()
.createParser(objectData);
JsonNode rootNode = new ObjectMapper().readTree(parser);
Iterator<JsonNode> iter = rootNode.iterator();
ObjectNode currentNode;
while (iter.hasNext()) {
currentNode = (ObjectNode) iter.next();
String lastName = currentNode.path("lastName").asText();
String firstName = currentNode.path("firstName").asText();
int minutes = currentNode.path("minutes").asInt();
// read all attributes from your JSon file
try {
table.putItem(new Item()
.withPrimaryKey("lastName", lastName, "firstName", firstName)
.withInt("minutes", minutes));
System.out.println("PutItem succeeded: " + lastName + " " + firstName);
} catch (Exception e) {
System.err.println("Unable to add user: " + lastName + " " + firstName);
System.err.println(e.getMessage());
break;
}
}
parser.close();
Inserting the records in your table will depend of your schema, I just put an arbitrary example, but anyway this will get you the reading of your file and the way to insert into the dynamoDB table
在表中插入记录将取决于您的模式,我只是放了一个任意的例子,但无论如何这将使您读取文件以及插入dynamoDB表的方式
As you talked about the different approaches, another possibility is to setup a AWS Pipeline
在您谈到不同的方法时,另一种可能性是设置AWS管道
#1
1
For CSV parsing, you can use plain reader as your file looks quite simple
对于CSV解析,您可以使用普通阅读器,因为您的文件看起来非常简单
AmazonS3 s3client = new AmazonS3Client(/**new ProfileCredentialsProvider()*/);
S3Object object = s3client.getObject(
new GetObjectRequest("lambda-function-bucket-blah-blah", "nba.json"));
InputStream objectData = object.getObjectContent();
AmazonDynamoDBClient dbClient = new AmazonDynamoDBClient();
dbClient.setRegion(Region.getRegion(Regions.US_BLAH_1));
DynamoDB dynamoDB = new DynamoDB(dbClient);
//DynamoDB dynamoDB = new DynamoDB(client);
Table table = dynamoDB.getTable("MyTable");
String line = "";
String cvsSplitBy = ",";
try (BufferedReader br = new BufferedReader(
new InputStreamReader(objectData, "UTF-8"));
while ((line = br.readLine()) != null) {
// use comma as separator
String[] elements = line.split(cvsSplitBy);
try {
table.putItem(new Item()
.withPrimaryKey("name", elements[0])
.withString("position", elements[1])
.withInt("points", elements[2])
.....);
System.out.println("PutItem succeeded: " + elements[0]);
} catch (Exception e) {
System.err.println("Unable to add user: " + elements);
System.err.println(e.getMessage());
break;
}
}
} catch (IOException e) {
e.printStackTrace();
}
Depending the complexity of your CSV, you can use 3rd party libraries like Apache CSV Parser or open CSV
根据CSV的复杂程度,您可以使用第三方库,如Apache CSV Parser或打开CSV
I leave the original answer for parsing JSon
我留下解析JSon的原始答案
I would use the Jackson library and following your code do the following
我会使用Jackson库并按照您的代码执行以下操作
AmazonS3 s3client = new AmazonS3Client(/**new ProfileCredentialsProvider()*/);
S3Object object = s3client.getObject(
new GetObjectRequest("lambda-function-bucket-blah-blah", "nba.json"));
InputStream objectData = object.getObjectContent();
AmazonDynamoDBClient dbClient = new AmazonDynamoDBClient();
dbClient.setRegion(Region.getRegion(Regions.US_BLAH_1));
DynamoDB dynamoDB = new DynamoDB(dbClient);
//DynamoDB dynamoDB = new DynamoDB(client);
Table table = dynamoDB.getTable("MyTable");
JsonParser parser = new JsonFactory()
.createParser(objectData);
JsonNode rootNode = new ObjectMapper().readTree(parser);
Iterator<JsonNode> iter = rootNode.iterator();
ObjectNode currentNode;
while (iter.hasNext()) {
currentNode = (ObjectNode) iter.next();
String lastName = currentNode.path("lastName").asText();
String firstName = currentNode.path("firstName").asText();
int minutes = currentNode.path("minutes").asInt();
// read all attributes from your JSon file
try {
table.putItem(new Item()
.withPrimaryKey("lastName", lastName, "firstName", firstName)
.withInt("minutes", minutes));
System.out.println("PutItem succeeded: " + lastName + " " + firstName);
} catch (Exception e) {
System.err.println("Unable to add user: " + lastName + " " + firstName);
System.err.println(e.getMessage());
break;
}
}
parser.close();
Inserting the records in your table will depend of your schema, I just put an arbitrary example, but anyway this will get you the reading of your file and the way to insert into the dynamoDB table
在表中插入记录将取决于您的模式,我只是放了一个任意的例子,但无论如何这将使您读取文件以及插入dynamoDB表的方式
As you talked about the different approaches, another possibility is to setup a AWS Pipeline
在您谈到不同的方法时,另一种可能性是设置AWS管道