I have an excel file with some data (with multiple columns and multiple rows). I want to import this data into my core-data based database. I'm struggling to import the data, because the text in the cells contain line breaks.
我有一个带有一些数据的excel文件(有多列和多行)。我想将这些数据导入我的基于核心数据的数据库。我正在努力导入数据,因为单元格中的文本包含换行符。
I've tried the following:
我尝试过以下方法:
1.) Export excel to Tab-delimited text file
1.)将excel导出到制表符分隔的文本文件
2.) Write an import route in iOS that reads the tab-delimited text file using the following code:
2.)在iOS中编写一个导入路径,使用以下代码读取制表符分隔的文本文件:
NSCharacterSet *tabCharacterSet = [NSCharacterSet characterSetWithCharactersInString:@"\t"];
NSArray *rows = [dataString componentsSeparatedByCharactersInSet:[NSCharacterSet newlineCharacterSet]];
NSArray *columns = [row componentsSeparatedByCharactersInSet:tabCharacterSet];
Problem: I have 1000 rows
with 15 columns
each. The parsing routine returns more than 1000 rows
and less than 15 columns
. The line breaks with-in the cell are not being handled properly by the parsing routine.
问题:我有1000行,每行15列。解析例程返回超过1000行且少于15列。解析例程没有正确处理单元格中的换行符。
I get the same results if I use
如果我使用,我会得到相同的结果
[NSCharacterSet characterSetWithCharactersInString:@"\r\n"];
or
[NSCharacterSet characterSetWithCharactersInString:@"\r"];
instead of
[NSCharacterSet newlineCharacterSet]
but it completely fails if I use
但如果我使用它就完全失败了
[NSCharacterSet characterSetWithCharactersInString:@"\n"];
How can I parse the excel-data properly?
如何正确解析excel数据?
Perhaps I can use Regex expressions to get line components? Any idea/pointers?
也许我可以使用Regex表达式来获取行组件?任何想法/指针?
UPDATE (Sample XLSX and Export File):
更新(示例XLSX和导出文件):
1 个解决方案
#1
0
The solution that worked for me involves using NSScanner
class.
对我有用的解决方案涉及使用NSScanner类。
- (NSArray *)parseCSVFileString {
NSMutableArray *rows = [NSMutableArray array];
// Get newline character set
NSMutableCharacterSet *newlineCharacterSet = (id)[NSMutableCharacterSet whitespaceAndNewlineCharacterSet];
[newlineCharacterSet formIntersectionWithCharacterSet:[[NSCharacterSet whitespaceCharacterSet] invertedSet]];
// Characters that are important to the parser
NSMutableCharacterSet *importantCharactersSet = (id)[NSMutableCharacterSet characterSetWithCharactersInString:@",\""];
[importantCharactersSet formUnionWithCharacterSet:newlineCharacterSet];
// Create scanner, and scan string
NSScanner *scanner = [NSScanner scannerWithString:self];
[scanner setCharactersToBeSkipped:nil];
while (![scanner isAtEnd]) {
@autoreleasepool {
BOOL insideQuotes = NO;
BOOL finishedRow = NO;
NSMutableArray *columns = [NSMutableArray arrayWithCapacity:10];
NSMutableString *currentColumn = [NSMutableString string];
while ( !finishedRow ) {
NSString *tempString;
if ([scanner scanUpToCharactersFromSet:importantCharactersSet intoString:&tempString]) {
[currentColumn appendString:tempString];
}
if ([scanner isAtEnd]) {
if (![currentColumn isEqualToString:@""]) [columns addObject:currentColumn];
finishedRow = YES;
} else if ([scanner scanCharactersFromSet:newlineCharacterSet intoString:&tempString]) {
if (insideQuotes) {
// Add line break to column text
[currentColumn appendString:tempString];
} else {
// End of row
if (![currentColumn isEqualToString:@""]) [columns addObject:currentColumn];
finishedRow = YES;
}
} else if ([scanner scanString:@"\"" intoString:NULL]) {
if (insideQuotes && [scanner scanString:@"\"" intoString:NULL]) {
// Replace double quotes with a single quote in the column string.
[currentColumn appendString:@"\""];
} else {
// Start or end of a quoted string.
insideQuotes = !insideQuotes;
}
} else if ([scanner scanString:@"," intoString:NULL]) {
if (insideQuotes) {
[currentColumn appendString:@","];
} else {
// This is a column separating comma
[columns addObject:currentColumn];
currentColumn = [NSMutableString string];
[scanner scanCharactersFromSet:[NSCharacterSet whitespaceCharacterSet] intoString:NULL];
}
}
}
if ( [columns count] > 0 ) [rows addObject:columns];
}
}
return rows;
}
Reference: http://www.macresearch.org/cocoa-scientists-part-xxvi-parsing-csv-data
#1
0
The solution that worked for me involves using NSScanner
class.
对我有用的解决方案涉及使用NSScanner类。
- (NSArray *)parseCSVFileString {
NSMutableArray *rows = [NSMutableArray array];
// Get newline character set
NSMutableCharacterSet *newlineCharacterSet = (id)[NSMutableCharacterSet whitespaceAndNewlineCharacterSet];
[newlineCharacterSet formIntersectionWithCharacterSet:[[NSCharacterSet whitespaceCharacterSet] invertedSet]];
// Characters that are important to the parser
NSMutableCharacterSet *importantCharactersSet = (id)[NSMutableCharacterSet characterSetWithCharactersInString:@",\""];
[importantCharactersSet formUnionWithCharacterSet:newlineCharacterSet];
// Create scanner, and scan string
NSScanner *scanner = [NSScanner scannerWithString:self];
[scanner setCharactersToBeSkipped:nil];
while (![scanner isAtEnd]) {
@autoreleasepool {
BOOL insideQuotes = NO;
BOOL finishedRow = NO;
NSMutableArray *columns = [NSMutableArray arrayWithCapacity:10];
NSMutableString *currentColumn = [NSMutableString string];
while ( !finishedRow ) {
NSString *tempString;
if ([scanner scanUpToCharactersFromSet:importantCharactersSet intoString:&tempString]) {
[currentColumn appendString:tempString];
}
if ([scanner isAtEnd]) {
if (![currentColumn isEqualToString:@""]) [columns addObject:currentColumn];
finishedRow = YES;
} else if ([scanner scanCharactersFromSet:newlineCharacterSet intoString:&tempString]) {
if (insideQuotes) {
// Add line break to column text
[currentColumn appendString:tempString];
} else {
// End of row
if (![currentColumn isEqualToString:@""]) [columns addObject:currentColumn];
finishedRow = YES;
}
} else if ([scanner scanString:@"\"" intoString:NULL]) {
if (insideQuotes && [scanner scanString:@"\"" intoString:NULL]) {
// Replace double quotes with a single quote in the column string.
[currentColumn appendString:@"\""];
} else {
// Start or end of a quoted string.
insideQuotes = !insideQuotes;
}
} else if ([scanner scanString:@"," intoString:NULL]) {
if (insideQuotes) {
[currentColumn appendString:@","];
} else {
// This is a column separating comma
[columns addObject:currentColumn];
currentColumn = [NSMutableString string];
[scanner scanCharactersFromSet:[NSCharacterSet whitespaceCharacterSet] intoString:NULL];
}
}
}
if ( [columns count] > 0 ) [rows addObject:columns];
}
}
return rows;
}
Reference: http://www.macresearch.org/cocoa-scientists-part-xxvi-parsing-csv-data