Imagine you have a table in a CSV file with this kind of layout:
name,property1 [unit1],property2 [unit2] name1,4.5,2.3 name2,3.2,7.4 name3,5.5,6.1
I need to convert each row to this kind of JSON structure (ie, for row 1):
"name1": [
"properties": [
"property_1": "_value_",
"unit": "unit1"
"property_2": "_value_",
"unit": "unit2"
On top of it all, I have to explain that I am using Qt 4.7 and can't update; also, I can't install Qxt so I'm relying on qt-json for the JSON parsing/encoding. More, the CSV file is not created/maintained by me, so I can't really change it either.
最重要的是,我必须解释我正在使用Qt 4.7,不能更新;另外,我不能安装Qxt,所以我依赖于JSON解析/编码的qt-json。而且,CSV文件不是由我创建/维护的,所以我也不能真正更改它。
So with all of this, I realised I need a few things, so this is a kind of multiple question:
- how should I write the RegEx to read the unit in each column's header? Please note that the unit is enclosed in rect-parenthesis.
- 如何编写RegEx来读取每个列的标题中的单元?请注意,该单元包含在括号中。
- imagine I extract both the header row and the other rows into a
, separating each column as a string. How can I manage to sync all the bits of data in order to create the JSON structure I need on a QString? (I think I need it in a QString so I can dump each row in a different file, but I'm open to other options as well) -
Just one final note - I also need to this to be somewhat scalable. The CSV files on which this will be apllied are very heterogenous in column count: some have 8 columns, others have 20.
I know it is not a good practice to post "multiquestions", but the thing is I'm feeling too overwhelmed with all of this, and because I have virtually no experience with Qt, I can't even define a plan to attack this. Hope someone can share some pointers. Thanks!
EDIT So, I've been thinking a little more about this and I don't actually know if this is a good idea/feasible but here is what I thought of:
- when going through the header row, I would check if each column string had a hit for the RegEx. If so, I would store the column index and the unit string in a list;
- 在遍历标题行时,我将检查每个列字符串是否对RegEx有命中。如果是,我将把列索引和单位字符串存储在一个列表中;
- then, when going through the other rows, in order to parse them into JSON, I would check in each column if it matched the index in the previous list, and if so, I would then add the unit to the map (as qt-json docs explains)
- 然后,在遍历其他行时,为了将它们解析为JSON,我将检查每一列是否与前面列表中的索引匹配,如果匹配,然后将该单元添加到映射中(正如qt- docs解释的那样)
Does this make any sense? Can anyone mock up a skeleton I can work on for this?
I've managed to get a few things working so far, but still not working as it should. Right now I have managed to read properly from the CSV file, but the output isn't coming out right. Can anyone share some insight?
NOTE: the processLineFromCSV function returns a QStringList obtained like so: QStringList cells = line.split(separator_char);
注意:processLineFromCSV函数返回如下所示的QStringList: QStringList cells = line.split(separator_char);
NOTE2: the RegEx was obtained from this answer.
NOTE3: Check below for the type of output I'm getting. Right now I think the problem relates more to the usage of the qt-json
lib than actually the rest of the code, but any help is welcome! :)
The code so far:
QFile file(csvfile);
if ( | QIODevice::Text))
bool first = true;
QVariantMap map;
QVariantMap propertyMap;
QList<QVariant> generalList, propertiesList;
while (!file.atEnd())
QString line = file.readLine();
if(first == true){
headerList = processLineFromCSV(line, separator_char);
first = false;
QStringList cellList = processLineFromCSV(line, separator_char);
int i=0;
for(i; i<cellList.size(); i++)
// check the header cell for "[unit]" string
// returns -1 if does not have the string
// if it has the string, it's stored in capturedUnits[1]
int test = exp.indexIn(;
// store the captured units in a QStringList
QStringList capturedUnits = exp.capturedTexts();
if(test==-1){ // if header does not have a captured unit - general column
QString name =;
QString sanitizeName= name.remove(exp.capturedTexts().at(0), Qt::CaseSensitive);
map[sanitizeName] =;
else{ // if header string has a captured unit - property column
QString propertyName =; // extract string in header
QString sanitizedPropertyName = propertyName.remove(exp); //remove the unit regex from the string
sanitizedPropertyName.remove(QChar('\n'), Qt::CaseSensitive); // clear newlines
if(sanitizedPropertyName.startsWith('"') && sanitizedPropertyName.endsWith('"'))
QString value; // extract string in value
QString sanitizedValue = value.remove(QChar('\n'), Qt::CaseSensitive); // clear newlines
if(sanitizedValue.startsWith('"') && sanitizedValue.endsWith('"'))
propertyMap[sanitizedPropertyName]= sanitizedValue; // map the property: value pair
propertyMap["unit"] =; // map the unit: [unit] value pair
QByteArray general = QtJson::serialize(map); // serialize the pair for general column
QByteArray properties = QtJson::serialize(propertyMap); // serialize the pair for property column
QVariant genVar(general);
QVariant propVar(properties);
QByteArray finalGeneral = QtJson::serialize(generalList);
QByteArray finalProperties = QtJson::serialize(propertiesList);
qDebug() << finalGeneral;
qDebug() << finalProperties;
The ouput:
"{ \"name\" : \"name1\" }",
"{ \"name\" : \"name1\" }",
"{ \"name\" : \"name2\" }",
"{ \"name\" : \"name2\" }",
"{ \"name\" : \"name3\" }",
"{ \"name\" : \"name3\" }"
"{ \"property1 \" : \"4.5\", \"unit\" : \"unit1\" }",
"{ \"property1 \" : \"4.5\", \"property2 \" : \"2.3\", \"unit\" : \"unit2\" }",
"{ \"property1 \" : \"3.2\", \"property2 \" : \"2.3\", \"unit\" : \"unit1\" }",
"{ \"property1 \" : \"3.2\", \"property2 \" : \"7.4\", \"unit\" : \"unit2\" }",
"{ \"property1 \" : \"5.5\", \"property2 \" : \"7.4\", \"unit\" : \"unit1\" }",
"{ \"property1 \" : \"5.5\", \"property2 \" : \"6.1\", \"unit\" : \"unit2\" }"
3 个解决方案
This should be a good start for you:
QString csv = "name,property1 [unit1],property2 [unit2],property3 [unit3]\n"
QStringList csvRows = csv.split('\n', QString::SkipEmptyParts);
QStringList csvHeader = csvRows.takeFirst().split(',');
foreach(QString row, csvRows) {
QStringList values = row.split(',');
QString rowName = values.takeFirst();
QVariantList properties;
for(int i = 0; i < values.size(); i++) {
QString value = values[i];
QStringList propParts = csvHeader[i].split(' ');
QString propName = propParts[0];
QString propType = propParts[1].mid(1, propParts[1].size() - 2);
QVariantMap property;
property[propName] = value;
property["unit"] = propType;
QVariantMap propertyObj;
propertyObj["properties"] = properties;
QVariantList propList;
QVariantMap root;
root[rowName] = propList;
QByteArray json = QtJson::serialize(root);
qDebug() << json;
// Now you can save json to a file
Just seen your response to my comment. I don't have much experience with QT either, but a quick outline....
Extract the data one line at a time, and 'split' it into an array. If you are using CSV you need to be sure that there are no data points that have a comma in them, or the split will result in a real mess. Check with whoever extracted the data if they can use another 'less common' separator (eg a '|' is good). if you data is all numeric that is great, but be wary of locations that use the comma as a decimal separator :(
I hope that you have 1 'table' per file, if not you need to be able to 'identify' when a new table starts somehow, this could be interesting / fun - depends on your outlook ;).
At the end you will have a collection of 'string arrays' (a table of some sort) hopefully the first is your header info. If you have mutliple tables, you will deal with them one at a time
You should now be able to 'output' each table in good JSON format.
Getting your 'units' from the header rows: If you know in advance where they are located (ie the index in the array) you can plan for extracting the info (using a regex if you wish) in the correct index locations.
Last point. If your csv file is very long (hundreds of lines), just grab the first few into a new test file for quicker debuging, then once you are happy, enlarge it a bit and check the output format... then again once you are happy that there are no other bugs... for the whole file Likewise if you have multiple tables in your file, start with the first one only, then add the first part of a second... test.... add a third.... test etc etc etc until you are happy
A possibly better solution, after reading your comment about wanting some form of 'synchronisation'. NOTE: this may seem a little more complex, but I think it would be a more flexible solution in the end. Also does this data not exist in a DB somewhere (who gave it to you?), can they give you direct read access to the underlying DB and tables? if so, you can jump straight to the 'output each table to JSON' step.
using an embeded DB (ie SQLite). Extract the first 'header' row, and create a table in your DB that follows the info there (you should be able to add info regarding units to the 'metadata' ie a description). If all your files are the same you could even import all the data into the same single table, or auto create a new table (assuming the same format) for each new file using the same create table statement.
使用嵌入式DB(即SQLite)。提取第一个“header”行,并在DB中创建一个表,该表遵循其中的信息(您应该能够向“metadata”(即描述)添加关于单元的信息)。如果所有的文件都是相同的,您甚至可以将所有数据导入到同一个表中,或者使用相同的create table语句为每个新文件自动创建一个新表(假设格式相同)。
I'm sure there is a 'csvimport' in SQLite (I haven't checked the docs yet, and haven't done this in a while) or someone has written a library that will do this.
Output each table to JSON format, again I'm sure someone has written a library for this.
Using the answer by ExplodingRat this is the final code: (without file creation at the end)
QString csvfile = ui->lineEditCSVfile->text();
QString separator_char = ui->lineEditSeparator->text();
QRegExp exp("\\[([^\\]]+)\\]");
QFile file(csvfile);
if (! | QIODevice::Text))
QString csv = file.readAll();
QStringList csvRows = csv.split('\n', QString::SkipEmptyParts);
QStringList csvHeader = csvRows.takeFirst().split(separator_char);
foreach(QString row, csvRows) {
QStringList values = row.split(separator_char);
QString rowName = values.takeFirst();
QVariantList general;
QVariantList properties;
for(int i = 0; i < values.size(); i++) {
QString value = values[i];
int test = exp.indexIn(csvHeader[i]);
//qDebug() << test;
//qDebug() << csvHeader;
QStringList capturedUnits = exp.capturedTexts();
QString propName = csvHeader[i];
//QString propName = csvHeader[i].remove(exp);
//qDebug() <<"property name" << propName;
QVariantMap property;
property[propName] = value;
//QStringList propParts = csvHeader[i].split(' ');
//QString propName = csvHeader[i].remove(exp);
QString propType = capturedUnits[1];
QVariantMap property;
property[propName] = value;
property["unit"] = propType;
QVariantMap propertyObj;
propertyObj["properties"] = properties;
QVariantList propList;
QVariantMap generalObj;
generalObj["general"] = general;
QVariantList generalList;
QVariantList fullList;
QVariantMap root;
root[rowName] = fullList;
QByteArray json = QtJson::serialize(root);
qDebug() << json;
// Now you can save json to a file
