So I'm importing a Data file that contains 5 columns such as
所以我正在导入一个包含5列的数据文件,例如
1992-01-25T00:00:30.000Z|0.718|-0.758|-0.429|1.129
I know that scanf()
allows you to specify the data type it is scanning, like in this case it would be %s
and %f
. But my problems is for the first Column I would like to scanf
it as an number or split that column into two columns like so 1992-01-25|00:00:30.000. Is using fgets()
another alternative?
我知道scanf()允许你指定它正在扫描的数据类型,就像在这种情况下它将是%s和%f。但我的问题是第一列我想将它作为数字扫描或将该列分成两列,如1992-01-25 | 00:00:30.000。使用fgets()另一种选择吗?
Is there a way I can do this efficiently because I'm storing each column into arrays and then I have a search function for each Array and it will be a pain to search an Array containing Strings.
有没有办法可以有效地做到这一点,因为我将每个列存储到数组中,然后我为每个数组都有一个搜索函数,搜索包含字符串的数组会很麻烦。
2 个解决方案
#1
1
You can use fgets
, strtok
, and sscanf
to parse the file.
您可以使用fgets,strtok和sscanf来解析文件。
-
fgets
reads a line from the file -
strtok
breaks the line into substrings using the|
as a separator -
sscanf
parses the substrings to convert each substring into numbers
fgets从文件中读取一行
strtok使用|将行拆分为子串作为分隔符
sscanf解析子字符串以将每个子字符串转换为数字
In the sample code below, the date fields are combined into a single integer. For example,
"1992-01-25" becomes the decimal number 19920125
. The time fields are combined so that the final result represents the number of milliseconds from midnight.
在下面的示例代码中,日期字段组合成一个整数。例如,“1992-01-25”变为十进制数19920125.时间字段被组合,以便最终结果表示从午夜开始的毫秒数。
bool parseFile(FILE *fpin)
{
char line[256];
while (fgets(line, sizeof(line), fpin) != NULL)
{
// get the date/time portion of the line
char *dateToken = strtok(line, "|");
// extract the floating point values from the line
float value[4];
for (int i = 0; i < 4; i++)
{
char *token = strtok(NULL, "|");
if (token == NULL)
return false;
if (sscanf(token, "%f", &value[i]) != 1)
return false;
}
// extract the components of the date and time
int year, month, day, hour, minute, second, millisec;
char t, z;
sscanf(dateToken, "%d-%d-%d%c%d:%d:%d.%d%c",
&year, &month, &day, &t,
&hour, &minute, &second, &millisec, &z);
// combine the components into a single number for the date and time
int date = year * 10000 + month * 100 + day;
int time = hour * 3600000 + minute * 60000 + second * 1000 + millisec;
// display the parsed information
printf("%d %d", date, time);
for (int i = 0; i < 4; i++)
printf(" %6.3f", value[i]);
printf("\n");
}
return true; // the file was successfully parsed
}
#2
1
If I were you, I'd make a structure that holds the table of data after parsing the file.
如果我是你,我会在解析文件后创建一个包含数据表的结构。
typedef struct {
int num_rows;
char table[MAX_NUM_ROWS][MAX_NUM_COLS][MAX_COL_LEN];
} YOUR_DATA;
You should use fgets
to parse that file, line by line. First tokenize the line on the 'T' and from then on tokenize it on '|', something like this
您应该使用fgets逐行解析该文件。首先将'T'上的线标记,然后在'|'上标记它,就像这样
FILE your_fp;
YOUR_DATA yourTable;
char line[MAX_ROW_LEN] = {0};
char *ptr = NULL, field = NULL;
int row = 0, col = 0;
if ((your_fp = fopen("datafile.txt", "r")) == NULL) {
//error
}
while(fgets(line, sizeof(line), your_fp) != NULL) {
ptr = line;
col = 0;
if ((field = strsep(&ptr, "T")) != NULL) {
snprintf(yourTable.table[row][col], MAX_COL_LEN, "%s", field);
col++;
}
while ((field = strsep(&ptr, "|")) != NULL) {
snprintf(yourTable.table[row][col], MAX_COL_LEN, "%s", field);
col++;
}
row++
}
Probably want to keep track of the number of rows in the table etc. Than you can worry about trying to convert them into the correct data type.
可能想要跟踪表格中的行数等。您可以担心尝试将它们转换为正确的数据类型。
#1
1
You can use fgets
, strtok
, and sscanf
to parse the file.
您可以使用fgets,strtok和sscanf来解析文件。
-
fgets
reads a line from the file -
strtok
breaks the line into substrings using the|
as a separator -
sscanf
parses the substrings to convert each substring into numbers
fgets从文件中读取一行
strtok使用|将行拆分为子串作为分隔符
sscanf解析子字符串以将每个子字符串转换为数字
In the sample code below, the date fields are combined into a single integer. For example,
"1992-01-25" becomes the decimal number 19920125
. The time fields are combined so that the final result represents the number of milliseconds from midnight.
在下面的示例代码中,日期字段组合成一个整数。例如,“1992-01-25”变为十进制数19920125.时间字段被组合,以便最终结果表示从午夜开始的毫秒数。
bool parseFile(FILE *fpin)
{
char line[256];
while (fgets(line, sizeof(line), fpin) != NULL)
{
// get the date/time portion of the line
char *dateToken = strtok(line, "|");
// extract the floating point values from the line
float value[4];
for (int i = 0; i < 4; i++)
{
char *token = strtok(NULL, "|");
if (token == NULL)
return false;
if (sscanf(token, "%f", &value[i]) != 1)
return false;
}
// extract the components of the date and time
int year, month, day, hour, minute, second, millisec;
char t, z;
sscanf(dateToken, "%d-%d-%d%c%d:%d:%d.%d%c",
&year, &month, &day, &t,
&hour, &minute, &second, &millisec, &z);
// combine the components into a single number for the date and time
int date = year * 10000 + month * 100 + day;
int time = hour * 3600000 + minute * 60000 + second * 1000 + millisec;
// display the parsed information
printf("%d %d", date, time);
for (int i = 0; i < 4; i++)
printf(" %6.3f", value[i]);
printf("\n");
}
return true; // the file was successfully parsed
}
#2
1
If I were you, I'd make a structure that holds the table of data after parsing the file.
如果我是你,我会在解析文件后创建一个包含数据表的结构。
typedef struct {
int num_rows;
char table[MAX_NUM_ROWS][MAX_NUM_COLS][MAX_COL_LEN];
} YOUR_DATA;
You should use fgets
to parse that file, line by line. First tokenize the line on the 'T' and from then on tokenize it on '|', something like this
您应该使用fgets逐行解析该文件。首先将'T'上的线标记,然后在'|'上标记它,就像这样
FILE your_fp;
YOUR_DATA yourTable;
char line[MAX_ROW_LEN] = {0};
char *ptr = NULL, field = NULL;
int row = 0, col = 0;
if ((your_fp = fopen("datafile.txt", "r")) == NULL) {
//error
}
while(fgets(line, sizeof(line), your_fp) != NULL) {
ptr = line;
col = 0;
if ((field = strsep(&ptr, "T")) != NULL) {
snprintf(yourTable.table[row][col], MAX_COL_LEN, "%s", field);
col++;
}
while ((field = strsep(&ptr, "|")) != NULL) {
snprintf(yourTable.table[row][col], MAX_COL_LEN, "%s", field);
col++;
}
row++
}
Probably want to keep track of the number of rows in the table etc. Than you can worry about trying to convert them into the correct data type.
可能想要跟踪表格中的行数等。您可以担心尝试将它们转换为正确的数据类型。