从文件中读取单词并将其作为char指针返回

时间:2021-06-17 21:21:40

I'm making a function char** read_from_file(char* fname, int * size) that reads all the words from a file fname and returns them as char**. My file only has 5 words, there is only one word per line. I then have another function print_strings(char** words, int num_words) that prints the strings.

我正在创建一个函数char ** read_from_file(char * fname,int * size),它读取文件fname中的所有单词并将它们作为char **返回。我的文件只有5个单词,每行只有一个单词。然后我有另一个打印字符串的函数print_strings(char ** words,int num_words)。

I'm having 3 problems:

我有3个问题:

  1. When I am comparing the index to < *size I get "comparison between pointer and integer"

    当我比较索引与<* size我得到“指针和整数之间的比较”

  2. I can't store the words in the **words

    我无法将这些单词存储在**单词中

  3. I'm not sure how to return all the words.

    我不知道如何归还所有的话。

This is my code:

这是我的代码:

void test_sort(char* fname){
    int i;   
    int num_words;
    char** words = read_from_file(fname, &num_words);

    printf("\n ORIGINAL data:\n");
    print_strings(words, num_words);
}

In Main:

int main(){    

    // test sorting array of string by string length
    test_sort("data.txt");
}

Reading Function

char** read_from_file(char* fname, int * size)  { 

    char** words = (char **)malloc(N_MAX);
    FILE *ifp;
    ifp = fopen(fname, "r");
    if(ifp == NULL){
        fprintf(stderr, "Can't open file\n");
        exit(1);
    }

    int index;

    while (!feof(ifp)){
        for(index = 0; index < size; index++)
        {
            fscanf(ifp,"%s", words[index]);
        }
    }
    fclose(ifp);

    return words;
}

1 个解决方案

#1


3  

When you allocate an array of pointer-to-pointer-to-char (e.g. char **words;), you must allocate memory for the array of pointers:

当你分配一个指针指向char的数组(例如char ** words;)时,你必须为指针数组分配内存:

char **words = malloc (N_MAX * sizeof *words);

as well as each array of characters (or string) pointed to by each pointer:

以及每个指针指向的每个字符(或字符串)数组:

words[index] = malloc ((N_MAX + 1) * sizeof **words);

or simply:

words[index] = malloc (N_MAX + 1);

or when allocating memory for a null-terminated string, a shortcut with strdup that both allocates sufficient memory to hold the string and copies the string (including the null-terminating character) to the new block of memory, returning a pointer to the new block:

或者为空终止字符串分配内存时,带有strdup的快捷方式,它既分配足够的内存来保存字符串,又将字符串(包括空终止字符)复制到新的内存块,返回指向新块的指针:

words[index] = strdup (buf);

A short example of your intended functions could be as follows (note, index is passed as a pointer below, so it must be deferenced to obtain its value *index):

您的预期函数的一个简短示例可能如下(注意,索引作为指针传递到下面,因此必须引用它才能获得其值*索引):

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define N_MAX 10
#define C_MAX 64

void test_sort(char* fname);
char** read_from_file(char* fname, size_t *index);

int main (void) {    

    test_sort ("data.txt");
    return 0;
}

void test_sort (char* fname)
{
    size_t i = 0;  
    size_t num_words = 0;

    char **words = read_from_file (fname, &num_words);

    printf("\n ORIGINAL data:\n\n");
    // print_strings(words, num_words);
    for (i = 0; i < num_words; i++)
        printf ("   words[%2zu] : %s\n", i, words[i]);
    putchar ('\n');

    /* free allocated memory */
    for (i = 0; i < num_words; i++)
        free (words[i]);
    free (words);
}

char** read_from_file (char* fname, size_t *index)
{ 
    char **words = malloc (N_MAX * sizeof *words);
    char buf[C_MAX] = {0};
    FILE *ifp = fopen (fname, "r");

    if (ifp == NULL){
        fprintf (stderr, "Can't open file\n");
        exit(1);
    }

    *index = 0;
    while (fgets (buf, C_MAX, ifp))
    {
        char *p = buf;  /* strip trailing newline/carriage return */
        size_t len = strlen (p);
        while (len && (p[len-1] == '\r' || p[len-1] == '\n'))
            p[--len] = 0;

        /* strdup allocates and copies buf */
        words[(*index)++] = strdup (buf);

        if (*index == N_MAX) {
            fprintf (stderr, "warning: N_MAX words read.\n");
            break;
        }
    }
    fclose(ifp);

    return words;
}

Input

$ cat data.txt
A quick
brown fox
jumps over
the lazy
dog.

Output

$ ./bin/read5str

 ORIGINAL data:

   words[ 0] : A quick
   words[ 1] : brown fox
   words[ 2] : jumps over
   words[ 3] : the lazy
   words[ 4] : dog.

Memory Error Check

内存错误检查

In any code your write that dynamically allocates memory, you have 2 responsibilites regarding any block of memory allocated: (1) always preserves a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed. It is imperative that you use a memory error checking program to insure you haven't written beyond/outside your allocated block of memory and to confirm that you have freed all the memory you have allocated. For Linux valgrind is the normal choice. There are so many subtle ways to misuse a block of memory that can cause real problems, there is no excuse not to do it. There are similar memory checkers for every platform. They are all simple to use. Just run your program through it.

在你动态分配内存的任何代码中,你有2个责任关于任何分配的内存块:(1)总是保留一个指向内存块起始地址的指针,所以,(2)它可以在没有时被释放需要更久。您必须使用内存错误检查程序,以确保您没有在已分配的内存块之外/之外写入,并确认已释放已分配的所有内存。对于Linux,valgrind是正常的选择。有许多微妙的方法来滥用可能导致实际问题的内存块,没有理由不这样做。每个平台都有类似的内存检查器。它们都很简单易用。只需通过它运行您的程序。

$ valgrind ./bin/read5str
==5507== Memcheck, a memory error detector
==5507== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.
==5507== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info
==5507== Command: ./bin/read5str
==5507==

 ORIGINAL data:

   words[ 0] : A quick
   words[ 1] : brown fox
   words[ 2] : jumps over
   words[ 3] : the lazy
   words[ 4] : dog.

==5507==
==5507== HEAP SUMMARY:
==5507==     in use at exit: 0 bytes in 0 blocks
==5507==   total heap usage: 7 allocs, 7 frees, 691 bytes allocated
==5507==
==5507== All heap blocks were freed -- no leaks are possible
==5507==
==5507== For counts of detected and suppressed errors, rerun with: -v
==5507== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 2 from 2)

The salient parts above are that there were 7 allocs, 7 frees and All heap blocks were freed. Further ERROR SUMMARY: 0 errors from 0 contexts. You should receive similar output every time. Let me know if you have additional questions.

上面显着的部分是有7个allocs,7个frees和所有堆块被释放。进一步的错误摘要:来自0个上下文的0个错误。你应该每次都收到类似的输出。如果您还有其他问题,请与我们联系。

#1


3  

When you allocate an array of pointer-to-pointer-to-char (e.g. char **words;), you must allocate memory for the array of pointers:

当你分配一个指针指向char的数组(例如char ** words;)时,你必须为指针数组分配内存:

char **words = malloc (N_MAX * sizeof *words);

as well as each array of characters (or string) pointed to by each pointer:

以及每个指针指向的每个字符(或字符串)数组:

words[index] = malloc ((N_MAX + 1) * sizeof **words);

or simply:

words[index] = malloc (N_MAX + 1);

or when allocating memory for a null-terminated string, a shortcut with strdup that both allocates sufficient memory to hold the string and copies the string (including the null-terminating character) to the new block of memory, returning a pointer to the new block:

或者为空终止字符串分配内存时,带有strdup的快捷方式,它既分配足够的内存来保存字符串,又将字符串(包括空终止字符)复制到新的内存块,返回指向新块的指针:

words[index] = strdup (buf);

A short example of your intended functions could be as follows (note, index is passed as a pointer below, so it must be deferenced to obtain its value *index):

您的预期函数的一个简短示例可能如下(注意,索引作为指针传递到下面,因此必须引用它才能获得其值*索引):

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define N_MAX 10
#define C_MAX 64

void test_sort(char* fname);
char** read_from_file(char* fname, size_t *index);

int main (void) {    

    test_sort ("data.txt");
    return 0;
}

void test_sort (char* fname)
{
    size_t i = 0;  
    size_t num_words = 0;

    char **words = read_from_file (fname, &num_words);

    printf("\n ORIGINAL data:\n\n");
    // print_strings(words, num_words);
    for (i = 0; i < num_words; i++)
        printf ("   words[%2zu] : %s\n", i, words[i]);
    putchar ('\n');

    /* free allocated memory */
    for (i = 0; i < num_words; i++)
        free (words[i]);
    free (words);
}

char** read_from_file (char* fname, size_t *index)
{ 
    char **words = malloc (N_MAX * sizeof *words);
    char buf[C_MAX] = {0};
    FILE *ifp = fopen (fname, "r");

    if (ifp == NULL){
        fprintf (stderr, "Can't open file\n");
        exit(1);
    }

    *index = 0;
    while (fgets (buf, C_MAX, ifp))
    {
        char *p = buf;  /* strip trailing newline/carriage return */
        size_t len = strlen (p);
        while (len && (p[len-1] == '\r' || p[len-1] == '\n'))
            p[--len] = 0;

        /* strdup allocates and copies buf */
        words[(*index)++] = strdup (buf);

        if (*index == N_MAX) {
            fprintf (stderr, "warning: N_MAX words read.\n");
            break;
        }
    }
    fclose(ifp);

    return words;
}

Input

$ cat data.txt
A quick
brown fox
jumps over
the lazy
dog.

Output

$ ./bin/read5str

 ORIGINAL data:

   words[ 0] : A quick
   words[ 1] : brown fox
   words[ 2] : jumps over
   words[ 3] : the lazy
   words[ 4] : dog.

Memory Error Check

内存错误检查

In any code your write that dynamically allocates memory, you have 2 responsibilites regarding any block of memory allocated: (1) always preserves a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed. It is imperative that you use a memory error checking program to insure you haven't written beyond/outside your allocated block of memory and to confirm that you have freed all the memory you have allocated. For Linux valgrind is the normal choice. There are so many subtle ways to misuse a block of memory that can cause real problems, there is no excuse not to do it. There are similar memory checkers for every platform. They are all simple to use. Just run your program through it.

在你动态分配内存的任何代码中,你有2个责任关于任何分配的内存块:(1)总是保留一个指向内存块起始地址的指针,所以,(2)它可以在没有时被释放需要更久。您必须使用内存错误检查程序,以确保您没有在已分配的内存块之外/之外写入,并确认已释放已分配的所有内存。对于Linux,valgrind是正常的选择。有许多微妙的方法来滥用可能导致实际问题的内存块,没有理由不这样做。每个平台都有类似的内存检查器。它们都很简单易用。只需通过它运行您的程序。

$ valgrind ./bin/read5str
==5507== Memcheck, a memory error detector
==5507== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.
==5507== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info
==5507== Command: ./bin/read5str
==5507==

 ORIGINAL data:

   words[ 0] : A quick
   words[ 1] : brown fox
   words[ 2] : jumps over
   words[ 3] : the lazy
   words[ 4] : dog.

==5507==
==5507== HEAP SUMMARY:
==5507==     in use at exit: 0 bytes in 0 blocks
==5507==   total heap usage: 7 allocs, 7 frees, 691 bytes allocated
==5507==
==5507== All heap blocks were freed -- no leaks are possible
==5507==
==5507== For counts of detected and suppressed errors, rerun with: -v
==5507== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 2 from 2)

The salient parts above are that there were 7 allocs, 7 frees and All heap blocks were freed. Further ERROR SUMMARY: 0 errors from 0 contexts. You should receive similar output every time. Let me know if you have additional questions.

上面显着的部分是有7个allocs,7个frees和所有堆块被释放。进一步的错误摘要:来自0个上下文的0个错误。你应该每次都收到类似的输出。如果您还有其他问题,请与我们联系。