使用扫描仪读取文本文件，并在每个字母出现时计数

so I have an assignment about array. It is asked to use Scanner to read through text files and record the occurrences of each alphabet and store them in a table.

所以我有关于数组的任务。要求使用扫描仪读取文本文件并记录每个字母的出现次数并将其存储在表格中。

For example:

例如：

public class something {

char[] alphabet = "abcdefghijklmnopqrstuvwxyz".toCharArray();

public void displayTable () {
        for (int i = 0; i < alphabet.length; i++) {
            System.out.println(alphabet[i] + ":  " + count);
        }
    }

I don't know how to construct the method to store the occurrences of each alphabet.

我不知道如何构造存储每个字母的出现的方法。

It is supposed to be like:

它应该是这样的：

public void countOccurrences (Scanner file) {
     //code to be written here
}

If the text file only has a line and the line is :

如果文本文件只有一行，则该行为：

Hello World

你好，世界

The method would ignore any integers or symbols and only output char that appeared in the table.

该方法将忽略任何整数或符号，并仅输出表中出现的char。

d: 1
e: 1
h: 1
l: 3
o: 2
r: 1
w: 1

I can't figure this out myself and any help is greatly appreciated!

我无法自己解决这个问题，非常感谢任何帮助！

Thanks, Shy

谢谢，害羞

2 个解决方案

#1

Simply use Map. Read inline comments for more info.

只需使用Map。阅读内联评论以获取更多信息。

Map<Character, Integer> treeMap = new TreeMap<Character, Integer>();
// initialize with default value that is zero for all the characters
for (char i = 'a'; i <= 'z'; i++) {
    treeMap.put(i, 0);
}

char[] alphabet = "Hello World".toCharArray();

for (int i = 0; i < alphabet.length; i++) {
    // make it lower case
    char ch = Character.toLowerCase(alphabet[i]);
    // just get the value and update it by one
    // check for characters only
    if (treeMap.containsKey(ch)) {
        treeMap.put(ch, treeMap.get(ch) + 1);
    }
}

// print the count
for (char key : treeMap.keySet()) {
    int count = treeMap.get(key);
    if (count > 0) {
        System.out.println(key + ":" + treeMap.get(key));
    }
}

output for Hello World ignore case

Hello World忽略大小写的输出

d:1
e:1
h:1
l:3
o:2
r:1
w:1

Read file line by line. Iterate all the character of the line and update the occurrence in the Map.

逐行读取文件。迭代该行的所有字符并更新Map中的事件。

#2

Note that I assumed that the only possible letters are from a to z.

请注意，我假设唯一可能的字母是从a到z。

So what you can do is build an array that will contains 26 entries (each entry which will correspond to the number of occurences of for the corresponding character).

所以你可以做的是构建一个包含26个条目的数组（每个条目对应于相应字符的出现次数）。

Now the thing to get is a mapping letter -> index with 'a' -> 0, 'b' -> 1 to 'z' -> 25.

现在得到的是一个映射字母 - >索引与'a' - > 0，'b' - > 1到'z' - > 25。

How would you get the mapping?

你会如何得到映射？

The trick is that characters are a kind of integer, as each character as actually an integer value associated (through the Unicode chart) (a is 97, b is 98 and so on).

诀窍是字符是一种整数，因为每个字符实际上是一个整数值（通过Unicode图表）（a是97，b是98，依此类推）。

In this case valueOfChar - 'a' will do the mapping and it's the key thing to understand. This will ensure that the value returned by this operation is between 0 and 25 because you know that each character will return a character between a and z (converting the input in lower case).

在这种情况下，valueOfChar - 'a'将进行映射，这是理解的关键。这将确保此操作返回的值介于0和25之间，因为您知道每个字符将返回a和z之间的字符（以小写形式转换输入）。

So to summarize :

总结一下：

For each line, split it by spaces.
对于每一行，用空格分隔。
Loop through each word to get the characters
循环遍历每个单词以获取字符
Put the character in lower case (if you put all the line in lower case, this step is useless)
把字符放在小写字母中（如果你把所有的行放在小写字母中，这一步是没用的）
Update the corresponding number of occurences in the array for this character using the above mapping
使用上面的映射更新此字符的数组中相应的出现次数

Here's a little example to show how the mapping works:

这是一个显示映射如何工作的小例子：

public class Test { 
    static final int[]  occurences = new int[26];
    public static void main(String args[]){
        String test = "helloworld";
        for(char c : test.toCharArray()){
            occurences[c - 'a']++;
        }
        for(char c = 'a'; c <= 'z'; c++){
            if(occurences[c - 'a'] != 0){
                System.out.println(c + " => "+occurences[c - 'a']);
            }
        }
    }
}

Output:

输出：

d => 1
e => 1
h => 1
l => 3
o => 2
r => 1
w => 1

#1