java实现文件单词频率统计 topN top K

时间:2021-10-09 17:21:00

java 实现单词计数.top N

思路

  1. 先统计每个单词出现的个数
  2. 利用 TreeSet 的自动排序的功能

上代码

  1. wordcount

      public void wordCount() {
        String route = "青岛路 青岛路 济阳路口东 济南路 宴坡路 ";
        Map<String, Integer> map = new HashMap<String, Integer>();
        StringTokenizer st = new StringTokenizer(route, " ");
        while (st.hasMoreTokens()) {
            String letter = st.nextToken();
            int count;
            if (map.get(letter) == null) {
                count = 1;
            } else {
                count = map.get(letter).intValue() + 1;
            }
            map.put(letter, count);
        }

        Set<StationEntity> set = new TreeSet<StationEntity>();
        for (String word : map.keySet()) {
            set.add(new StationEntity(word, map.get(word)));
            System.out.println("word" + word + ":" + map.get(word));

        }

        System.out.println(" 输出形式 --");
        for (Iterator<StationEntity> it = set.iterator(); it.hasNext(); ) {
            StationEntity station = it.next();
            System.out.println(station.getName() + "=" + station.getCount());

          System.out.println("station"+station+":");

        }

    }

2.StationEntity

package com.dk.route;

/**
 * Created by zzy on 15/11/28.
 */
public class StationEntity implements Comparable<StationEntity>{
    private String name ;

    public int getCount() {
        return count;
    }

    public void setCount(int count) {
        this.count = count;
    }

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }

    private int  count;

    public StationEntity(String name,int  count){
        this.name = name;
        this.count = count;
    }





    public int compareTo(StationEntity o) {

        int cmp = count - o.count;
        StationEntity s = (StationEntity)o ;


        return cmp ==0 ?name.compareTo(o.name):-cmp;
    }


}

3.运行结果

java实现文件单词频率统计 topN top Kjava实现文件单词频率统计 topN top K