Hunspell介绍及试用

时间:2022-03-12 17:14:03

1、简介

  Hunspell是一个为拥有多态和复杂组合词的语言所设计的拼写检查器,原本为匈牙利语设计。

  Hunspell是一个*软件,在GPL、LGPL和MPL三许可证下发行。

  Hunspell对主要平台和编程语言都有接口和封装。Hunspell基于MySpell,并且与MySpell词典后端兼容。MySpell使用单字节字符编码,而Hunspell则可以使用Unicode UTF-8编码的词典。

2、以下应用程序使用Hunspell作为拼写检查器:

  Mac OS X10.6 以及之后版本

  Eclipse,使用Hunspell4Eclipse

  Google Chrome,Google开发的一个网页浏览器

  Evernote,笔记软件

  LibreOffice和OpenOffice.org,开源办公组件

  Mozilla Firefox和Thunderbird以及SeaMonkey

  Opera,一个跨平台的网页浏览器

  Scribus,桌面出版应用

  Vim,一个文本编辑器

  WPS Office,国产办公组件

3、使用docker镜像测试Hunspell的功能:

  3.1查看可用字典

[root@host---- hunspell]# docker run --rm tmaier/hunspell -D
SEARCH PATH:
.::/usr/share/hunspell:/usr/share/myspell:/usr/share/myspell/dicts:/Library/Spelling:/root/.openoffice.org//user/wordbook:/root/.openoffice.org2/user/wordbook:/root/.openoffice.org2./user/w/lib/openoffice.org/basis3./share/dict/ooo:/opt/openoffice.org2./share/dict/ooo:/usr/lib/openoffice.org2./share/dict/ooo:/opt/openoffice.org2./share/dict/ooo:/usr/lib/openoffice.org2./shhare/dict/ooo:/opt/openoffice.org2./share/dict/ooo:/usr/lib/openoffice.org2./share/dict/ooo:/opt/openoffice.org2./share/dict/ooo:/usr/lib/openoffice.org2./share/dict/ooo
AVAILABLE DICTIONARIES (path is not mandatory for -d option):
/usr/share/hunspell/en_CA
/usr/share/hunspell/de_DE_comb
/usr/share/hunspell/en_ZA
/usr/share/hunspell/en_US
/usr/share/hunspell/en_GB
/usr/share/hunspell/en_AU
/usr/share/hunspell/de_CH
/usr/share/hunspell/de_DE_neu
/usr/share/hunspell/en_NZ
/usr/share/hunspell/de_AT
/usr/share/hunspell/default
LOADED DICTIONARY:
/usr/share/hunspell/default.aff
/usr/share/hunspell/default.dic
Hunspell 1.6.

  3.2查看帮助信息

[root@host---- hunspell]# docker run --rm -v $(pwd):/workdir tmaier/hunspell -u3 -i utf- -d de_DE_neu,en_US,de_CH -p words  -h
Usage: hunspell [OPTION]... [FILE]...
Check spelling of each FILE. Without FILE, check standard input. - check only first field in lines (delimiter = tabulator)
-a Ispell's pipe interface
--check-url check URLs, e-mail addresses and directory paths
--check-apostrophe check Unicode typographic apostrophe
-d d[,d2,...] use d (d2 etc.) dictionaries
-D show available dictionaries
-G print only correct words or lines
-h, --help display this help and exit
-H HTML input file format
-i enc input encoding
-l print misspelled words(只打印错误的单词)
-L print lines with misspelled words(打印错误单词所在行)
-m analyze the words of the input text
-n nroff/troff input file format
-O OpenDocument (ODF or Flat ODF) input file format
-p dict set dict custom dictionary
-r warn of the potential mistakes (rare words)
-P password set password for encrypted dictionaries
-s stem the words of the input text
-S suffix words of the input text
-t TeX/LaTeX input file format
-v, --version print version number
-vv print Ispell compatible version number
-w print misspelled words (= lines) from one word/line input.
-X XML input file format Example: hunspell -d en_US file.txt # interactive spelling
hunspell -i utf- file.txt # check UTF- encoded file
hunspell -l *.odt # print misspelled words of ODF files # Quick fix of ODF documents by personal dictionary creation # Make a reduced list from misspelled and unknown words: hunspell -l *.odt | sort | uniq >words # Delete misspelled words of the file by a text editor.
# Use this personal dictionary to fix the deleted words: hunspell -p words *.odt Bug reports: http://hunspell.github.io/

  3.3检查某个文档的拼写(显示错误词所在行数及建议更改)原文:test1.TXT(链接:https://pan.baidu.com/s/17JRmtnebLblVsMG05CIm-w 密码:l3q9)

[root@host---- hunspell]# docker run --rm -v $(pwd):/workdir tmaier/hunspell -u3 -i utf- -d de_DE_neu,en_US,de_CH -p words  test1.TXT
test1.TXT:: Locate: rans | Try: rand
test1.TXT:: Locate: wew | Try: woo
test1.TXT:: Locate: Sevenn | Try: Severn
test1.TXT:: Locate: cannt | Try: canny
test1.TXT:: Locate: Hmm | Try: Mm
test1.TXT:: Locate: Lele | Try: Lee
test1.TXT:: Locate: Lele | Try: Lee
test1.TXT:: Locate: Lele | Try: Lee
test1.TXT:: Locate: Lele | Try: Lee
test1.TXT:: Locate: Hmm | Try: Mm
test1.TXT:: Locate: Hmm | Try: Mm
test1.TXT:: Locate: ve | Try: be
test1.TXT:: Locate: ve | Try: be
test1.TXT:: Locate: ve | Try: be
test1.TXT:: Locate: Hmm | Try: Mm
test1.TXT:: Locate: ve | Try: be
test1.TXT:: Locate: hasn | Try: has
test1.TXT:: Locate: isn | Try: sin
test1.TXT:: Locate: ve | Try: be
test1.TXT:: Locate: ve | Try: be
test1.TXT:: Locate: Hmm | Try: Mm
test1.TXT:: Locate: Hmm | Try: Mm
test1.TXT:: Locate: wasn | Try: wans
test1.TXT:: Locate: isn | Try: sin
test1.TXT:: Locate: isn | Try: sin
test1.TXT:: Locate: vomeronasal | Try: astronomer
test1.TXT:: Locate: didn | Try: did
test1.TXT:: Locate: ve | Try: be
test1.TXT:: Locate: weren | Try: were
test1.TXT:: Locate: wasn | Try: wans
test1.TXT:: Locate: wouldn | Try: would
test1.TXT:: Locate: weren | Try: were
test1.TXT:: Locate: ve | Try: be
test1.TXT:: Locate: ve | Try: be
test1.TXT:: Locate: cefepime | Try: timepiece
test1.TXT:: Locate: amikacin | Try: Kamikaze
test1.TXT:: Locate: Mmm | Try: Mm
test1.TXT:: Locate: kuai | Try: Kauai
test1.TXT:: Locate: ve | Try: be
test1.TXT:: Locate: isn | Try: sin
test1.TXT:: Locate: ve | Try: be
test1.TXT:: Locate: aren | Try: earn
test1.TXT:: Locate: shouldn | Try: should
test1.TXT:: Locate: whould | Try: would
test1.TXT:: Locate: Hmm | Try: Mm
test1.TXT:: Locate: Hmm | Try: Mm
test1.TXT:: Locate: Hmm | Try: Mm
test1.TXT:: Locate: ve | Try: be
test1.TXT:: Locate: Hmm | Try: Mm
test1.TXT:: Locate: Hmm | Try: Mm
test1.TXT:: Locate: Uhh | Try: Shh
test1.TXT:: Locate: Chh | Try: Ch
test1.TXT:: Locate: Hmm | Try: Mm
test1.TXT:: Locate: isn | Try: sin
test1.TXT:: Locate: ve | Try: be
test1.TXT:: Locate: exfoliator | Try: defoliator
test1.TXT:: Locate: didn | Try: did
test1.TXT:: Locate: didn | Try: did
test1.TXT:: Locate: Hmm | Try: Mm
test1.TXT:: Locate: ve | Try: be