文件名称:unicode-segmentation:根据UAX#29规则的字素簇和单词边界
文件大小:212KB
文件格式:ZIP
更新时间:2024-05-22 05:23:46
Rust
根据规则在Grapheme Cluster或Word边界上拆分字符串的迭代器。 use unicode_segmentation :: UnicodeSegmentation; fn main () { let s = "a̐éö̲ \r\n " ; let g = s. graphemes ( true ). collect :: < Vec>>(); let b: & [_] = & [ "a̐" , "é" , "ö̲" , " \r\n " ]; assert_eq! (g, b); let s = "The quick ( \" brown \" ) fox can't jump 32.3 feet, right?" ; let w = s. unicode_words (). collect :: < V
【文件预览】:
unicode-segmentation-master
----.gitignore(42B)
----benches()
--------texts()
--------graphemes.rs(1KB)
----COPYRIGHT(321B)
----src()
--------grapheme.rs(29KB)
--------test.rs(8KB)
--------sentence.rs(14KB)
--------tables.rs(221KB)
--------testdata.rs(190KB)
--------word.rs(28KB)
--------lib.rs(11KB)
----.travis.yml(840B)
----fuzz()
--------.gitignore(25B)
--------fuzz_targets()
--------Cargo.toml(433B)
----Cargo.toml(880B)
----.github()
--------workflows()
----LICENSE-MIT(1KB)
----scripts()
--------unicode_gen_breaktests.py(6KB)
--------unicode.py(13KB)
----README.md(3KB)
----LICENSE-APACHE(11KB)