Is there a standard (preferably Apache Commons or similarly non-viral) library for doing "glob" type matches in Java? When I had to do similar in Perl once, I just changed all the ".
" to "\.
", the "*
" to ".*
" and the "?
" to ".
" and that sort of thing, but I'm wondering if somebody has done the work for me.
是否有用于在Java中进行“glob”类型匹配的标准(最好是Apache Commons或类似的非病毒)库?当我不得不在Perl中做类似的事情时,我只是改变了所有的“。”到“\。”,“*”到“。*”和“?”至 ”。”还有那种事,但我想知道是否有人为我做过这项工作。
Similar question: Create regex from glob expression
类似的问题:从glob表达式创建正则表达式
12 个解决方案
#1
34
There's nothing built-in, but it's pretty simple to convert something glob-like to a regex:
没有任何内置功能,但将类似glob的东西转换为正则表达式非常简单:
public static String createRegexFromGlob(String glob)
{
String out = "^";
for(int i = 0; i < glob.length(); ++i)
{
final char c = glob.charAt(i);
switch(c)
{
case '*': out += ".*"; break;
case '?': out += '.'; break;
case '.': out += "\\."; break;
case '\\': out += "\\\\"; break;
default: out += c;
}
}
out += '$';
return out;
}
this works for me, but I'm not sure if it covers the glob "standard", if there is one :)
这对我有用,但我不确定它是否涵盖了glob“标准”,如果有的话:)
Update by Paul Tomblin: I found a perl program that does glob conversion, and adapting it to Java I end up with:
Paul Tomblin更新:我找到了一个执行全局转换的perl程序,并将其调整为Java我最终得到:
private String convertGlobToRegEx(String line)
{
LOG.info("got line [" + line + "]");
line = line.trim();
int strLen = line.length();
StringBuilder sb = new StringBuilder(strLen);
// Remove beginning and ending * globs because they're useless
if (line.startsWith("*"))
{
line = line.substring(1);
strLen--;
}
if (line.endsWith("*"))
{
line = line.substring(0, strLen-1);
strLen--;
}
boolean escaping = false;
int inCurlies = 0;
for (char currentChar : line.toCharArray())
{
switch (currentChar)
{
case '*':
if (escaping)
sb.append("\\*");
else
sb.append(".*");
escaping = false;
break;
case '?':
if (escaping)
sb.append("\\?");
else
sb.append('.');
escaping = false;
break;
case '.':
case '(':
case ')':
case '+':
case '|':
case '^':
case '$':
case '@':
case '%':
sb.append('\\');
sb.append(currentChar);
escaping = false;
break;
case '\\':
if (escaping)
{
sb.append("\\\\");
escaping = false;
}
else
escaping = true;
break;
case '{':
if (escaping)
{
sb.append("\\{");
}
else
{
sb.append('(');
inCurlies++;
}
escaping = false;
break;
case '}':
if (inCurlies > 0 && !escaping)
{
sb.append(')');
inCurlies--;
}
else if (escaping)
sb.append("\\}");
else
sb.append("}");
escaping = false;
break;
case ',':
if (inCurlies > 0 && !escaping)
{
sb.append('|');
}
else if (escaping)
sb.append("\\,");
else
sb.append(",");
break;
default:
escaping = false;
sb.append(currentChar);
}
}
return sb.toString();
}
I'm editing into this answer rather than making my own because this answer put me on the right track.
我正在编辑这个答案而不是自己创作,因为这个答案让我走上正轨。
#2
48
Globbing is also planned for implemented in Java 7.
计划在Java 7中实现Globbing。
See FileSystem.getPathMatcher(String)
and the "Finding Files" tutorial.
请参阅FileSystem.getPathMatcher(String)和“查找文件”教程。
#3
22
Thanks to everyone here for their contributions. I wrote a more comprehensive conversion than any of the previous answers:
感谢大家的贡献。我写了一个比以前的答案更全面的转换:
/**
* Converts a standard POSIX Shell globbing pattern into a regular expression
* pattern. The result can be used with the standard {@link java.util.regex} API to
* recognize strings which match the glob pattern.
* <p/>
* See also, the POSIX Shell language:
* http://pubs.opengroup.org/onlinepubs/009695399/utilities/xcu_chap02.html#tag_02_13_01
*
* @param pattern A glob pattern.
* @return A regex pattern to recognize the given glob pattern.
*/
public static final String convertGlobToRegex(String pattern) {
StringBuilder sb = new StringBuilder(pattern.length());
int inGroup = 0;
int inClass = 0;
int firstIndexInClass = -1;
char[] arr = pattern.toCharArray();
for (int i = 0; i < arr.length; i++) {
char ch = arr[i];
switch (ch) {
case '\\':
if (++i >= arr.length) {
sb.append('\\');
} else {
char next = arr[i];
switch (next) {
case ',':
// escape not needed
break;
case 'Q':
case 'E':
// extra escape needed
sb.append('\\');
default:
sb.append('\\');
}
sb.append(next);
}
break;
case '*':
if (inClass == 0)
sb.append(".*");
else
sb.append('*');
break;
case '?':
if (inClass == 0)
sb.append('.');
else
sb.append('?');
break;
case '[':
inClass++;
firstIndexInClass = i+1;
sb.append('[');
break;
case ']':
inClass--;
sb.append(']');
break;
case '.':
case '(':
case ')':
case '+':
case '|':
case '^':
case '$':
case '@':
case '%':
if (inClass == 0 || (firstIndexInClass == i && ch == '^'))
sb.append('\\');
sb.append(ch);
break;
case '!':
if (firstIndexInClass == i)
sb.append('^');
else
sb.append('!');
break;
case '{':
inGroup++;
sb.append('(');
break;
case '}':
inGroup--;
sb.append(')');
break;
case ',':
if (inGroup > 0)
sb.append('|');
else
sb.append(',');
break;
default:
sb.append(ch);
}
}
return sb.toString();
}
And the unit tests to prove it works:
单元测试证明它有效:
/**
* @author Neil Traft
*/
public class StringUtils_ConvertGlobToRegex_Test {
@Test
public void star_becomes_dot_star() throws Exception {
assertEquals("gl.*b", StringUtils.convertGlobToRegex("gl*b"));
}
@Test
public void escaped_star_is_unchanged() throws Exception {
assertEquals("gl\\*b", StringUtils.convertGlobToRegex("gl\\*b"));
}
@Test
public void question_mark_becomes_dot() throws Exception {
assertEquals("gl.b", StringUtils.convertGlobToRegex("gl?b"));
}
@Test
public void escaped_question_mark_is_unchanged() throws Exception {
assertEquals("gl\\?b", StringUtils.convertGlobToRegex("gl\\?b"));
}
@Test
public void character_classes_dont_need_conversion() throws Exception {
assertEquals("gl[-o]b", StringUtils.convertGlobToRegex("gl[-o]b"));
}
@Test
public void escaped_classes_are_unchanged() throws Exception {
assertEquals("gl\\[-o\\]b", StringUtils.convertGlobToRegex("gl\\[-o\\]b"));
}
@Test
public void negation_in_character_classes() throws Exception {
assertEquals("gl[^a-n!p-z]b", StringUtils.convertGlobToRegex("gl[!a-n!p-z]b"));
}
@Test
public void nested_negation_in_character_classes() throws Exception {
assertEquals("gl[[^a-n]!p-z]b", StringUtils.convertGlobToRegex("gl[[!a-n]!p-z]b"));
}
@Test
public void escape_carat_if_it_is_the_first_char_in_a_character_class() throws Exception {
assertEquals("gl[\\^o]b", StringUtils.convertGlobToRegex("gl[^o]b"));
}
@Test
public void metachars_are_escaped() throws Exception {
assertEquals("gl..*\\.\\(\\)\\+\\|\\^\\$\\@\\%b", StringUtils.convertGlobToRegex("gl?*.()+|^$@%b"));
}
@Test
public void metachars_in_character_classes_dont_need_escaping() throws Exception {
assertEquals("gl[?*.()+|^$@%]b", StringUtils.convertGlobToRegex("gl[?*.()+|^$@%]b"));
}
@Test
public void escaped_backslash_is_unchanged() throws Exception {
assertEquals("gl\\\\b", StringUtils.convertGlobToRegex("gl\\\\b"));
}
@Test
public void slashQ_and_slashE_are_escaped() throws Exception {
assertEquals("\\\\Qglob\\\\E", StringUtils.convertGlobToRegex("\\Qglob\\E"));
}
@Test
public void braces_are_turned_into_groups() throws Exception {
assertEquals("(glob|regex)", StringUtils.convertGlobToRegex("{glob,regex}"));
}
@Test
public void escaped_braces_are_unchanged() throws Exception {
assertEquals("\\{glob\\}", StringUtils.convertGlobToRegex("\\{glob\\}"));
}
@Test
public void commas_dont_need_escaping() throws Exception {
assertEquals("(glob,regex),", StringUtils.convertGlobToRegex("{glob\\,regex},"));
}
}
#4
8
There are couple of libraries that do Glob-like pattern matching that are more modern than the ones listed:
有几个库可以进行类似Glob的模式匹配,它们比列出的更现代:
Theres Ants Directory Scanner And Springs AntPathMatcher
Theres Ants目录扫描仪和弹簧AntPathMatcher
I recommend both over the other solutions since Ant Style Globbing has pretty much become the standard glob syntax in the Java world (Hudson, Spring, Ant and I think Maven).
我推荐其他解决方案,因为Ant Style Globbing已经成为Java世界中的标准glob语法(Hudson,Spring,Ant和I think Maven)。
#5
5
This is a simple Glob implementation which handles * and ? in the pattern
这是一个简单的Glob实现,可以处理*和?在模式中
public class GlobMatch {
private String text;
private String pattern;
public boolean match(String text, String pattern) {
this.text = text;
this.pattern = pattern;
return matchCharacter(0, 0);
}
private boolean matchCharacter(int patternIndex, int textIndex) {
if (patternIndex >= pattern.length()) {
return false;
}
switch(pattern.charAt(patternIndex)) {
case '?':
// Match any character
if (textIndex >= text.length()) {
return false;
}
break;
case '*':
// * at the end of the pattern will match anything
if (patternIndex + 1 >= pattern.length() || textIndex >= text.length()) {
return true;
}
// Probe forward to see if we can get a match
while (textIndex < text.length()) {
if (matchCharacter(patternIndex + 1, textIndex)) {
return true;
}
textIndex++;
}
return false;
default:
if (textIndex >= text.length()) {
return false;
}
String textChar = text.substring(textIndex, textIndex + 1);
String patternChar = pattern.substring(patternIndex, patternIndex + 1);
// Note the match is case insensitive
if (textChar.compareToIgnoreCase(patternChar) != 0) {
return false;
}
}
// End of pattern and text?
if (patternIndex + 1 >= pattern.length() && textIndex + 1 >= text.length()) {
return true;
}
// Go on to match the next character in the pattern
return matchCharacter(patternIndex + 1, textIndex + 1);
}
}
#6
5
I recently had to do it and used \Q
and \E
to escape the glob pattern:
我最近不得不这样做并使用\ Q和\ E来逃避glob模式:
private static Pattern getPatternFromGlob(String glob) {
return Pattern.compile(
"^" + Pattern.quote(glob)
.replace("*", "\\E.*\\Q")
.replace("?", "\\E.\\Q")
+ "$");
}
#7
3
GlobCompiler/GlobEngine, from Jakarta ORO, looks promising. It's available under the Apache License.
雅加达ORO的GlobCompiler / GlobEngine看起来很有前景。它在Apache许可下可用。
#8
3
Similar to Tony Edgecombe's answer, here is a short and simple globber that supports *
and ?
without using regex, if anybody needs one.
与Tony Edgecombe的答案类似,这是一个支持*和?的短而简单的globber。没有使用正则表达式,如果有人需要一个。
public static boolean matches(String text, String glob) {
String rest = null;
int pos = glob.indexOf('*');
if (pos != -1) {
rest = glob.substring(pos + 1);
glob = glob.substring(0, pos);
}
if (glob.length() > text.length())
return false;
// handle the part up to the first *
for (int i = 0; i < glob.length(); i++)
if (glob.charAt(i) != '?'
&& !glob.substring(i, i + 1).equalsIgnoreCase(text.substring(i, i + 1)))
return false;
// recurse for the part after the first *, if any
if (rest == null) {
return glob.length() == text.length();
} else {
for (int i = glob.length(); i <= text.length(); i++) {
if (matches(text.substring(i), rest))
return true;
}
return false;
}
}
#9
2
I don't know about a "standard" implementation, but I know of a sourceforge project released under the BSD license that implemented glob matching for files. It's implemented in one file, maybe you can adapt it for your requirements.
我不知道“标准”实现,但我知道在BSD许可下发布的sourceforge项目实现了文件的全局匹配。它在一个文件中实现,也许您可以根据您的要求进行调整。
#10
0
Long ago I was doing a massive glob-driven text filtering so I've written a small piece of code (15 lines of code, no dependencies beyond JDK). It handles only '*' (was sufficient for me), but can be easily extended for '?'. It is several times faster than pre-compiled regexp, does not require any pre-compilation (essentially it is a string-vs-string comparison every time the pattern is matched).
很久以前我正在做一个大规模的全局驱动的文本过滤,所以我写了一小段代码(15行代码,没有JDK之外的依赖)。它只处理'*'(对我来说足够了),但可以很容易地扩展为'?'。它比预编译的正则表达式快几倍,不需要任何预编译(基本上它是每次匹配模式时的字符串与字符串比较)。
Code:
码:
public static boolean miniglob(String[] pattern, String line) {
if (pattern.length == 0) return line.isEmpty();
else if (pattern.length == 1) return line.equals(pattern[0]);
else {
if (!line.startsWith(pattern[0])) return false;
int idx = pattern[0].length();
for (int i = 1; i < pattern.length - 1; ++i) {
String patternTok = pattern[i];
int nextIdx = line.indexOf(patternTok, idx);
if (nextIdx < 0) return false;
else idx = nextIdx + patternTok.length();
}
if (!line.endsWith(pattern[pattern.length - 1])) return false;
return true;
}
}
Usage:
用法:
public static void main(String[] args) {
BufferedReader in = new BufferedReader(new InputStreamReader(System.in));
try {
// read from stdin space separated text and pattern
for (String input = in.readLine(); input != null; input = in.readLine()) {
String[] tokens = input.split(" ");
String line = tokens[0];
String[] pattern = tokens[1].split("\\*+", -1 /* want empty trailing token if any */);
// check matcher performance
long tm0 = System.currentTimeMillis();
for (int i = 0; i < 1000000; ++i) {
miniglob(pattern, line);
}
long tm1 = System.currentTimeMillis();
System.out.println("miniglob took " + (tm1-tm0) + " ms");
// check regexp performance
Pattern reptn = Pattern.compile(tokens[1].replace("*", ".*"));
Matcher mtchr = reptn.matcher(line);
tm0 = System.currentTimeMillis();
for (int i = 0; i < 1000000; ++i) {
mtchr.matches();
}
tm1 = System.currentTimeMillis();
System.out.println("regexp took " + (tm1-tm0) + " ms");
// check if miniglob worked correctly
if (miniglob(pattern, line)) {
System.out.println("+ >" + line);
}
else {
System.out.println("- >" + line);
}
}
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
Copy/paste from here
从这里复制/粘贴
#11
0
It may be a slightly hacky approach. I've figured it out from NIO2's Files.newDirectoryStream(Path dir, String glob)
code. Pay attention that every match new Path
object is created. So far I was able to test this only on Windows FS, however, I believe it should work on Unix as well.
这可能是一个有点hacky的方法。我已经从NIO2的Files.newDirectoryStream(Path dir,String glob)代码中找到了它。注意创建每个匹配的新Path对象。到目前为止,我只能在Windows FS上测试它,但是,我相信它也适用于Unix。
// a file system hack to get a glob matching
PathMatcher matcher = ("*".equals(glob)) ? null
: FileSystems.getDefault().getPathMatcher("glob:" + glob);
if ("*".equals(glob) || matcher.matches(Paths.get(someName))) {
// do you stuff here
}
#12
-1
By the way, it seems as if you did it the hard way in Perl
顺便说一下,好像你在Perl中做得很好
This does the trick in Perl:
这在Perl中起到了作用:
my @files = glob("*.html")
# Or, if you prefer:
my @files = <*.html>
#1
34
There's nothing built-in, but it's pretty simple to convert something glob-like to a regex:
没有任何内置功能,但将类似glob的东西转换为正则表达式非常简单:
public static String createRegexFromGlob(String glob)
{
String out = "^";
for(int i = 0; i < glob.length(); ++i)
{
final char c = glob.charAt(i);
switch(c)
{
case '*': out += ".*"; break;
case '?': out += '.'; break;
case '.': out += "\\."; break;
case '\\': out += "\\\\"; break;
default: out += c;
}
}
out += '$';
return out;
}
this works for me, but I'm not sure if it covers the glob "standard", if there is one :)
这对我有用,但我不确定它是否涵盖了glob“标准”,如果有的话:)
Update by Paul Tomblin: I found a perl program that does glob conversion, and adapting it to Java I end up with:
Paul Tomblin更新:我找到了一个执行全局转换的perl程序,并将其调整为Java我最终得到:
private String convertGlobToRegEx(String line)
{
LOG.info("got line [" + line + "]");
line = line.trim();
int strLen = line.length();
StringBuilder sb = new StringBuilder(strLen);
// Remove beginning and ending * globs because they're useless
if (line.startsWith("*"))
{
line = line.substring(1);
strLen--;
}
if (line.endsWith("*"))
{
line = line.substring(0, strLen-1);
strLen--;
}
boolean escaping = false;
int inCurlies = 0;
for (char currentChar : line.toCharArray())
{
switch (currentChar)
{
case '*':
if (escaping)
sb.append("\\*");
else
sb.append(".*");
escaping = false;
break;
case '?':
if (escaping)
sb.append("\\?");
else
sb.append('.');
escaping = false;
break;
case '.':
case '(':
case ')':
case '+':
case '|':
case '^':
case '$':
case '@':
case '%':
sb.append('\\');
sb.append(currentChar);
escaping = false;
break;
case '\\':
if (escaping)
{
sb.append("\\\\");
escaping = false;
}
else
escaping = true;
break;
case '{':
if (escaping)
{
sb.append("\\{");
}
else
{
sb.append('(');
inCurlies++;
}
escaping = false;
break;
case '}':
if (inCurlies > 0 && !escaping)
{
sb.append(')');
inCurlies--;
}
else if (escaping)
sb.append("\\}");
else
sb.append("}");
escaping = false;
break;
case ',':
if (inCurlies > 0 && !escaping)
{
sb.append('|');
}
else if (escaping)
sb.append("\\,");
else
sb.append(",");
break;
default:
escaping = false;
sb.append(currentChar);
}
}
return sb.toString();
}
I'm editing into this answer rather than making my own because this answer put me on the right track.
我正在编辑这个答案而不是自己创作,因为这个答案让我走上正轨。
#2
48
Globbing is also planned for implemented in Java 7.
计划在Java 7中实现Globbing。
See FileSystem.getPathMatcher(String)
and the "Finding Files" tutorial.
请参阅FileSystem.getPathMatcher(String)和“查找文件”教程。
#3
22
Thanks to everyone here for their contributions. I wrote a more comprehensive conversion than any of the previous answers:
感谢大家的贡献。我写了一个比以前的答案更全面的转换:
/**
* Converts a standard POSIX Shell globbing pattern into a regular expression
* pattern. The result can be used with the standard {@link java.util.regex} API to
* recognize strings which match the glob pattern.
* <p/>
* See also, the POSIX Shell language:
* http://pubs.opengroup.org/onlinepubs/009695399/utilities/xcu_chap02.html#tag_02_13_01
*
* @param pattern A glob pattern.
* @return A regex pattern to recognize the given glob pattern.
*/
public static final String convertGlobToRegex(String pattern) {
StringBuilder sb = new StringBuilder(pattern.length());
int inGroup = 0;
int inClass = 0;
int firstIndexInClass = -1;
char[] arr = pattern.toCharArray();
for (int i = 0; i < arr.length; i++) {
char ch = arr[i];
switch (ch) {
case '\\':
if (++i >= arr.length) {
sb.append('\\');
} else {
char next = arr[i];
switch (next) {
case ',':
// escape not needed
break;
case 'Q':
case 'E':
// extra escape needed
sb.append('\\');
default:
sb.append('\\');
}
sb.append(next);
}
break;
case '*':
if (inClass == 0)
sb.append(".*");
else
sb.append('*');
break;
case '?':
if (inClass == 0)
sb.append('.');
else
sb.append('?');
break;
case '[':
inClass++;
firstIndexInClass = i+1;
sb.append('[');
break;
case ']':
inClass--;
sb.append(']');
break;
case '.':
case '(':
case ')':
case '+':
case '|':
case '^':
case '$':
case '@':
case '%':
if (inClass == 0 || (firstIndexInClass == i && ch == '^'))
sb.append('\\');
sb.append(ch);
break;
case '!':
if (firstIndexInClass == i)
sb.append('^');
else
sb.append('!');
break;
case '{':
inGroup++;
sb.append('(');
break;
case '}':
inGroup--;
sb.append(')');
break;
case ',':
if (inGroup > 0)
sb.append('|');
else
sb.append(',');
break;
default:
sb.append(ch);
}
}
return sb.toString();
}
And the unit tests to prove it works:
单元测试证明它有效:
/**
* @author Neil Traft
*/
public class StringUtils_ConvertGlobToRegex_Test {
@Test
public void star_becomes_dot_star() throws Exception {
assertEquals("gl.*b", StringUtils.convertGlobToRegex("gl*b"));
}
@Test
public void escaped_star_is_unchanged() throws Exception {
assertEquals("gl\\*b", StringUtils.convertGlobToRegex("gl\\*b"));
}
@Test
public void question_mark_becomes_dot() throws Exception {
assertEquals("gl.b", StringUtils.convertGlobToRegex("gl?b"));
}
@Test
public void escaped_question_mark_is_unchanged() throws Exception {
assertEquals("gl\\?b", StringUtils.convertGlobToRegex("gl\\?b"));
}
@Test
public void character_classes_dont_need_conversion() throws Exception {
assertEquals("gl[-o]b", StringUtils.convertGlobToRegex("gl[-o]b"));
}
@Test
public void escaped_classes_are_unchanged() throws Exception {
assertEquals("gl\\[-o\\]b", StringUtils.convertGlobToRegex("gl\\[-o\\]b"));
}
@Test
public void negation_in_character_classes() throws Exception {
assertEquals("gl[^a-n!p-z]b", StringUtils.convertGlobToRegex("gl[!a-n!p-z]b"));
}
@Test
public void nested_negation_in_character_classes() throws Exception {
assertEquals("gl[[^a-n]!p-z]b", StringUtils.convertGlobToRegex("gl[[!a-n]!p-z]b"));
}
@Test
public void escape_carat_if_it_is_the_first_char_in_a_character_class() throws Exception {
assertEquals("gl[\\^o]b", StringUtils.convertGlobToRegex("gl[^o]b"));
}
@Test
public void metachars_are_escaped() throws Exception {
assertEquals("gl..*\\.\\(\\)\\+\\|\\^\\$\\@\\%b", StringUtils.convertGlobToRegex("gl?*.()+|^$@%b"));
}
@Test
public void metachars_in_character_classes_dont_need_escaping() throws Exception {
assertEquals("gl[?*.()+|^$@%]b", StringUtils.convertGlobToRegex("gl[?*.()+|^$@%]b"));
}
@Test
public void escaped_backslash_is_unchanged() throws Exception {
assertEquals("gl\\\\b", StringUtils.convertGlobToRegex("gl\\\\b"));
}
@Test
public void slashQ_and_slashE_are_escaped() throws Exception {
assertEquals("\\\\Qglob\\\\E", StringUtils.convertGlobToRegex("\\Qglob\\E"));
}
@Test
public void braces_are_turned_into_groups() throws Exception {
assertEquals("(glob|regex)", StringUtils.convertGlobToRegex("{glob,regex}"));
}
@Test
public void escaped_braces_are_unchanged() throws Exception {
assertEquals("\\{glob\\}", StringUtils.convertGlobToRegex("\\{glob\\}"));
}
@Test
public void commas_dont_need_escaping() throws Exception {
assertEquals("(glob,regex),", StringUtils.convertGlobToRegex("{glob\\,regex},"));
}
}
#4
8
There are couple of libraries that do Glob-like pattern matching that are more modern than the ones listed:
有几个库可以进行类似Glob的模式匹配,它们比列出的更现代:
Theres Ants Directory Scanner And Springs AntPathMatcher
Theres Ants目录扫描仪和弹簧AntPathMatcher
I recommend both over the other solutions since Ant Style Globbing has pretty much become the standard glob syntax in the Java world (Hudson, Spring, Ant and I think Maven).
我推荐其他解决方案,因为Ant Style Globbing已经成为Java世界中的标准glob语法(Hudson,Spring,Ant和I think Maven)。
#5
5
This is a simple Glob implementation which handles * and ? in the pattern
这是一个简单的Glob实现,可以处理*和?在模式中
public class GlobMatch {
private String text;
private String pattern;
public boolean match(String text, String pattern) {
this.text = text;
this.pattern = pattern;
return matchCharacter(0, 0);
}
private boolean matchCharacter(int patternIndex, int textIndex) {
if (patternIndex >= pattern.length()) {
return false;
}
switch(pattern.charAt(patternIndex)) {
case '?':
// Match any character
if (textIndex >= text.length()) {
return false;
}
break;
case '*':
// * at the end of the pattern will match anything
if (patternIndex + 1 >= pattern.length() || textIndex >= text.length()) {
return true;
}
// Probe forward to see if we can get a match
while (textIndex < text.length()) {
if (matchCharacter(patternIndex + 1, textIndex)) {
return true;
}
textIndex++;
}
return false;
default:
if (textIndex >= text.length()) {
return false;
}
String textChar = text.substring(textIndex, textIndex + 1);
String patternChar = pattern.substring(patternIndex, patternIndex + 1);
// Note the match is case insensitive
if (textChar.compareToIgnoreCase(patternChar) != 0) {
return false;
}
}
// End of pattern and text?
if (patternIndex + 1 >= pattern.length() && textIndex + 1 >= text.length()) {
return true;
}
// Go on to match the next character in the pattern
return matchCharacter(patternIndex + 1, textIndex + 1);
}
}
#6
5
I recently had to do it and used \Q
and \E
to escape the glob pattern:
我最近不得不这样做并使用\ Q和\ E来逃避glob模式:
private static Pattern getPatternFromGlob(String glob) {
return Pattern.compile(
"^" + Pattern.quote(glob)
.replace("*", "\\E.*\\Q")
.replace("?", "\\E.\\Q")
+ "$");
}
#7
3
GlobCompiler/GlobEngine, from Jakarta ORO, looks promising. It's available under the Apache License.
雅加达ORO的GlobCompiler / GlobEngine看起来很有前景。它在Apache许可下可用。
#8
3
Similar to Tony Edgecombe's answer, here is a short and simple globber that supports *
and ?
without using regex, if anybody needs one.
与Tony Edgecombe的答案类似,这是一个支持*和?的短而简单的globber。没有使用正则表达式,如果有人需要一个。
public static boolean matches(String text, String glob) {
String rest = null;
int pos = glob.indexOf('*');
if (pos != -1) {
rest = glob.substring(pos + 1);
glob = glob.substring(0, pos);
}
if (glob.length() > text.length())
return false;
// handle the part up to the first *
for (int i = 0; i < glob.length(); i++)
if (glob.charAt(i) != '?'
&& !glob.substring(i, i + 1).equalsIgnoreCase(text.substring(i, i + 1)))
return false;
// recurse for the part after the first *, if any
if (rest == null) {
return glob.length() == text.length();
} else {
for (int i = glob.length(); i <= text.length(); i++) {
if (matches(text.substring(i), rest))
return true;
}
return false;
}
}
#9
2
I don't know about a "standard" implementation, but I know of a sourceforge project released under the BSD license that implemented glob matching for files. It's implemented in one file, maybe you can adapt it for your requirements.
我不知道“标准”实现,但我知道在BSD许可下发布的sourceforge项目实现了文件的全局匹配。它在一个文件中实现,也许您可以根据您的要求进行调整。
#10
0
Long ago I was doing a massive glob-driven text filtering so I've written a small piece of code (15 lines of code, no dependencies beyond JDK). It handles only '*' (was sufficient for me), but can be easily extended for '?'. It is several times faster than pre-compiled regexp, does not require any pre-compilation (essentially it is a string-vs-string comparison every time the pattern is matched).
很久以前我正在做一个大规模的全局驱动的文本过滤,所以我写了一小段代码(15行代码,没有JDK之外的依赖)。它只处理'*'(对我来说足够了),但可以很容易地扩展为'?'。它比预编译的正则表达式快几倍,不需要任何预编译(基本上它是每次匹配模式时的字符串与字符串比较)。
Code:
码:
public static boolean miniglob(String[] pattern, String line) {
if (pattern.length == 0) return line.isEmpty();
else if (pattern.length == 1) return line.equals(pattern[0]);
else {
if (!line.startsWith(pattern[0])) return false;
int idx = pattern[0].length();
for (int i = 1; i < pattern.length - 1; ++i) {
String patternTok = pattern[i];
int nextIdx = line.indexOf(patternTok, idx);
if (nextIdx < 0) return false;
else idx = nextIdx + patternTok.length();
}
if (!line.endsWith(pattern[pattern.length - 1])) return false;
return true;
}
}
Usage:
用法:
public static void main(String[] args) {
BufferedReader in = new BufferedReader(new InputStreamReader(System.in));
try {
// read from stdin space separated text and pattern
for (String input = in.readLine(); input != null; input = in.readLine()) {
String[] tokens = input.split(" ");
String line = tokens[0];
String[] pattern = tokens[1].split("\\*+", -1 /* want empty trailing token if any */);
// check matcher performance
long tm0 = System.currentTimeMillis();
for (int i = 0; i < 1000000; ++i) {
miniglob(pattern, line);
}
long tm1 = System.currentTimeMillis();
System.out.println("miniglob took " + (tm1-tm0) + " ms");
// check regexp performance
Pattern reptn = Pattern.compile(tokens[1].replace("*", ".*"));
Matcher mtchr = reptn.matcher(line);
tm0 = System.currentTimeMillis();
for (int i = 0; i < 1000000; ++i) {
mtchr.matches();
}
tm1 = System.currentTimeMillis();
System.out.println("regexp took " + (tm1-tm0) + " ms");
// check if miniglob worked correctly
if (miniglob(pattern, line)) {
System.out.println("+ >" + line);
}
else {
System.out.println("- >" + line);
}
}
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
Copy/paste from here
从这里复制/粘贴
#11
0
It may be a slightly hacky approach. I've figured it out from NIO2's Files.newDirectoryStream(Path dir, String glob)
code. Pay attention that every match new Path
object is created. So far I was able to test this only on Windows FS, however, I believe it should work on Unix as well.
这可能是一个有点hacky的方法。我已经从NIO2的Files.newDirectoryStream(Path dir,String glob)代码中找到了它。注意创建每个匹配的新Path对象。到目前为止,我只能在Windows FS上测试它,但是,我相信它也适用于Unix。
// a file system hack to get a glob matching
PathMatcher matcher = ("*".equals(glob)) ? null
: FileSystems.getDefault().getPathMatcher("glob:" + glob);
if ("*".equals(glob) || matcher.matches(Paths.get(someName))) {
// do you stuff here
}
#12
-1
By the way, it seems as if you did it the hard way in Perl
顺便说一下,好像你在Perl中做得很好
This does the trick in Perl:
这在Perl中起到了作用:
my @files = glob("*.html")
# Or, if you prefer:
my @files = <*.html>