Java +正则表达式如何检查这样的字符串“LOAD_filesourceB-01012008_000058.dat”的类型和编号(最后6位数)

时间:2020-12-15 19:29:50

how to implement such a requirement via regexp?

如何通过regexp实现这样的要求?

I have a list of filenames as String's.
LOAD_filesourceA-01012008-00001.dat
LOAD_filesourceB-01012008-00001.dat
LOAD_filesourceB-01012008-00003.dat
LOAD_filesourceA-01012008-00004.dat
LOAD_filesourceA-01012008-000055.dat
LOAD_filesourceB-01012008_000055.dat
...
LOAD_filesourceB-01012008_000058.dat
etc

我有一个文件名列表作为字符串。 LOAD_filesourceA-01012008-00001.dat LOAD_filesourceB-01012008-00001.dat LOAD_filesourceB-01012008-00003.dat LOAD_filesourceA-01012008-00004.dat LOAD_filesourceA-01012008-000055.dat LOAD_filesourceB-01012008_000055.dat ... LOAD_filesourceB-01012008_000058.dat etc

after loading each file, that file gets moved into an archive directory... and I log the file type and load number(last 6 chars in filename)
I have 2 pieces of info: 1- whether the file I wish to load is of type A or B 2- the last loaded file number as integer based on these, I would like to get the file name of the next file, i.e. that is of the same file type and the load number(= the last 6 digits before . ".dat" section) should be the next available number. say loaded was 12, then I will search for 13, if not available 14, 15 etc.. till I process all files in that directory.

加载每个文件后,该文件被移动到一个存档目录...我记录文件类型和加载编号(文件名中的最后6个字符)我有两条信息:1-我想加载的文件是否是类型A或B 2-最后加载的文件编号为基于这些的整数,我想得到下一个文件的文件名,即具有相同文件类型和加载编号(=之前的最后6位数)。 “.dat”部分)应该是下一个可用的数字。说加载是12,然后我将搜索13,如果不可用14,15等..直到我处理该目录中的所有文件。

just given a string like "LOAD_filesourceB-01012008_000058.dat" can I check that this is file type B and assuming last loaded file number was 57, it satisfies being number 58 requirement. (> 57 I mean)

只要给出一个像“LOAD_filesourceB-01012008_000058.dat”这样的字符串,我可以检查这是文件类型B,假设最后加载的文件号是57,它满足58号要求。 (>我的意思是57)

3 个解决方案

#1


LOAD_filesource(A|B)-[0-9]+-([0-9])+.dat

A or B will end up in group 1, the number of the file in group 2. Then parse group 2 as a decimal integer.

A或B将在组1中结束,组2中的文件编号。然后将组2解析为十进制整数。

#2


See this:

public class Match {

    Pattern pattern = Pattern.compile("LOAD_filesource(A|B)-[0-9]{8}[_-]([0-9]{5,6})\\.dat");

    String files[] = {
        "LOAD_filesourceA-01012008-00001.dat",
        "LOAD_filesourceB-01012008-00001.dat",
        "LOAD_filesourceB-01012008-00003.dat",
        "LOAD_filesourceA-01012008-00004.dat",
        "LOAD_filesourceA-01012008-000055.dat",
        "LOAD_filesourceB-01012008_000055.dat",
        "LOAD_filesourceB-01012008_000058.dat"
    };

    public static void main(String[] args) {
        new Match().run();
    }

    private void run() {
        for (String file : files) {
            Matcher matcher = pattern.matcher(file);

            System.out.print(String.format("%s %b %s %s\n", file, matcher.matches(), matcher.group(1), matcher.group(2)));
        }
    }
}

with this output:

使用此输出:

LOAD_filesourceA-01012008-00001.dat true A 00001
LOAD_filesourceB-01012008-00001.dat true B 00001
LOAD_filesourceB-01012008-00003.dat true B 00003
LOAD_filesourceA-01012008-00004.dat true A 00004
LOAD_filesourceA-01012008-000055.dat true A 000055
LOAD_filesourceB-01012008_000055.dat true B 000055
LOAD_filesourceB-01012008_000058.dat true B 000058

#3


I don't know if its intentional or not, but you have listed two different formats, one that uses a hyphen as the final separator and one that uses an underscore. If both are really supported, you would want:

我不知道它是否有意,但你列出了两种不同的格式,一种使用连字符作为最终分隔符,另一种使用下划线。如果两者都得到了真正的支持,你会想要:

LOAD_filesource(A|B)-[0-9]+[_-]([0-9])+.dat

Also, your six digit number is sometimes five digits (e.g. the 00001 in LOAD_filesourceA-...-00001.dat), but the above regular expression only requires at least one digit be present.

此外,您的六位数有时是五位数(例如LOAD_filesourceA -...- 00001.dat中的00001),但上述正则表达式只需要至少有一位数字。

Depending on how many files you're going to attempt to examine, you might be better off loading up a directory listing rather than randomly checking to see if a file exists. With an appropriate compare method, sorting your list could give you your files in an easy-to-work-with order.

根据您要尝试检查的文件数量,最好加载目录列表,而不是随机检查文件是否存在。使用适当的比较方法,对列表进行排序可以为您提供易于使用的文件。

#1


LOAD_filesource(A|B)-[0-9]+-([0-9])+.dat

A or B will end up in group 1, the number of the file in group 2. Then parse group 2 as a decimal integer.

A或B将在组1中结束,组2中的文件编号。然后将组2解析为十进制整数。

#2


See this:

public class Match {

    Pattern pattern = Pattern.compile("LOAD_filesource(A|B)-[0-9]{8}[_-]([0-9]{5,6})\\.dat");

    String files[] = {
        "LOAD_filesourceA-01012008-00001.dat",
        "LOAD_filesourceB-01012008-00001.dat",
        "LOAD_filesourceB-01012008-00003.dat",
        "LOAD_filesourceA-01012008-00004.dat",
        "LOAD_filesourceA-01012008-000055.dat",
        "LOAD_filesourceB-01012008_000055.dat",
        "LOAD_filesourceB-01012008_000058.dat"
    };

    public static void main(String[] args) {
        new Match().run();
    }

    private void run() {
        for (String file : files) {
            Matcher matcher = pattern.matcher(file);

            System.out.print(String.format("%s %b %s %s\n", file, matcher.matches(), matcher.group(1), matcher.group(2)));
        }
    }
}

with this output:

使用此输出:

LOAD_filesourceA-01012008-00001.dat true A 00001
LOAD_filesourceB-01012008-00001.dat true B 00001
LOAD_filesourceB-01012008-00003.dat true B 00003
LOAD_filesourceA-01012008-00004.dat true A 00004
LOAD_filesourceA-01012008-000055.dat true A 000055
LOAD_filesourceB-01012008_000055.dat true B 000055
LOAD_filesourceB-01012008_000058.dat true B 000058

#3


I don't know if its intentional or not, but you have listed two different formats, one that uses a hyphen as the final separator and one that uses an underscore. If both are really supported, you would want:

我不知道它是否有意,但你列出了两种不同的格式,一种使用连字符作为最终分隔符,另一种使用下划线。如果两者都得到了真正的支持,你会想要:

LOAD_filesource(A|B)-[0-9]+[_-]([0-9])+.dat

Also, your six digit number is sometimes five digits (e.g. the 00001 in LOAD_filesourceA-...-00001.dat), but the above regular expression only requires at least one digit be present.

此外,您的六位数有时是五位数(例如LOAD_filesourceA -...- 00001.dat中的00001),但上述正则表达式只需要至少有一位数字。

Depending on how many files you're going to attempt to examine, you might be better off loading up a directory listing rather than randomly checking to see if a file exists. With an appropriate compare method, sorting your list could give you your files in an easy-to-work-with order.

根据您要尝试检查的文件数量,最好加载目录列表,而不是随机检查文件是否存在。使用适当的比较方法,对列表进行排序可以为您提供易于使用的文件。