从字符串模式中提取数字

时间:2022-09-13 00:19:04

I have a file which have details about a parameter called "leak". There is one line in the file that gives this information. "Leaks" have three types; short, medium and long. Not all leaks may be presented at a given time. Below are some examples of leak information in 6 files. The pattern is type_of_leak(number_of_leaks).

我有一个文件,其中包含有关名为“泄漏”的参数的详细信息。文件中有一行提供此信息。 “泄漏”有三种类型;短,中,长。并非所有泄漏都可能在给定时间出现。以下是6个文件中泄漏信息的一些示例。模式是type_of_leak(number_of_leaks)。

e.g:

leak:   short(4)    medium(11)  long(4)
leak:   short(6)
leak:   long(3)
leak:   medium(4)   
leak:   medium(1)   long(8)
leak:   short(1)    long(5)

I want to extract the three leak values in order and populate an interger array. 0th element short leak, 1st element medium leak and 2nd element long leak. If leaks are not presented for a given category the value should be '0'. Below is the code I'm using. My code can extract the leaks however when the number of leaks is a number more than 1 digit it can only extract the first digit.

我想按顺序提取三个泄漏值并填充一个整数数组。第0元素短泄漏,第1元素介质泄漏,第2元素长泄漏。如果没有为给定类别提供泄漏,则该值应为“0”。下面是我正在使用的代码。我的代码可以提取泄漏但是当泄漏数量超过1位数时,它只能提取第一个数字。

int[] leaks = new int[3];

if(line.contains("leak:")){ //search for the line that starts with leak

    System.out.println(line);

    //short leaks
    if(line.contains("short")) {
        int index = line.indexOf("short");
        int numShortLeaks = Integer.parseInt((line.substring(index+6, index+7)));
        leaks[0] = numShortLeaks;
    }else {
        leaks[0] = 0; //no short leaks replace with zero                    
    }

    if(line.contains("medium")) {
        int index = line.indexOf("medium");
        int numMediumLeaks = Integer.parseInt((line.substring(index+7, index+8)));
        leaks[1] = numMediumLeaks;
    }else {
        leaks[1] = 0;                       
    }

    if(line.contains("long")) {
        int index = line.indexOf("long");
        int numLongLeaks = Integer.parseInt((line.substring(index+5, index+6)));
        leaks[2] = numLongLeaks;
    }else {
        leaks[2] = 0;                       
    }

2 个解决方案

#1


1  

Use this regular expression

使用此正则表达式

/leak:(?:\s+short\((\d+)\))?(?:\s+medium\((\d+)\))?(?:\s+long\((\d+)\))?

This will match the short, medium and long ints in group 1, 2 and 3 respectively.

这将分别匹配第1组,第2组和第3组中的短,中和长整数。

Even if one or more of short, medium, long is not provided, the group number will be correct, so group 3 is always the long value, regardless of whether short/medium were provided.

即使没有提供短,中,长中的一个或多个,组号也是正确的,因此无论是否提供短/中,组3总是长值。

String line = "leak:   short(16)    long(3)";
Pattern pattern = Pattern.compile("leak:(?:\\s+short\\((\\d+)\\))?(?:\\s+medium\\((\d+)\\))?(?:\\s+long\\((\\d+)\\))?");
Matcher matcher = pattern.matcher(line);
while (matcher.find()) {

  //Stick them in your array.
  System.out.println("short " + matcher.group(1)); //16
  System.out.println("medium " + matcher.group(2)); //null
  System.out.println("long  " + matcher.group(3)); //3
}

#2


0  

By an example

举个例子

leak: short(4) medium(11) long(4)

泄漏:短(4)中(11)长(4)

The code is simple as

代码很简单

int leakIndex = line.indexOf("leak:");
if(leakIndex > -1) {
    // Got the data
    // 1. Split by tab to group like short(4) or medium(11) or long(4)
    final String[] dataLine = line.subString(leakIndex + 1, line.length).split("\t");
    // 2. Loop over the data line to extract the value
    for(String data : dataLine) {
        // I suggest you to create a sub function to extract
        // 3. Simple idea is replaced all non number by empty value and we can parse it
        if(data.contains("short")) {
            leaks[0] = Integer.parseInt(data.replaceAll("[^0-9]", ""));
            // TODO: You should handling NumberFormatException here
        } else if() {
        }
        // Do other for medium and long here
} else {
// Skip
}

Note that: By store a long value into Int cause a loss

请注意:通过将长值存储到Int中会导致丢失

#1


1  

Use this regular expression

使用此正则表达式

/leak:(?:\s+short\((\d+)\))?(?:\s+medium\((\d+)\))?(?:\s+long\((\d+)\))?

This will match the short, medium and long ints in group 1, 2 and 3 respectively.

这将分别匹配第1组,第2组和第3组中的短,中和长整数。

Even if one or more of short, medium, long is not provided, the group number will be correct, so group 3 is always the long value, regardless of whether short/medium were provided.

即使没有提供短,中,长中的一个或多个,组号也是正确的,因此无论是否提供短/中,组3总是长值。

String line = "leak:   short(16)    long(3)";
Pattern pattern = Pattern.compile("leak:(?:\\s+short\\((\\d+)\\))?(?:\\s+medium\\((\d+)\\))?(?:\\s+long\\((\\d+)\\))?");
Matcher matcher = pattern.matcher(line);
while (matcher.find()) {

  //Stick them in your array.
  System.out.println("short " + matcher.group(1)); //16
  System.out.println("medium " + matcher.group(2)); //null
  System.out.println("long  " + matcher.group(3)); //3
}

#2


0  

By an example

举个例子

leak: short(4) medium(11) long(4)

泄漏:短(4)中(11)长(4)

The code is simple as

代码很简单

int leakIndex = line.indexOf("leak:");
if(leakIndex > -1) {
    // Got the data
    // 1. Split by tab to group like short(4) or medium(11) or long(4)
    final String[] dataLine = line.subString(leakIndex + 1, line.length).split("\t");
    // 2. Loop over the data line to extract the value
    for(String data : dataLine) {
        // I suggest you to create a sub function to extract
        // 3. Simple idea is replaced all non number by empty value and we can parse it
        if(data.contains("short")) {
            leaks[0] = Integer.parseInt(data.replaceAll("[^0-9]", ""));
            // TODO: You should handling NumberFormatException here
        } else if() {
        }
        // Do other for medium and long here
} else {
// Skip
}

Note that: By store a long value into Int cause a loss

请注意:通过将长值存储到Int中会导致丢失