使用正则表达式提取在某些给定字符之前的整数

时间:2021-08-16 00:14:27

I'm trying to write a function that will take an input String and read it line by line, and what I'm looking to do is convert units of measurements between metric and imperial.

我正在尝试编写一个函数,它将接受一个输入字符串并逐行读取它,而我想要做的是转换公制和英制之间的测量单位。

Obviously that actual conversion between miles/kilometers and kilograms/pounds is straightforward math, but I'm a bit stumped on the correct way to go about extracting these integers so i can convert them.

显然,英里/公里和千克/磅之间的实际转换是简单的数学,但我有点难以正确地提取这些整数,所以我可以转换它们。

To make things more difficult, the input will vary, and I'm going to need to identify different formats (spaces between integer and unit of measurement, different spellings [miles,mile,mi,km,kilometer etc])

为了使事情变得更加困难,输入会有所不同,我将需要识别不同的格式(整数和测量单位之间的空格,不同的拼写[英里,英里,英里,公里,公里等])

Now I've got

现在我有了

if (isMetric) {
            for (String line : input.split("[\\r\\n]+")) {

            }
            return input;
        }

To read each line, and i'm thinking i might need to use a combination of String.substring as well as Regex, but I'm pretty new.

要读取每一行,我想我可能需要使用String.substring和Regex的组合,但我很新。

Any sort of guidance or links to helpful articles would be much appreciated, I'm not exactly looking for a straight up solution here of course!

任何形式的指导或链接到有用的文章将非常感激,我当然不是在寻找一个直接的解决方案!

Thanks a lot!

非常感谢!

Edit:

For example as you asked:

例如,你问:

Input:

I ran 50miles today, 1mile yesterday, and I also lifted a 20 pound and a 5lb weight!

我今天跑了50英里,昨天跑了1英里,我也提升了20磅和5磅的重量!

Output:

I ran 80km today, 1.6km yesterday, and I also lifted a 9kg and a 2.2kg weight!

我今天跑了80公里,昨天跑了1.6公里,我也提升了9公斤重2.2公斤!

1 个解决方案

#1


2  

Here's a solution that will let you find all matches with or without spaces, and with different unit spellings.

这是一个解决方案,可以让您找到所有匹配,包括空格或不包含空格,以及不同的单位拼写。

Note that in the patterns, it is important that all units that have a prefix come before their prefix (so here, miles must come before mil).

请注意,在模式中,所有具有前缀的单位都必须位于其前缀之前(因此,此处,里程必须在mil之前)。

// \d+ matches a number. \s* matches any number of spaces.
String milePattern = "(\\d+)\\s*((miles)|(mile)|(mil))";
String kmPattern = "(\\d+)\\s*((kilometers)|(km)|(kilometres))";

// Compile the patterns (you should not do that at each method call, in your real code)
Pattern mileP = Pattern.compile(milePattern);
Pattern kmP = Pattern.compile(kmPattern);

// You can match one or multiple lines all the same.
String input = "I ran 1001km or 601 mile \n that is the same as 602 mil or 603miles or 1002 kilometers.";

// Create matcher instance on your input.
Matcher mileM = mileP.matcher(input);
// Iterate over all mile-matches (find will 'advance' each time you call it)
while (mileM.find()) {
    // Retrieve the value and the unit
    String amount = mileM.group(1);
    String unit = mileM.group(2);

    // You can also access some data about the match
    int idx = mileM.start();

    // And do whatever you need with it
    System.out.println("Found a mile value: " + amount + " with unit " + unit + " starting at index: " + idx);
}

You can do the same as I did with the miles but with the kilometer pattern. You could also combine both expressions if you want. In my test case I get the output:

您可以像对待里程那样做,但是用公里模式。如果需要,您还可以组合两个表达式。在我的测试用例中,我得到了输出:

Found a mile value: 601 with unit mile starting at index: 16
Found a mile value: 602 with unit mil starting at index: 47
Found a mile value: 603 with unit miles starting at index: 58
Found a km value: 1001 with unit km starting at index: 6
Found a km value: 1002 with unit kilometers starting at index: 70

You can then do whatever conversion you want, or rebuild the string with other units.

然后,您可以执行任何所需的转换,或者使用其他单元重建字符串。

#1


2  

Here's a solution that will let you find all matches with or without spaces, and with different unit spellings.

这是一个解决方案,可以让您找到所有匹配,包括空格或不包含空格,以及不同的单位拼写。

Note that in the patterns, it is important that all units that have a prefix come before their prefix (so here, miles must come before mil).

请注意,在模式中,所有具有前缀的单位都必须位于其前缀之前(因此,此处,里程必须在mil之前)。

// \d+ matches a number. \s* matches any number of spaces.
String milePattern = "(\\d+)\\s*((miles)|(mile)|(mil))";
String kmPattern = "(\\d+)\\s*((kilometers)|(km)|(kilometres))";

// Compile the patterns (you should not do that at each method call, in your real code)
Pattern mileP = Pattern.compile(milePattern);
Pattern kmP = Pattern.compile(kmPattern);

// You can match one or multiple lines all the same.
String input = "I ran 1001km or 601 mile \n that is the same as 602 mil or 603miles or 1002 kilometers.";

// Create matcher instance on your input.
Matcher mileM = mileP.matcher(input);
// Iterate over all mile-matches (find will 'advance' each time you call it)
while (mileM.find()) {
    // Retrieve the value and the unit
    String amount = mileM.group(1);
    String unit = mileM.group(2);

    // You can also access some data about the match
    int idx = mileM.start();

    // And do whatever you need with it
    System.out.println("Found a mile value: " + amount + " with unit " + unit + " starting at index: " + idx);
}

You can do the same as I did with the miles but with the kilometer pattern. You could also combine both expressions if you want. In my test case I get the output:

您可以像对待里程那样做,但是用公里模式。如果需要,您还可以组合两个表达式。在我的测试用例中,我得到了输出:

Found a mile value: 601 with unit mile starting at index: 16
Found a mile value: 602 with unit mil starting at index: 47
Found a mile value: 603 with unit miles starting at index: 58
Found a km value: 1001 with unit km starting at index: 6
Found a km value: 1002 with unit kilometers starting at index: 70

You can then do whatever conversion you want, or rebuild the string with other units.

然后,您可以执行任何所需的转换,或者使用其他单元重建字符串。