PHP：基于数组拆分字符串

Below is that data I'm trying to parse:

下面是我试图解析的数据:

50‐59 1High300.00 Avg300.00
90‐99 11High222.00 Avg188.73
120‐1293High204.00 Avg169.33

The first section is a weight range, next is a count, followed by Highprice, ending with Avgprice.

第一部分是权重范围,接下来是计数,其次是Highprice,以Avgprice结尾。

As an example, I need to parse the data above into an array which would look like

作为一个例子,我需要将上面的数据解析成一个看起来像的数组

[0]50-59
[1]1
[2]High300.00
[3]Avg300.00

[0]90-99
[1]11
[2]High222.00
[3]Avg188.73

[0]120‐129
[1]3
[2]High204.00
[3]Avg169.33

I thought about creating an array of what the possible weight ranges can be but I can't figure out how to use the values of the array to split the string.

我想创建一个可能的权重范围的数组,但我无法弄清楚如何使用数组的值来分割字符串。

$arr = array("10-19","20-29","30-39","40-49","50-59","60-69","70-79","80-89","90-99","100-109","110-119","120-129","130-139","140-149","150-159","160-169","170-179","180-189","190-199","200-209","210-219","220-229","230-239","240-249","250-259","260-269","270-279","280-289","290-299","300-309");

Any ideas would be greatly appreciated.

任何想法将不胜感激。

4 个解决方案

#1

Hope this will work:

希望这会奏效:

    $string='50-59 1High300.00 Avg300.00
    90-99 11High222.00 Avg188.73
    120-129 3High204.00 Avg169.33';

    $requiredData=array();
    $dataArray=explode("\n",$string);
    $counter=0;
    foreach($dataArray as $data)
    {
        if(preg_match('#^([\d]+\-[\d]+) ([\d]+)([a-zA-Z]+[\d\.]+) ([a-zA-Z]+[\d\.]+)#', $data,$matches))    
        {
            $requiredData[$counter][]=$matches[1];
            $requiredData[$counter][]=$matches[2];
            $requiredData[$counter][]=$matches[3];
            $requiredData[$counter][]=$matches[4];
            $counter++;
        }
    }
    print_r($requiredData);

#2

'#^([\d]+\-[\d]+) ([\d]+)([a-zA-Z]+[\d\.]+) ([a-zA-Z]+[\d\.]+)#'
I don't think that will work because of the space you have in the regex between the weight and count. The thing I'm struggling with is a row like this where there is no space. 120‐1293High204.00 Avg169.33 that needs to be parsed like [0]120‐129 [1]3 [2]High204.00 [3]Avg169.33

我不认为这会起作用,因为你在重量和数量之间的正则表达式中有空间。我正在努力的事情就是这样一条没有空间的行。 120-1293High204.00 Avg169.33需要解析,如[0] 120-129 [1] 3 [2] High204.00 [3] Avg169.33

You are right. That can be remedied by limiting the number of weight digits to three and making the space optional.

你是对的。这可以通过将重量数字的数量限制为三并使空间可选来解决。

'#^(\d+-\d{1,3}) *…

#3

$arr = array('50-59 1High300.00 Avg300.00', 
             '90-99 11High222.00 Avg188.73', 
             '120-129 3High204.00 Avg169.33');

foreach($arr as $str) {
    if (preg_match('/^(\d+-\d{1,3})\s*(\d+)(High\d+\.\d\d) (Avg\d+\.\d\d)/i', $str, $m)) {
        array_shift($m); //remove group 0 (ie. the whole match)
        $result[] = $m;
    }
}
print_r($result);

Output:

Array
(
    [0] => Array
        (
            [0] => 50-59
            [1] => 1
            [2] => High300.00
            [3] => Avg300.00
        )

    [1] => Array
        (
            [0] => 90-99
            [1] => 11
            [2] => High222.00
            [3] => Avg188.73
        )

    [2] => Array
        (
            [0] => 120-129
            [1] => 3
            [2] => High204.00
            [3] => Avg169.33
        )

)

Explanation:

/                   : regex delimiter
    ^               : begining of string
    (               : start group 1
      \d+-\d{1,3}   : 1 or more digits a dash and 1 upto 3 digits ie. weight range
    )               : end group 1
    \s*             : 0 or more space character
    (\d+)           : group 2 ie. count
    (High\d+\.\d\d) : group 3 literal High followed by price
    (Avg\d+\.\d\d)  : Group 4 literal Avg followed by price
/i                  : regex delimiter and case Insensitive modifier.

To be more generic, you could replace High and Avg by [a-z]+

为了更通用,您可以用[a-z] +替换High和Avg

#4

Let's get some resolution on this seemingly abandoned question:

让我们对这个看似遗弃的问题得到一些解决方案:

This is a pattern you can trust (Pattern Demo):

这是您可以信任的模式(模式演示):

^((\d{0,2})0\‐(?:\2)9) ?(\d{1,3})High(\d{1,3}\.\d{2}) ?Avg(\d{1,3}\.\d{2})

^((\ d {0,2})0 \ - (?:\ 2)9)?(\ d {1,3})高(\ d {1,3} \。\ d {2})?平均(\ d {1,3} \。\ d {2})

The other answers overlooked the digital pattern in the weight range substring. The range start integer always ends in 0, and the range end integer always ends in 9; the range always spans ten integers.

其他答案忽略了权重范围子串中的数字模式。范围起始整数始终以0结尾,范围结束整数始终以9结尾;范围总是跨越十个整数。

My pattern will capture the digits that precede the 0 in the starting integer and reference them immediately after the dash, then require that captured number to be followed by a 9.

我的模式将捕获起始整数中0之前的数字,并在短划线后立即引用它们,然后要求捕获的数字后跟9。

I want to point out that your sample input was a little bit tricky because your ‐ is not the standard - that is between the 0 and = on my keyboard. This was a sneaky little gotcha for me to solve.

我想指出你的样本输入有点棘手,因为你的 - 不是标准 - 介于键盘上的0和=之间。这对我来说是一个偷偷摸摸的小问题。

Method (Demo):

$keys=['weight range','count','Highprice','Avgprice'];
$in='50‐59 1High300.00 Avg300.00
90‐99 11High222.00Avg188.73
120‐1293High204.00 Avg169.33';

$out=preg_match_all('/((\d{0,2})0\‐(?:\2)9) ?(\d{1,3})High(\d{1,3}\.\d{2}) ?Avg(\d{1,3}\.\d{2})/',$in,$out)?array_diff_key($out,[0=>'',2=>'']):[];
// array_diff_key removes unwanted matching subarrays
foreach($out as $i=>$v){
    $result[]=array_combine($keys,array_column($out,0));
}
var_export($result);

Output:

array (
  0 => 
  array (
    'weight range' => '50‐59',
    'count' => '1',
    'Highprice' => '300.00',
    'Avgprice' => '300.00',
  ),
  1 => 
  array (
    'weight range' => '50‐59',
    'count' => '1',
    'Highprice' => '300.00',
    'Avgprice' => '300.00',
  ),
  2 => 
  array (
    'weight range' => '50‐59',
    'count' => '1',
    'Highprice' => '300.00',
    'Avgprice' => '300.00',
  ),
  3 => 
  array (
    'weight range' => '50‐59',
    'count' => '1',
    'Highprice' => '300.00',
    'Avgprice' => '300.00',
  ),
)

#1