Below is that data I'm trying to parse:
下面是我试图解析的数据:
50‐59 1High300.00 Avg300.00
90‐99 11High222.00 Avg188.73
120‐1293High204.00 Avg169.33
The first section is a weight range, next is a count, followed by Highprice, ending with Avgprice.
第一部分是权重范围,接下来是计数,其次是Highprice,以Avgprice结尾。
As an example, I need to parse the data above into an array which would look like
作为一个例子,我需要将上面的数据解析成一个看起来像的数组
[0]50-59
[1]1
[2]High300.00
[3]Avg300.00
[0]90-99
[1]11
[2]High222.00
[3]Avg188.73
[0]120‐129
[1]3
[2]High204.00
[3]Avg169.33
I thought about creating an array of what the possible weight ranges can be but I can't figure out how to use the values of the array to split the string.
我想创建一个可能的权重范围的数组,但我无法弄清楚如何使用数组的值来分割字符串。
$arr = array("10-19","20-29","30-39","40-49","50-59","60-69","70-79","80-89","90-99","100-109","110-119","120-129","130-139","140-149","150-159","160-169","170-179","180-189","190-199","200-209","210-219","220-229","230-239","240-249","250-259","260-269","270-279","280-289","290-299","300-309");
Any ideas would be greatly appreciated.
任何想法将不胜感激。
4 个解决方案
#1
1
Hope this will work:
希望这会奏效:
$string='50-59 1High300.00 Avg300.00
90-99 11High222.00 Avg188.73
120-129 3High204.00 Avg169.33';
$requiredData=array();
$dataArray=explode("\n",$string);
$counter=0;
foreach($dataArray as $data)
{
if(preg_match('#^([\d]+\-[\d]+) ([\d]+)([a-zA-Z]+[\d\.]+) ([a-zA-Z]+[\d\.]+)#', $data,$matches))
{
$requiredData[$counter][]=$matches[1];
$requiredData[$counter][]=$matches[2];
$requiredData[$counter][]=$matches[3];
$requiredData[$counter][]=$matches[4];
$counter++;
}
}
print_r($requiredData);
#2
1
'#^([\d]+\-[\d]+) ([\d]+)([a-zA-Z]+[\d\.]+) ([a-zA-Z]+[\d\.]+)#'
I don't think that will work because of the space you have in the regex between the weight and count. The thing I'm struggling with is a row like this where there is no space.
120‐1293High204.00 Avg169.33
that needs to be parsed like[0]120‐129 [1]3 [2]High204.00 [3]Avg169.33
我不认为这会起作用,因为你在重量和数量之间的正则表达式中有空间。我正在努力的事情就是这样一条没有空间的行。 120-1293High204.00 Avg169.33需要解析,如[0] 120-129 [1] 3 [2] High204.00 [3] Avg169.33
You are right. That can be remedied by limiting the number of weight digits to three and making the space optional.
你是对的。这可以通过将重量数字的数量限制为三并使空间可选来解决。
'#^(\d+-\d{1,3}) *…
#3
0
$arr = array('50-59 1High300.00 Avg300.00',
'90-99 11High222.00 Avg188.73',
'120-129 3High204.00 Avg169.33');
foreach($arr as $str) {
if (preg_match('/^(\d+-\d{1,3})\s*(\d+)(High\d+\.\d\d) (Avg\d+\.\d\d)/i', $str, $m)) {
array_shift($m); //remove group 0 (ie. the whole match)
$result[] = $m;
}
}
print_r($result);
Output:
Array
(
[0] => Array
(
[0] => 50-59
[1] => 1
[2] => High300.00
[3] => Avg300.00
)
[1] => Array
(
[0] => 90-99
[1] => 11
[2] => High222.00
[3] => Avg188.73
)
[2] => Array
(
[0] => 120-129
[1] => 3
[2] => High204.00
[3] => Avg169.33
)
)
Explanation:
/ : regex delimiter
^ : begining of string
( : start group 1
\d+-\d{1,3} : 1 or more digits a dash and 1 upto 3 digits ie. weight range
) : end group 1
\s* : 0 or more space character
(\d+) : group 2 ie. count
(High\d+\.\d\d) : group 3 literal High followed by price
(Avg\d+\.\d\d) : Group 4 literal Avg followed by price
/i : regex delimiter and case Insensitive modifier.
To be more generic, you could replace High
and Avg
by [a-z]+
为了更通用,您可以用[a-z] +替换High和Avg
#4
0
Let's get some resolution on this seemingly abandoned question:
让我们对这个看似遗弃的问题得到一些解决方案:
This is a pattern you can trust (Pattern Demo):
这是您可以信任的模式(模式演示):
^((\d{0,2})0\‐(?:\2)9) ?(\d{1,3})High(\d{1,3}\.\d{2}) ?Avg(\d{1,3}\.\d{2})
^((\ d {0,2})0 \ - (?:\ 2)9)?(\ d {1,3})高(\ d {1,3} \。\ d {2})?平均(\ d {1,3} \。\ d {2})
The other answers overlooked the digital pattern in the weight range
substring. The range start integer always ends in 0
, and the range end integer always ends in 9
; the range always spans ten integers.
其他答案忽略了权重范围子串中的数字模式。范围起始整数始终以0结尾,范围结束整数始终以9结尾;范围总是跨越十个整数。
My pattern will capture the digits that precede the 0
in the starting integer and reference them immediately after the dash, then require that captured number to be followed by a 9
.
我的模式将捕获起始整数中0之前的数字,并在短划线后立即引用它们,然后要求捕获的数字后跟9。
I want to point out that your sample input was a little bit tricky because your ‐
is not the standard -
that is between the 0
and =
on my keyboard. This was a sneaky little gotcha for me to solve.
我想指出你的样本输入有点棘手,因为你的 - 不是标准 - 介于键盘上的0和=之间。这对我来说是一个偷偷摸摸的小问题。
Method (Demo):
$keys=['weight range','count','Highprice','Avgprice'];
$in='50‐59 1High300.00 Avg300.00
90‐99 11High222.00Avg188.73
120‐1293High204.00 Avg169.33';
$out=preg_match_all('/((\d{0,2})0\‐(?:\2)9) ?(\d{1,3})High(\d{1,3}\.\d{2}) ?Avg(\d{1,3}\.\d{2})/',$in,$out)?array_diff_key($out,[0=>'',2=>'']):[];
// array_diff_key removes unwanted matching subarrays
foreach($out as $i=>$v){
$result[]=array_combine($keys,array_column($out,0));
}
var_export($result);
Output:
array (
0 =>
array (
'weight range' => '50‐59',
'count' => '1',
'Highprice' => '300.00',
'Avgprice' => '300.00',
),
1 =>
array (
'weight range' => '50‐59',
'count' => '1',
'Highprice' => '300.00',
'Avgprice' => '300.00',
),
2 =>
array (
'weight range' => '50‐59',
'count' => '1',
'Highprice' => '300.00',
'Avgprice' => '300.00',
),
3 =>
array (
'weight range' => '50‐59',
'count' => '1',
'Highprice' => '300.00',
'Avgprice' => '300.00',
),
)
#1
1
Hope this will work:
希望这会奏效:
$string='50-59 1High300.00 Avg300.00
90-99 11High222.00 Avg188.73
120-129 3High204.00 Avg169.33';
$requiredData=array();
$dataArray=explode("\n",$string);
$counter=0;
foreach($dataArray as $data)
{
if(preg_match('#^([\d]+\-[\d]+) ([\d]+)([a-zA-Z]+[\d\.]+) ([a-zA-Z]+[\d\.]+)#', $data,$matches))
{
$requiredData[$counter][]=$matches[1];
$requiredData[$counter][]=$matches[2];
$requiredData[$counter][]=$matches[3];
$requiredData[$counter][]=$matches[4];
$counter++;
}
}
print_r($requiredData);
#2
1
'#^([\d]+\-[\d]+) ([\d]+)([a-zA-Z]+[\d\.]+) ([a-zA-Z]+[\d\.]+)#'
I don't think that will work because of the space you have in the regex between the weight and count. The thing I'm struggling with is a row like this where there is no space.
120‐1293High204.00 Avg169.33
that needs to be parsed like[0]120‐129 [1]3 [2]High204.00 [3]Avg169.33
我不认为这会起作用,因为你在重量和数量之间的正则表达式中有空间。我正在努力的事情就是这样一条没有空间的行。 120-1293High204.00 Avg169.33需要解析,如[0] 120-129 [1] 3 [2] High204.00 [3] Avg169.33
You are right. That can be remedied by limiting the number of weight digits to three and making the space optional.
你是对的。这可以通过将重量数字的数量限制为三并使空间可选来解决。
'#^(\d+-\d{1,3}) *…
#3
0
$arr = array('50-59 1High300.00 Avg300.00',
'90-99 11High222.00 Avg188.73',
'120-129 3High204.00 Avg169.33');
foreach($arr as $str) {
if (preg_match('/^(\d+-\d{1,3})\s*(\d+)(High\d+\.\d\d) (Avg\d+\.\d\d)/i', $str, $m)) {
array_shift($m); //remove group 0 (ie. the whole match)
$result[] = $m;
}
}
print_r($result);
Output:
Array
(
[0] => Array
(
[0] => 50-59
[1] => 1
[2] => High300.00
[3] => Avg300.00
)
[1] => Array
(
[0] => 90-99
[1] => 11
[2] => High222.00
[3] => Avg188.73
)
[2] => Array
(
[0] => 120-129
[1] => 3
[2] => High204.00
[3] => Avg169.33
)
)
Explanation:
/ : regex delimiter
^ : begining of string
( : start group 1
\d+-\d{1,3} : 1 or more digits a dash and 1 upto 3 digits ie. weight range
) : end group 1
\s* : 0 or more space character
(\d+) : group 2 ie. count
(High\d+\.\d\d) : group 3 literal High followed by price
(Avg\d+\.\d\d) : Group 4 literal Avg followed by price
/i : regex delimiter and case Insensitive modifier.
To be more generic, you could replace High
and Avg
by [a-z]+
为了更通用,您可以用[a-z] +替换High和Avg
#4
0
Let's get some resolution on this seemingly abandoned question:
让我们对这个看似遗弃的问题得到一些解决方案:
This is a pattern you can trust (Pattern Demo):
这是您可以信任的模式(模式演示):
^((\d{0,2})0\‐(?:\2)9) ?(\d{1,3})High(\d{1,3}\.\d{2}) ?Avg(\d{1,3}\.\d{2})
^((\ d {0,2})0 \ - (?:\ 2)9)?(\ d {1,3})高(\ d {1,3} \。\ d {2})?平均(\ d {1,3} \。\ d {2})
The other answers overlooked the digital pattern in the weight range
substring. The range start integer always ends in 0
, and the range end integer always ends in 9
; the range always spans ten integers.
其他答案忽略了权重范围子串中的数字模式。范围起始整数始终以0结尾,范围结束整数始终以9结尾;范围总是跨越十个整数。
My pattern will capture the digits that precede the 0
in the starting integer and reference them immediately after the dash, then require that captured number to be followed by a 9
.
我的模式将捕获起始整数中0之前的数字,并在短划线后立即引用它们,然后要求捕获的数字后跟9。
I want to point out that your sample input was a little bit tricky because your ‐
is not the standard -
that is between the 0
and =
on my keyboard. This was a sneaky little gotcha for me to solve.
我想指出你的样本输入有点棘手,因为你的 - 不是标准 - 介于键盘上的0和=之间。这对我来说是一个偷偷摸摸的小问题。
Method (Demo):
$keys=['weight range','count','Highprice','Avgprice'];
$in='50‐59 1High300.00 Avg300.00
90‐99 11High222.00Avg188.73
120‐1293High204.00 Avg169.33';
$out=preg_match_all('/((\d{0,2})0\‐(?:\2)9) ?(\d{1,3})High(\d{1,3}\.\d{2}) ?Avg(\d{1,3}\.\d{2})/',$in,$out)?array_diff_key($out,[0=>'',2=>'']):[];
// array_diff_key removes unwanted matching subarrays
foreach($out as $i=>$v){
$result[]=array_combine($keys,array_column($out,0));
}
var_export($result);
Output:
array (
0 =>
array (
'weight range' => '50‐59',
'count' => '1',
'Highprice' => '300.00',
'Avgprice' => '300.00',
),
1 =>
array (
'weight range' => '50‐59',
'count' => '1',
'Highprice' => '300.00',
'Avgprice' => '300.00',
),
2 =>
array (
'weight range' => '50‐59',
'count' => '1',
'Highprice' => '300.00',
'Avgprice' => '300.00',
),
3 =>
array (
'weight range' => '50‐59',
'count' => '1',
'Highprice' => '300.00',
'Avgprice' => '300.00',
),
)