I have a string like this, where each word is marked with encoding (FW
, PRP
, etc) using underline:
我有一个这样的字符串,其中每个单词都使用下划线标记编码(FW,PRP等):
Hi_FW !_.
My_PRP$ name_NN 's_POS Jim_NNP ._.
I_PRP 'm_VBP from_IN New_NNP Zealand_NNP ._.
This_DT is_VBZ my_PRP$ friend_NN ._.
His_PRP$ name_NN 's_POS Adam_NNP ._.
He_PRP 's_VBZ from_IN Australia_NNP ._.
This_DT is_VBZ my_PRP$ friend_NN too_RB ._.
Her_PRP$ name_NN 's_POS Paola_NNP ._.
She_PRP 's_VBZ from_IN Italy_NNP ._.
I need to break it into an array where a key is a word, and its value is its corresponding tag:
我需要将其分解为一个数组,其中键是一个单词,其值是其对应的标记:
[
"Hi" => "FW",
"My" => "PRP$",
"name" => "NN"
...
]
I assume I can somehow split this string by the delimiter _
, but can't seem to find a good way to then join it into the array I need.
我假设我可以通过分隔符_以某种方式分割这个字符串,但似乎找不到一个好的方法然后将它加入我需要的数组。
How can that be achieved?
怎么能实现呢?
3 个解决方案
#1
$arr = explode("\n", $string);
$newarr = array();
foreach($arr as $item)
{
$explodeditem = explode(' ', $item);
foreach($explodeditem as $string)
array_push ($newarr, $string);
}
$result = array();
foreach($newarr as $item)
{
$newArr = explode('_', $item);
$result[$newArr[0]] = $newArr[1];
}
#2
Lets assume we are reading from a file (data.txt) then the following reads the contents of the file using fopen() which can be omitted if your requirement is a string.
让我们假设我们正在从文件(data.txt)中读取,然后使用fopen()读取文件的内容,如果您的需求是字符串,则可以省略该内容。
The following is a partial naive implementation solution intended to give you a head start. Comments for given very simple delimiters and use of multiple preg_split() (twice):
以下是一个部分天真的实施解决方案,旨在为您提供一个良好的开端。给出非常简单的分隔符的注释和使用多个preg_split()(两次):
<?php
$results = array();
$delimiter = '_';
$file_handle = fopen("data.txt", "r");
while (!feof($file_handle)) {
// ie. My_PRP$ name_NN 's_POS Jim_NNP ._.
$line = fgets($file_handle);
// validations ommited
// split by delimiter '_'
// [0] = My
// [1] = PRP$
$line_array = preg_split("/$delimiter/", $line);
// ie. for cases Hi_FW !_.
// from results above, split by space
// [0] = FW
// [1] = !
$value = preg_split("/\s/", $line_array[1]);
// sighh, adding delimiter back to key-value array
$result[$line_array[0]] = $delimiter.$value[0];
}
fclose($file_handle);
print_r($result);
?>
data.txt
Hi_FW !_.
My_PRP$ name_NN 's_POS Jim_NNP ._.
I_PRP 'm_VBP from_IN New_NNP Zealand_NNP ._.
This_DT is_VBZ my_PRP$ friend_NN ._.
His_PRP$ name_NN 's_POS Adam_NNP ._.
He_PRP 's_VBZ from_IN Australia_NNP ._.
This_DT is_VBZ my_PRP$ friend_NN too_RB ._.
Her_PRP$ name_NN 's_POS Paola_NNP ._.
She_PRP 's_VBZ from_IN Italy_NNP ._.
Hope this helps.
希望这可以帮助。
#3
I would do an explode on whitespaces and than on _
我会在空格上爆炸而不是在_上
<?php
$inputArray = explode(" ", $input);
$sentences = array();
foreach ($inputArray as $word){
$wordArray = explode("_", $word);
$sentences[$wordArray[0]] = $wordArray[1];
}
#1
$arr = explode("\n", $string);
$newarr = array();
foreach($arr as $item)
{
$explodeditem = explode(' ', $item);
foreach($explodeditem as $string)
array_push ($newarr, $string);
}
$result = array();
foreach($newarr as $item)
{
$newArr = explode('_', $item);
$result[$newArr[0]] = $newArr[1];
}
#2
Lets assume we are reading from a file (data.txt) then the following reads the contents of the file using fopen() which can be omitted if your requirement is a string.
让我们假设我们正在从文件(data.txt)中读取,然后使用fopen()读取文件的内容,如果您的需求是字符串,则可以省略该内容。
The following is a partial naive implementation solution intended to give you a head start. Comments for given very simple delimiters and use of multiple preg_split() (twice):
以下是一个部分天真的实施解决方案,旨在为您提供一个良好的开端。给出非常简单的分隔符的注释和使用多个preg_split()(两次):
<?php
$results = array();
$delimiter = '_';
$file_handle = fopen("data.txt", "r");
while (!feof($file_handle)) {
// ie. My_PRP$ name_NN 's_POS Jim_NNP ._.
$line = fgets($file_handle);
// validations ommited
// split by delimiter '_'
// [0] = My
// [1] = PRP$
$line_array = preg_split("/$delimiter/", $line);
// ie. for cases Hi_FW !_.
// from results above, split by space
// [0] = FW
// [1] = !
$value = preg_split("/\s/", $line_array[1]);
// sighh, adding delimiter back to key-value array
$result[$line_array[0]] = $delimiter.$value[0];
}
fclose($file_handle);
print_r($result);
?>
data.txt
Hi_FW !_.
My_PRP$ name_NN 's_POS Jim_NNP ._.
I_PRP 'm_VBP from_IN New_NNP Zealand_NNP ._.
This_DT is_VBZ my_PRP$ friend_NN ._.
His_PRP$ name_NN 's_POS Adam_NNP ._.
He_PRP 's_VBZ from_IN Australia_NNP ._.
This_DT is_VBZ my_PRP$ friend_NN too_RB ._.
Her_PRP$ name_NN 's_POS Paola_NNP ._.
She_PRP 's_VBZ from_IN Italy_NNP ._.
Hope this helps.
希望这可以帮助。
#3
I would do an explode on whitespaces and than on _
我会在空格上爆炸而不是在_上
<?php
$inputArray = explode(" ", $input);
$sentences = array();
foreach ($inputArray as $word){
$wordArray = explode("_", $word);
$sentences[$wordArray[0]] = $wordArray[1];
}