解析数组的自定义格式
问题描述:
我试图将自定义格式的.txt文件转换为数组。解析数组的自定义格式
这里是第一线的样本:
[LINETYPE]S [STARTTIME]00:00:00
[LINETYPE]M [TITLE]There For You [PERFORMER]Martin Garrix & Troye Sivan [MUSICID]21120 [LABEL]
[LINETYPE]M [TITLE]Shut Up and Dance [PERFORMER]Walk The Moon [MUSICID]20634 [LABEL]
继续这样跌下去,用一个新的[LINETYPE] S为一天的每一个小时。
现在,我想达到的目标,是让每一个小时的阵列 - 包含另一个数组与支架作为键的每个值:
array("00:00:00" => array(
array("LINETYPE" => "M",
"TITLE" => "There For You",
"PERFORMER" => "Martin Garrix & Troye Sivan",
"MUSICID" => "21120",
"LABEL" => ""),
array("LINETYPE" => "M",
"TITLE" => "Shut Up And Dance",
"PERFORMER" => "Walk The Moon",
"MUSICID" => "20634",
"LABEL" => "")
), "01:00:00" => array(
[...]
);
的正则表达式来搜索每个小时应该是正确的,但似乎preg_split不支持多个捕获组。
到目前为止,我发现的所有答案都在一个更基本的层面上,如爆炸逗号。
这是我的代码到目前为止。
$test = get_musiclog("06082017");
$hours = array();
$regex = "(\[LINETYPE\]S\t\[STARTTIME\])((?:(?:[0-2][0-9])|(?:[2][0-3])|(?:[0-9])):(?:[0-5][0-9])(?::[0-5][0-9])?(?:\\s?)?)\n";
if (preg_match_all ("/".$regex."/is", $test, $matches)) {
foreach($matches[2] as $hour) {
$hours[$hour] = ""; // Music array
}
}
几乎没有任何头发留在头脑里想着这些问题。任何人都可以将我指向正确的方向吗?
- 我如何拆分字符串 “[LINETYPE] S [STARTTIME] XX:XX:XX” 和保持时间的关键?
- 如何知道结构是[键]值\ t而\ n是分隔符的其余元素?
- “数组内的数组”是否正确?
答
我的方法主要使用否定字符类来捕获标记之间的值。为了保持正则表达式相对较轻,可以使用可选的捕获组,以便像LINETYPE S
这样的短线仍匹配。
代码:(Pattern Demo)(PHP Demo)
$input='[LINETYPE]S [STARTTIME]00:00:00
[LINETYPE]M [TITLE]There For You [PERFORMER]Martin Garrix & Troye Sivan [MUSICID]21120 [LABEL]
[LINETYPE]M [TITLE]Shut Up and Dance [PERFORMER]Walk The Moon [MUSICID]20634 [LABEL]
[LINETYPE]M [TITLE]Subeme La Radio [PERFORMER]Enrique Iglesias [MUSICID]21105 [LABEL]
[LINETYPE]C [TITLE]AUTO S2 NON STOP [PERFORMER] [MUSICID]A2-6 [LABEL]
[LINETYPE]M [TITLE]Uptown Funk [PERFORMER]Mark Ronson & Bruno Mars [MUSICID]20533 [LABEL]
[LINETYPE]M [TITLE]Let It Go [PERFORMER]James Bay [MUSICID]20839 [LABEL]
[LINETYPE]S [STARTTIME]01:00:00
[LINETYPE]M [TITLE]There For You [PERFORMER]Martin Garrix & Troye Sivan [MUSICID]21120 [LABEL]
[LINETYPE]M [TITLE]Shut Up and Dance [PERFORMER]Walk The Moon [MUSICID]20634 [LABEL] ';
$pattern='/\[LINETYPE\]([A-Z])\s\[(?:TITLE|STARTTIME)\]([^[\s]*(?: [^[\s]+)*)(?:\s*\[PERFORMER\]([^[ ]*(?: [^[ ]+)*)\s+\[MUSICID\]([^[ ]+)\s+\[LABEL\](\S*(?: \S+)*))?/';
if(!preg_match_all($pattern,$input,$out, PREG_SET_ORDER)){
echo 'something went wrong';
}else{
foreach($out as $a){
if($a[1]=='S'){
$time=$a[2];
}else{
$result[$time][]=["LINETYPE"=>$a[1],"TITLE"=>$a[2],"PERFORMER"=>$a[3],"MUSICID"=>$a[4],"LABEL"=>$a[5]];
}
}
}
var_export($result);
输出:
array (
'00:00:00' =>
array (
0 =>
array (
'LINETYPE' => 'M',
'TITLE' => 'There For You',
'PERFORMER' => 'Martin Garrix & Troye Sivan',
'MUSICID' => '21120',
'LABEL' => '',
),
1 =>
array (
'LINETYPE' => 'M',
'TITLE' => 'Shut Up and Dance',
'PERFORMER' => 'Walk The Moon',
'MUSICID' => '20634',
'LABEL' => '',
),
2 =>
array (
'LINETYPE' => 'M',
'TITLE' => 'Subeme La Radio',
'PERFORMER' => 'Enrique Iglesias',
'MUSICID' => '21105',
'LABEL' => '',
),
3 =>
array (
'LINETYPE' => 'C',
'TITLE' => 'AUTO S2 NON STOP',
'PERFORMER' => '',
'MUSICID' => 'A2-6',
'LABEL' => '',
),
4 =>
array (
'LINETYPE' => 'M',
'TITLE' => 'Uptown Funk',
'PERFORMER' => 'Mark Ronson & Bruno Mars',
'MUSICID' => '20533',
'LABEL' => '',
),
5 =>
array (
'LINETYPE' => 'M',
'TITLE' => 'Let It Go',
'PERFORMER' => 'James Bay',
'MUSICID' => '20839',
'LABEL' => '',
),
),
'01:00:00' =>
array (
0 =>
array (
'LINETYPE' => 'M',
'TITLE' => 'There For You',
'PERFORMER' => 'Martin Garrix & Troye Sivan',
'MUSICID' => '21120',
'LABEL' => '',
),
1 =>
array (
'LINETYPE' => 'M',
'TITLE' => 'Shut Up and Dance',
'PERFORMER' => 'Walk The Moon',
'MUSICID' => '20634',
'LABEL' => '',
),
),
)
不要建立一个模式来匹配所有,使用* stream_get_line块读取文件*以字符串' [LINETYPE] S'(例如)作为第三个参数。 –
不确定''LINETYPE“=>”M“'是一个需要保留的相关信息,它似乎与文件格式本身更相关。 –