使用perl将HTML表格转换为文本

时间:2022-12-09 08:58:41

I have a html table content which I am trying to convert it into text with same structure, with the help of use HTML::TreeBuilder and use HTML::FormatText in perl. I have tried with this code

我有一个html表内容,我试图将其转换为具有相同结构的文本,借助于使用HTML :: TreeBuilder并在perl中使用HTML :: FormatText。我试过这个代码

use strict;
use warnings;
use HTML::TreeBuilder;
use HTML::FormatText;
my $raw_html='';
my $tree = HTML::TreeBuilder->new_from_content($raw_html); 
print $tree->format(HTML::FormatText->new);

expected output is:

预期产量是:

data1            data1_value

data2            data2_value

data3            data3_value

but the output I get is like:

但我得到的输出是:

data1

data1_value

data2

data2_vaue

data3

data3_value

I am in need of some suggestion.

我需要一些建议。

1 个解决方案

#1


1  

The documentation of HTML::FormatText states "Formatting of HTML tables and forms is not implemented."

HTML :: FormatText的文档声明“未实现HTML表格和表单的格式化”。

So you will need to find another approach. HTML::TableExtract is a likely candidate.

所以你需要找到另一种方法。 HTML :: TableExtract可能是候选者。

#1


1  

The documentation of HTML::FormatText states "Formatting of HTML tables and forms is not implemented."

HTML :: FormatText的文档声明“未实现HTML表格和表单的格式化”。

So you will need to find another approach. HTML::TableExtract is a likely candidate.

所以你需要找到另一种方法。 HTML :: TableExtract可能是候选者。