在Emacs上打印XML文件很不错

时间:2022-12-01 14:01:51

I use emacs to edit my xml files (nxml-mode) and the files were generated by machine don't have any pretty formatting of the tags.

我使用emacs编辑我的xml文件(nxml模式),而由机器生成的文件没有任何漂亮的标记格式。

I have searched for pretty printing the entire file with indentation and saving it, but wasn't able to find an automatic way.

我搜索了漂亮的用缩进打印整个文件并保存它,但是没有找到一个自动的方法。

Is there a way? Or atleast some editor on linux which can do it.

有一种方式吗?或者至少在linux上有一些编辑器可以做到这一点。

15 个解决方案

#1


24  

I use nXML mode for editing and Tidy when I want to format and indent XML or HTML. There is also an Emacs interface to Tidy.

当我想要格式化和缩进XML或HTML时,我使用nXML模式进行编辑和整理。还有一个Emacs接口用于Tidy。

#2


96  

You don't even need to write your own function - sgml-mode (a gnu emacs core module) has a built-in pretty printing function called (sgml-pretty-print ...) which takes region beginning and end arguments.

您甚至不需要编写自己的函数- sgml-mode(一个gnu emacs核心模块)具有一个内置的漂亮的打印函数(sgml-pretty-print…),该函数采用区域开始和结束参数。

If you are cutting and pasting xml and you find your terminal is chopping the lines in arbitrary places you can use this pretty printer which fixes broken lines first.

如果您正在剪切和粘贴xml,并且您发现您的终端在任意位置切割线,您可以使用这个漂亮的打印机,它首先修复损坏的行。

#3


85  

If you only need pretty indenting without introducing any new line-breaks, you can apply the indent-region command to the entire buffer with these keystrokes:

如果您只需要漂亮的缩进而不引入任何新的换行符,那么您可以使用以下按键将indent-region命令应用到整个缓冲区:

C-x h
C-M-\

If you also need to introduce line-breaks, so that opening and closing tags are on separate lines, you could use the following very nice elisp function, written by Benjamin Ferrari. I found it on his blog and hope it's ok for me to reproduce it here:

如果您还需要引入换行符,使开始和结束标记在单独的行上,您可以使用以下非常好的elisp函数,它是由Benjamin Ferrari编写的。我在他的博客上找到了它,希望我可以在这里复制它:

(defun bf-pretty-print-xml-region (begin end)
  "Pretty format XML markup in region. You need to have nxml-mode
http://www.emacswiki.org/cgi-bin/wiki/NxmlMode installed to do
this.  The function inserts linebreaks to separate tags that have
nothing but whitespace between them.  It then indents the markup
by using nxml's indentation rules."
  (interactive "r")
  (save-excursion
      (nxml-mode)
      (goto-char begin)
      (while (search-forward-regexp "\>[ \\t]*\<" nil t) 
        (backward-char) (insert "\n"))
      (indent-region begin end))
    (message "Ah, much better!"))

This doesn't rely on an external tool like Tidy.

这并不依赖于像Tidy这样的外部工具。

#4


33  

Emacs can run arbitrary commands with M-|. If you have xmllint installed:

Emacs可以使用M-|运行任意命令。如果您安装了xmllint:

"M-| xmllint --format -" will format the selected region

“M-| xmllint——format -”将格式化所选区域

"C-u M-| xmllint --format -" will do the same, replacing the region with the output

“C-u M-| xmllint -format -”将会这样做,用输出替换该区域。

#5


18  

Thanks to Tim Helmstedt above I made st like this:

感谢上面的Tim Helmstedt,我让st这样:

(defun nxml-pretty-format ()
    (interactive)
    (save-excursion
        (shell-command-on-region (point-min) (point-max) "xmllint --format -" (buffer-name) t)
        (nxml-mode)
        (indent-region begin end)))

fast and easy. Many thanks.

快速和容易。多谢。

#6


13  

For introducing line breaks and then pretty printing

为了引入换行,然后漂亮的打印

M-x sgml-mode
M-x sgml-pretty-print

#7


7  

here's a few tweaks I made to Benjamin Ferrari's version:

以下是我对本杰明·法拉利版本做的一些调整:

  • the search-forward-regexp didn't specify an end, so it would operate on stuff from beginning of region to end of buffer (instead of end of region)
  • 搜索-向前regexp没有指定结束,因此它将从区域开始到缓冲区结束(而不是区域结束)进行操作。
  • Now increments end properly, as Cheeso noted.
  • 正如Cheeso所指出的,现在每增加一段时间,就会正确地结束。
  • it would insert a break between <tag></tag>, which modifies its value. Yes, technically we're modifying values of everything here, but an empty start/end is much more likely to be significant. Now uses two separate, slightly more strict searches to avoid that.
  • 它会在 <标签> 之间插入一个break,它会修改它的值。是的,严格地说,我们正在修改这里的所有值,但是一个空的开始/结束更可能是重要的。现在使用两个单独的、稍微更严格的搜索来避免这种情况。

Still has the "doesn't rely on external tidy", etc. However, it does require cl for the incf macro.

仍然有“不依赖外部清理”等。但是,对于incf宏,它确实需要cl。

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; pretty print xml region
(defun pretty-print-xml-region (begin end)
  "Pretty format XML markup in region. You need to have nxml-mode
http://www.emacswiki.org/cgi-bin/wiki/NxmlMode installed to do
this.  The function inserts linebreaks to separate tags that have
nothing but whitespace between them.  It then indents the markup
by using nxml's indentation rules."
  (interactive "r")
  (save-excursion
    (nxml-mode)
    (goto-char begin)
    ;; split <foo><foo> or </foo><foo>, but not <foo></foo>
    (while (search-forward-regexp ">[ \t]*<[^/]" end t)
      (backward-char 2) (insert "\n") (incf end))
    ;; split <foo/></foo> and </foo></foo>
    (goto-char begin)
    (while (search-forward-regexp "<.*?/.*?>[ \t]*<" end t)
      (backward-char) (insert "\n") (incf end))
    (indent-region begin end nil)
    (normal-mode))
  (message "All indented!"))

#8


5  

One way of doing is If you have something in below format

一种方法是如果你有以下格式的东西

<abc>     <abc><abc>   <abc></abc> </abc></abc>       </abc>

In Emacs, try

在Emacs,试一试

M-x nxml-mode
M-x replace-regexp RET  > *< RET >C-q C-j< RET 
C-M-\ to indent

This will indent above xml example to below

这将缩进到下面的xml示例中。

<abc>
  <abc>
    <abc>
      <abc>
      </abc>
    </abc>
  </abc>
</abc>

In VIM you can do this by

在VIM中,你可以通过

:set ft=xml
:%s/>\s*</>\r</g
ggVG=

Hope this helps.

希望这个有帮助。

#9


2  

  1. Emacs nxml-mode can work on presented format, but you'll have to split the lines.
  2. Emacs nxml-mode可以处理呈现的格式,但是您必须将行分开。
  3. For longer files that simply isn't worth it. Run this stylesheet (ideally with Saxon which IMHO gets the line indents about right) against longer files to get a nice pretty print. For any elements where you want to retain white space add their names alongside 'programlisting' as in 'programlisting yourElementName'
  4. 对于长一些不值得的文件。运行这个样式表(理想情况下是用撒克逊语,IMHO在正确的情况下得到了正确的线),而不是更长的文件,以获得漂亮的打印。对于任何需要保留空格的元素,请在“programlisting”中添加它们的名称,如“programlisting yourElementName”

HTH

HTH

#10


2  

I took Jason Viers' version and added logic to put xmlns declarations on their own lines. This assumes that you have xmlns= and xmlns: with no intervening whitespace.

我使用了Jason Viers的版本,并添加了逻辑,将xmlns声明放在自己的行上。这假设您有xmlns=和xmlns:没有中间的空格。

(defun cheeso-pretty-print-xml-region (begin end)
  "Pretty format XML markup in region. You need to have nxml-mode
http://www.emacswiki.org/cgi-bin/wiki/NxmlMode installed to do
this.  The function inserts linebreaks to separate tags that have
nothing but whitespace between them.  It then indents the markup
by using nxml's indentation rules."
  (interactive "r")
  (save-excursion
    (nxml-mode)
    ;; split <foo><bar> or </foo><bar>, but not <foo></foo>
    (goto-char begin)
    (while (search-forward-regexp ">[ \t]*<[^/]" end t)
      (backward-char 2) (insert "\n") (incf end))
    ;; split <foo/></foo> and </foo></foo>
    (goto-char begin)
    (while (search-forward-regexp "<.*?/.*?>[ \t]*<" end t)
      (backward-char) (insert "\n") (incf end))
    ;; put xml namespace decls on newline
    (goto-char begin)
    (while (search-forward-regexp "\\(<\\([a-zA-Z][-:A-Za-z0-9]*\\)\\|['\"]\\) \\(xmlns[=:]\\)" end t)
      (goto-char (match-end 0))
      (backward-char 6) (insert "\n") (incf end))
    (indent-region begin end nil)
    (normal-mode))
  (message "All indented!"))

#11


1  

Tidy looks like a good mode. Must look at it. Will use it if I really need all the features it offers.

整洁看起来是个好模式。必须看它。如果我真的需要它提供的所有特性,我会使用它。

Anyway, this problem was nagging me for about a week and I wasn't searching properly. After posting, I started searching and found one site with an elisp function which does it pretty good. The author also suggests using Tidy.

不管怎样,这个问题困扰了我一个星期,我没有好好地搜索。发布后,我开始搜索,找到了一个具有elisp功能的网站。作者还建议使用Tidy。

Thanks for answer Marcel (too bad I don't have enough points to upmod you) .

谢谢你的回答,Marcel(可惜我没有足够的分数来更新你)。

Will post about it soon on my blog. Here is a post about it (with a link to Marcel's site).

我会很快在我的博客上发表。这里有一个关于它的帖子(链接到Marcel的网站)。

#12


1  

I use xml-reformat-tags from xml-parse.el. Usually you will want to have the point at the beginning of the file when running this command.

我使用来自xml解析器。el的xml-reformat标记。通常,在运行这个命令时,您希望在文件的开头有这个点。

It's interesting that the file is incorporated into Emacspeak. When I was using Emacspeak on day-by-day basis, I thought xml-reformat-tags is an Emacs builtin. One day I lost it and had to make an internet search for that, and thus entered the wiki page mentioned above.

有趣的是,这个文件被合并到Emacspeak中。当我日复一日地使用Emacspeak时,我认为xml-reformat-tags是Emacs内置的。有一天我把它弄丢了,不得不上网搜索,于是我进入了上面提到的维基页面。

I'm attaching also my code to start xml-parse. Not sure if this is the best piece of Emacs code, but seems to work for me.

我还附加了代码以开始xml解析。不确定这是否是最好的Emacs代码,但似乎对我有用。

(if (file-exists-p "~/.emacs.d/packages/xml-parse.el")
  (let ((load-path load-path))
    (add-to-list 'load-path "~/.emacs.d/packages")
    (require 'xml-parse))
)

#13


1  

If you use spacemacs, just use command 'spacemacs/indent-region-or-buffer'.

如果您使用的是spacemacs,请使用command 'spacemacs/ instudent -region-or-buffer'。

M-x spacemacs/indent-region-or-buffer

#14


0  

I'm afraid I like Benjamin Ferrari version much better. The internal pretty print always places the end tag in a new line after the value, inserting unwanted CR in the tag values.

恐怕我更喜欢本杰明·法拉利的版本。内部漂亮的打印始终将结束标记放在值之后的新行中,在标记值中插入不需要的CR。

#15


0  

as of 2017 emacs already comes with this capability by default, but you have to write this little function into your ~/.emacs.d/init.el:

到2017年,emacs默认已经具备了这个功能,但您必须将这个小函数写入~/.emacs.d/init.el:

(require 'sgml-mode)

(defun reformat-xml ()
  (interactive)
  (save-excursion
    (sgml-pretty-print (point-min) (point-max))
    (indent-region (point-min) (point-max))))

then just call M-x reformat-xml

然后调用M-x reformat-xml

source: https://davidcapello.com/blog/emacs/reformat-xml-on-emacs/

来源:https://davidcapello.com/blog/emacs/reformat-xml-on-emacs/

#1


24  

I use nXML mode for editing and Tidy when I want to format and indent XML or HTML. There is also an Emacs interface to Tidy.

当我想要格式化和缩进XML或HTML时,我使用nXML模式进行编辑和整理。还有一个Emacs接口用于Tidy。

#2


96  

You don't even need to write your own function - sgml-mode (a gnu emacs core module) has a built-in pretty printing function called (sgml-pretty-print ...) which takes region beginning and end arguments.

您甚至不需要编写自己的函数- sgml-mode(一个gnu emacs核心模块)具有一个内置的漂亮的打印函数(sgml-pretty-print…),该函数采用区域开始和结束参数。

If you are cutting and pasting xml and you find your terminal is chopping the lines in arbitrary places you can use this pretty printer which fixes broken lines first.

如果您正在剪切和粘贴xml,并且您发现您的终端在任意位置切割线,您可以使用这个漂亮的打印机,它首先修复损坏的行。

#3


85  

If you only need pretty indenting without introducing any new line-breaks, you can apply the indent-region command to the entire buffer with these keystrokes:

如果您只需要漂亮的缩进而不引入任何新的换行符,那么您可以使用以下按键将indent-region命令应用到整个缓冲区:

C-x h
C-M-\

If you also need to introduce line-breaks, so that opening and closing tags are on separate lines, you could use the following very nice elisp function, written by Benjamin Ferrari. I found it on his blog and hope it's ok for me to reproduce it here:

如果您还需要引入换行符,使开始和结束标记在单独的行上,您可以使用以下非常好的elisp函数,它是由Benjamin Ferrari编写的。我在他的博客上找到了它,希望我可以在这里复制它:

(defun bf-pretty-print-xml-region (begin end)
  "Pretty format XML markup in region. You need to have nxml-mode
http://www.emacswiki.org/cgi-bin/wiki/NxmlMode installed to do
this.  The function inserts linebreaks to separate tags that have
nothing but whitespace between them.  It then indents the markup
by using nxml's indentation rules."
  (interactive "r")
  (save-excursion
      (nxml-mode)
      (goto-char begin)
      (while (search-forward-regexp "\>[ \\t]*\<" nil t) 
        (backward-char) (insert "\n"))
      (indent-region begin end))
    (message "Ah, much better!"))

This doesn't rely on an external tool like Tidy.

这并不依赖于像Tidy这样的外部工具。

#4


33  

Emacs can run arbitrary commands with M-|. If you have xmllint installed:

Emacs可以使用M-|运行任意命令。如果您安装了xmllint:

"M-| xmllint --format -" will format the selected region

“M-| xmllint——format -”将格式化所选区域

"C-u M-| xmllint --format -" will do the same, replacing the region with the output

“C-u M-| xmllint -format -”将会这样做,用输出替换该区域。

#5


18  

Thanks to Tim Helmstedt above I made st like this:

感谢上面的Tim Helmstedt,我让st这样:

(defun nxml-pretty-format ()
    (interactive)
    (save-excursion
        (shell-command-on-region (point-min) (point-max) "xmllint --format -" (buffer-name) t)
        (nxml-mode)
        (indent-region begin end)))

fast and easy. Many thanks.

快速和容易。多谢。

#6


13  

For introducing line breaks and then pretty printing

为了引入换行,然后漂亮的打印

M-x sgml-mode
M-x sgml-pretty-print

#7


7  

here's a few tweaks I made to Benjamin Ferrari's version:

以下是我对本杰明·法拉利版本做的一些调整:

  • the search-forward-regexp didn't specify an end, so it would operate on stuff from beginning of region to end of buffer (instead of end of region)
  • 搜索-向前regexp没有指定结束,因此它将从区域开始到缓冲区结束(而不是区域结束)进行操作。
  • Now increments end properly, as Cheeso noted.
  • 正如Cheeso所指出的,现在每增加一段时间,就会正确地结束。
  • it would insert a break between <tag></tag>, which modifies its value. Yes, technically we're modifying values of everything here, but an empty start/end is much more likely to be significant. Now uses two separate, slightly more strict searches to avoid that.
  • 它会在 <标签> 之间插入一个break,它会修改它的值。是的,严格地说,我们正在修改这里的所有值,但是一个空的开始/结束更可能是重要的。现在使用两个单独的、稍微更严格的搜索来避免这种情况。

Still has the "doesn't rely on external tidy", etc. However, it does require cl for the incf macro.

仍然有“不依赖外部清理”等。但是,对于incf宏,它确实需要cl。

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; pretty print xml region
(defun pretty-print-xml-region (begin end)
  "Pretty format XML markup in region. You need to have nxml-mode
http://www.emacswiki.org/cgi-bin/wiki/NxmlMode installed to do
this.  The function inserts linebreaks to separate tags that have
nothing but whitespace between them.  It then indents the markup
by using nxml's indentation rules."
  (interactive "r")
  (save-excursion
    (nxml-mode)
    (goto-char begin)
    ;; split <foo><foo> or </foo><foo>, but not <foo></foo>
    (while (search-forward-regexp ">[ \t]*<[^/]" end t)
      (backward-char 2) (insert "\n") (incf end))
    ;; split <foo/></foo> and </foo></foo>
    (goto-char begin)
    (while (search-forward-regexp "<.*?/.*?>[ \t]*<" end t)
      (backward-char) (insert "\n") (incf end))
    (indent-region begin end nil)
    (normal-mode))
  (message "All indented!"))

#8


5  

One way of doing is If you have something in below format

一种方法是如果你有以下格式的东西

<abc>     <abc><abc>   <abc></abc> </abc></abc>       </abc>

In Emacs, try

在Emacs,试一试

M-x nxml-mode
M-x replace-regexp RET  > *< RET >C-q C-j< RET 
C-M-\ to indent

This will indent above xml example to below

这将缩进到下面的xml示例中。

<abc>
  <abc>
    <abc>
      <abc>
      </abc>
    </abc>
  </abc>
</abc>

In VIM you can do this by

在VIM中,你可以通过

:set ft=xml
:%s/>\s*</>\r</g
ggVG=

Hope this helps.

希望这个有帮助。

#9


2  

  1. Emacs nxml-mode can work on presented format, but you'll have to split the lines.
  2. Emacs nxml-mode可以处理呈现的格式,但是您必须将行分开。
  3. For longer files that simply isn't worth it. Run this stylesheet (ideally with Saxon which IMHO gets the line indents about right) against longer files to get a nice pretty print. For any elements where you want to retain white space add their names alongside 'programlisting' as in 'programlisting yourElementName'
  4. 对于长一些不值得的文件。运行这个样式表(理想情况下是用撒克逊语,IMHO在正确的情况下得到了正确的线),而不是更长的文件,以获得漂亮的打印。对于任何需要保留空格的元素,请在“programlisting”中添加它们的名称,如“programlisting yourElementName”

HTH

HTH

#10


2  

I took Jason Viers' version and added logic to put xmlns declarations on their own lines. This assumes that you have xmlns= and xmlns: with no intervening whitespace.

我使用了Jason Viers的版本,并添加了逻辑,将xmlns声明放在自己的行上。这假设您有xmlns=和xmlns:没有中间的空格。

(defun cheeso-pretty-print-xml-region (begin end)
  "Pretty format XML markup in region. You need to have nxml-mode
http://www.emacswiki.org/cgi-bin/wiki/NxmlMode installed to do
this.  The function inserts linebreaks to separate tags that have
nothing but whitespace between them.  It then indents the markup
by using nxml's indentation rules."
  (interactive "r")
  (save-excursion
    (nxml-mode)
    ;; split <foo><bar> or </foo><bar>, but not <foo></foo>
    (goto-char begin)
    (while (search-forward-regexp ">[ \t]*<[^/]" end t)
      (backward-char 2) (insert "\n") (incf end))
    ;; split <foo/></foo> and </foo></foo>
    (goto-char begin)
    (while (search-forward-regexp "<.*?/.*?>[ \t]*<" end t)
      (backward-char) (insert "\n") (incf end))
    ;; put xml namespace decls on newline
    (goto-char begin)
    (while (search-forward-regexp "\\(<\\([a-zA-Z][-:A-Za-z0-9]*\\)\\|['\"]\\) \\(xmlns[=:]\\)" end t)
      (goto-char (match-end 0))
      (backward-char 6) (insert "\n") (incf end))
    (indent-region begin end nil)
    (normal-mode))
  (message "All indented!"))

#11


1  

Tidy looks like a good mode. Must look at it. Will use it if I really need all the features it offers.

整洁看起来是个好模式。必须看它。如果我真的需要它提供的所有特性,我会使用它。

Anyway, this problem was nagging me for about a week and I wasn't searching properly. After posting, I started searching and found one site with an elisp function which does it pretty good. The author also suggests using Tidy.

不管怎样,这个问题困扰了我一个星期,我没有好好地搜索。发布后,我开始搜索,找到了一个具有elisp功能的网站。作者还建议使用Tidy。

Thanks for answer Marcel (too bad I don't have enough points to upmod you) .

谢谢你的回答,Marcel(可惜我没有足够的分数来更新你)。

Will post about it soon on my blog. Here is a post about it (with a link to Marcel's site).

我会很快在我的博客上发表。这里有一个关于它的帖子(链接到Marcel的网站)。

#12


1  

I use xml-reformat-tags from xml-parse.el. Usually you will want to have the point at the beginning of the file when running this command.

我使用来自xml解析器。el的xml-reformat标记。通常,在运行这个命令时,您希望在文件的开头有这个点。

It's interesting that the file is incorporated into Emacspeak. When I was using Emacspeak on day-by-day basis, I thought xml-reformat-tags is an Emacs builtin. One day I lost it and had to make an internet search for that, and thus entered the wiki page mentioned above.

有趣的是,这个文件被合并到Emacspeak中。当我日复一日地使用Emacspeak时,我认为xml-reformat-tags是Emacs内置的。有一天我把它弄丢了,不得不上网搜索,于是我进入了上面提到的维基页面。

I'm attaching also my code to start xml-parse. Not sure if this is the best piece of Emacs code, but seems to work for me.

我还附加了代码以开始xml解析。不确定这是否是最好的Emacs代码,但似乎对我有用。

(if (file-exists-p "~/.emacs.d/packages/xml-parse.el")
  (let ((load-path load-path))
    (add-to-list 'load-path "~/.emacs.d/packages")
    (require 'xml-parse))
)

#13


1  

If you use spacemacs, just use command 'spacemacs/indent-region-or-buffer'.

如果您使用的是spacemacs,请使用command 'spacemacs/ instudent -region-or-buffer'。

M-x spacemacs/indent-region-or-buffer

#14


0  

I'm afraid I like Benjamin Ferrari version much better. The internal pretty print always places the end tag in a new line after the value, inserting unwanted CR in the tag values.

恐怕我更喜欢本杰明·法拉利的版本。内部漂亮的打印始终将结束标记放在值之后的新行中,在标记值中插入不需要的CR。

#15


0  

as of 2017 emacs already comes with this capability by default, but you have to write this little function into your ~/.emacs.d/init.el:

到2017年,emacs默认已经具备了这个功能,但您必须将这个小函数写入~/.emacs.d/init.el:

(require 'sgml-mode)

(defun reformat-xml ()
  (interactive)
  (save-excursion
    (sgml-pretty-print (point-min) (point-max))
    (indent-region (point-min) (point-max))))

then just call M-x reformat-xml

然后调用M-x reformat-xml

source: https://davidcapello.com/blog/emacs/reformat-xml-on-emacs/

来源:https://davidcapello.com/blog/emacs/reformat-xml-on-emacs/