调用wkhtmltopdf从HTML生成PDF。

时间:2022-01-31 05:50:27

I'm attempting to create a PDF file from an HTML file. After looking around a little I've found: wkhtmltopdf to be perfect. I need to call this .exe from the ASP.NET server. I've attempted:

我正在尝试从一个HTML文件创建一个PDF文件。在看了一遍之后,我发现:wkhtmltopdf是完美的。我需要从ASP中调用。exe。网络服务器。我尝试:

    Process p = new Process();
    p.StartInfo.UseShellExecute = false;
    p.StartInfo.FileName = HttpContext.Current.Server.MapPath("wkhtmltopdf.exe");
    p.StartInfo.Arguments = "TestPDF.htm TestPDF.pdf";
    p.Start();
    p.WaitForExit();

With no success of any files being created on the server. Can anyone give me a pointer in the right direction? I put the wkhtmltopdf.exe file at the top level directory of the site. Is there anywhere else it should be held?

没有在服务器上创建任何文件的成功。谁能给我一个正确方向的指针?我把wkhtmltopdf。exe文件在网站的顶层目录。还有其他地方应该举行吗?


Edit: If anyone has better solutions to dynamically create pdf files from html, please let me know.

编辑:如果有人有更好的解决方案,可以从html中动态创建pdf文件,请告诉我。

10 个解决方案

#1


50  

Update:
My answer below, creates the pdf file on the disk. I then streamed that file to the users browser as a download. Consider using something like Hath's answer below to get wkhtml2pdf to output to a stream instead and then send that directly to the user - that will bypass lots of issues with file permissions etc.

更新:下面我的答案,在磁盘上创建pdf文件。然后,我将该文件流到用户浏览器,作为下载。考虑使用如下的方法,让wkhtml2pdf输出到一个流,然后直接发送给用户——这将绕过许多文件权限等问题。

My original answer:
Make sure you've specified an output path for the PDF that is writeable by the ASP.NET process of IIS running on your server (usually NETWORK_SERVICE I think).

我的原始答案是:确保您已经为PDF指定了一个可写的输出路径。在您的服务器上运行的IIS的净进程(通常是NETWORK_SERVICE)。

Mine looks like this (and it works):

我的是这样的(它是有效的):

/// <summary>
/// Convert Html page at a given URL to a PDF file using open-source tool wkhtml2pdf
/// </summary>
/// <param name="Url"></param>
/// <param name="outputFilename"></param>
/// <returns></returns>
public static bool HtmlToPdf(string Url, string outputFilename)
{
    // assemble destination PDF file name
    string filename = ConfigurationManager.AppSettings["ExportFilePath"] + "\\" + outputFilename + ".pdf";

    // get proj no for header
    Project project = new Project(int.Parse(outputFilename));

    var p = new System.Diagnostics.Process();
    p.StartInfo.FileName = ConfigurationManager.AppSettings["HtmlToPdfExePath"];

    string switches = "--print-media-type ";
    switches += "--margin-top 4mm --margin-bottom 4mm --margin-right 0mm --margin-left 0mm ";
    switches += "--page-size A4 ";
    switches += "--no-background ";
    switches += "--redirect-delay 100";

    p.StartInfo.Arguments = switches + " " + Url + " " + filename;

    p.StartInfo.UseShellExecute = false; // needs to be false in order to redirect output
    p.StartInfo.RedirectStandardOutput = true;
    p.StartInfo.RedirectStandardError = true;
    p.StartInfo.RedirectStandardInput = true; // redirect all 3, as it should be all 3 or none
    p.StartInfo.WorkingDirectory = StripFilenameFromFullPath(p.StartInfo.FileName);

    p.Start();

    // read the output here...
    string output = p.StandardOutput.ReadToEnd(); 

    // ...then wait n milliseconds for exit (as after exit, it can't read the output)
    p.WaitForExit(60000); 

    // read the exit code, close process
    int returnCode = p.ExitCode;
    p.Close(); 

    // if 0 or 2, it worked (not sure about other values, I want a better way to confirm this)
    return (returnCode == 0 || returnCode == 2);
}

#2


40  

I had the same problem when i tried using msmq with a windows service but it was very slow for some reason. (the process part).

当我尝试使用windows服务的msmq时,我遇到了同样的问题,但由于某种原因,它非常慢。(部分)的过程。

This is what finally worked:

这就是最终的效果:

private void DoDownload()
{
    var url = Request.Url.GetLeftPart(UriPartial.Authority) + "/CPCDownload.aspx?IsPDF=False?UserID=" + this.CurrentUser.UserID.ToString();
    var file = WKHtmlToPdf(url);
    if (file != null)
    {
        Response.ContentType = "Application/pdf";
        Response.BinaryWrite(file);
        Response.End();
    }
}

public byte[] WKHtmlToPdf(string url)
{
    var fileName = " - ";
    var wkhtmlDir = "C:\\Program Files\\wkhtmltopdf\\";
    var wkhtml = "C:\\Program Files\\wkhtmltopdf\\wkhtmltopdf.exe";
    var p = new Process();

    p.StartInfo.CreateNoWindow = true;
    p.StartInfo.RedirectStandardOutput = true;
    p.StartInfo.RedirectStandardError = true;
    p.StartInfo.RedirectStandardInput = true;
    p.StartInfo.UseShellExecute = false;
    p.StartInfo.FileName = wkhtml;
    p.StartInfo.WorkingDirectory = wkhtmlDir;

    string switches = "";
    switches += "--print-media-type ";
    switches += "--margin-top 10mm --margin-bottom 10mm --margin-right 10mm --margin-left 10mm ";
    switches += "--page-size Letter ";
    p.StartInfo.Arguments = switches + " " + url + " " + fileName;
    p.Start();

    //read output
    byte[] buffer = new byte[32768];
    byte[] file;
    using(var ms = new MemoryStream())
    {
        while(true)
        {
            int read =  p.StandardOutput.BaseStream.Read(buffer, 0,buffer.Length);

            if(read <=0)
            {
                break;
            }
            ms.Write(buffer, 0, read);
        }
        file = ms.ToArray();
    }

    // wait or exit
    p.WaitForExit(60000);

    // read the exit code, close process
    int returnCode = p.ExitCode;
    p.Close();

    return returnCode == 0 ? file : null;
}

Thanks Graham Ambrose and everyone else.

感谢Graham Ambrose和其他人。

#3


16  

OK, so this is an old question, but an excellent one. And since I did not find a good answer, I made my own :) Also, I've posted this super simple project to GitHub.

这是个老问题,但很好。由于我找不到一个好的答案,我自己做了:)另外,我把这个超级简单的项目贴到了GitHub上。

Here is some sample code:

下面是一些示例代码:

var pdfData = HtmlToXConverter.ConvertToPdf("<h1>SOO COOL!</h1>");

Here are some key points:

以下是一些关键点:

  • No P/Invoke
  • 没有P / Invoke
  • No creating of a new process
  • 没有创建新流程。
  • No file-system (all in RAM)
  • 没有文件系统(全部在RAM中)
  • Native .NET DLL with intellisense, etc.
  • 具有智能感知的本地。net DLL等。
  • Ability to generate a PDF or PNG (HtmlToXConverter.ConvertToPng)
  • 生成PDF或PNG的能力(HtmlToXConverter.ConvertToPng)

#4


6  

Check out the C# wrapper library (using P/Invoke) for the wkhtmltopdf library: https://github.com/pruiz/WkHtmlToXSharp

查看wkhtmltopdf库的c#包装库(使用P/Invoke): https://github.com/pruiz/WkHtmlToXSharp。

#5


5  

There are many reason why this is generally a bad idea. How are you going to control the executables that get spawned off but end up living on in memory if there is a crash? What about denial-of-service attacks, or if something malicious gets into TestPDF.htm?

这通常是一个坏主意有很多原因。如果发生崩溃,您将如何控制被派生的可执行文件,但最终却以内存的形式存在?拒绝服务攻击,或者恶意进入TestPDF.htm?

My understanding is that the ASP.NET user account will not have the rights to logon locally. It also needs to have the correct file permissions to access the executable and to write to the file system. You need to edit the local security policy and let the ASP.NET user account (maybe ASPNET) logon locally (it may be in the deny list by default). Then you need to edit the permissions on the NTFS filesystem for the other files. If you are in a shared hosting environment it may be impossible to apply the configuration you need.

我的理解是ASP。NET用户帐户将无权在本地登录。它还需要有正确的文件权限来访问可执行文件并写入文件系统。您需要编辑本地安全策略并让ASP。NET用户帐户(可能是ASPNET)在本地登录(默认情况下它可能在拒绝列表中)。然后,您需要编辑其他文件的NTFS文件系统的权限。如果您处于共享的托管环境中,则可能无法应用您所需要的配置。

The best way to use an external executable like this is to queue jobs from the ASP.NET code and have some sort of service monitor the queue. If you do this you will protect yourself from all sorts of bad things happening. The maintenance issues with changing the user account are not worth the effort in my opinion, and whilst setting up a service or scheduled job is a pain, its just a better design. The ASP.NET page should poll a result queue for the output and you can present the user with a wait page. This is acceptable in most cases.

使用这种外部可执行文件的最佳方法是从ASP中进行队列作业。NET代码,并有某种服务监视队列。如果你这样做,你会保护自己不受各种坏事的影响。在我看来,更改用户帐户的维护问题是不值得的,而设置一个服务或预定的工作是一种痛苦,它只是一个更好的设计。ASP。NET页面应该为输出轮询一个结果队列,您可以向用户显示一个等待页面。这在大多数情况下是可以接受的。

#6


4  

You can tell wkhtmltopdf to send it's output to sout by specifying "-" as the output file. You can then read the output from the process into the response stream and avoid the permissions issues with writing to the file system.

您可以通过指定“-”作为输出文件,告诉wkhtmltopdf将它的输出发送到sout。然后,您可以将该流程的输出读入响应流,并避免使用写入文件系统的权限问题。

#7


2  

Thanks for the question / answer / all the comments above. I came upon this when I was writing my own C# wrapper for WKHTMLtoPDF and it answered a couple of the problems I had. I ended up writing about this in a blog post - which also contains my wrapper (you'll no doubt see the "inspiration" from the entries above seeping into my code...)

谢谢你的问题/回答/上面的所有评论。我在为WKHTMLtoPDF编写自己的c#包装器时遇到了这个问题,它回答了我遇到的几个问题。最后,我在一个博客文章中写到这一点——它也包含了我的包装(你肯定会从上面的条目中看到“灵感”渗透到我的代码中…)

http://icanmakethiswork.blogspot.de/2012/04/making-pdfs-from-html-in-c-using.html

http://icanmakethiswork.blogspot.de/2012/04/making-pdfs-from-html-in-c-using.html

Thanks again guys!

再次感谢各位!

#8


0  

The ASP .Net process probably doesn't have write access to the directory.

net进程可能没有对目录的写访问权限。

Try telling it to write to %TEMP%, and see if it works.

尝试告诉它写入%TEMP%,看看它是否有效。

Also, make your ASP .Net page echo the process's stdout and stderr, and check for error messages.

同时,让您的asp.net页面响应进程的stdout和stderr,并检查错误消息。

#9


0  

Generally return code =0 is coming if the pdf file is created properly and correctly.If it's not created then the value is in -ve range.

如果pdf文件被正确和正确地创建,一般返回代码=0。如果它没有创建,那么值就在-ve范围内。

#10


-1  

using System;
using System.Diagnostics;
using System.Web;

public partial class pdftest : System.Web.UI.Page
{
    protected void Page_Load(object sender, EventArgs e)
    {

    }
    private void fn_test()
    {
        try
        {
            string url = HttpContext.Current.Request.Url.AbsoluteUri;
            Response.Write(url);
            ProcessStartInfo startInfo = new ProcessStartInfo();
            startInfo.FileName = 
                @"C:\PROGRA~1\WKHTML~1\wkhtmltopdf.exe";//"wkhtmltopdf.exe";
            startInfo.Arguments = url + @" C:\test"
                 + Guid.NewGuid().ToString() + ".pdf";
            Process.Start(startInfo);
        }
        catch (Exception ex)
        {
            string xx = ex.Message.ToString();
            Response.Write("<br>" + xx);
        }
    }
    protected void btn_test_Click(object sender, EventArgs e)
    {
        fn_test();
    }
}

#1


50  

Update:
My answer below, creates the pdf file on the disk. I then streamed that file to the users browser as a download. Consider using something like Hath's answer below to get wkhtml2pdf to output to a stream instead and then send that directly to the user - that will bypass lots of issues with file permissions etc.

更新:下面我的答案,在磁盘上创建pdf文件。然后,我将该文件流到用户浏览器,作为下载。考虑使用如下的方法,让wkhtml2pdf输出到一个流,然后直接发送给用户——这将绕过许多文件权限等问题。

My original answer:
Make sure you've specified an output path for the PDF that is writeable by the ASP.NET process of IIS running on your server (usually NETWORK_SERVICE I think).

我的原始答案是:确保您已经为PDF指定了一个可写的输出路径。在您的服务器上运行的IIS的净进程(通常是NETWORK_SERVICE)。

Mine looks like this (and it works):

我的是这样的(它是有效的):

/// <summary>
/// Convert Html page at a given URL to a PDF file using open-source tool wkhtml2pdf
/// </summary>
/// <param name="Url"></param>
/// <param name="outputFilename"></param>
/// <returns></returns>
public static bool HtmlToPdf(string Url, string outputFilename)
{
    // assemble destination PDF file name
    string filename = ConfigurationManager.AppSettings["ExportFilePath"] + "\\" + outputFilename + ".pdf";

    // get proj no for header
    Project project = new Project(int.Parse(outputFilename));

    var p = new System.Diagnostics.Process();
    p.StartInfo.FileName = ConfigurationManager.AppSettings["HtmlToPdfExePath"];

    string switches = "--print-media-type ";
    switches += "--margin-top 4mm --margin-bottom 4mm --margin-right 0mm --margin-left 0mm ";
    switches += "--page-size A4 ";
    switches += "--no-background ";
    switches += "--redirect-delay 100";

    p.StartInfo.Arguments = switches + " " + Url + " " + filename;

    p.StartInfo.UseShellExecute = false; // needs to be false in order to redirect output
    p.StartInfo.RedirectStandardOutput = true;
    p.StartInfo.RedirectStandardError = true;
    p.StartInfo.RedirectStandardInput = true; // redirect all 3, as it should be all 3 or none
    p.StartInfo.WorkingDirectory = StripFilenameFromFullPath(p.StartInfo.FileName);

    p.Start();

    // read the output here...
    string output = p.StandardOutput.ReadToEnd(); 

    // ...then wait n milliseconds for exit (as after exit, it can't read the output)
    p.WaitForExit(60000); 

    // read the exit code, close process
    int returnCode = p.ExitCode;
    p.Close(); 

    // if 0 or 2, it worked (not sure about other values, I want a better way to confirm this)
    return (returnCode == 0 || returnCode == 2);
}

#2


40  

I had the same problem when i tried using msmq with a windows service but it was very slow for some reason. (the process part).

当我尝试使用windows服务的msmq时,我遇到了同样的问题,但由于某种原因,它非常慢。(部分)的过程。

This is what finally worked:

这就是最终的效果:

private void DoDownload()
{
    var url = Request.Url.GetLeftPart(UriPartial.Authority) + "/CPCDownload.aspx?IsPDF=False?UserID=" + this.CurrentUser.UserID.ToString();
    var file = WKHtmlToPdf(url);
    if (file != null)
    {
        Response.ContentType = "Application/pdf";
        Response.BinaryWrite(file);
        Response.End();
    }
}

public byte[] WKHtmlToPdf(string url)
{
    var fileName = " - ";
    var wkhtmlDir = "C:\\Program Files\\wkhtmltopdf\\";
    var wkhtml = "C:\\Program Files\\wkhtmltopdf\\wkhtmltopdf.exe";
    var p = new Process();

    p.StartInfo.CreateNoWindow = true;
    p.StartInfo.RedirectStandardOutput = true;
    p.StartInfo.RedirectStandardError = true;
    p.StartInfo.RedirectStandardInput = true;
    p.StartInfo.UseShellExecute = false;
    p.StartInfo.FileName = wkhtml;
    p.StartInfo.WorkingDirectory = wkhtmlDir;

    string switches = "";
    switches += "--print-media-type ";
    switches += "--margin-top 10mm --margin-bottom 10mm --margin-right 10mm --margin-left 10mm ";
    switches += "--page-size Letter ";
    p.StartInfo.Arguments = switches + " " + url + " " + fileName;
    p.Start();

    //read output
    byte[] buffer = new byte[32768];
    byte[] file;
    using(var ms = new MemoryStream())
    {
        while(true)
        {
            int read =  p.StandardOutput.BaseStream.Read(buffer, 0,buffer.Length);

            if(read <=0)
            {
                break;
            }
            ms.Write(buffer, 0, read);
        }
        file = ms.ToArray();
    }

    // wait or exit
    p.WaitForExit(60000);

    // read the exit code, close process
    int returnCode = p.ExitCode;
    p.Close();

    return returnCode == 0 ? file : null;
}

Thanks Graham Ambrose and everyone else.

感谢Graham Ambrose和其他人。

#3


16  

OK, so this is an old question, but an excellent one. And since I did not find a good answer, I made my own :) Also, I've posted this super simple project to GitHub.

这是个老问题,但很好。由于我找不到一个好的答案,我自己做了:)另外,我把这个超级简单的项目贴到了GitHub上。

Here is some sample code:

下面是一些示例代码:

var pdfData = HtmlToXConverter.ConvertToPdf("<h1>SOO COOL!</h1>");

Here are some key points:

以下是一些关键点:

  • No P/Invoke
  • 没有P / Invoke
  • No creating of a new process
  • 没有创建新流程。
  • No file-system (all in RAM)
  • 没有文件系统(全部在RAM中)
  • Native .NET DLL with intellisense, etc.
  • 具有智能感知的本地。net DLL等。
  • Ability to generate a PDF or PNG (HtmlToXConverter.ConvertToPng)
  • 生成PDF或PNG的能力(HtmlToXConverter.ConvertToPng)

#4


6  

Check out the C# wrapper library (using P/Invoke) for the wkhtmltopdf library: https://github.com/pruiz/WkHtmlToXSharp

查看wkhtmltopdf库的c#包装库(使用P/Invoke): https://github.com/pruiz/WkHtmlToXSharp。

#5


5  

There are many reason why this is generally a bad idea. How are you going to control the executables that get spawned off but end up living on in memory if there is a crash? What about denial-of-service attacks, or if something malicious gets into TestPDF.htm?

这通常是一个坏主意有很多原因。如果发生崩溃,您将如何控制被派生的可执行文件,但最终却以内存的形式存在?拒绝服务攻击,或者恶意进入TestPDF.htm?

My understanding is that the ASP.NET user account will not have the rights to logon locally. It also needs to have the correct file permissions to access the executable and to write to the file system. You need to edit the local security policy and let the ASP.NET user account (maybe ASPNET) logon locally (it may be in the deny list by default). Then you need to edit the permissions on the NTFS filesystem for the other files. If you are in a shared hosting environment it may be impossible to apply the configuration you need.

我的理解是ASP。NET用户帐户将无权在本地登录。它还需要有正确的文件权限来访问可执行文件并写入文件系统。您需要编辑本地安全策略并让ASP。NET用户帐户(可能是ASPNET)在本地登录(默认情况下它可能在拒绝列表中)。然后,您需要编辑其他文件的NTFS文件系统的权限。如果您处于共享的托管环境中,则可能无法应用您所需要的配置。

The best way to use an external executable like this is to queue jobs from the ASP.NET code and have some sort of service monitor the queue. If you do this you will protect yourself from all sorts of bad things happening. The maintenance issues with changing the user account are not worth the effort in my opinion, and whilst setting up a service or scheduled job is a pain, its just a better design. The ASP.NET page should poll a result queue for the output and you can present the user with a wait page. This is acceptable in most cases.

使用这种外部可执行文件的最佳方法是从ASP中进行队列作业。NET代码,并有某种服务监视队列。如果你这样做,你会保护自己不受各种坏事的影响。在我看来,更改用户帐户的维护问题是不值得的,而设置一个服务或预定的工作是一种痛苦,它只是一个更好的设计。ASP。NET页面应该为输出轮询一个结果队列,您可以向用户显示一个等待页面。这在大多数情况下是可以接受的。

#6


4  

You can tell wkhtmltopdf to send it's output to sout by specifying "-" as the output file. You can then read the output from the process into the response stream and avoid the permissions issues with writing to the file system.

您可以通过指定“-”作为输出文件,告诉wkhtmltopdf将它的输出发送到sout。然后,您可以将该流程的输出读入响应流,并避免使用写入文件系统的权限问题。

#7


2  

Thanks for the question / answer / all the comments above. I came upon this when I was writing my own C# wrapper for WKHTMLtoPDF and it answered a couple of the problems I had. I ended up writing about this in a blog post - which also contains my wrapper (you'll no doubt see the "inspiration" from the entries above seeping into my code...)

谢谢你的问题/回答/上面的所有评论。我在为WKHTMLtoPDF编写自己的c#包装器时遇到了这个问题,它回答了我遇到的几个问题。最后,我在一个博客文章中写到这一点——它也包含了我的包装(你肯定会从上面的条目中看到“灵感”渗透到我的代码中…)

http://icanmakethiswork.blogspot.de/2012/04/making-pdfs-from-html-in-c-using.html

http://icanmakethiswork.blogspot.de/2012/04/making-pdfs-from-html-in-c-using.html

Thanks again guys!

再次感谢各位!

#8


0  

The ASP .Net process probably doesn't have write access to the directory.

net进程可能没有对目录的写访问权限。

Try telling it to write to %TEMP%, and see if it works.

尝试告诉它写入%TEMP%,看看它是否有效。

Also, make your ASP .Net page echo the process's stdout and stderr, and check for error messages.

同时,让您的asp.net页面响应进程的stdout和stderr,并检查错误消息。

#9


0  

Generally return code =0 is coming if the pdf file is created properly and correctly.If it's not created then the value is in -ve range.

如果pdf文件被正确和正确地创建,一般返回代码=0。如果它没有创建,那么值就在-ve范围内。

#10


-1  

using System;
using System.Diagnostics;
using System.Web;

public partial class pdftest : System.Web.UI.Page
{
    protected void Page_Load(object sender, EventArgs e)
    {

    }
    private void fn_test()
    {
        try
        {
            string url = HttpContext.Current.Request.Url.AbsoluteUri;
            Response.Write(url);
            ProcessStartInfo startInfo = new ProcessStartInfo();
            startInfo.FileName = 
                @"C:\PROGRA~1\WKHTML~1\wkhtmltopdf.exe";//"wkhtmltopdf.exe";
            startInfo.Arguments = url + @" C:\test"
                 + Guid.NewGuid().ToString() + ".pdf";
            Process.Start(startInfo);
        }
        catch (Exception ex)
        {
            string xx = ex.Message.ToString();
            Response.Write("<br>" + xx);
        }
    }
    protected void btn_test_Click(object sender, EventArgs e)
    {
        fn_test();
    }
}