有哪些选项可用于定义具有node.js依赖关系的Python包?

时间:2021-10-26 17:13:08

Currently, I have a few (unpublished) Python packages in local use, which I install (for development purposes) with a Bash script on Linux into an activated (otherwise "empty") virtual environment in the following manner:

目前,我在本地使用了一些(未发布的)Python软件包,我将这些软件包安装(用于开发),在Linux上使用Bash脚本以下列方式进入激活(否则为“空”)的虚拟环境:

cd /root/of/python/package
pip install -r requirements_python.txt # includes "nodeenv"
nodeenv -p # pulls node.js and integrates it into my virtual environment
npm i -g npm # update npm ...
cat requirements_node.txt | xargs npm install -g
pip install -e .

The background is that I have a number of node.js dependencies, JavaScript CLI scripts, which are called by my Python code.

背景是我有许多node.js依赖项,JavaScript CLI脚本,由我的Python代码调用。

Pros of current approach:

当前方法的优点:

  • dead simple: relies on nodeenv for all required plumbing
  • 死简单:依赖nodeenv进行所有必需的管道
  • can theoretically be implemented within setup.py with subprocess.Popen etc
  • 理论上可以在setup.py中使用subprocess.Popen等实现

Cons of current approach:

当前方法的缺点:

  • Unix-like platforms with Bash only
  • 类似Unix的平台只有Bash
  • "hard" to distribute my packages, say on PyPI
  • 在PyPI上说,“很难”分发我的包裹
  • requires a virtual environment
  • 需要一个虚拟环境
  • has potentially "interesting" side effects if a package is installed globally
  • 如果全局安装包,则可能产生“有趣”的副作用
  • potentially interferes with a pre-existing configuration / "deployment" of nodeenv in the current virtual environment
  • 可能会干扰当前虚拟环境中nodeenv的预先存在的配置/“部署”

What is the canonical (if there is any) or just a sane, potentially cross-platform approach of defining node.js dependencies for a Python package, making it publishable?

什么是规范(如果有的话)或只是一个理智的,可能跨平台的方法来定义Python包的node.js依赖关系,使其可以发布?

Why is this question even relevant? JavaScript is not just for web development (any more). There are also interesting (relevant) data processing tools out there. If you do not want to miss / ignore them, well, welcome to this particular form of hell.

为什么这个问题甚至相关? JavaScript不仅仅适用于Web开发(更多)。还有一些有趣的(相关的)数据处理工具。如果你不想错过/忽略它们,那么,欢迎来到这种特殊形式的地狱。


I recently came across calmjs, which appears to be what I am looking for. I have not experimented much with it yet and it also appears to be a relatively young project.

我最近遇到了冷静,这似乎是我正在寻找的。我还没有尝试过它,它似乎也是一个相对年轻的项目。

I started an issue there asking a similar question.

我在那里问了一个类似的问题。


EDIT (1): Interesting resource: JavaScript versus Research Computing - A Brief Guide for Those Who Regret That This Has Become Necessary

编辑(1):有趣的资源:JavaScript与研究计算 - 对于那些遗憾的必要的人的简要指南


EDIT (2): I started an issue against nodeenv, asking how I could make a project depend on it.

编辑(2):我开始针对nodeenv的一个问题,询问我如何使项目依赖它。

2 个解决方案

#1


7  

(Disclaimer: I am the author of calmjs)

(免责声明:我是calmjs的作者)

After mulling over this particular issue for another few days, this question actually encapsulates multiple problems which may or may not be orthogonal to each other depending on one's given point of view, given some of the following (the list is not exhaustive)

在考虑了这个特定问题几天之后,这个问题实际上包含了多个问题,这些问题可能会或可能不会相互正交,这取决于一个人给定的观点,给出以下一些(列表并非详尽无遗)

  1. How can a developer ensure that they have all the information required to install the package when given one.
  2. 开发人员如何确保他们拥有安装软件包所需的所有信息。
  3. How does a project ensure that the ground they are standing on is solid (i.e. has all the dependencies required).
  4. 项目如何确保他们所站立的地面是坚固的(即具有所需的所有依赖性)。
  5. How easy is it for the user to install the given project.
  6. 用户安装给定项目有多容易。
  7. How easy is it to reproduce a given build.
  8. 重现给定的构建是多么容易。

For a single language, single platform project, the first question posed is trivially answered - just use whatever package management solution implemented for that language (i.e. Python - PyPI, Node.js - npm). The other questions generally fall into place.

对于单一语言,单一平台项目,提出的第一个问题很简单 - 只需使用为该语言实现的任何包管理解决方案(即Python-PyPI,Node.js -npm)。其他问题通常都会落实到位。

For a multi-language, multi-platform, this is where it completely falls apart. Long story short, this is why projects generally have multiple sets of instructions for whatever version of Windows, Mac or Linux (of various mainstream distros) for the installation of their software, especially in binary form, to address the third question so that it's easy for the end user (which usually end up being doable, but not necessarily easy).

对于多语言,多平台,这是完全崩溃的地方。简而言之,这就是为什么项目通常会为任何版本的Windows,Mac或Linux(各种主流发行版)提供多套指令,用于安装他们的软件,特别是二进制形式,以解决第三个问题,这样很容易对于最终用户(通常最终可行,但不一定容易)。

For developers and system integrators, who are definitely more interested in questions 2 and 4, they likely want an automation script for whatever platform they are on. This is kind of what you already got, except it only works on Linux, or wherever Bash is available. Now this also begs the question: How does one ensure Bash is available on the system? Some system administrators may prefer some other form of shell, so we are again back to the same problem, but instead of asking if Node.js is there, we have to ask if Bash is there. So this problem is basically unsolvable unless a line is drawn.

对于对问题2和问题4更感兴趣的开发人员和系统集成商,他们可能希望在他们所处的任何平台上使用自动化脚本。这是你已经得到的,除了它只适用于Linux,或者Bash可用的地方。现在这也引出了一个问题:如何确保Bash在系统上可用?一些系统管理员可能更喜欢其他形式的shell,所以我们再次回到同样的问题,但不是询问是否存在Node.js,我们不得不问Bash是否存在。所以除非绘制一条线,否则这个问题基本上是无法解决的。

The first question hasn't really been mentioned yet, and I am going to make this fun by asking it in this manner: given a package from npm that requires a Python package, how does one specify a dependency on PyPI? Turns out such a project exists: nopy. I have not use it before, but at a casual glance it provide a specific way to record dependency information in the package.json file, which is the standard method for Node.js packages convey information about itself. Do note that it has a non-standard way of managing Python packages, however given that it does use whatever Python available, it will probably do the right thing if a Python virtual environment was activated. Doing it this way also mean that Node.js package dependants may have a way to figure out the required Python dependencies that have been declared by their Node.js dependencies, but note that without something else on top of it (or some other ground/line), there is no way to assert from within the environment that it will guarantee to do what needs to be done.

第一个问题还没有真正提到,我将以这种方式提出这个问题:给定一个需要Python包的npm包,如何指定对PyPI的依赖?原来这样的项目存在:nopy。我以前没有使用它,但随便一眼就提供了一种在package.json文件中记录依赖信息的特定方法,这是Node.js包传达自身信息的标准方法。请注意,它具有管理Python包的非标准方式,但是考虑到它确实使用了可用的Python,如果激活Python虚拟环境,它可能会做正确的事情。这样做也意味着Node.js包依赖者可能有办法找出他们的Node.js依赖项已声明所需的Python依赖项,但请注意,除了它之外没有别的东西(或其他一些地面/ ()),没有办法在环境中断言它将保证做需要做的事情。

Naturally, coming back to Python, this question has been asked before (but not necessarily in a useful way specifically to you as the contexts are all different):

当然,回到Python,之前已经问过这个问题(但不一定是特别针对你的有用方式,因为上下文都是不同的):

Anyway, calmjs only solves problem 1 - i.e. let developers have the ability to figure out the Node.js packages they need from a given Python package, and to a lesser extent assist with problem 4, but without the guarantees of 2 and 3 it is not exactly solved.

无论如何,calmjs只解决问题1 - 即让开发人员能够从给定的Python包中找出他们需要的Node.js包,并在较小程度上协助解决问题4,但是没有2和3的保证它是没有完全解决。

From within Python dependency management point of view, there is no way to guarantee that the required external tools are available until their usage are attempted (it will either work or not work, and likewise from Node.js as explained earlier, and thank you for your question on the issue tracker, by the way). If this particular guarantee is required, many system integrators would make use of their favorite operating system level package manager (i.e. dpkg/apt, rpm/yum, or whatever else on Linux, Homebrew on OS X, perhaps Chocolatey on Windows), but again this does require further dependencies to install. Hence if multiple platforms are to be supported, there is no general solutions unless one were to reduce the scope, or have some kind of standard continuous integration that would generate working installation images that one would then deploy onto whatever virtualisation services the organisation uses (just an example).

从Python依赖关系管理的角度来看,没有办法保证所需的外部工具在尝试使用之前是可用的(它将工作或不工作,同样来自Node.js,如前所述,并且感谢您顺便问一下你在问题跟踪器上的问题。如果需要这种特殊保证,许多系统集成商将使用他们喜欢的操作系统级别的包管理器(即dpkg / apt,rpm / yum或Linux上的其他任何东西,OS X上的Homebrew,也许是Windows上的Chocolatey),但是这需要进一步的依赖安装。因此,如果要支持多个平台,除非有人要缩小范围,或者采用某种标准的持续集成来生成工作安装映像,然后将其部署到组织使用的任何虚拟化服务上,否则没有通用的解决方案(只是一个例子)。

Without all the specific baselines, this question is very difficult to provide a satisfactory answer for all parties involved.

如果没有所有具体的基线,这个问题很难为所有相关方提供满意的答案。

#2


5  

What you describe is certainly not the simplest problem. For Python alone, companies came up with all kinds of packaging methods (e.g. Twitter's pex, Spotify's dh-virtualenv, or even grocker, which shifts Python deployments into container space).

你所描述的当然不是最简单的问题。对于Python而言,公司提出了各种打包方法(例如Twitter的pex,Spotify的dh-virtualenv,甚至是grocker,它将Python部署转移到容器空间)。

That said, one very hacky way, I could think of would be:

那就是说,一种非常黑客的方式,我能想到的是:

  • Find a way to compile your Node apps into a single binary. There is pkg (a blogpost about it), which
  • 找到一种将Node应用程序编译为单个二进制文件的方法。有pkg(关于它的博客),其中

[...] enables you to package your Node.js project into an executable that can be run even on devices without Node.js installed.

[...]使您能够将Node.js项目打包成可执行文件,甚至可以在没有安装Node.js的设备上运行。

This way the Node tools would be take care of.

通过这种方式,Node工具可以处理。

  • Next, take these binary blobs and add them (somehow) as scripts to your python package, so that they get distributed along with your package and find their place, where your actual python package can pick them up and execute them.
  • 接下来,使用这些二进制blob并将它们(以某种方式)作为脚本添加到python包中,以便它们与您的包一起分发并找到它们的位置,您的实际python包可以在其中拾取并执行它们。

Upsides:

上升空间:

  • User do not need any nodejs on their machine (which is probably expected, when you just want to pip install something).
  • 用户不需要在他们的机器上使用任何nodejs(当你只是想要安装一些东西时,这可能是预期的)。
  • Your package gets more self-contained by including binaries.
  • 通过包含二进制文件,您的包可以更加自包含。

Downsides:

缺点:

  • Your python package will include binary, which is less common.
  • 你的python包将包含二进制文件,这不常见。
  • Containing binaries means that you will have to prepare versions for all platforms. Not impossible, but more work.
  • 包含二进制文件意味着您必须为所有平台准备版本。并非不可能,但更多的工作。
  • You will have to expand your package creation pipeline (Makefile, setup.py, or other) a bit to make this simple and repeatable.
  • 您必须稍微扩展包创建管道(Makefile,setup.py或其他)以使其简单且可重复。
  • Your package gets significantly larger (which is probably the least of the problems today).
  • 你的包裹变得更大(这可能是今天最小的问题)。

#1


7  

(Disclaimer: I am the author of calmjs)

(免责声明:我是calmjs的作者)

After mulling over this particular issue for another few days, this question actually encapsulates multiple problems which may or may not be orthogonal to each other depending on one's given point of view, given some of the following (the list is not exhaustive)

在考虑了这个特定问题几天之后,这个问题实际上包含了多个问题,这些问题可能会或可能不会相互正交,这取决于一个人给定的观点,给出以下一些(列表并非详尽无遗)

  1. How can a developer ensure that they have all the information required to install the package when given one.
  2. 开发人员如何确保他们拥有安装软件包所需的所有信息。
  3. How does a project ensure that the ground they are standing on is solid (i.e. has all the dependencies required).
  4. 项目如何确保他们所站立的地面是坚固的(即具有所需的所有依赖性)。
  5. How easy is it for the user to install the given project.
  6. 用户安装给定项目有多容易。
  7. How easy is it to reproduce a given build.
  8. 重现给定的构建是多么容易。

For a single language, single platform project, the first question posed is trivially answered - just use whatever package management solution implemented for that language (i.e. Python - PyPI, Node.js - npm). The other questions generally fall into place.

对于单一语言,单一平台项目,提出的第一个问题很简单 - 只需使用为该语言实现的任何包管理解决方案(即Python-PyPI,Node.js -npm)。其他问题通常都会落实到位。

For a multi-language, multi-platform, this is where it completely falls apart. Long story short, this is why projects generally have multiple sets of instructions for whatever version of Windows, Mac or Linux (of various mainstream distros) for the installation of their software, especially in binary form, to address the third question so that it's easy for the end user (which usually end up being doable, but not necessarily easy).

对于多语言,多平台,这是完全崩溃的地方。简而言之,这就是为什么项目通常会为任何版本的Windows,Mac或Linux(各种主流发行版)提供多套指令,用于安装他们的软件,特别是二进制形式,以解决第三个问题,这样很容易对于最终用户(通常最终可行,但不一定容易)。

For developers and system integrators, who are definitely more interested in questions 2 and 4, they likely want an automation script for whatever platform they are on. This is kind of what you already got, except it only works on Linux, or wherever Bash is available. Now this also begs the question: How does one ensure Bash is available on the system? Some system administrators may prefer some other form of shell, so we are again back to the same problem, but instead of asking if Node.js is there, we have to ask if Bash is there. So this problem is basically unsolvable unless a line is drawn.

对于对问题2和问题4更感兴趣的开发人员和系统集成商,他们可能希望在他们所处的任何平台上使用自动化脚本。这是你已经得到的,除了它只适用于Linux,或者Bash可用的地方。现在这也引出了一个问题:如何确保Bash在系统上可用?一些系统管理员可能更喜欢其他形式的shell,所以我们再次回到同样的问题,但不是询问是否存在Node.js,我们不得不问Bash是否存在。所以除非绘制一条线,否则这个问题基本上是无法解决的。

The first question hasn't really been mentioned yet, and I am going to make this fun by asking it in this manner: given a package from npm that requires a Python package, how does one specify a dependency on PyPI? Turns out such a project exists: nopy. I have not use it before, but at a casual glance it provide a specific way to record dependency information in the package.json file, which is the standard method for Node.js packages convey information about itself. Do note that it has a non-standard way of managing Python packages, however given that it does use whatever Python available, it will probably do the right thing if a Python virtual environment was activated. Doing it this way also mean that Node.js package dependants may have a way to figure out the required Python dependencies that have been declared by their Node.js dependencies, but note that without something else on top of it (or some other ground/line), there is no way to assert from within the environment that it will guarantee to do what needs to be done.

第一个问题还没有真正提到,我将以这种方式提出这个问题:给定一个需要Python包的npm包,如何指定对PyPI的依赖?原来这样的项目存在:nopy。我以前没有使用它,但随便一眼就提供了一种在package.json文件中记录依赖信息的特定方法,这是Node.js包传达自身信息的标准方法。请注意,它具有管理Python包的非标准方式,但是考虑到它确实使用了可用的Python,如果激活Python虚拟环境,它可能会做正确的事情。这样做也意味着Node.js包依赖者可能有办法找出他们的Node.js依赖项已声明所需的Python依赖项,但请注意,除了它之外没有别的东西(或其他一些地面/ ()),没有办法在环境中断言它将保证做需要做的事情。

Naturally, coming back to Python, this question has been asked before (but not necessarily in a useful way specifically to you as the contexts are all different):

当然,回到Python,之前已经问过这个问题(但不一定是特别针对你的有用方式,因为上下文都是不同的):

Anyway, calmjs only solves problem 1 - i.e. let developers have the ability to figure out the Node.js packages they need from a given Python package, and to a lesser extent assist with problem 4, but without the guarantees of 2 and 3 it is not exactly solved.

无论如何,calmjs只解决问题1 - 即让开发人员能够从给定的Python包中找出他们需要的Node.js包,并在较小程度上协助解决问题4,但是没有2和3的保证它是没有完全解决。

From within Python dependency management point of view, there is no way to guarantee that the required external tools are available until their usage are attempted (it will either work or not work, and likewise from Node.js as explained earlier, and thank you for your question on the issue tracker, by the way). If this particular guarantee is required, many system integrators would make use of their favorite operating system level package manager (i.e. dpkg/apt, rpm/yum, or whatever else on Linux, Homebrew on OS X, perhaps Chocolatey on Windows), but again this does require further dependencies to install. Hence if multiple platforms are to be supported, there is no general solutions unless one were to reduce the scope, or have some kind of standard continuous integration that would generate working installation images that one would then deploy onto whatever virtualisation services the organisation uses (just an example).

从Python依赖关系管理的角度来看,没有办法保证所需的外部工具在尝试使用之前是可用的(它将工作或不工作,同样来自Node.js,如前所述,并且感谢您顺便问一下你在问题跟踪器上的问题。如果需要这种特殊保证,许多系统集成商将使用他们喜欢的操作系统级别的包管理器(即dpkg / apt,rpm / yum或Linux上的其他任何东西,OS X上的Homebrew,也许是Windows上的Chocolatey),但是这需要进一步的依赖安装。因此,如果要支持多个平台,除非有人要缩小范围,或者采用某种标准的持续集成来生成工作安装映像,然后将其部署到组织使用的任何虚拟化服务上,否则没有通用的解决方案(只是一个例子)。

Without all the specific baselines, this question is very difficult to provide a satisfactory answer for all parties involved.

如果没有所有具体的基线,这个问题很难为所有相关方提供满意的答案。

#2


5  

What you describe is certainly not the simplest problem. For Python alone, companies came up with all kinds of packaging methods (e.g. Twitter's pex, Spotify's dh-virtualenv, or even grocker, which shifts Python deployments into container space).

你所描述的当然不是最简单的问题。对于Python而言,公司提出了各种打包方法(例如Twitter的pex,Spotify的dh-virtualenv,甚至是grocker,它将Python部署转移到容器空间)。

That said, one very hacky way, I could think of would be:

那就是说,一种非常黑客的方式,我能想到的是:

  • Find a way to compile your Node apps into a single binary. There is pkg (a blogpost about it), which
  • 找到一种将Node应用程序编译为单个二进制文件的方法。有pkg(关于它的博客),其中

[...] enables you to package your Node.js project into an executable that can be run even on devices without Node.js installed.

[...]使您能够将Node.js项目打包成可执行文件,甚至可以在没有安装Node.js的设备上运行。

This way the Node tools would be take care of.

通过这种方式,Node工具可以处理。

  • Next, take these binary blobs and add them (somehow) as scripts to your python package, so that they get distributed along with your package and find their place, where your actual python package can pick them up and execute them.
  • 接下来,使用这些二进制blob并将它们(以某种方式)作为脚本添加到python包中,以便它们与您的包一起分发并找到它们的位置,您的实际python包可以在其中拾取并执行它们。

Upsides:

上升空间:

  • User do not need any nodejs on their machine (which is probably expected, when you just want to pip install something).
  • 用户不需要在他们的机器上使用任何nodejs(当你只是想要安装一些东西时,这可能是预期的)。
  • Your package gets more self-contained by including binaries.
  • 通过包含二进制文件,您的包可以更加自包含。

Downsides:

缺点:

  • Your python package will include binary, which is less common.
  • 你的python包将包含二进制文件,这不常见。
  • Containing binaries means that you will have to prepare versions for all platforms. Not impossible, but more work.
  • 包含二进制文件意味着您必须为所有平台准备版本。并非不可能,但更多的工作。
  • You will have to expand your package creation pipeline (Makefile, setup.py, or other) a bit to make this simple and repeatable.
  • 您必须稍微扩展包创建管道(Makefile,setup.py或其他)以使其简单且可重复。
  • Your package gets significantly larger (which is probably the least of the problems today).
  • 你的包裹变得更大(这可能是今天最小的问题)。