Could you tell me how can I read a file that is inside my Python package?
你能告诉我如何读取Python包中的文件?
My situation
A package that I load has a number of templates (text files used as strings) that I want to load from within the program. But how do I specify the path to such file?
我加载的包有许多我想从程序中加载的模板(用作字符串的文本文件)。但是如何指定此类文件的路径?
Imagine I want to read a file from:
想象一下,我想从以下位置读取文件:
package\templates\temp_file
Some kind of path manipulation? Package base path tracking?
某种路径操纵?包基路径跟踪?
7 个解决方案
#1
-6
[added 2016-06-15: apparently this doesn't work in all situations. please refer to the other answers]
[补充2016-06-15:显然这并不适用于所有情况。请参考其他答案]
import os, mypackage
template = os.path.join(mypackage.__path__[0], 'templates', 'temp_file')
#2
82
Assuming your template is located inside your module's package at this path:
假设您的模板位于此路径的模块包中:
<your_package>/templates/temp_file
the correct way to read your template is to use pkg_resources
package from setuptools distribution:
读取模板的正确方法是使用setuptools发行版中的pkg_resources包:
import pkg_resources
resource_package = __name__ # Could be any module/package name
resource_path = '/'.join(('templates', 'temp_file')) # Do not use os.path.join(), see below
template = pkg_resources.resource_string(resource_package, resource_path)
# or for a file-like stream:
template = pkg_resources.resource_stream(resource_package, resource_path)
Tip:
This will read data even if your distribution is zipped, so you may setzip_safe=True
in yoursetup.py
, and/or use the long-awaitedzipapp
packer from python-3.5 to create self-contained distributions.提示:即使您的发行版已压缩,也会读取数据,因此您可以在setup.py中设置zip_safe = True,和/或使用期待已久的python-3.5中的zipapp打包程序来创建自包含的发行版。
According to the Setuptools/pkg_resources
docs, do not use os.path.join
:
根据Setuptools / pkg_resources文档,不要使用os.path.join:
Basic Resource Access
Note that resource names must be
/
-separated paths and cannot be absolute (i.e. no leading/
) or contain relative names like "..
". Do not useos.path
routines to manipulate resource paths, as they are not filesystem paths.请注意,资源名称必须是/ -separated路径,不能是绝对路径(即没有前导/)或包含相对名称,如“..”。不要使用os.path例程来操作资源路径,因为它们不是文件系统路径。
#3
5
In case you have this structure
如果你有这个结构
lidtk
├── bin
│ └── lidtk
├── lidtk
│ ├── analysis
│ │ ├── char_distribution.py
│ │ └── create_cm.py
│ ├── classifiers
│ │ ├── char_dist_metric_train_test.py
│ │ ├── char_features.py
│ │ ├── cld2
│ │ │ ├── cld2_preds.txt
│ │ │ └── cld2wili.py
│ │ ├── get_cld2.py
│ │ ├── text_cat
│ │ │ ├── __init__.py
│ │ │ ├── REAMDE.md <---------- say you want to get this
│ │ │ └── textcat_ngram.py
│ │ └── tfidf_features.py
│ ├── data
│ │ ├── __init__.py
│ │ ├── create_ml_dataset.py
│ │ ├── download_documents.py
│ │ ├── language_utils.py
│ │ ├── pickle_to_txt.py
│ │ └── wili.py
│ ├── __init__.py
│ ├── get_predictions.py
│ ├── languages.csv
│ └── utils.py
├── README.md
├── setup.cfg
└── setup.py
you need this code:
你需要这个代码:
import pkg_resources
# __name__ in case you're within the package
# - otherwise it would be 'lidtk' in this example as it is the package name
path = 'classifiers/text_cat/REAMDE.md' # always use slash
filepath = pkg_resources.resource_filename(__name__, path)
I'm not too sure about the "always use slash" part. It might come from setuptools
我不太确定“总是使用斜线”部分。它可能来自setuptools
Also notice that if you use paths, you must use a forward slash (/) as the path separator, even if you are on Windows. Setuptools automatically converts slashes to appropriate platform-specific separators at build time
另请注意,如果使用路径,则必须使用正斜杠(/)作为路径分隔符,即使您在Windows上也是如此。 Setuptools在构建时自动将斜杠转换为适当的特定于平台的分隔符
In case you wonder where the documentation is:
如果您想知道文档的位置:
- PEP 0365
- PEP 0365
- https://packaging.python.org/guides/single-sourcing-package-version/
- https://packaging.python.org/guides/single-sourcing-package-version/
#4
2
Every python module in your package has a __file__
attribute
包中的每个python模块都有一个__file__属性
You can use it as:
您可以将其用作:
import os
from mypackage
templates_dir = os.path.join(os.path.dirname(mypackage.__file__), 'templates')
template_file = os.path.join(templates_dir, 'template.txt')
For egg resources see: http://peak.telecommunity.com/DevCenter/PythonEggs#accessing-package-resources
有关鸡蛋资源,请参阅:http://peak.telecommunity.com/DevCenter/PythonEggs#accessing-package-resources
#5
0
assuming you are using an egg file; not extracted:
假设你正在使用鸡蛋文件;未提取:
I "solved" this in a recent project, by using a postinstall script, that extracts my templates from the egg (zip file) to the proper directory in the filesystem. It was the quickest, most reliable solution I found, since working with __path__[0]
can go wrong sometimes (i don't recall the name, but i cam across at least one library, that added something in front of that list!).
我在最近的一个项目中通过使用postinstall脚本“解决”了这个问题,该脚本将我的模板从egg(zip文件)中提取到文件系统中的正确目录。这是我发现的最快,最可靠的解决方案,因为使用__path __ [0]有时可能会出错(我不记得这个名字,但我至少看过一个库,在该列表前添加了一些内容!) 。
Also egg files are usually extracted on the fly to a temporary location called the "egg cache". You can change that location using an environment variable, either before starting your script or even later, eg.
鸡蛋文件通常也会被动态提取到称为“蛋缓存”的临时位置。您可以在启动脚本之前或之后使用环境变量更改该位置,例如。
os.environ['PYTHON_EGG_CACHE'] = path
However there is pkg_resources that might do the job properly.
但是有pkg_resources可以正常工作。
#7
-3
You should be able to import portions of your package's name space with something like:
您应该可以使用以下内容导入部分包名称空间:
from my_package import my_stuff
... you should not need to specify anything that looks like a filename if this is a properly constructed Python package (that's normally abstracted away).
...如果这是一个正确构造的Python包(通常是抽象的),你不应该指定任何看起来像文件名的东西。
#1
-6
[added 2016-06-15: apparently this doesn't work in all situations. please refer to the other answers]
[补充2016-06-15:显然这并不适用于所有情况。请参考其他答案]
import os, mypackage
template = os.path.join(mypackage.__path__[0], 'templates', 'temp_file')
#2
82
Assuming your template is located inside your module's package at this path:
假设您的模板位于此路径的模块包中:
<your_package>/templates/temp_file
the correct way to read your template is to use pkg_resources
package from setuptools distribution:
读取模板的正确方法是使用setuptools发行版中的pkg_resources包:
import pkg_resources
resource_package = __name__ # Could be any module/package name
resource_path = '/'.join(('templates', 'temp_file')) # Do not use os.path.join(), see below
template = pkg_resources.resource_string(resource_package, resource_path)
# or for a file-like stream:
template = pkg_resources.resource_stream(resource_package, resource_path)
Tip:
This will read data even if your distribution is zipped, so you may setzip_safe=True
in yoursetup.py
, and/or use the long-awaitedzipapp
packer from python-3.5 to create self-contained distributions.提示:即使您的发行版已压缩,也会读取数据,因此您可以在setup.py中设置zip_safe = True,和/或使用期待已久的python-3.5中的zipapp打包程序来创建自包含的发行版。
According to the Setuptools/pkg_resources
docs, do not use os.path.join
:
根据Setuptools / pkg_resources文档,不要使用os.path.join:
Basic Resource Access
Note that resource names must be
/
-separated paths and cannot be absolute (i.e. no leading/
) or contain relative names like "..
". Do not useos.path
routines to manipulate resource paths, as they are not filesystem paths.请注意,资源名称必须是/ -separated路径,不能是绝对路径(即没有前导/)或包含相对名称,如“..”。不要使用os.path例程来操作资源路径,因为它们不是文件系统路径。
#3
5
In case you have this structure
如果你有这个结构
lidtk
├── bin
│ └── lidtk
├── lidtk
│ ├── analysis
│ │ ├── char_distribution.py
│ │ └── create_cm.py
│ ├── classifiers
│ │ ├── char_dist_metric_train_test.py
│ │ ├── char_features.py
│ │ ├── cld2
│ │ │ ├── cld2_preds.txt
│ │ │ └── cld2wili.py
│ │ ├── get_cld2.py
│ │ ├── text_cat
│ │ │ ├── __init__.py
│ │ │ ├── REAMDE.md <---------- say you want to get this
│ │ │ └── textcat_ngram.py
│ │ └── tfidf_features.py
│ ├── data
│ │ ├── __init__.py
│ │ ├── create_ml_dataset.py
│ │ ├── download_documents.py
│ │ ├── language_utils.py
│ │ ├── pickle_to_txt.py
│ │ └── wili.py
│ ├── __init__.py
│ ├── get_predictions.py
│ ├── languages.csv
│ └── utils.py
├── README.md
├── setup.cfg
└── setup.py
you need this code:
你需要这个代码:
import pkg_resources
# __name__ in case you're within the package
# - otherwise it would be 'lidtk' in this example as it is the package name
path = 'classifiers/text_cat/REAMDE.md' # always use slash
filepath = pkg_resources.resource_filename(__name__, path)
I'm not too sure about the "always use slash" part. It might come from setuptools
我不太确定“总是使用斜线”部分。它可能来自setuptools
Also notice that if you use paths, you must use a forward slash (/) as the path separator, even if you are on Windows. Setuptools automatically converts slashes to appropriate platform-specific separators at build time
另请注意,如果使用路径,则必须使用正斜杠(/)作为路径分隔符,即使您在Windows上也是如此。 Setuptools在构建时自动将斜杠转换为适当的特定于平台的分隔符
In case you wonder where the documentation is:
如果您想知道文档的位置:
- PEP 0365
- PEP 0365
- https://packaging.python.org/guides/single-sourcing-package-version/
- https://packaging.python.org/guides/single-sourcing-package-version/
#4
2
Every python module in your package has a __file__
attribute
包中的每个python模块都有一个__file__属性
You can use it as:
您可以将其用作:
import os
from mypackage
templates_dir = os.path.join(os.path.dirname(mypackage.__file__), 'templates')
template_file = os.path.join(templates_dir, 'template.txt')
For egg resources see: http://peak.telecommunity.com/DevCenter/PythonEggs#accessing-package-resources
有关鸡蛋资源,请参阅:http://peak.telecommunity.com/DevCenter/PythonEggs#accessing-package-resources
#5
0
assuming you are using an egg file; not extracted:
假设你正在使用鸡蛋文件;未提取:
I "solved" this in a recent project, by using a postinstall script, that extracts my templates from the egg (zip file) to the proper directory in the filesystem. It was the quickest, most reliable solution I found, since working with __path__[0]
can go wrong sometimes (i don't recall the name, but i cam across at least one library, that added something in front of that list!).
我在最近的一个项目中通过使用postinstall脚本“解决”了这个问题,该脚本将我的模板从egg(zip文件)中提取到文件系统中的正确目录。这是我发现的最快,最可靠的解决方案,因为使用__path __ [0]有时可能会出错(我不记得这个名字,但我至少看过一个库,在该列表前添加了一些内容!) 。
Also egg files are usually extracted on the fly to a temporary location called the "egg cache". You can change that location using an environment variable, either before starting your script or even later, eg.
鸡蛋文件通常也会被动态提取到称为“蛋缓存”的临时位置。您可以在启动脚本之前或之后使用环境变量更改该位置,例如。
os.environ['PYTHON_EGG_CACHE'] = path
However there is pkg_resources that might do the job properly.
但是有pkg_resources可以正常工作。
#6
#7
-3
You should be able to import portions of your package's name space with something like:
您应该可以使用以下内容导入部分包名称空间:
from my_package import my_stuff
... you should not need to specify anything that looks like a filename if this is a properly constructed Python package (that's normally abstracted away).
...如果这是一个正确构造的Python包(通常是抽象的),你不应该指定任何看起来像文件名的东西。