Is there a recommended package for machine learning in Python?
在Python中有推荐的机器学习包吗?
I have previous experience in implementing a variety of machine learning and statistical algorithms in C++ and MATLAB, but having done some work in Python I'm curious about the available packages for Python.
我以前在c++和MATLAB中实现过各种机器学习和统计算法,但是在Python中做过一些工作之后,我对Python的可用包感到好奇。
14 个解决方案
#1
43
AFAIK, Orange may be the best choice at the moment.
PyML is good too.
PyMC for Bayesian estimation.
and, there is a Book "Machine Learning: An Algorithmic Perspective", There are lots of Python code examples in the book, maybe it is worth reading.
and there is a blog post: Pragmatic Classification with Python.
Just my two cents.
当然,现在橙色可能是最好的选择。PyML也不错。PyMC贝叶斯估计。还有一本书《机器学习:算法视角》,书中有很多Python代码示例,可能值得一读。还有一篇博客文章:用Python进行实用分类。只是我两美分。
#2
98
There is also scikit-learn (BSD, with only dependencies on numpy & scipy). It includes various supervised learning algorithms such as:
还有scikit-learn (BSD,仅依赖于numpy和scipy)。它包括各种监督学习算法,如:
- SVM based on libsvm and linear with scipy.sparse bindings for wide features datasets
- 基于libsvm的支持向量机和基于scipy的线性支持向量机。用于宽特性数据集的稀疏绑定
- bayesian methods
- 贝叶斯方法
- HMMs
- 摘要
- L1 and L1+L2 regularized regression methods aka Lasso and Elastic Net models implemented with algorithms such as LARS and coordinate descent
- L1和L1+L2正则化回归方法即Lasso和弹性网络模型,采用LARS和坐标下降等算法实现
It also features unsupervised clustering algorithms such as:
它还具有非监督的聚类算法,如:
- kmeans++
- kmeans + +
- meanshift
- meanshift
- affinity propagation
- 亲和力传播
- spectral clustering
- 谱聚类
And also other tools such as:
以及其他工具,如:
- feature extractors for text content (token and char ngrams + hashing vectorizer)
- 文本内容的特征提取器(令牌和字符图+散列向量机)
- univariate feature selections
- 单变量特征选择
- a simple pipe line tool
- 一个简单的管道工具
- numerous implementations of cross validation strategies
- 交叉验证策略的许多实现
- performance metrics evaluation and ploting (ROC curve, AUC, confusion matrix, ...)
- 性能指标评估和绘图(ROC曲线,AUC, confusion matrix,…)
- a grid search utility to perform hyper-parameters tuning using parallel cross validation
- 使用并行交叉验证执行超参数优化的网格搜索实用程序
- integration with joblib for caching partial results when working in interactive environment (e.g. using ipython)
- 与joblib集成,以便在交互环境(例如使用ipython)中缓存部分结果。
Each algorithm implementation comes with sample programs demonstrating its usage either on toy data or real life datasets.
每个算法实现都带有示例程序,演示了它在玩具数据或真实生活数据集中的用法。
Also, the official source repository is hosted on github so please feel free to contribute bugfixes and improvement using the regular pull request feature for interactive code review.
另外,官方的源存储库托管在github上,因此,请使用交互式代码评审的常规拉请求特性,免费提供bug修复和改进。
#3
11
A general user friendly package is Orange -- kind of like Weka or RapidMiner, if you're familiar with those.
一般用户友好的软件包是橙色的——有点像Weka或RapidMiner,如果你熟悉的话。
Other than that, there's a variety of packages and toolkits for various tasks. You should consult the Python packages listed on mloss as a starting point.
除此之外,还有用于各种任务的各种包和工具包。您应该参考mloss上列出的Python包作为起点。
#4
7
You might want to look at:
你可能会想看看:
http://www.shogun-toolbox.org/, which has interfaces for multiple languages, including python. There's also http://www.pybrain.org/, which is (I believe) a native implementation of ML algorithms. Hope that helps.
http://www.shogun-toolbox.org/,它有多种语言的接口,包括python。还有http://www.pybrain.org/,我认为它是ML算法的一个本地实现。希望有帮助。
#5
5
For Support Vector Machines, take a look at LibSVM which among others, have Python interface.
对于支持向量机,请查看LibSVM,其中包括Python接口。
#6
5
Deep Learning Tutorials describe how to develop and train deep neural networks. The used library even use Nvidia GPU if available.
深度学习教程描述如何开发和训练深度神经网络。如果可用,使用的库甚至使用Nvidia GPU。
#7
4
Probably related questions at Stack Overflow:
可能在堆栈溢出时相关的问题:
Artificial Inteligence library in python.
python中的人工智能库。
What is the best artificial-intelligence library for Python?
Python最好的人工智能库是什么?
#8
4
I gave Orange a try.
我试了一下橘子。
It's powerful, but if you go through the documentation, you would realize that the author has his own crazy style of writing Python. His code does get pretty cryptic if you are relatively new to Python so I wouldn't recommend Orange unless you are familiar with Python.
它很强大,但是如果您浏览一下文档,您会发现作者有自己疯狂的编写Python的风格。如果您对Python比较陌生,那么他的代码会变得非常神秘,所以我不推荐Orange,除非您熟悉Python。
#10
2
I'm not sure you'd exactly call this machine learning, but the nltk package does Bayesian-style classification of text. You can use learning data and test data to see that it is inferring rules about the data.
我不确定您是否会确切地称之为机器学习,但是nltk包执行的是bayesian样式的文本分类。您可以使用学习数据和测试数据来查看它是关于数据的推断规则。
#11
2
This is a great list done by SciPy, of many well-known Python packages, among others, machine learning related: Artificial intelligence & machine learning
这是SciPy所做的一个很好的列表,其中包括许多著名的Python包,以及与机器学习相关的:人工智能和机器学习
#12
1
If you are looking for neural network, python binding for fann is quite easy to use,and come with tools to train your networks
如果您正在寻找神经网络,那么fann的python绑定很容易使用,并且附带了训练您的网络的工具
#13
1
Take a look at the Modular toolkit for Data Processing (MDP). It implements a couple of algorithms from machine learning and statistics and it's mature and well documented.
请看数据处理模块工具包(MDP)。它从机器学习和统计中实现了两种算法,它是成熟的,并且有良好的文档记录。
#14
#1
43
AFAIK, Orange may be the best choice at the moment.
PyML is good too.
PyMC for Bayesian estimation.
and, there is a Book "Machine Learning: An Algorithmic Perspective", There are lots of Python code examples in the book, maybe it is worth reading.
and there is a blog post: Pragmatic Classification with Python.
Just my two cents.
当然,现在橙色可能是最好的选择。PyML也不错。PyMC贝叶斯估计。还有一本书《机器学习:算法视角》,书中有很多Python代码示例,可能值得一读。还有一篇博客文章:用Python进行实用分类。只是我两美分。
#2
98
There is also scikit-learn (BSD, with only dependencies on numpy & scipy). It includes various supervised learning algorithms such as:
还有scikit-learn (BSD,仅依赖于numpy和scipy)。它包括各种监督学习算法,如:
- SVM based on libsvm and linear with scipy.sparse bindings for wide features datasets
- 基于libsvm的支持向量机和基于scipy的线性支持向量机。用于宽特性数据集的稀疏绑定
- bayesian methods
- 贝叶斯方法
- HMMs
- 摘要
- L1 and L1+L2 regularized regression methods aka Lasso and Elastic Net models implemented with algorithms such as LARS and coordinate descent
- L1和L1+L2正则化回归方法即Lasso和弹性网络模型,采用LARS和坐标下降等算法实现
It also features unsupervised clustering algorithms such as:
它还具有非监督的聚类算法,如:
- kmeans++
- kmeans + +
- meanshift
- meanshift
- affinity propagation
- 亲和力传播
- spectral clustering
- 谱聚类
And also other tools such as:
以及其他工具,如:
- feature extractors for text content (token and char ngrams + hashing vectorizer)
- 文本内容的特征提取器(令牌和字符图+散列向量机)
- univariate feature selections
- 单变量特征选择
- a simple pipe line tool
- 一个简单的管道工具
- numerous implementations of cross validation strategies
- 交叉验证策略的许多实现
- performance metrics evaluation and ploting (ROC curve, AUC, confusion matrix, ...)
- 性能指标评估和绘图(ROC曲线,AUC, confusion matrix,…)
- a grid search utility to perform hyper-parameters tuning using parallel cross validation
- 使用并行交叉验证执行超参数优化的网格搜索实用程序
- integration with joblib for caching partial results when working in interactive environment (e.g. using ipython)
- 与joblib集成,以便在交互环境(例如使用ipython)中缓存部分结果。
Each algorithm implementation comes with sample programs demonstrating its usage either on toy data or real life datasets.
每个算法实现都带有示例程序,演示了它在玩具数据或真实生活数据集中的用法。
Also, the official source repository is hosted on github so please feel free to contribute bugfixes and improvement using the regular pull request feature for interactive code review.
另外,官方的源存储库托管在github上,因此,请使用交互式代码评审的常规拉请求特性,免费提供bug修复和改进。
#3
11
A general user friendly package is Orange -- kind of like Weka or RapidMiner, if you're familiar with those.
一般用户友好的软件包是橙色的——有点像Weka或RapidMiner,如果你熟悉的话。
Other than that, there's a variety of packages and toolkits for various tasks. You should consult the Python packages listed on mloss as a starting point.
除此之外,还有用于各种任务的各种包和工具包。您应该参考mloss上列出的Python包作为起点。
#4
7
You might want to look at:
你可能会想看看:
http://www.shogun-toolbox.org/, which has interfaces for multiple languages, including python. There's also http://www.pybrain.org/, which is (I believe) a native implementation of ML algorithms. Hope that helps.
http://www.shogun-toolbox.org/,它有多种语言的接口,包括python。还有http://www.pybrain.org/,我认为它是ML算法的一个本地实现。希望有帮助。
#5
5
For Support Vector Machines, take a look at LibSVM which among others, have Python interface.
对于支持向量机,请查看LibSVM,其中包括Python接口。
#6
5
Deep Learning Tutorials describe how to develop and train deep neural networks. The used library even use Nvidia GPU if available.
深度学习教程描述如何开发和训练深度神经网络。如果可用,使用的库甚至使用Nvidia GPU。
#7
4
Probably related questions at Stack Overflow:
可能在堆栈溢出时相关的问题:
Artificial Inteligence library in python.
python中的人工智能库。
What is the best artificial-intelligence library for Python?
Python最好的人工智能库是什么?
#8
4
I gave Orange a try.
我试了一下橘子。
It's powerful, but if you go through the documentation, you would realize that the author has his own crazy style of writing Python. His code does get pretty cryptic if you are relatively new to Python so I wouldn't recommend Orange unless you are familiar with Python.
它很强大,但是如果您浏览一下文档,您会发现作者有自己疯狂的编写Python的风格。如果您对Python比较陌生,那么他的代码会变得非常神秘,所以我不推荐Orange,除非您熟悉Python。
#9
#10
2
I'm not sure you'd exactly call this machine learning, but the nltk package does Bayesian-style classification of text. You can use learning data and test data to see that it is inferring rules about the data.
我不确定您是否会确切地称之为机器学习,但是nltk包执行的是bayesian样式的文本分类。您可以使用学习数据和测试数据来查看它是关于数据的推断规则。
#11
2
This is a great list done by SciPy, of many well-known Python packages, among others, machine learning related: Artificial intelligence & machine learning
这是SciPy所做的一个很好的列表,其中包括许多著名的Python包,以及与机器学习相关的:人工智能和机器学习
#12
1
If you are looking for neural network, python binding for fann is quite easy to use,and come with tools to train your networks
如果您正在寻找神经网络,那么fann的python绑定很容易使用,并且附带了训练您的网络的工具
#13
1
Take a look at the Modular toolkit for Data Processing (MDP). It implements a couple of algorithms from machine learning and statistics and it's mature and well documented.
请看数据处理模块工具包(MDP)。它从机器学习和统计中实现了两种算法,它是成熟的,并且有良好的文档记录。