scikit-learn学习笔记

时间:2022-08-22 23:51:41

参考资料:

python机器学习库scikit-learn简明教程之:随机森林 

http://nbviewer.jupyter.org/github/donnemartin/data-science-ipython-notebooks/blob/master/kaggle/titanic.ipynb

Python中的支持向量机SVM的使用(有实例) 

基于SIFT特征和SVM的图像分类

scikit-learn sklearn 0.18 官方文档中文版  

只需十四步:从零开始掌握 Python 机器学习(附资源) 

https://github.com/jakevdp/sklearn_pycon2015 

官网:http://scikit-learn.org/stable/

Scikit-learn (sklearn) 优雅地学会机器学习 (莫烦 Python 教程) 

python机器学习库scikit-learn简明教程之:AdaBoost算法 

http://www.docin.com/p-1775095945.html

https://www.bilibili.com/video/av22530538/?p=6 

三维点云目标提取总结 

https://github.com/Fdevmsy/Image_Classification_with_5_methods

https://github.com/huangchuchuan/SVM-HOG-images-classifier

https://blog.csdn.net/always2015/article/details/47100713

DBScan https://www.cnblogs.com/pinard/p/6208966.html

1.KNN的使用

carto@cartoPC:~$ python
Python 2.7.12 (default, Dec  4 2017, 14:50:18) 
[GCC 5.4.0 20160609] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> from sklearn import datasets
>>> from sklearn.cross_validation import train_test_split
/usr/local/lib/python2.7/dist-packages/sklearn/cross_validation.py:41: DeprecationWarning: This module was deprecated in version 0.18 in favor of the model_selection module into which all the refactored classes and functions are moved. Also note that the interface of the new CV iterators are different from that of this module. This module will be removed in 0.20.
  "This module will be removed in 0.20.", DeprecationWarning)
>>> from sklearn.neighbors import KNeighborsClassifier
>>> iris=datasets.load_iris()
>>> iris_X=iris.data
>>> iris_y=iris.target
>>> print(iris_X[:2,:])
[[ 5.1  3.5  1.4  0.2]
 [ 4.9  3.   1.4  0.2]]
>>> print(iris_y)
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2
 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
 2 2]
>>> X_train,X_test,y_train,y_test=train_test_split(iris_X,iris_y,test_size=0.3)
>>> print(y_train)
[2 1 0 0 0 2 0 0 1 1 2 2 1 1 2 2 2 0 1 0 2 2 1 1 1 1 1 0 1 1 0 2 1 0 0 2 2
 0 0 2 1 0 0 2 1 2 1 2 1 1 1 2 1 2 0 2 0 1 1 2 1 0 1 2 2 0 2 2 1 0 1 1 2 2
 1 0 1 1 2 0 0 1 0 1 0 2 0 1 1 0 2 1 2 0 2 0 2 0 2 1 0 2 0 2 2]
>>> knn=KNeighborsClassifier()
>>> knn.fit()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: fit() takes exactly 3 arguments (1 given)
>>> knn.fit(X_train,y_train)
KNeighborsClassifier(algorithm='auto', leaf_size=30, metric='minkowski',
           metric_params=None, n_jobs=1, n_neighbors=5, p=2,
           weights='uniform')
>>> print(knn.predict(X_test))
[1 1 2 0 1 1 1 1 2 0 0 2 0 1 0 0 0 1 2 2 2 2 0 1 2 0 1 2 2 0 1 2 0 0 1 0 0
 0 0 1 0 1 1 2 0]
>>> print(y_test)
[1 1 2 0 1 1 1 1 2 0 0 2 0 1 0 0 0 1 2 2 2 2 0 1 2 0 1 2 2 0 2 2 0 0 2 0 0
 0 0 1 0 1 1 2 0]
>>> 

 2.SVC的使用

import pandas as pd
import numpy as np
from sklearn import datasets
from sklearn import svm
from sklearn.model_selection import train_test_split
def load_data():
	iris=datasets.load_iris()
	X_train,X_test,y_train,y_test=train_test_split(
	iris.data,iris.target,test_size=0.10,random_state=0)
	return X_train,X_test,y_train,y_test

def test_LinearSVC(X_train,X_test,y_train,y_test):
	cls=svm.LinearSVC()
	cls.fit(X_train,y_train)
	print('Coefficients:%s, intercept %s'%(cls.coef_,cls.intercept_))
	print('Score: %.2f' %cls.score(X_test,y_test))

if __name__=="__main__":
	X_train,X_test,y_train,y_test=load_data()
	test_LinearSVC(X_train,X_test,y_train,y_test)

 调用

carto@cartoPC:~/python_ws$ python svmtest2.py
Coefficients:[[ 0.18424504  0.45123335 -0.80794237 -0.45071267]
 [-0.13381099 -0.75235247  0.57223898 -1.11494325]
 [-0.7943601  -0.95801711  1.31465593  1.8169808 ]], intercept [ 0.10956304  1.86593164 -1.72576407]
Score: 1.00