I've read this post about how to use OpenCV's HOG-based pedestrian detector: How can I detect and track people using OpenCV?
我读过这篇关于如何使用OpenCV基于霍格的行人检测器的文章:我如何使用OpenCV检测和跟踪人们?
I want to use HOG for detecting other types of objects in images (not just pedestrians). However, the Python binding of HOGDetectMultiScale doesn't seem to give access to the actual HOG features.
我想用HOG来检测图像中的其他类型的物体(不仅仅是行人)。然而,HOGDetectMultiScale的Python绑定似乎无法访问实际的HOG特征。
Is there any way to use Python + OpenCV to extract the HOG features directly from any image?
有没有办法用Python + OpenCV直接从任何图像中提取HOG特征?
7 个解决方案
#1
7
If you want fast Python code for HOG features, I've ported the code to Cython: https://github.com/cvondrick/pyvision/blob/master/vision/features.pyx
如果您想要快速的Python代码,我已经将代码移植到Cython: https://github.com/cvondrick/pyvision/blob/master/vision/features.pyx。
#2
118
In python opencv you can compute hog like this:
在python opencv中,可以这样计算hog:
import cv2
hog = cv2.HOGDescriptor()
im = cv2.imread(sample)
h = hog.compute(im)
#3
32
1. Get Inbuilt Documentation: Following command on your python console will help you know the structure of class HOGDescriptor:
1。获取内置文档:在python控制台执行以下命令将帮助您了解类HOGDescriptor的结构:
import cv2;
help(cv2.HOGDescriptor())
2. Example Code: Here is a snippet of code to initialize an cv2.HOGDescriptor with different parameters (The terms I used here are standard terms which are well defined in OpenCV documentation here):
2。示例代码:这里是初始化cv2的代码片段。具有不同参数的HOGDescriptor(这里使用的术语是OpenCV文档中定义良好的标准术语):
import cv2
image = cv2.imread("test.jpg",0)
winSize = (64,64)
blockSize = (16,16)
blockStride = (8,8)
cellSize = (8,8)
nbins = 9
derivAperture = 1
winSigma = 4.
histogramNormType = 0
L2HysThreshold = 2.0000000000000001e-01
gammaCorrection = 0
nlevels = 64
hog = cv2.HOGDescriptor(winSize,blockSize,blockStride,cellSize,nbins,derivAperture,winSigma,
histogramNormType,L2HysThreshold,gammaCorrection,nlevels)
#compute(img[, winStride[, padding[, locations]]]) -> descriptors
winStride = (8,8)
padding = (8,8)
locations = ((10,20),)
hist = hog.compute(image,winStride,padding,locations)
3. Reasoning: The resultant hog descriptor will have dimension as: 9 orientations X (4 corner blocks that get 1 normalization + 6x4 blocks on the edges that get 2 normalizations + 6x6 blocks that get 4 normalizations) = 1764. as I have given only one location for hog.compute().
3所示。推理:生成的hog描述符的维度为:9个朝向X(4个角块得到1个归一化+ 6x4块得到2个正化+ 6x6块得到4个正化)= 1764。因为我只给了hog.compute()一个位置。
4. One more way to initialize is from xml file which contains all parameter values:
4所示。初始化的另一种方法是来自包含所有参数值的xml文件:
hog = cv2.HOGDescriptor("hog.xml")
To get an xml file one can do following:
要获得一个xml文件,可以执行以下操作:
hog = cv2.HOGDescriptor()
hog.save("hog.xml")
and edit the respective parameter values in xml file.
并在xml文件中编辑相应的参数值。
#4
9
Despite the fact that exist a method as said in previous answers:
尽管存在一种方法,如前面的回答所述:
hog = cv2.HOGDescriptor()
猪= cv2.HOGDescriptor()
I would like to post a python implementation you can find on opencv's examples directory, hoping it can be useful to understand HOG funcionallity:
我想发布一个python实现,您可以在opencv的示例目录中找到它,希望它能有助于理解HOG funcionallity:
def hog(img):
gx = cv2.Sobel(img, cv2.CV_32F, 1, 0)
gy = cv2.Sobel(img, cv2.CV_32F, 0, 1)
mag, ang = cv2.cartToPolar(gx, gy)
bin_n = 16 # Number of bins
bin = np.int32(bin_n*ang/(2*np.pi))
bin_cells = []
mag_cells = []
cellx = celly = 8
for i in range(0,img.shape[0]/celly):
for j in range(0,img.shape[1]/cellx):
bin_cells.append(bin[i*celly : i*celly+celly, j*cellx : j*cellx+cellx])
mag_cells.append(mag[i*celly : i*celly+celly, j*cellx : j*cellx+cellx])
hists = [np.bincount(b.ravel(), m.ravel(), bin_n) for b, m in zip(bin_cells, mag_cells)]
hist = np.hstack(hists)
# transform to Hellinger kernel
eps = 1e-7
hist /= hist.sum() + eps
hist = np.sqrt(hist)
hist /= norm(hist) + eps
return hist
Regards.
的问候。
#5
7
Here is a solution that uses only OpenCV:
这里有一个只使用OpenCV的解决方案:
import numpy as np
import cv2
import matplotlib.pyplot as plt
img = cv2.cvtColor(cv2.imread("/home/me/Downloads/cat.jpg"),
cv2.COLOR_BGR2GRAY)
cell_size = (8, 8) # h x w in pixels
block_size = (2, 2) # h x w in cells
nbins = 9 # number of orientation bins
# winSize is the size of the image cropped to an multiple of the cell size
hog = cv2.HOGDescriptor(_winSize=(img.shape[1] // cell_size[1] * cell_size[1],
img.shape[0] // cell_size[0] * cell_size[0]),
_blockSize=(block_size[1] * cell_size[1],
block_size[0] * cell_size[0]),
_blockStride=(cell_size[1], cell_size[0]),
_cellSize=(cell_size[1], cell_size[0]),
_nbins=nbins)
n_cells = (img.shape[0] // cell_size[0], img.shape[1] // cell_size[1])
hog_feats = hog.compute(img)\
.reshape(n_cells[1] - block_size[1] + 1,
n_cells[0] - block_size[0] + 1,
block_size[0], block_size[1], nbins) \
.transpose((1, 0, 2, 3, 4)) # index blocks by rows first
# hog_feats now contains the gradient amplitudes for each direction,
# for each cell of its group for each group. Indexing is by rows then columns.
gradients = np.zeros((n_cells[0], n_cells[1], nbins))
# count cells (border cells appear less often across overlapping groups)
cell_count = np.full((n_cells[0], n_cells[1], 1), 0, dtype=int)
for off_y in range(block_size[0]):
for off_x in range(block_size[1]):
gradients[off_y:n_cells[0] - block_size[0] + off_y + 1,
off_x:n_cells[1] - block_size[1] + off_x + 1] += \
hog_feats[:, :, off_y, off_x, :]
cell_count[off_y:n_cells[0] - block_size[0] + off_y + 1,
off_x:n_cells[1] - block_size[1] + off_x + 1] += 1
# Average gradients
gradients /= cell_count
# Preview
plt.figure()
plt.imshow(img, cmap='gray')
plt.show()
bin = 5 # angle is 360 / nbins * direction
plt.pcolor(gradients[:, :, bin])
plt.gca().invert_yaxis()
plt.gca().set_aspect('equal', adjustable='box')
plt.colorbar()
plt.show()
I have used HOG descriptor computation and visualization to understand the data layout and vectorized the loops over groups.
我使用了HOG描述符计算和可视化来理解数据布局,并在组上对循环进行矢量化。
#6
1
I would disagree with the argument of peakxu. The HOG detector in the end is "just" a rigid linear filter. any degrees of freedom in the "object" (i.e. persons) lead to bluring in the detector, and are not actually handled by it. There is an extension of this detector using latent SVMs that does explicitly handle dgrees of freedom by introducing structural constraints between independent parts (i.e. head, arms, etc) as well as allowing for multiple appearances per object (i.e. frontal people and sideways people...).
我不同意匹克许的观点。最后的HOG检测器是一个严格的线性过滤器。“对象”(即人)中的任何*度都会在探测器中引起闪烁,实际上它并没有对其进行处理。这个探测器的一个扩展是使用潜在的SVMs,它通过引入独立部件(如头部、手臂等)之间的结构约束,以及允许每个物体(如正面人和侧面人…)出现多个外观来显式地处理*度。
Regarding the HOG detector in opencv: In theory you can upload another detector to be used with the features, but you cannot afaik get the features themselves. thus, if you have a trained detector (i.e. a class specific linear filter) you should be able to upload that into the detector to get the fast detections performance of opencv. that said it should be easy to hack the opencv source code to provide this access and propose this patch back to the maintainers.
关于opencv中的HOG检测器:理论上,您可以上传另一个检测器来使用这些特性,但是您不能自己获取这些特性。因此,如果您有一个经过训练的检测器(即特定于类的线性过滤器),您应该能够将其上载到检测器中,以获得opencv的快速检测性能。也就是说,很容易破解opencv源代码来提供这种访问,并将这个补丁提交给维护人员。
#7
-9
I would not recommend using HOG features for detecting objects other than pedestrians. In the original HOG paper by Dalal and Triggs, they specifically mentioned that their detector is built around pedestrian detection in allowing for significant degrees of freedom in the limbs while using strong structural hints around human body.
我不建议使用HOG特征来检测行人以外的物体。在Dalal和Triggs的原始猪皮纸中,他们特别提到,他们的探测器是围绕行人检测而建立的,在使用强烈的人体结构暗示的同时,可以在肢体上获得显著的*度。
Instead, try looking at OpenCV's HaarDetectObjects. You can learn how to train your own cascades here.
相反,你可以看看OpenCV的HaarDetectObjects。你可以在这里学习如何训练你自己的瀑布。
#1
7
If you want fast Python code for HOG features, I've ported the code to Cython: https://github.com/cvondrick/pyvision/blob/master/vision/features.pyx
如果您想要快速的Python代码,我已经将代码移植到Cython: https://github.com/cvondrick/pyvision/blob/master/vision/features.pyx。
#2
118
In python opencv you can compute hog like this:
在python opencv中,可以这样计算hog:
import cv2
hog = cv2.HOGDescriptor()
im = cv2.imread(sample)
h = hog.compute(im)
#3
32
1. Get Inbuilt Documentation: Following command on your python console will help you know the structure of class HOGDescriptor:
1。获取内置文档:在python控制台执行以下命令将帮助您了解类HOGDescriptor的结构:
import cv2;
help(cv2.HOGDescriptor())
2. Example Code: Here is a snippet of code to initialize an cv2.HOGDescriptor with different parameters (The terms I used here are standard terms which are well defined in OpenCV documentation here):
2。示例代码:这里是初始化cv2的代码片段。具有不同参数的HOGDescriptor(这里使用的术语是OpenCV文档中定义良好的标准术语):
import cv2
image = cv2.imread("test.jpg",0)
winSize = (64,64)
blockSize = (16,16)
blockStride = (8,8)
cellSize = (8,8)
nbins = 9
derivAperture = 1
winSigma = 4.
histogramNormType = 0
L2HysThreshold = 2.0000000000000001e-01
gammaCorrection = 0
nlevels = 64
hog = cv2.HOGDescriptor(winSize,blockSize,blockStride,cellSize,nbins,derivAperture,winSigma,
histogramNormType,L2HysThreshold,gammaCorrection,nlevels)
#compute(img[, winStride[, padding[, locations]]]) -> descriptors
winStride = (8,8)
padding = (8,8)
locations = ((10,20),)
hist = hog.compute(image,winStride,padding,locations)
3. Reasoning: The resultant hog descriptor will have dimension as: 9 orientations X (4 corner blocks that get 1 normalization + 6x4 blocks on the edges that get 2 normalizations + 6x6 blocks that get 4 normalizations) = 1764. as I have given only one location for hog.compute().
3所示。推理:生成的hog描述符的维度为:9个朝向X(4个角块得到1个归一化+ 6x4块得到2个正化+ 6x6块得到4个正化)= 1764。因为我只给了hog.compute()一个位置。
4. One more way to initialize is from xml file which contains all parameter values:
4所示。初始化的另一种方法是来自包含所有参数值的xml文件:
hog = cv2.HOGDescriptor("hog.xml")
To get an xml file one can do following:
要获得一个xml文件,可以执行以下操作:
hog = cv2.HOGDescriptor()
hog.save("hog.xml")
and edit the respective parameter values in xml file.
并在xml文件中编辑相应的参数值。
#4
9
Despite the fact that exist a method as said in previous answers:
尽管存在一种方法,如前面的回答所述:
hog = cv2.HOGDescriptor()
猪= cv2.HOGDescriptor()
I would like to post a python implementation you can find on opencv's examples directory, hoping it can be useful to understand HOG funcionallity:
我想发布一个python实现,您可以在opencv的示例目录中找到它,希望它能有助于理解HOG funcionallity:
def hog(img):
gx = cv2.Sobel(img, cv2.CV_32F, 1, 0)
gy = cv2.Sobel(img, cv2.CV_32F, 0, 1)
mag, ang = cv2.cartToPolar(gx, gy)
bin_n = 16 # Number of bins
bin = np.int32(bin_n*ang/(2*np.pi))
bin_cells = []
mag_cells = []
cellx = celly = 8
for i in range(0,img.shape[0]/celly):
for j in range(0,img.shape[1]/cellx):
bin_cells.append(bin[i*celly : i*celly+celly, j*cellx : j*cellx+cellx])
mag_cells.append(mag[i*celly : i*celly+celly, j*cellx : j*cellx+cellx])
hists = [np.bincount(b.ravel(), m.ravel(), bin_n) for b, m in zip(bin_cells, mag_cells)]
hist = np.hstack(hists)
# transform to Hellinger kernel
eps = 1e-7
hist /= hist.sum() + eps
hist = np.sqrt(hist)
hist /= norm(hist) + eps
return hist
Regards.
的问候。
#5
7
Here is a solution that uses only OpenCV:
这里有一个只使用OpenCV的解决方案:
import numpy as np
import cv2
import matplotlib.pyplot as plt
img = cv2.cvtColor(cv2.imread("/home/me/Downloads/cat.jpg"),
cv2.COLOR_BGR2GRAY)
cell_size = (8, 8) # h x w in pixels
block_size = (2, 2) # h x w in cells
nbins = 9 # number of orientation bins
# winSize is the size of the image cropped to an multiple of the cell size
hog = cv2.HOGDescriptor(_winSize=(img.shape[1] // cell_size[1] * cell_size[1],
img.shape[0] // cell_size[0] * cell_size[0]),
_blockSize=(block_size[1] * cell_size[1],
block_size[0] * cell_size[0]),
_blockStride=(cell_size[1], cell_size[0]),
_cellSize=(cell_size[1], cell_size[0]),
_nbins=nbins)
n_cells = (img.shape[0] // cell_size[0], img.shape[1] // cell_size[1])
hog_feats = hog.compute(img)\
.reshape(n_cells[1] - block_size[1] + 1,
n_cells[0] - block_size[0] + 1,
block_size[0], block_size[1], nbins) \
.transpose((1, 0, 2, 3, 4)) # index blocks by rows first
# hog_feats now contains the gradient amplitudes for each direction,
# for each cell of its group for each group. Indexing is by rows then columns.
gradients = np.zeros((n_cells[0], n_cells[1], nbins))
# count cells (border cells appear less often across overlapping groups)
cell_count = np.full((n_cells[0], n_cells[1], 1), 0, dtype=int)
for off_y in range(block_size[0]):
for off_x in range(block_size[1]):
gradients[off_y:n_cells[0] - block_size[0] + off_y + 1,
off_x:n_cells[1] - block_size[1] + off_x + 1] += \
hog_feats[:, :, off_y, off_x, :]
cell_count[off_y:n_cells[0] - block_size[0] + off_y + 1,
off_x:n_cells[1] - block_size[1] + off_x + 1] += 1
# Average gradients
gradients /= cell_count
# Preview
plt.figure()
plt.imshow(img, cmap='gray')
plt.show()
bin = 5 # angle is 360 / nbins * direction
plt.pcolor(gradients[:, :, bin])
plt.gca().invert_yaxis()
plt.gca().set_aspect('equal', adjustable='box')
plt.colorbar()
plt.show()
I have used HOG descriptor computation and visualization to understand the data layout and vectorized the loops over groups.
我使用了HOG描述符计算和可视化来理解数据布局,并在组上对循环进行矢量化。
#6
1
I would disagree with the argument of peakxu. The HOG detector in the end is "just" a rigid linear filter. any degrees of freedom in the "object" (i.e. persons) lead to bluring in the detector, and are not actually handled by it. There is an extension of this detector using latent SVMs that does explicitly handle dgrees of freedom by introducing structural constraints between independent parts (i.e. head, arms, etc) as well as allowing for multiple appearances per object (i.e. frontal people and sideways people...).
我不同意匹克许的观点。最后的HOG检测器是一个严格的线性过滤器。“对象”(即人)中的任何*度都会在探测器中引起闪烁,实际上它并没有对其进行处理。这个探测器的一个扩展是使用潜在的SVMs,它通过引入独立部件(如头部、手臂等)之间的结构约束,以及允许每个物体(如正面人和侧面人…)出现多个外观来显式地处理*度。
Regarding the HOG detector in opencv: In theory you can upload another detector to be used with the features, but you cannot afaik get the features themselves. thus, if you have a trained detector (i.e. a class specific linear filter) you should be able to upload that into the detector to get the fast detections performance of opencv. that said it should be easy to hack the opencv source code to provide this access and propose this patch back to the maintainers.
关于opencv中的HOG检测器:理论上,您可以上传另一个检测器来使用这些特性,但是您不能自己获取这些特性。因此,如果您有一个经过训练的检测器(即特定于类的线性过滤器),您应该能够将其上载到检测器中,以获得opencv的快速检测性能。也就是说,很容易破解opencv源代码来提供这种访问,并将这个补丁提交给维护人员。
#7
-9
I would not recommend using HOG features for detecting objects other than pedestrians. In the original HOG paper by Dalal and Triggs, they specifically mentioned that their detector is built around pedestrian detection in allowing for significant degrees of freedom in the limbs while using strong structural hints around human body.
我不建议使用HOG特征来检测行人以外的物体。在Dalal和Triggs的原始猪皮纸中,他们特别提到,他们的探测器是围绕行人检测而建立的,在使用强烈的人体结构暗示的同时,可以在肢体上获得显著的*度。
Instead, try looking at OpenCV's HaarDetectObjects. You can learn how to train your own cascades here.
相反,你可以看看OpenCV的HaarDetectObjects。你可以在这里学习如何训练你自己的瀑布。