创建trackbars在OpenCV Python中滚动大图像。

时间:2022-05-24 23:48:52

I am trying to create scrollbars in a window created by OpenCv python. I know that I need to implement the code to handle the scrolling/panning process but I have no idea where to start and I've looked everywhere. It is essential that I create the scrollbars in the OpenCV window instead of using some other GUI window framework. Below is the code I am using to load an image and scale the image(which works). Any help is appreciated. And please don't refer me to the opencv documentation on creating trackbars, I've read it and it doesn't help at all. Thanks!

我正在尝试创建由OpenCv python创建的窗口中的滚动条。我知道我需要实现代码来处理滚动/平移的过程,但是我不知道从哪里开始,我到处都找遍了。我必须在OpenCV窗口中创建滚动条,而不是使用其他GUI窗口框架。下面是我用来加载图像和缩放图像的代码(有效)。任何帮助都是感激。请不要把我介绍给opencv的创建trackbars的文档,我已经读过了,它一点也没有帮助。谢谢!

import cv2
import cv2.cv as cv
import numpy as np

cv.NamedWindow('image', cv.CV_WINDOW_AUTOSIZE)
cv.NamedWindow('Control Window', cv.CV_WINDOW_AUTOSIZE)


print " Zoom In-Out demo "
print " Press u to zoom "
print " Press d to zoom "

img = cv2.imread('picture.jpg')


while(1):
    h,w = img.shape[:2]

    cv2.imshow('image',img)
    k = cv2.waitKey(10)

    if k==27 :
        break

    elif k == ord('u'):  # Zoom in, make image double size
        img = cv2.pyrUp(img,dstsize = (2*w,2*h))

    elif k == ord('d'):  # Zoom down, make image half the size
        img = cv2.pyrDown(img,dstsize = (w/2,h/2))

cv2.destroyAllWindows()

2 个解决方案

#1


2  

I had the same need, so today I created a class from scratch that handles mouse clicks, pan, and zoom on an OpenCV window. It works like this:

我也有同样的需求,所以今天我创建了一个类,它处理鼠标点击、平移和缩放OpenCV窗口。是这样的:

  1. right-drag up or down to zoom
  2. 右拉向上或向下缩放。
  3. right-click to center the view on the mouse
  4. 右键单击以将视图置于鼠标的中心。
  5. drag the x and y trackbars to scroll
  6. 拖动x和y轨迹条来滚动。
  7. when you initialize it, you can optionally pass in a function that will be called when the user left-clicks on a pixel
  8. 当您初始化它时,您可以选择传递一个函数,当用户左键单击一个像素时,该函数将被调用。

(As far as I can tell, OpenCV can't read the mouse wheel and can't create a vertical trackbar, so the user experience is a little non-intuitive but it works.)

(据我所知,OpenCV不能读取鼠标滚轮,也不能创建垂直的trackbar,所以用户体验有点不直观,但很有用。)

# -*- coding: utf-8 -*-
import cv2
import numpy as np

class PanZoomWindow(object):
    """ Controls an OpenCV window. Registers a mouse listener so that:
        1. right-dragging up/down zooms in/out
        2. right-clicking re-centers
        3. trackbars scroll vertically and horizontally 
    You can open multiple windows at once if you specify different window names.
    You can pass in an onLeftClickFunction, and when the user left-clicks, this 
    will call onLeftClickFunction(y,x), with y,x in original image coordinates."""
    def __init__(self, img, windowName = 'PanZoomWindow', onLeftClickFunction = None):
        self.WINDOW_NAME = windowName
        self.H_TRACKBAR_NAME = 'x'
        self.V_TRACKBAR_NAME = 'y'
        self.img = img
        self.onLeftClickFunction = onLeftClickFunction
        self.TRACKBAR_TICKS = 1000
        self.panAndZoomState = PanAndZoomState(img.shape, self)
        self.lButtonDownLoc = None
        self.mButtonDownLoc = None
        self.rButtonDownLoc = None
        cv2.namedWindow(self.WINDOW_NAME, cv2.WINDOW_NORMAL)
        self.redrawImage()
        cv2.setMouseCallback(self.WINDOW_NAME, self.onMouse)
        cv2.createTrackbar(self.H_TRACKBAR_NAME, self.WINDOW_NAME, 0, self.TRACKBAR_TICKS, self.onHTrackbarMove)
        cv2.createTrackbar(self.V_TRACKBAR_NAME, self.WINDOW_NAME, 0, self.TRACKBAR_TICKS, self.onVTrackbarMove)
    def onMouse(self,event, x,y,_ignore1,_ignore2):
        """ Responds to mouse events within the window. 
        The x and y are pixel coordinates in the image currently being displayed.
        If the user has zoomed in, the image being displayed is a sub-region, so you'll need to
        add self.panAndZoomState.ul to get the coordinates in the full image."""
        if event == cv2.EVENT_MOUSEMOVE:
            return
        elif event == cv2.EVENT_RBUTTONDOWN:
            #record where the user started to right-drag
            self.mButtonDownLoc = np.array([y,x])
        elif event == cv2.EVENT_RBUTTONUP and self.mButtonDownLoc is not None:
            #the user just finished right-dragging
            dy = y - self.mButtonDownLoc[0]
            pixelsPerDoubling = 0.2*self.panAndZoomState.shape[0] #lower = zoom more
            changeFactor = (1.0+abs(dy)/pixelsPerDoubling)
            changeFactor = min(max(1.0,changeFactor),5.0)
            if changeFactor < 1.05:
                dy = 0 #this was a click, not a draw. So don't zoom, just re-center.
            if dy > 0: #moved down, so zoom out.
                zoomInFactor = 1.0/changeFactor
            else:
                zoomInFactor = changeFactor
#            print "zoomFactor:",zoomFactor
            self.panAndZoomState.zoom(self.mButtonDownLoc[0], self.mButtonDownLoc[1], zoomInFactor)
        elif event == cv2.EVENT_LBUTTONDOWN:
            #the user pressed the left button. 
            coordsInDisplayedImage = np.array([y,x])
            if np.any(coordsInDisplayedImage < 0) or np.any(coordsInDisplayedImage > self.panAndZoomState.shape[:2]):
                print "you clicked outside the image area"
            else:
                print "you clicked on",coordsInDisplayedImage,"within the zoomed rectangle"
                coordsInFullImage = self.panAndZoomState.ul + coordsInDisplayedImage
                print "this is",coordsInFullImage,"in the actual image"
                print "this pixel holds ",self.img[coordsInFullImage[0],coordsInFullImage[1]]
                if self.onLeftClickFunction is not None:
                    self.onLeftClickFunction(coordsInFullImage[0],coordsInFullImage[1])
        #you can handle other mouse click events here
    def onVTrackbarMove(self,tickPosition):
        self.panAndZoomState.setYFractionOffset(float(tickPosition)/self.TRACKBAR_TICKS)
    def onHTrackbarMove(self,tickPosition):
        self.panAndZoomState.setXFractionOffset(float(tickPosition)/self.TRACKBAR_TICKS)
    def redrawImage(self):
        pzs = self.panAndZoomState
        cv2.imshow(self.WINDOW_NAME, self.img[pzs.ul[0]:pzs.ul[0]+pzs.shape[0], pzs.ul[1]:pzs.ul[1]+pzs.shape[1]])

class PanAndZoomState(object):
    """ Tracks the currently-shown rectangle of the image.
    Does the math to adjust this rectangle to pan and zoom."""
    MIN_SHAPE = np.array([50,50])
    def __init__(self, imShape, parentWindow):
        self.ul = np.array([0,0]) #upper left of the zoomed rectangle (expressed as y,x)
        self.imShape = np.array(imShape[0:2])
        self.shape = self.imShape #current dimensions of rectangle
        self.parentWindow = parentWindow
    def zoom(self,relativeCy,relativeCx,zoomInFactor):
        self.shape = (self.shape.astype(np.float)/zoomInFactor).astype(np.int)
        #expands the view to a square shape if possible. (I don't know how to get the actual window aspect ratio)
        self.shape[:] = np.max(self.shape) 
        self.shape = np.maximum(PanAndZoomState.MIN_SHAPE,self.shape) #prevent zooming in too far
        c = self.ul+np.array([relativeCy,relativeCx])
        self.ul = c-self.shape/2
        self._fixBoundsAndDraw()
    def _fixBoundsAndDraw(self):
        """ Ensures we didn't scroll/zoom outside the image. 
        Then draws the currently-shown rectangle of the image."""
#        print "in self.ul:",self.ul, "shape:",self.shape
        self.ul = np.maximum(0,np.minimum(self.ul, self.imShape-self.shape))
        self.shape = np.minimum(np.maximum(PanAndZoomState.MIN_SHAPE,self.shape), self.imShape-self.ul)
#        print "out self.ul:",self.ul, "shape:",self.shape
        yFraction = float(self.ul[0])/max(1,self.imShape[0]-self.shape[0])
        xFraction = float(self.ul[1])/max(1,self.imShape[1]-self.shape[1])
        cv2.setTrackbarPos(self.parentWindow.H_TRACKBAR_NAME, self.parentWindow.WINDOW_NAME,int(xFraction*self.parentWindow.TRACKBAR_TICKS))
        cv2.setTrackbarPos(self.parentWindow.V_TRACKBAR_NAME, self.parentWindow.WINDOW_NAME,int(yFraction*self.parentWindow.TRACKBAR_TICKS))
        self.parentWindow.redrawImage()
    def setYAbsoluteOffset(self,yPixel):
        self.ul[0] = min(max(0,yPixel), self.imShape[0]-self.shape[0])
        self._fixBoundsAndDraw()
    def setXAbsoluteOffset(self,xPixel):
        self.ul[1] = min(max(0,xPixel), self.imShape[1]-self.shape[1])
        self._fixBoundsAndDraw()
    def setYFractionOffset(self,fraction):
        """ pans so the upper-left zoomed rectange is "fraction" of the way down the image."""
        self.ul[0] = int(round((self.imShape[0]-self.shape[0])*fraction))
        self._fixBoundsAndDraw()
    def setXFractionOffset(self,fraction):
        """ pans so the upper-left zoomed rectange is "fraction" of the way right on the image."""
        self.ul[1] = int(round((self.imShape[1]-self.shape[1])*fraction))
        self._fixBoundsAndDraw()

if __name__ == "__main__":
    infile = "./testImage.png"
    myImage = cv2.imread(infile,cv2.IMREAD_ANYCOLOR)
    window = PanZoomWindow(myImage, "test window")
    key = -1
    while key != ord('q') and key != 27: # 27 = escape key
        #the OpenCV window won't display until you call cv2.waitKey()
        key = cv2.waitKey(5) #User can press 'q' or ESC to exit.
    cv2.destroyAllWindows()

#2


1  

Because I am doing image processing on the image like getting pixel information and I will loose that ability if I encapsulate it in a widget or window provided by the GUI framework

因为我在图像处理上像获取像素信息一样,如果我将其封装在由GUI框架提供的小部件或窗口中,我将会释放这种能力。

That isn't true. You could always update the image after doing your processing. For example look here and here especially.
These examples process images in OpenCv and put them in a PyQt gui frame. I am sure that you could do similar things with other Gui frameworks (I couldn't find anything for Tkinter). I think I have seen wxPython integrated in the past.

那不是真的。您可以在完成处理后更新映像。比如这里,尤其是这里。这些示例处理OpenCv中的图像并将它们放入PyQt gui框架中。我确信您可以用其他Gui框架来做类似的事情(对于Tkinter,我找不到任何东西)。我想我以前见过wxPython集成。

When you are making your program, be sure to display a copy of the image. that way, the image object will continue to be changeable, and you can just update the image in the Gui. For example, here is some pseudo-code:

当你在制作你的程序时,一定要显示一个图像的拷贝。这样,图像对象将继续变化,您可以只更新Gui中的图像。例如,这里有一些伪代码:

image=Image("myimage.png")
image.resize(100,400)
img=QImage(image)#similar to how pyqt would work
img.show()
image.invert_colors()
img=QImage(image)
img.show()

Of course, this is not what you will actually be writing, it is an abstraction of the idea.

当然,这并不是你真正要写的东西,而是一种抽象的想法。

EDIT: In this case I would render the video (see this example & here), then take the image as a separate object, then render (again as a third object) with pyqt. To catch the location of the mouse click, look at this question, and finally, reference that point to the second object wich is the OpenCV image.

编辑:在本例中,我将呈现视频(参见这个示例&这里),然后将图像作为一个单独的对象,然后以pyqt呈现(再次作为第三个对象)。要捕捉鼠标点击的位置,看看这个问题,最后,指向第二个对象的是OpenCV图像。

#1


2  

I had the same need, so today I created a class from scratch that handles mouse clicks, pan, and zoom on an OpenCV window. It works like this:

我也有同样的需求,所以今天我创建了一个类,它处理鼠标点击、平移和缩放OpenCV窗口。是这样的:

  1. right-drag up or down to zoom
  2. 右拉向上或向下缩放。
  3. right-click to center the view on the mouse
  4. 右键单击以将视图置于鼠标的中心。
  5. drag the x and y trackbars to scroll
  6. 拖动x和y轨迹条来滚动。
  7. when you initialize it, you can optionally pass in a function that will be called when the user left-clicks on a pixel
  8. 当您初始化它时,您可以选择传递一个函数,当用户左键单击一个像素时,该函数将被调用。

(As far as I can tell, OpenCV can't read the mouse wheel and can't create a vertical trackbar, so the user experience is a little non-intuitive but it works.)

(据我所知,OpenCV不能读取鼠标滚轮,也不能创建垂直的trackbar,所以用户体验有点不直观,但很有用。)

# -*- coding: utf-8 -*-
import cv2
import numpy as np

class PanZoomWindow(object):
    """ Controls an OpenCV window. Registers a mouse listener so that:
        1. right-dragging up/down zooms in/out
        2. right-clicking re-centers
        3. trackbars scroll vertically and horizontally 
    You can open multiple windows at once if you specify different window names.
    You can pass in an onLeftClickFunction, and when the user left-clicks, this 
    will call onLeftClickFunction(y,x), with y,x in original image coordinates."""
    def __init__(self, img, windowName = 'PanZoomWindow', onLeftClickFunction = None):
        self.WINDOW_NAME = windowName
        self.H_TRACKBAR_NAME = 'x'
        self.V_TRACKBAR_NAME = 'y'
        self.img = img
        self.onLeftClickFunction = onLeftClickFunction
        self.TRACKBAR_TICKS = 1000
        self.panAndZoomState = PanAndZoomState(img.shape, self)
        self.lButtonDownLoc = None
        self.mButtonDownLoc = None
        self.rButtonDownLoc = None
        cv2.namedWindow(self.WINDOW_NAME, cv2.WINDOW_NORMAL)
        self.redrawImage()
        cv2.setMouseCallback(self.WINDOW_NAME, self.onMouse)
        cv2.createTrackbar(self.H_TRACKBAR_NAME, self.WINDOW_NAME, 0, self.TRACKBAR_TICKS, self.onHTrackbarMove)
        cv2.createTrackbar(self.V_TRACKBAR_NAME, self.WINDOW_NAME, 0, self.TRACKBAR_TICKS, self.onVTrackbarMove)
    def onMouse(self,event, x,y,_ignore1,_ignore2):
        """ Responds to mouse events within the window. 
        The x and y are pixel coordinates in the image currently being displayed.
        If the user has zoomed in, the image being displayed is a sub-region, so you'll need to
        add self.panAndZoomState.ul to get the coordinates in the full image."""
        if event == cv2.EVENT_MOUSEMOVE:
            return
        elif event == cv2.EVENT_RBUTTONDOWN:
            #record where the user started to right-drag
            self.mButtonDownLoc = np.array([y,x])
        elif event == cv2.EVENT_RBUTTONUP and self.mButtonDownLoc is not None:
            #the user just finished right-dragging
            dy = y - self.mButtonDownLoc[0]
            pixelsPerDoubling = 0.2*self.panAndZoomState.shape[0] #lower = zoom more
            changeFactor = (1.0+abs(dy)/pixelsPerDoubling)
            changeFactor = min(max(1.0,changeFactor),5.0)
            if changeFactor < 1.05:
                dy = 0 #this was a click, not a draw. So don't zoom, just re-center.
            if dy > 0: #moved down, so zoom out.
                zoomInFactor = 1.0/changeFactor
            else:
                zoomInFactor = changeFactor
#            print "zoomFactor:",zoomFactor
            self.panAndZoomState.zoom(self.mButtonDownLoc[0], self.mButtonDownLoc[1], zoomInFactor)
        elif event == cv2.EVENT_LBUTTONDOWN:
            #the user pressed the left button. 
            coordsInDisplayedImage = np.array([y,x])
            if np.any(coordsInDisplayedImage < 0) or np.any(coordsInDisplayedImage > self.panAndZoomState.shape[:2]):
                print "you clicked outside the image area"
            else:
                print "you clicked on",coordsInDisplayedImage,"within the zoomed rectangle"
                coordsInFullImage = self.panAndZoomState.ul + coordsInDisplayedImage
                print "this is",coordsInFullImage,"in the actual image"
                print "this pixel holds ",self.img[coordsInFullImage[0],coordsInFullImage[1]]
                if self.onLeftClickFunction is not None:
                    self.onLeftClickFunction(coordsInFullImage[0],coordsInFullImage[1])
        #you can handle other mouse click events here
    def onVTrackbarMove(self,tickPosition):
        self.panAndZoomState.setYFractionOffset(float(tickPosition)/self.TRACKBAR_TICKS)
    def onHTrackbarMove(self,tickPosition):
        self.panAndZoomState.setXFractionOffset(float(tickPosition)/self.TRACKBAR_TICKS)
    def redrawImage(self):
        pzs = self.panAndZoomState
        cv2.imshow(self.WINDOW_NAME, self.img[pzs.ul[0]:pzs.ul[0]+pzs.shape[0], pzs.ul[1]:pzs.ul[1]+pzs.shape[1]])

class PanAndZoomState(object):
    """ Tracks the currently-shown rectangle of the image.
    Does the math to adjust this rectangle to pan and zoom."""
    MIN_SHAPE = np.array([50,50])
    def __init__(self, imShape, parentWindow):
        self.ul = np.array([0,0]) #upper left of the zoomed rectangle (expressed as y,x)
        self.imShape = np.array(imShape[0:2])
        self.shape = self.imShape #current dimensions of rectangle
        self.parentWindow = parentWindow
    def zoom(self,relativeCy,relativeCx,zoomInFactor):
        self.shape = (self.shape.astype(np.float)/zoomInFactor).astype(np.int)
        #expands the view to a square shape if possible. (I don't know how to get the actual window aspect ratio)
        self.shape[:] = np.max(self.shape) 
        self.shape = np.maximum(PanAndZoomState.MIN_SHAPE,self.shape) #prevent zooming in too far
        c = self.ul+np.array([relativeCy,relativeCx])
        self.ul = c-self.shape/2
        self._fixBoundsAndDraw()
    def _fixBoundsAndDraw(self):
        """ Ensures we didn't scroll/zoom outside the image. 
        Then draws the currently-shown rectangle of the image."""
#        print "in self.ul:",self.ul, "shape:",self.shape
        self.ul = np.maximum(0,np.minimum(self.ul, self.imShape-self.shape))
        self.shape = np.minimum(np.maximum(PanAndZoomState.MIN_SHAPE,self.shape), self.imShape-self.ul)
#        print "out self.ul:",self.ul, "shape:",self.shape
        yFraction = float(self.ul[0])/max(1,self.imShape[0]-self.shape[0])
        xFraction = float(self.ul[1])/max(1,self.imShape[1]-self.shape[1])
        cv2.setTrackbarPos(self.parentWindow.H_TRACKBAR_NAME, self.parentWindow.WINDOW_NAME,int(xFraction*self.parentWindow.TRACKBAR_TICKS))
        cv2.setTrackbarPos(self.parentWindow.V_TRACKBAR_NAME, self.parentWindow.WINDOW_NAME,int(yFraction*self.parentWindow.TRACKBAR_TICKS))
        self.parentWindow.redrawImage()
    def setYAbsoluteOffset(self,yPixel):
        self.ul[0] = min(max(0,yPixel), self.imShape[0]-self.shape[0])
        self._fixBoundsAndDraw()
    def setXAbsoluteOffset(self,xPixel):
        self.ul[1] = min(max(0,xPixel), self.imShape[1]-self.shape[1])
        self._fixBoundsAndDraw()
    def setYFractionOffset(self,fraction):
        """ pans so the upper-left zoomed rectange is "fraction" of the way down the image."""
        self.ul[0] = int(round((self.imShape[0]-self.shape[0])*fraction))
        self._fixBoundsAndDraw()
    def setXFractionOffset(self,fraction):
        """ pans so the upper-left zoomed rectange is "fraction" of the way right on the image."""
        self.ul[1] = int(round((self.imShape[1]-self.shape[1])*fraction))
        self._fixBoundsAndDraw()

if __name__ == "__main__":
    infile = "./testImage.png"
    myImage = cv2.imread(infile,cv2.IMREAD_ANYCOLOR)
    window = PanZoomWindow(myImage, "test window")
    key = -1
    while key != ord('q') and key != 27: # 27 = escape key
        #the OpenCV window won't display until you call cv2.waitKey()
        key = cv2.waitKey(5) #User can press 'q' or ESC to exit.
    cv2.destroyAllWindows()

#2


1  

Because I am doing image processing on the image like getting pixel information and I will loose that ability if I encapsulate it in a widget or window provided by the GUI framework

因为我在图像处理上像获取像素信息一样,如果我将其封装在由GUI框架提供的小部件或窗口中,我将会释放这种能力。

That isn't true. You could always update the image after doing your processing. For example look here and here especially.
These examples process images in OpenCv and put them in a PyQt gui frame. I am sure that you could do similar things with other Gui frameworks (I couldn't find anything for Tkinter). I think I have seen wxPython integrated in the past.

那不是真的。您可以在完成处理后更新映像。比如这里,尤其是这里。这些示例处理OpenCv中的图像并将它们放入PyQt gui框架中。我确信您可以用其他Gui框架来做类似的事情(对于Tkinter,我找不到任何东西)。我想我以前见过wxPython集成。

When you are making your program, be sure to display a copy of the image. that way, the image object will continue to be changeable, and you can just update the image in the Gui. For example, here is some pseudo-code:

当你在制作你的程序时,一定要显示一个图像的拷贝。这样,图像对象将继续变化,您可以只更新Gui中的图像。例如,这里有一些伪代码:

image=Image("myimage.png")
image.resize(100,400)
img=QImage(image)#similar to how pyqt would work
img.show()
image.invert_colors()
img=QImage(image)
img.show()

Of course, this is not what you will actually be writing, it is an abstraction of the idea.

当然,这并不是你真正要写的东西,而是一种抽象的想法。

EDIT: In this case I would render the video (see this example & here), then take the image as a separate object, then render (again as a third object) with pyqt. To catch the location of the mouse click, look at this question, and finally, reference that point to the second object wich is the OpenCV image.

编辑:在本例中,我将呈现视频(参见这个示例&这里),然后将图像作为一个单独的对象,然后以pyqt呈现(再次作为第三个对象)。要捕捉鼠标点击的位置,看看这个问题,最后,指向第二个对象的是OpenCV图像。