I am new to python so please bear with me. I want to create python code that solves captcha online. I am developing in windows not linux and right now though I have many problems.
我是python的新手所以请耐心等待。我想创建在线解决验证码的python代码。我在windows中开发而不是linux,现在虽然我有很多问题。
1st I cannot understand how will my python file run on a live website. Solve captcha that a website shows.
1我无法理解我的python文件将如何在实时网站上运行。解决网站显示的验证码。
2nd I have managed to develope and get some code but I don't think it works properly or at least the way I want it too. When I ran it withe the cmd nothing happens.
第二,我已经设法开发并获得了一些代码,但我认为它不能正常工作或至少我想要的方式。当我用cmd运行它没有任何反应。
Here is my code:
这是我的代码:
from PIL import Image
import ImageEnhance
from pytesser import *
from urllib import urlretrieve
def get(link):
urlretrieve(link,'temp.png')
get('http://www.example.com/');
im = Image.open("temp.png")
nx, ny = im.size
im2 = im.resize((int(nx*5), int(ny*5)), Image.BICUBIC)
im2.save("temp2.png")
enh = ImageEnhance.Contrast(im)
enh.enhance(1.3).show("30% more contrast")
imgx = Image.open('temp2.png')
imgx = imgx.convert("RGBA")
pix = imgx.load()
for y in xrange(imgx.size[1]):
for x in xrange(imgx.size[0]):
if pix[x, y] != (0, 0, 0, 255):
pix[x, y] = (255, 255, 255, 255)
imgx.save("bw.gif", "GIF")
original = Image.open('bw.gif')
bg = original.resize((116, 56), Image.NEAREST)
ext = ".tif"
bg.save("input-NEAREST" + ext)
image = Image.open('input-NEAREST.tif')
print image_to_string(image)
Can someone please help me try to fix this code and explain to me how to use it e.x on a website
有人可以帮我尝试修复此代码并向我解释如何在网站上使用它e.x
1 个解决方案
#1
2
I cannot understand how will my python file run on a live website.
我无法理解我的python文件将如何在实时网站上运行。
I think I can help you understand. You don't run your python script "on a live website". What you want is to run you python script locally on your machine, but as part of a bigger program that behaves as an automated client designed to interact with the server whose captchas you're cracking.
我想我可以帮助你理解。您不“在实时网站上”运行您的python脚本。你想要的是在你的机器上本地运行你的python脚本,但作为一个更大的程序的一部分,行为作为一个自动客户端,旨在与你的破解验证码的服务器进行交互。
Compare these two programs:
比较这两个程序:
- Google Chrome is a human guided web client, and it can interact with any web server.
- Google Chrome是一款人性化的网络客户端,可以与任何网络服务器进行互动。
- Your script is an automated client, and it can interact only with the web server you design it to.
- 您的脚本是一个自动客户端,它只能与您设计的Web服务器进行交互。
Here's what I mean by specific design: you design your client to get captcha images from a specific URL for the web server, and to submit data in a format specific to the web server. Like this, for instance:
以下是具体设计的含义:您设计客户端以从Web服务器的特定URL获取验证码图像,并以特定于Web服务器的格式提交数据。像这样,例如:
- Loads the website you want with something like
httplib
performing anHTTP GET
. - 使用httplib执行HTTP GET之类的东西加载您想要的网站。
- Extracts the captcha image and solves it
- 提取验证码图像并解决它
- Submits the form with the solved captcha string and the rest of your desired data, again with an http client like
httplib
performing anHTTP POST
. (APOST
is the same thing that the "submit" button does when you fill out a form on a website.) - 使用已解决的验证码字符串和其他所需数据提交表单,再次使用httplib等http客户端执行HTTP POST。 (当你在网站上填写表格时,POST与“提交”按钮的功能相同。)
Your current script does part of #1 - but it only extracts the image, it doesn't get the rest of the page. And if your preprocessing and image_to_string
function work, then #2 is done.
您当前的脚本是#1的一部分 - 但它只提取图像,它不会获取页面的其余部分。如果你的预处理和image_to_string函数工作,那么#2就完成了。
#1
2
I cannot understand how will my python file run on a live website.
我无法理解我的python文件将如何在实时网站上运行。
I think I can help you understand. You don't run your python script "on a live website". What you want is to run you python script locally on your machine, but as part of a bigger program that behaves as an automated client designed to interact with the server whose captchas you're cracking.
我想我可以帮助你理解。您不“在实时网站上”运行您的python脚本。你想要的是在你的机器上本地运行你的python脚本,但作为一个更大的程序的一部分,行为作为一个自动客户端,旨在与你的破解验证码的服务器进行交互。
Compare these two programs:
比较这两个程序:
- Google Chrome is a human guided web client, and it can interact with any web server.
- Google Chrome是一款人性化的网络客户端,可以与任何网络服务器进行互动。
- Your script is an automated client, and it can interact only with the web server you design it to.
- 您的脚本是一个自动客户端,它只能与您设计的Web服务器进行交互。
Here's what I mean by specific design: you design your client to get captcha images from a specific URL for the web server, and to submit data in a format specific to the web server. Like this, for instance:
以下是具体设计的含义:您设计客户端以从Web服务器的特定URL获取验证码图像,并以特定于Web服务器的格式提交数据。像这样,例如:
- Loads the website you want with something like
httplib
performing anHTTP GET
. - 使用httplib执行HTTP GET之类的东西加载您想要的网站。
- Extracts the captcha image and solves it
- 提取验证码图像并解决它
- Submits the form with the solved captcha string and the rest of your desired data, again with an http client like
httplib
performing anHTTP POST
. (APOST
is the same thing that the "submit" button does when you fill out a form on a website.) - 使用已解决的验证码字符串和其他所需数据提交表单,再次使用httplib等http客户端执行HTTP POST。 (当你在网站上填写表格时,POST与“提交”按钮的功能相同。)
Your current script does part of #1 - but it only extracts the image, it doesn't get the rest of the page. And if your preprocessing and image_to_string
function work, then #2 is done.
您当前的脚本是#1的一部分 - 但它只提取图像,它不会获取页面的其余部分。如果你的预处理和image_to_string函数工作,那么#2就完成了。