国税总局验证码识别发票查验

时间:2024-02-19 14:45:35

先欣赏几张美图养养眼吧。

 

好了进入正题,国税局的验证码长下面这样,有0~9、26个英文字母与汉字组成,颜色有黑色、红色、黄色和蓝色。验证码识别常用的模型不外乎CNN和CRNN。


Python识别的代码如下,请使用post请求,使用get请求将会返回网页帮助:

with open(\'./tmp.jpg\', \'rb\') as f:
    img_bytes = f.read()

img_base64 = base64.b64encode(img_bytes)
# \'00\' 黑色 \'01\' 红色 \'02\' 黄色 \'03\' 蓝色
data = {\'image\': str(img_base64, \'utf-8\'), \'key\': \'04\'}
result = requests.post(\'http://47.99.174.98:8808/captcha\', json=data)
print(result.json())

Java识别代码如下:

import java.io.*;
import java.net.HttpURLConnection;
import java.net.URL;
import java.util.Base64;


public class CaptchaRecognize {
    static String captcha_url = "http://47.99.174.98:8808/captcha";

    public static String getBase64(String imgFile) {
        InputStream inputStream = null;
        byte[] data = null;
        try {
            inputStream = new FileInputStream(imgFile);
            data = new byte[inputStream.available()];
            inputStream.read(data);
            inputStream.close();

        } catch (IOException e) {
            e.printStackTrace();
        }
        Base64.Encoder encoder = Base64.getEncoder();
        assert data != null;
        return encoder.encodeToString(data);
    }

    public static void captchaPost() throws IOException {
        String imgBase64 = getBase64("./imgs/img001.jpg");
        String data = "{" + "\"image\":" + "\"" + imgBase64 + "\"" + "," + "\"key\":" + "\"03\"" + "}";
        URL url = new URL(captcha_url);
        HttpURLConnection conn = (HttpURLConnection) url.openConnection();
        conn.setRequestMethod("POST");
        conn.setDoOutput(true);
        conn.setDoInput(true);
        conn.setRequestProperty("Content-Type", "application/json");
        conn.setRequestProperty("Accept", "application/json");
        OutputStreamWriter out = new OutputStreamWriter(conn.getOutputStream());
        out.write(data);
        out.flush();
        out.close();
        BufferedReader in = new BufferedReader(new InputStreamReader(conn.getInputStream(), "UTF-8"));
        String line;
        while ((line = in.readLine()) != null) {
            System.out.println(line); // {"code": "Y2W"}
        }
    }
    public static void main(String[] args) throws IOException {
        captchaPost();
    }
}

当然也可以免去验证码直接查验发票,查验规则同国税局官网,代码如下,请使用post请求,秒回结果哦,使用get请求将会返回网页帮助:

import requests
# fpdm 发票代码 fphm 发票号码 rq 开票日期 jym 校验码后六位
data = {\'fpdm\': \'044001505121\', \'fphm\': \'38507145\', \'rq\': \'20180926\', \'jym\': \'865375\'}
result = requests.post(\'http://47.99.174.98:8808/fp\', json=data)
print(result.json())

查验结果如下: