有没有办法制作一个自动纠正扫描文件的脚本？

I often scan handwritten documents to send to colleagues, and need to make corrections to the digital file once it's scanned. (For example, I change mistakes I made on the original document to white.)

我经常扫描手写文档发送给同事,并且需要在扫描后对数字文件进行更正。 (例如,我将原始文档上的错误更改为白色。)

I am thinking of some script which can do the following:

我在考虑一些可以执行以下操作的脚本:

Take a color scan image (say a tiff) as input, and make simple corrections automatically based on colored corrections in the image.

将彩色扫描图像(比如tiff)作为输入,并根据图像中的彩色校正自动进行简单校正。

For example take the simplest case: I write only black on white. There is an area where I made mistakes so I draw a red closed circle (with a pen on the actual sheet of paper) around that area. Then I scan the image (or usually many of them). Now I would like the script to erase each of these areas in all of the images so my mistakes disappear in the resulting image.

例如,最简单的情况是:我只在白色上写黑色。有一个区域我犯了错误,因此我在该区域周围绘制了一个红色的闭合圆圈(在实际纸张上用笔)。然后我扫描图像(或通常很多)。现在我希望脚本擦除所有图像中的每个区域,这样我的错误就会在结果图像中消失。

Any ideas how to realize this in a Linux environment, e.g. with Image Magick?

任何想法如何在Linux环境中实现这一点,例如与Image Magick?

It looks like Gimp with script-fu could be the way to go it should be powerful enough. Can somebody give me a hint by pointing out the above example would look like in script-fu?

它看起来像Gimp with script-fu可能是它应该足够强大的方式。有人可以通过指出上面的例子看起来像在script-fu中给我一个提示吗?

2 个解决方案

#1

I'm thinking in a solution based on ImageMagick. We would need the following steps:

我正在考虑基于ImageMagick的解决方案。我们需要以下步骤:

Find the color used to draw in the scanned document (for now on, called target color);

找到用于在扫描文档中绘制的颜色(现在称为目标颜色);

Find its x and y coordinates in the image;

在图像中找到它的x和y坐标;

Pass this position as a seed to Flood Fill algorithm.

将此位置作为种子传递到Flood Fill算法。

We could use the following script based on functions of ImageMagick:

我们可以使用基于ImageMagick函数的以下脚本:

Output all the unique colors in the picture. This will be used to find out which are the RGB components of the target color (command source).

输出图片中所有独特的颜色。这将用于找出目标颜色(命令源)的RGB组件。
```
convert <image> -unique-colors -depth 8 txt:- > output.txt
```
Output the coordinates of each color in a text file:

输出文本文件中每种颜色的坐标:
```
convert <image> txt:- > coord.txt
```
Find the x and y coordinates of the target color (command source). Suppose the target color obtained by step 1 was red:

找到目标颜色的x和y坐标(命令源)。假设步骤1获得的目标颜色为红色:
```
grep red coord.txt
```
Finally, use x and y as a seed to floodfill to replace the circle region by your desired color (command source). In this case, I've used white to erase the region:

最后,使用x和y作为种子进行填充,以用您想要的颜色(命令源)替换圆形区域。在这种情况下,我用白色来擦除区域:
```
convert <image> -fill white -fuzz 13% \
        -draw 'color <x>,<y> floodfill' <image_floodfill_output>
```

The -fuzz parameter will avoid that colors which were originally red and became corrupted due to noise also gets replaced.

-fuzz参数将避免原本为红色并因噪声而损坏的颜色也会被替换。

This tutorial gives more information about floodfill function, such as how to replace the edge colors.

本教程提供了有关填充功能的更多信息,例如如何替换边缘颜色。

#2

I would suggest looking at a scansnap scanner (perhaps the scansnap 3100). There are several things that the bundled software can do that may be helpful.

我建议看一下扫描带扫描仪(也许是scansnap 3100)。捆绑软件可以做的一些事情可能会有所帮助。

You may find that any software / script that you find will not work the way you'd like. It sounds like many of these edits are things that need to be seen with a human eye. Perhaps you could hire a personal assistant to make these corrections for you. :)

您可能会发现您找到的任何软件/脚本都无法按照您的喜好运行。听起来很多这些编辑都需要用人眼来看待。也许你可以聘请一位私人助理来为你做出这些更正。 :)

#1