使用Tesseract进行OCR会在GetUTF8Text方法上造成内存泄漏

时间:2021-11-17 23:22:35

I am using tesseract OCR for business card reading. I have a memory leak and I can't resolve it, I don't know how to.

我正在使用tesseract OCR进行名片阅读。我有内存泄漏,我无法解决它,我不知道如何。

In my code...

在我的代码中......

tesseract->Recognize(NULL); 
char* utf8Text = tesseract->GetUTF8Text();

GetUTF8Text() method gives memory leak. Here is the log in memory leak instruments:

GetUTF8Text()方法给出了内存泄漏。这是登录内存泄漏仪器:

tesseract::TessBaseAPI::GetUTF8Text()
operator new[](unsigned long) libstdc++.6.dylib
operator new(unsigned long) libstdc++.6.dylib
malloc libsystem_c.dylib

After some memory leaks, app crashes. GetUTF8Text is in baseapi.h file. I think tessearact was written by c++. I don't know c++. Can anyone help? Or anyone has clean tesseract?

一些内存泄漏后,应用程序崩溃。 GetUTF8Text位于baseapi.h文件中。我认为tessea是由c ++编写的。我不知道c ++。有人可以帮忙吗?或者任何人都有干净的tesseract?

2 个解决方案

#1


3  

According the documentation I found in baseapi.h.

根据我在baseapi.h中找到的文档。

/**
 * The recognized text is returned as a char* which is coded
 * as UTF8 and must be freed with the delete [] operator.
 */
char* GetUTF8Text();

So you will need to delete [] the utf8text when you are done with it.

所以你需要在完成后删除[] utf8text。

tesseract->Recognize(NULL); 
char* utf8Text = tesseract->GetUTF8Text();
... //use utf8Text or copy if necessary
delete [] utf8text;

#2


3  

From the Tesseract documentation:

从Tesseract文档:

The recognized text is returned as a char* which is coded as UTF8 and must be freed with the delete [] operator.

识别的文本作为char *返回,编码为UTF8,必须使用delete []运算符释放。

Put differently: Its your responsibility to free the memory, so its your leak and not Tesseracts.

换句话说:你有责任释放记忆,所以它是你的泄漏,而不是Tesseracts。

#1


3  

According the documentation I found in baseapi.h.

根据我在baseapi.h中找到的文档。

/**
 * The recognized text is returned as a char* which is coded
 * as UTF8 and must be freed with the delete [] operator.
 */
char* GetUTF8Text();

So you will need to delete [] the utf8text when you are done with it.

所以你需要在完成后删除[] utf8text。

tesseract->Recognize(NULL); 
char* utf8Text = tesseract->GetUTF8Text();
... //use utf8Text or copy if necessary
delete [] utf8text;

#2


3  

From the Tesseract documentation:

从Tesseract文档:

The recognized text is returned as a char* which is coded as UTF8 and must be freed with the delete [] operator.

识别的文本作为char *返回,编码为UTF8,必须使用delete []运算符释放。

Put differently: Its your responsibility to free the memory, so its your leak and not Tesseracts.

换句话说:你有责任释放记忆,所以它是你的泄漏,而不是Tesseracts。