Java poi3.9解析word文档

时间:2022-09-23 06:25:26
大家好,关于Java poi解析 word2003 doc文档资料太少了,希望大家能帮个忙,提供点思路

我想要做一个题库网站,题目源是word文档(那种考试试卷电子版)

1. Java poi图片位置处理
   HWPFDocument = new HWPFDocument(new FileInputStream("2011小学数学毕业模拟试卷1.doc"));
   List picA = docA.getPicturesTable().getAllPictures();//获取所有文档中的图片
   
   int length = doc.characterLength();
   for(int m =0;m<length-1;m++){
       Range range = new Range(m,m+1,doc);
       for(int j=0;j<range.numCharacterRuns();j++){
           CharacterRun cr=range.getCharacterRun(j);
           if(cr.getPicOffset()!=-1){} //表明是图片?然后标注明是list中的第几张图片
       }
   }

2. 数学试卷的高等数学公式
   解决思路一:poi是能解析成wmf图片(windows 画图板),但是wmf并不是网页图片显示格式
   解决思路二:word安装mathtype工具后,可以将word文档中由mathtype公式编辑器编辑的数学公式转化成mathml,然后再解析word文本和mathml的混合文档
   但是数学公式在数据库中怎么存储呢?

3. 是否能有其他的工具实现,
   1.网上看到c#解析word文档更好?(微软自己的东西)
   2.VBA来做?
   3.我不知道用itext来解析pdf能否相对简单一点

29 个解决方案

#1


你确定java 解析word 的资料少?

#2


感觉都讲得很浅,并不是我想要的   poi word部分其官方的API,一些方法也讲得很简单
有过经验的朋友能一起来帮忙给个建议吗?

#3


InputStream input = null;  
HWPFDocument doc = null;
String content = null;
try {
input = new FileInputStream("2011小学数学毕业模拟试卷1.doc");
doc = new HWPFDocument(input);
content = doc.getRange().text();
        List<Picture> picA = doc.getPicturesTable().getAllPictures();// 把word中数学公式以wmf图片的格式读到List中
Picture p = null;
for (int i = 0; i < picA.size(); i++) {
    p = (Picture) picA.get(i);
    if(p!=null){
        FileOutputStream output = new FileOutputStream("F://" + i + "."+p.suggestFileExtension());
        p.writeImageContent(output); 
        output.close();
    }
}
} catch (Exception e) {
e.printStackTrace();
}

int pici = 1;
int length = doc.characterLength();
for (int m = 0; m < length - 1; m++) { 
Range range = new Range(m, m + 1, doc); 
for (int j = 0; j < range.numCharacterRuns(); j++) {
CharacterRun cr = range.getCharacterRun(j);
if (cr.isOle2()) {     // 表明是ole2对象,word文档数学公式由mathtype编辑的ole对象
cr.replaceText(cr.text(), "方程式"+pici);
                        //貌似只会替换一次
pici++;
}
}
}

#4


Java poi3.9解析word文档学习。

#5


我也遇到这样的问题

#6


大哥,现在实现了没有?小弟想请教一下

#7


回复楼主 关于java读取word 公式问题

刚好今天我也碰到一个这个样的问题, 我用的是jacob 将word 转成html, word里面的公式会变成图片。
html就用Jsoup 解析的。

缺陷是 用jacob 服务器必须是windows 的。 而且需要装office2007 


今天找了一天的资料。没发现java可以操作word 公式的。。 

#8


用openoffice看看,是跨平台的,word中的公司是作为图片处理的

#9


第一问,也许资料少,但是官方API里肯定有例子,下载后的POI方法中也有demo。照着学,挨个测试一下效果。
第二问,数学公式中的任何符号都有对应的ASSIC编码吧,存储编码数字就可以了

#10


服务器是windows的方法多些如7楼说的jacob\及自己开发的插件,其它的用 openoffice但支持的也不是很好,加的样式什么的有时显示不太准确

#11


关于解析word中公式你解决没,之前看有说是用VBA,可是我们是要求可跨平台的,这下伤不起了 Java poi3.9解析word文档

#12


引用 11 楼 tlfu_12344 的回复:
关于解析word中公式你解决没,之前看有说是用VBA,可是我们是要求可跨平台的,这下伤不起了 Java poi3.9解析word文档


后来因为没有实现,这个不了了之了!不知道现在有什么实现思路了吗?

#13


引用 12 楼 SHENZHOUCHEN91 的回复:
Quote: 引用 11 楼 tlfu_12344 的回复:

关于解析word中公式你解决没,之前看有说是用VBA,可是我们是要求可跨平台的,这下伤不起了 Java poi3.9解析word文档


后来因为没有实现,这个不了了之了!不知道现在有什么实现思路了吗?
嗯,有思路了,正在做!,用openoffice解析,对于docx文件,要自己改里面的配置文件

#14


问题已经解决:
Java poi3.9解析word文档Java poi3.9解析word文档

#15


楼主能详细说一下解决过程吗 我最近需要做这个东西。

#16


引用 14 楼 tlfu_12344 的回复:
问题已经解决:
Java poi3.9解析word文档Java poi3.9解析word文档
怎么解决的?大神快赐教!!!!最近急需这个功能

#17


引用 14 楼 tlfu_12344 的回复:
问题已经解决:
Java poi3.9解析word文档Java poi3.9解析word文档
联系方式Email:2213429531@qq.com

#18


引用 14 楼 tlfu_12344 的回复:
问题已经解决:
Java poi3.9解析word文档Java poi3.9解析word文档

怎么解决的?

#19


显然是把公式截成图片了啊

#20


引用 19 楼 gangzai626919 的回复:
显然是把公式截成图片了啊
显然你不懂mathml.ooxml!

#21


的确mathjax很不错 Java poi3.9解析word文档

#22


求demo 只给一段代码搞不好

#23


该回复于2014-05-09 08:29:56被管理员删除

#24


引用 16 楼 guoluqiang 的回复:
Quote: 引用 14 楼 tlfu_12344 的回复:

问题已经解决:
Java poi3.9解析word文档Java poi3.9解析word文档
怎么解决的?大神快赐教!!!!最近急需这个功能

怎么解决的呀,求帮助

#25


引用 22 楼 chen6013143 的回复:
求demo 只给一段代码搞不好

大神请问你弄好了吗,求帮助

#26


private List<StartEnd> CutRange(List<StartEnd> fromList, List<StartEnd> cutList)
        {
            List<StartEnd> resultList = new List<StartEnd>();
            for (int i = 0; i < fromList.Count; i++)
            {
                bool SAVE = true;
                for (int j = 0; j < cutList.Count; j++)
                {
                    if (fromList[i].Start >= cutList[j].Start && fromList[i].End <= cutList[j].End)
                    {
                        SAVE = false;
                        if(fromList[i].Start == cutList[j].Start)
                        {
                            resultList.Add(cutList[j]);
                        }
                    }
                }
                if (SAVE)
                    resultList.Add(fromList[i]);
            }
            return resultList;
        }


        public string GetSelectionImg(string paperName)
        {
            cutTimes = 1;
            cutTimesCount = 0;

            const int MAX_height = 3000;
            //string timeResult = "";

            Range range = Globals.ThisAddIn.Application.Selection.Range;
            Range range2 = Globals.ThisAddIn.Application.Selection.Range;

            //timeResult = timeResult + "Time1:" + DateTime.Now.ToString() + "\n";
            string imgName = Guid.NewGuid().ToString()+".png";
            if(!Directory.Exists(Globals.ThisAddIn.exerciseJsonPath+paperName))
            {
                Directory.CreateDirectory(Globals.ThisAddIn.exerciseJsonPath + paperName);
            }


            double zoom = 0.33;
            const int imgWidth = 1188;
            

            Image imgTemp = Metafile.FromStream(new MemoryStream(range.EnhMetaFileBits));


            //imgTemp.Save(Globals.ThisAddIn.exerciseJsonPath + paperName + "\\" + imgName, System.Drawing.Imaging.ImageFormat.Png);


            if (MAX_height < imgTemp.Height)
            {
                Paragraphs paragraphs = range.Paragraphs;
                Tables tables = range.Tables;

                List<StartEnd> paragraphList = new List<StartEnd>();
                List<StartEnd> tableList = new List<StartEnd>();

                for(int i=0;i<paragraphs.Count;i++)
                {
                    Paragraph paragraph=paragraphs[i+1];
                    StartEnd startEnd = new StartEnd();
                    startEnd.Start = paragraph.Range.Start;
                    startEnd.End = paragraph.Range.End;

                    paragraphList.Add(startEnd);
                }
                for(int i=0;i<tables.Count;i++)
                {
                    Table table=tables[i+1];
                    StartEnd startEnd = new StartEnd();
                    startEnd.Start = table.Range.Start;
                    startEnd.End = table.Range.End;

                    tableList.Add(startEnd);
                }

                List<StartEnd> resultList = CutRange(paragraphList,tableList);

                List<StartEnd> finalImgRangeList = new List<StartEnd>();
                for (int i = 0; i < resultList.Count;i++ )
                {
                    StartEnd startendImg = new StartEnd();
                    startendImg.Start = resultList[i].Start;
                    startendImg.End = resultList[i].End;
                    for(int j=i;j<resultList.Count;j++)
                    {
                        range2.SetRange((int)resultList[i].Start,(int)resultList[j].End);
                        Image img = Metafile.FromStream(new MemoryStream(range2.EnhMetaFileBits));
                        if(img.Height<MAX_height)
                        {
                            startendImg.End = resultList[j].End;
                            if (j == resultList.Count - 1)
                                i = j;
                        }
                        else
                        {
                            if(i==j)
                            {
                                MessageBox.Show("请确定没有超过一页的段落或表格");
                                Globals.ThisAddIn.Application.ActiveWindow.ScrollIntoView(range2);

                                return "";
                            }
                            i = j-1;
                            break;
                        }
                    }
                    finalImgRangeList.Add(startendImg);
                }
                cutTimes = finalImgRangeList.Count;


                //timeResult = timeResult + "Time2:" + DateTime.Now.ToString() + "\n";


                int allImgHeight = 0;
                int allImgWidth = 0;
                for (int i = 0; i < finalImgRangeList.Count; i++)
                {
                    range2.SetRange((int)finalImgRangeList[i].Start,(int)finalImgRangeList[i].End);
                    Image img = Metafile.FromStream(new MemoryStream(range2.EnhMetaFileBits));
                    if (img.Width > allImgWidth)
                        allImgWidth = img.Width;

                    allImgHeight += img.Height;
                }
                    //for(int i=0;i<resultList.Count;i++)
                    //{
                    //    int start=(int)resultList[i].Start;
                    //    int end=(int)resultList[i].End;
                    //    range2.SetRange(start,end);
                    //    if (i % 2 == 0)
                    //        range2.HighlightColorIndex = WdColorIndex.wdRed;
                    //    else
                    //        range2.HighlightColorIndex = WdColorIndex.wdYellow;
                    //}

                //zoom = (double)imgWidth / (double)allImgWidth;
                System.Drawing.Bitmap bmp = new Bitmap(imgWidth, (int)(allImgHeight * zoom));
                
                System.Drawing.Graphics gx = System.Drawing.Graphics.FromImage(bmp); // 创建Graphics对象 
                gx.InterpolationMode = InterpolationMode.HighQualityBicubic;
                // 指定高质量、低速度呈现。  
                gx.SmoothingMode = SmoothingMode.HighQuality;
                gx.CompositingQuality = CompositingQuality.HighQuality;

                gx.CompositingMode = CompositingMode.SourceOver;
                gx.TextRenderingHint = System.Drawing.Text.TextRenderingHint.ClearTypeGridFit;
                int startPosition = 0;
                double oldZoom = zoom;
                for (int i = 0; i < finalImgRangeList.Count; i++)
                {
                    range2.SetRange((int)finalImgRangeList[i].Start, (int)finalImgRangeList[i].End);
                    Image img = Metafile.FromStream(new MemoryStream(range2.EnhMetaFileBits));

                    if ((double)imgWidth / (double)img.Width < zoom)
                        zoom = (double)imgWidth / (double)img.Width;

                    gx.FillRectangle(new SolidBrush(System.Drawing.Color.Transparent), 0, startPosition, (int)(img.Width * zoom), (int)(img.Height * zoom));
                    gx.DrawImage(img, new System.Drawing.Rectangle(0, startPosition, (int)(img.Width * zoom), (int)(img.Height*zoom)));

                    startPosition += (int)(img.Height * zoom);
                    zoom = oldZoom;

                    cutTimesCount = i + 1;
                }

                //bmp = KiSharpen(bmp,(float)0.3);
                bmp.Save(Globals.ThisAddIn.exerciseJsonPath + paperName + "\\" + imgName, System.Drawing.Imaging.ImageFormat.Png);
            }
            else
            {
                //zoom = (double)imgWidth / (double)imgTemp.Width;
                System.Drawing.Bitmap bmp = new Bitmap(imgWidth, (int)(imgTemp.Height * zoom));
                System.Drawing.Graphics gx = System.Drawing.Graphics.FromImage(bmp); // 创建Graphics对象 
                gx.InterpolationMode = InterpolationMode.HighQualityBicubic;
                // 指定高质量、低速度呈现。  
                gx.SmoothingMode = SmoothingMode.HighQuality;
                gx.CompositingQuality = CompositingQuality.HighQuality;

                gx.CompositingMode = CompositingMode.SourceOver;
                gx.TextRenderingHint = System.Drawing.Text.TextRenderingHint.ClearTypeGridFit;

                gx.FillRectangle(new SolidBrush(System.Drawing.Color.Transparent), 0, 0, (int)(imgTemp.Width * zoom), (int)(imgTemp.Height * zoom));
                gx.DrawImage(imgTemp, new System.Drawing.Rectangle(0, 0, (int)(imgTemp.Width * zoom), (int)(imgTemp.Height * zoom)));
                //imgTemp.Save(Globals.ThisAddIn.exerciseJsonPath + paperName + "\\" + imgName, System.Drawing.Imaging.ImageFormat.Png);
                bmp.Save(Globals.ThisAddIn.exerciseJsonPath + paperName + "\\" + imgName, System.Drawing.Imaging.ImageFormat.Png);
            }
            //timeResult = timeResult + "Time3:" + DateTime.Now.ToString() + "\n";
            
            cutTimes = 1;
            cutTimesCount = 0;

            //MessageBox.Show(timeResult);
            return imgName;
        }

#27


引用 26 楼 guoluqiang 的回复:
private List<StartEnd> CutRange(List<StartEnd> fromList, List<StartEnd> cutList)
        {
            List<StartEnd> resultList = new List<StartEnd>();
            for (int i = 0; i < fromList.Count; i++)
            {
                bool SAVE = true;
                for (int j = 0; j < cutList.Count; j++)
                {
                    if (fromList[i].Start >= cutList[j].Start && fromList[i].End <= cutList[j].End)
                    {
                        SAVE = false;
                        if(fromList[i].Start == cutList[j].Start)
                        {
                            resultList.Add(cutList[j]);
                        }
                    }
                }
                if (SAVE)
                    resultList.Add(fromList[i]);
            }
            return resultList;
        }


        public string GetSelectionImg(string paperName)
        {
            cutTimes = 1;
            cutTimesCount = 0;

            const int MAX_height = 3000;
            //string timeResult = "";

            Range range = Globals.ThisAddIn.Application.Selection.Range;
            Range range2 = Globals.ThisAddIn.Application.Selection.Range;

            //timeResult = timeResult + "Time1:" + DateTime.Now.ToString() + "\n";
            string imgName = Guid.NewGuid().ToString()+".png";
            if(!Directory.Exists(Globals.ThisAddIn.exerciseJsonPath+paperName))
            {
                Directory.CreateDirectory(Globals.ThisAddIn.exerciseJsonPath + paperName);
            }


            double zoom = 0.33;
            const int imgWidth = 1188;
            

            Image imgTemp = Metafile.FromStream(new MemoryStream(range.EnhMetaFileBits));


            //imgTemp.Save(Globals.ThisAddIn.exerciseJsonPath + paperName + "\\" + imgName, System.Drawing.Imaging.ImageFormat.Png);


            if (MAX_height < imgTemp.Height)
            {
                Paragraphs paragraphs = range.Paragraphs;
                Tables tables = range.Tables;

                List<StartEnd> paragraphList = new List<StartEnd>();
                List<StartEnd> tableList = new List<StartEnd>();

                for(int i=0;i<paragraphs.Count;i++)
                {
                    Paragraph paragraph=paragraphs[i+1];
                    StartEnd startEnd = new StartEnd();
                    startEnd.Start = paragraph.Range.Start;
                    startEnd.End = paragraph.Range.End;

                    paragraphList.Add(startEnd);
                }
                for(int i=0;i<tables.Count;i++)
                {
                    Table table=tables[i+1];
                    StartEnd startEnd = new StartEnd();
                    startEnd.Start = table.Range.Start;
                    startEnd.End = table.Range.End;

                    tableList.Add(startEnd);
                }

                List<StartEnd> resultList = CutRange(paragraphList,tableList);

                List<StartEnd> finalImgRangeList = new List<StartEnd>();
                for (int i = 0; i < resultList.Count;i++ )
                {
                    StartEnd startendImg = new StartEnd();
                    startendImg.Start = resultList[i].Start;
                    startendImg.End = resultList[i].End;
                    for(int j=i;j<resultList.Count;j++)
                    {
                        range2.SetRange((int)resultList[i].Start,(int)resultList[j].End);
                        Image img = Metafile.FromStream(new MemoryStream(range2.EnhMetaFileBits));
                        if(img.Height<MAX_height)
                        {
                            startendImg.End = resultList[j].End;
                            if (j == resultList.Count - 1)
                                i = j;
                        }
                        else
                        {
                            if(i==j)
                            {
                                MessageBox.Show("请确定没有超过一页的段落或表格");
                                Globals.ThisAddIn.Application.ActiveWindow.ScrollIntoView(range2);

                                return "";
                            }
                            i = j-1;
                            break;
                        }
                    }
                    finalImgRangeList.Add(startendImg);
                }
                cutTimes = finalImgRangeList.Count;


                //timeResult = timeResult + "Time2:" + DateTime.Now.ToString() + "\n";


                int allImgHeight = 0;
                int allImgWidth = 0;
                for (int i = 0; i < finalImgRangeList.Count; i++)
                {
                    range2.SetRange((int)finalImgRangeList[i].Start,(int)finalImgRangeList[i].End);
                    Image img = Metafile.FromStream(new MemoryStream(range2.EnhMetaFileBits));
                    if (img.Width > allImgWidth)
                        allImgWidth = img.Width;

                    allImgHeight += img.Height;
                }
                    //for(int i=0;i<resultList.Count;i++)
                    //{
                    //    int start=(int)resultList[i].Start;
                    //    int end=(int)resultList[i].End;
                    //    range2.SetRange(start,end);
                    //    if (i % 2 == 0)
                    //        range2.HighlightColorIndex = WdColorIndex.wdRed;
                    //    else
                    //        range2.HighlightColorIndex = WdColorIndex.wdYellow;
                    //}

                //zoom = (double)imgWidth / (double)allImgWidth;
                System.Drawing.Bitmap bmp = new Bitmap(imgWidth, (int)(allImgHeight * zoom));
                
                System.Drawing.Graphics gx = System.Drawing.Graphics.FromImage(bmp); // 创建Graphics对象 
                gx.InterpolationMode = InterpolationMode.HighQualityBicubic;
                // 指定高质量、低速度呈现。  
                gx.SmoothingMode = SmoothingMode.HighQuality;
                gx.CompositingQuality = CompositingQuality.HighQuality;

                gx.CompositingMode = CompositingMode.SourceOver;
                gx.TextRenderingHint = System.Drawing.Text.TextRenderingHint.ClearTypeGridFit;
                int startPosition = 0;
                double oldZoom = zoom;
                for (int i = 0; i < finalImgRangeList.Count; i++)
                {
                    range2.SetRange((int)finalImgRangeList[i].Start, (int)finalImgRangeList[i].End);
                    Image img = Metafile.FromStream(new MemoryStream(range2.EnhMetaFileBits));

                    if ((double)imgWidth / (double)img.Width < zoom)
                        zoom = (double)imgWidth / (double)img.Width;

                    gx.FillRectangle(new SolidBrush(System.Drawing.Color.Transparent), 0, startPosition, (int)(img.Width * zoom), (int)(img.Height * zoom));
                    gx.DrawImage(img, new System.Drawing.Rectangle(0, startPosition, (int)(img.Width * zoom), (int)(img.Height*zoom)));

                    startPosition += (int)(img.Height * zoom);
                    zoom = oldZoom;

                    cutTimesCount = i + 1;
                }

                //bmp = KiSharpen(bmp,(float)0.3);
                bmp.Save(Globals.ThisAddIn.exerciseJsonPath + paperName + "\\" + imgName, System.Drawing.Imaging.ImageFormat.Png);
            }
            else
            {
                //zoom = (double)imgWidth / (double)imgTemp.Width;
                System.Drawing.Bitmap bmp = new Bitmap(imgWidth, (int)(imgTemp.Height * zoom));
                System.Drawing.Graphics gx = System.Drawing.Graphics.FromImage(bmp); // 创建Graphics对象 
                gx.InterpolationMode = InterpolationMode.HighQualityBicubic;
                // 指定高质量、低速度呈现。  
                gx.SmoothingMode = SmoothingMode.HighQuality;
                gx.CompositingQuality = CompositingQuality.HighQuality;

                gx.CompositingMode = CompositingMode.SourceOver;
                gx.TextRenderingHint = System.Drawing.Text.TextRenderingHint.ClearTypeGridFit;

                gx.FillRectangle(new SolidBrush(System.Drawing.Color.Transparent), 0, 0, (int)(imgTemp.Width * zoom), (int)(imgTemp.Height * zoom));
                gx.DrawImage(imgTemp, new System.Drawing.Rectangle(0, 0, (int)(imgTemp.Width * zoom), (int)(imgTemp.Height * zoom)));
                //imgTemp.Save(Globals.ThisAddIn.exerciseJsonPath + paperName + "\\" + imgName, System.Drawing.Imaging.ImageFormat.Png);
                bmp.Save(Globals.ThisAddIn.exerciseJsonPath + paperName + "\\" + imgName, System.Drawing.Imaging.ImageFormat.Png);
            }
            //timeResult = timeResult + "Time3:" + DateTime.Now.ToString() + "\n";
            
            cutTimes = 1;
            cutTimesCount = 0;

            //MessageBox.Show(timeResult);
            return imgName;
        }

很明显你这不是跨平台的

#28


装逼又不分享,嘲讽别人

#29


楼主,请问具体是想怎么从Word上把公式读出来的呢?

#1


你确定java 解析word 的资料少?

#2


感觉都讲得很浅,并不是我想要的   poi word部分其官方的API,一些方法也讲得很简单
有过经验的朋友能一起来帮忙给个建议吗?

#3


InputStream input = null;  
HWPFDocument doc = null;
String content = null;
try {
input = new FileInputStream("2011小学数学毕业模拟试卷1.doc");
doc = new HWPFDocument(input);
content = doc.getRange().text();
        List<Picture> picA = doc.getPicturesTable().getAllPictures();// 把word中数学公式以wmf图片的格式读到List中
Picture p = null;
for (int i = 0; i < picA.size(); i++) {
    p = (Picture) picA.get(i);
    if(p!=null){
        FileOutputStream output = new FileOutputStream("F://" + i + "."+p.suggestFileExtension());
        p.writeImageContent(output); 
        output.close();
    }
}
} catch (Exception e) {
e.printStackTrace();
}

int pici = 1;
int length = doc.characterLength();
for (int m = 0; m < length - 1; m++) { 
Range range = new Range(m, m + 1, doc); 
for (int j = 0; j < range.numCharacterRuns(); j++) {
CharacterRun cr = range.getCharacterRun(j);
if (cr.isOle2()) {     // 表明是ole2对象,word文档数学公式由mathtype编辑的ole对象
cr.replaceText(cr.text(), "方程式"+pici);
                        //貌似只会替换一次
pici++;
}
}
}

#4


Java poi3.9解析word文档学习。

#5


我也遇到这样的问题

#6


大哥,现在实现了没有?小弟想请教一下

#7


回复楼主 关于java读取word 公式问题

刚好今天我也碰到一个这个样的问题, 我用的是jacob 将word 转成html, word里面的公式会变成图片。
html就用Jsoup 解析的。

缺陷是 用jacob 服务器必须是windows 的。 而且需要装office2007 


今天找了一天的资料。没发现java可以操作word 公式的。。 

#8


用openoffice看看,是跨平台的,word中的公司是作为图片处理的

#9


第一问,也许资料少,但是官方API里肯定有例子,下载后的POI方法中也有demo。照着学,挨个测试一下效果。
第二问,数学公式中的任何符号都有对应的ASSIC编码吧,存储编码数字就可以了

#10


服务器是windows的方法多些如7楼说的jacob\及自己开发的插件,其它的用 openoffice但支持的也不是很好,加的样式什么的有时显示不太准确

#11


关于解析word中公式你解决没,之前看有说是用VBA,可是我们是要求可跨平台的,这下伤不起了 Java poi3.9解析word文档

#12


引用 11 楼 tlfu_12344 的回复:
关于解析word中公式你解决没,之前看有说是用VBA,可是我们是要求可跨平台的,这下伤不起了 Java poi3.9解析word文档


后来因为没有实现,这个不了了之了!不知道现在有什么实现思路了吗?

#13


引用 12 楼 SHENZHOUCHEN91 的回复:
Quote: 引用 11 楼 tlfu_12344 的回复:

关于解析word中公式你解决没,之前看有说是用VBA,可是我们是要求可跨平台的,这下伤不起了 Java poi3.9解析word文档


后来因为没有实现,这个不了了之了!不知道现在有什么实现思路了吗?
嗯,有思路了,正在做!,用openoffice解析,对于docx文件,要自己改里面的配置文件

#14


问题已经解决:
Java poi3.9解析word文档Java poi3.9解析word文档

#15


楼主能详细说一下解决过程吗 我最近需要做这个东西。

#16


引用 14 楼 tlfu_12344 的回复:
问题已经解决:
Java poi3.9解析word文档Java poi3.9解析word文档
怎么解决的?大神快赐教!!!!最近急需这个功能

#17


引用 14 楼 tlfu_12344 的回复:
问题已经解决:
Java poi3.9解析word文档Java poi3.9解析word文档
联系方式Email:2213429531@qq.com

#18


引用 14 楼 tlfu_12344 的回复:
问题已经解决:
Java poi3.9解析word文档Java poi3.9解析word文档

怎么解决的?

#19


显然是把公式截成图片了啊

#20


引用 19 楼 gangzai626919 的回复:
显然是把公式截成图片了啊
显然你不懂mathml.ooxml!

#21


的确mathjax很不错 Java poi3.9解析word文档

#22


求demo 只给一段代码搞不好

#23


该回复于2014-05-09 08:29:56被管理员删除

#24


引用 16 楼 guoluqiang 的回复:
Quote: 引用 14 楼 tlfu_12344 的回复:

问题已经解决:
Java poi3.9解析word文档Java poi3.9解析word文档
怎么解决的?大神快赐教!!!!最近急需这个功能

怎么解决的呀,求帮助

#25


引用 22 楼 chen6013143 的回复:
求demo 只给一段代码搞不好

大神请问你弄好了吗,求帮助

#26


private List<StartEnd> CutRange(List<StartEnd> fromList, List<StartEnd> cutList)
        {
            List<StartEnd> resultList = new List<StartEnd>();
            for (int i = 0; i < fromList.Count; i++)
            {
                bool SAVE = true;
                for (int j = 0; j < cutList.Count; j++)
                {
                    if (fromList[i].Start >= cutList[j].Start && fromList[i].End <= cutList[j].End)
                    {
                        SAVE = false;
                        if(fromList[i].Start == cutList[j].Start)
                        {
                            resultList.Add(cutList[j]);
                        }
                    }
                }
                if (SAVE)
                    resultList.Add(fromList[i]);
            }
            return resultList;
        }


        public string GetSelectionImg(string paperName)
        {
            cutTimes = 1;
            cutTimesCount = 0;

            const int MAX_height = 3000;
            //string timeResult = "";

            Range range = Globals.ThisAddIn.Application.Selection.Range;
            Range range2 = Globals.ThisAddIn.Application.Selection.Range;

            //timeResult = timeResult + "Time1:" + DateTime.Now.ToString() + "\n";
            string imgName = Guid.NewGuid().ToString()+".png";
            if(!Directory.Exists(Globals.ThisAddIn.exerciseJsonPath+paperName))
            {
                Directory.CreateDirectory(Globals.ThisAddIn.exerciseJsonPath + paperName);
            }


            double zoom = 0.33;
            const int imgWidth = 1188;
            

            Image imgTemp = Metafile.FromStream(new MemoryStream(range.EnhMetaFileBits));


            //imgTemp.Save(Globals.ThisAddIn.exerciseJsonPath + paperName + "\\" + imgName, System.Drawing.Imaging.ImageFormat.Png);


            if (MAX_height < imgTemp.Height)
            {
                Paragraphs paragraphs = range.Paragraphs;
                Tables tables = range.Tables;

                List<StartEnd> paragraphList = new List<StartEnd>();
                List<StartEnd> tableList = new List<StartEnd>();

                for(int i=0;i<paragraphs.Count;i++)
                {
                    Paragraph paragraph=paragraphs[i+1];
                    StartEnd startEnd = new StartEnd();
                    startEnd.Start = paragraph.Range.Start;
                    startEnd.End = paragraph.Range.End;

                    paragraphList.Add(startEnd);
                }
                for(int i=0;i<tables.Count;i++)
                {
                    Table table=tables[i+1];
                    StartEnd startEnd = new StartEnd();
                    startEnd.Start = table.Range.Start;
                    startEnd.End = table.Range.End;

                    tableList.Add(startEnd);
                }

                List<StartEnd> resultList = CutRange(paragraphList,tableList);

                List<StartEnd> finalImgRangeList = new List<StartEnd>();
                for (int i = 0; i < resultList.Count;i++ )
                {
                    StartEnd startendImg = new StartEnd();
                    startendImg.Start = resultList[i].Start;
                    startendImg.End = resultList[i].End;
                    for(int j=i;j<resultList.Count;j++)
                    {
                        range2.SetRange((int)resultList[i].Start,(int)resultList[j].End);
                        Image img = Metafile.FromStream(new MemoryStream(range2.EnhMetaFileBits));
                        if(img.Height<MAX_height)
                        {
                            startendImg.End = resultList[j].End;
                            if (j == resultList.Count - 1)
                                i = j;
                        }
                        else
                        {
                            if(i==j)
                            {
                                MessageBox.Show("请确定没有超过一页的段落或表格");
                                Globals.ThisAddIn.Application.ActiveWindow.ScrollIntoView(range2);

                                return "";
                            }
                            i = j-1;
                            break;
                        }
                    }
                    finalImgRangeList.Add(startendImg);
                }
                cutTimes = finalImgRangeList.Count;


                //timeResult = timeResult + "Time2:" + DateTime.Now.ToString() + "\n";


                int allImgHeight = 0;
                int allImgWidth = 0;
                for (int i = 0; i < finalImgRangeList.Count; i++)
                {
                    range2.SetRange((int)finalImgRangeList[i].Start,(int)finalImgRangeList[i].End);
                    Image img = Metafile.FromStream(new MemoryStream(range2.EnhMetaFileBits));
                    if (img.Width > allImgWidth)
                        allImgWidth = img.Width;

                    allImgHeight += img.Height;
                }
                    //for(int i=0;i<resultList.Count;i++)
                    //{
                    //    int start=(int)resultList[i].Start;
                    //    int end=(int)resultList[i].End;
                    //    range2.SetRange(start,end);
                    //    if (i % 2 == 0)
                    //        range2.HighlightColorIndex = WdColorIndex.wdRed;
                    //    else
                    //        range2.HighlightColorIndex = WdColorIndex.wdYellow;
                    //}

                //zoom = (double)imgWidth / (double)allImgWidth;
                System.Drawing.Bitmap bmp = new Bitmap(imgWidth, (int)(allImgHeight * zoom));
                
                System.Drawing.Graphics gx = System.Drawing.Graphics.FromImage(bmp); // 创建Graphics对象 
                gx.InterpolationMode = InterpolationMode.HighQualityBicubic;
                // 指定高质量、低速度呈现。  
                gx.SmoothingMode = SmoothingMode.HighQuality;
                gx.CompositingQuality = CompositingQuality.HighQuality;

                gx.CompositingMode = CompositingMode.SourceOver;
                gx.TextRenderingHint = System.Drawing.Text.TextRenderingHint.ClearTypeGridFit;
                int startPosition = 0;
                double oldZoom = zoom;
                for (int i = 0; i < finalImgRangeList.Count; i++)
                {
                    range2.SetRange((int)finalImgRangeList[i].Start, (int)finalImgRangeList[i].End);
                    Image img = Metafile.FromStream(new MemoryStream(range2.EnhMetaFileBits));

                    if ((double)imgWidth / (double)img.Width < zoom)
                        zoom = (double)imgWidth / (double)img.Width;

                    gx.FillRectangle(new SolidBrush(System.Drawing.Color.Transparent), 0, startPosition, (int)(img.Width * zoom), (int)(img.Height * zoom));
                    gx.DrawImage(img, new System.Drawing.Rectangle(0, startPosition, (int)(img.Width * zoom), (int)(img.Height*zoom)));

                    startPosition += (int)(img.Height * zoom);
                    zoom = oldZoom;

                    cutTimesCount = i + 1;
                }

                //bmp = KiSharpen(bmp,(float)0.3);
                bmp.Save(Globals.ThisAddIn.exerciseJsonPath + paperName + "\\" + imgName, System.Drawing.Imaging.ImageFormat.Png);
            }
            else
            {
                //zoom = (double)imgWidth / (double)imgTemp.Width;
                System.Drawing.Bitmap bmp = new Bitmap(imgWidth, (int)(imgTemp.Height * zoom));
                System.Drawing.Graphics gx = System.Drawing.Graphics.FromImage(bmp); // 创建Graphics对象 
                gx.InterpolationMode = InterpolationMode.HighQualityBicubic;
                // 指定高质量、低速度呈现。  
                gx.SmoothingMode = SmoothingMode.HighQuality;
                gx.CompositingQuality = CompositingQuality.HighQuality;

                gx.CompositingMode = CompositingMode.SourceOver;
                gx.TextRenderingHint = System.Drawing.Text.TextRenderingHint.ClearTypeGridFit;

                gx.FillRectangle(new SolidBrush(System.Drawing.Color.Transparent), 0, 0, (int)(imgTemp.Width * zoom), (int)(imgTemp.Height * zoom));
                gx.DrawImage(imgTemp, new System.Drawing.Rectangle(0, 0, (int)(imgTemp.Width * zoom), (int)(imgTemp.Height * zoom)));
                //imgTemp.Save(Globals.ThisAddIn.exerciseJsonPath + paperName + "\\" + imgName, System.Drawing.Imaging.ImageFormat.Png);
                bmp.Save(Globals.ThisAddIn.exerciseJsonPath + paperName + "\\" + imgName, System.Drawing.Imaging.ImageFormat.Png);
            }
            //timeResult = timeResult + "Time3:" + DateTime.Now.ToString() + "\n";
            
            cutTimes = 1;
            cutTimesCount = 0;

            //MessageBox.Show(timeResult);
            return imgName;
        }

#27


引用 26 楼 guoluqiang 的回复:
private List<StartEnd> CutRange(List<StartEnd> fromList, List<StartEnd> cutList)
        {
            List<StartEnd> resultList = new List<StartEnd>();
            for (int i = 0; i < fromList.Count; i++)
            {
                bool SAVE = true;
                for (int j = 0; j < cutList.Count; j++)
                {
                    if (fromList[i].Start >= cutList[j].Start && fromList[i].End <= cutList[j].End)
                    {
                        SAVE = false;
                        if(fromList[i].Start == cutList[j].Start)
                        {
                            resultList.Add(cutList[j]);
                        }
                    }
                }
                if (SAVE)
                    resultList.Add(fromList[i]);
            }
            return resultList;
        }


        public string GetSelectionImg(string paperName)
        {
            cutTimes = 1;
            cutTimesCount = 0;

            const int MAX_height = 3000;
            //string timeResult = "";

            Range range = Globals.ThisAddIn.Application.Selection.Range;
            Range range2 = Globals.ThisAddIn.Application.Selection.Range;

            //timeResult = timeResult + "Time1:" + DateTime.Now.ToString() + "\n";
            string imgName = Guid.NewGuid().ToString()+".png";
            if(!Directory.Exists(Globals.ThisAddIn.exerciseJsonPath+paperName))
            {
                Directory.CreateDirectory(Globals.ThisAddIn.exerciseJsonPath + paperName);
            }


            double zoom = 0.33;
            const int imgWidth = 1188;
            

            Image imgTemp = Metafile.FromStream(new MemoryStream(range.EnhMetaFileBits));


            //imgTemp.Save(Globals.ThisAddIn.exerciseJsonPath + paperName + "\\" + imgName, System.Drawing.Imaging.ImageFormat.Png);


            if (MAX_height < imgTemp.Height)
            {
                Paragraphs paragraphs = range.Paragraphs;
                Tables tables = range.Tables;

                List<StartEnd> paragraphList = new List<StartEnd>();
                List<StartEnd> tableList = new List<StartEnd>();

                for(int i=0;i<paragraphs.Count;i++)
                {
                    Paragraph paragraph=paragraphs[i+1];
                    StartEnd startEnd = new StartEnd();
                    startEnd.Start = paragraph.Range.Start;
                    startEnd.End = paragraph.Range.End;

                    paragraphList.Add(startEnd);
                }
                for(int i=0;i<tables.Count;i++)
                {
                    Table table=tables[i+1];
                    StartEnd startEnd = new StartEnd();
                    startEnd.Start = table.Range.Start;
                    startEnd.End = table.Range.End;

                    tableList.Add(startEnd);
                }

                List<StartEnd> resultList = CutRange(paragraphList,tableList);

                List<StartEnd> finalImgRangeList = new List<StartEnd>();
                for (int i = 0; i < resultList.Count;i++ )
                {
                    StartEnd startendImg = new StartEnd();
                    startendImg.Start = resultList[i].Start;
                    startendImg.End = resultList[i].End;
                    for(int j=i;j<resultList.Count;j++)
                    {
                        range2.SetRange((int)resultList[i].Start,(int)resultList[j].End);
                        Image img = Metafile.FromStream(new MemoryStream(range2.EnhMetaFileBits));
                        if(img.Height<MAX_height)
                        {
                            startendImg.End = resultList[j].End;
                            if (j == resultList.Count - 1)
                                i = j;
                        }
                        else
                        {
                            if(i==j)
                            {
                                MessageBox.Show("请确定没有超过一页的段落或表格");
                                Globals.ThisAddIn.Application.ActiveWindow.ScrollIntoView(range2);

                                return "";
                            }
                            i = j-1;
                            break;
                        }
                    }
                    finalImgRangeList.Add(startendImg);
                }
                cutTimes = finalImgRangeList.Count;


                //timeResult = timeResult + "Time2:" + DateTime.Now.ToString() + "\n";


                int allImgHeight = 0;
                int allImgWidth = 0;
                for (int i = 0; i < finalImgRangeList.Count; i++)
                {
                    range2.SetRange((int)finalImgRangeList[i].Start,(int)finalImgRangeList[i].End);
                    Image img = Metafile.FromStream(new MemoryStream(range2.EnhMetaFileBits));
                    if (img.Width > allImgWidth)
                        allImgWidth = img.Width;

                    allImgHeight += img.Height;
                }
                    //for(int i=0;i<resultList.Count;i++)
                    //{
                    //    int start=(int)resultList[i].Start;
                    //    int end=(int)resultList[i].End;
                    //    range2.SetRange(start,end);
                    //    if (i % 2 == 0)
                    //        range2.HighlightColorIndex = WdColorIndex.wdRed;
                    //    else
                    //        range2.HighlightColorIndex = WdColorIndex.wdYellow;
                    //}

                //zoom = (double)imgWidth / (double)allImgWidth;
                System.Drawing.Bitmap bmp = new Bitmap(imgWidth, (int)(allImgHeight * zoom));
                
                System.Drawing.Graphics gx = System.Drawing.Graphics.FromImage(bmp); // 创建Graphics对象 
                gx.InterpolationMode = InterpolationMode.HighQualityBicubic;
                // 指定高质量、低速度呈现。  
                gx.SmoothingMode = SmoothingMode.HighQuality;
                gx.CompositingQuality = CompositingQuality.HighQuality;

                gx.CompositingMode = CompositingMode.SourceOver;
                gx.TextRenderingHint = System.Drawing.Text.TextRenderingHint.ClearTypeGridFit;
                int startPosition = 0;
                double oldZoom = zoom;
                for (int i = 0; i < finalImgRangeList.Count; i++)
                {
                    range2.SetRange((int)finalImgRangeList[i].Start, (int)finalImgRangeList[i].End);
                    Image img = Metafile.FromStream(new MemoryStream(range2.EnhMetaFileBits));

                    if ((double)imgWidth / (double)img.Width < zoom)
                        zoom = (double)imgWidth / (double)img.Width;

                    gx.FillRectangle(new SolidBrush(System.Drawing.Color.Transparent), 0, startPosition, (int)(img.Width * zoom), (int)(img.Height * zoom));
                    gx.DrawImage(img, new System.Drawing.Rectangle(0, startPosition, (int)(img.Width * zoom), (int)(img.Height*zoom)));

                    startPosition += (int)(img.Height * zoom);
                    zoom = oldZoom;

                    cutTimesCount = i + 1;
                }

                //bmp = KiSharpen(bmp,(float)0.3);
                bmp.Save(Globals.ThisAddIn.exerciseJsonPath + paperName + "\\" + imgName, System.Drawing.Imaging.ImageFormat.Png);
            }
            else
            {
                //zoom = (double)imgWidth / (double)imgTemp.Width;
                System.Drawing.Bitmap bmp = new Bitmap(imgWidth, (int)(imgTemp.Height * zoom));
                System.Drawing.Graphics gx = System.Drawing.Graphics.FromImage(bmp); // 创建Graphics对象 
                gx.InterpolationMode = InterpolationMode.HighQualityBicubic;
                // 指定高质量、低速度呈现。  
                gx.SmoothingMode = SmoothingMode.HighQuality;
                gx.CompositingQuality = CompositingQuality.HighQuality;

                gx.CompositingMode = CompositingMode.SourceOver;
                gx.TextRenderingHint = System.Drawing.Text.TextRenderingHint.ClearTypeGridFit;

                gx.FillRectangle(new SolidBrush(System.Drawing.Color.Transparent), 0, 0, (int)(imgTemp.Width * zoom), (int)(imgTemp.Height * zoom));
                gx.DrawImage(imgTemp, new System.Drawing.Rectangle(0, 0, (int)(imgTemp.Width * zoom), (int)(imgTemp.Height * zoom)));
                //imgTemp.Save(Globals.ThisAddIn.exerciseJsonPath + paperName + "\\" + imgName, System.Drawing.Imaging.ImageFormat.Png);
                bmp.Save(Globals.ThisAddIn.exerciseJsonPath + paperName + "\\" + imgName, System.Drawing.Imaging.ImageFormat.Png);
            }
            //timeResult = timeResult + "Time3:" + DateTime.Now.ToString() + "\n";
            
            cutTimes = 1;
            cutTimesCount = 0;

            //MessageBox.Show(timeResult);
            return imgName;
        }

很明显你这不是跨平台的

#28


装逼又不分享,嘲讽别人

#29


楼主,请问具体是想怎么从Word上把公式读出来的呢?