如何从Excel单元格中提取链接url

时间:2022-01-15 21:19:34

I have a c# webjob that downloads and then reads an Excel file. One of the columns contains links that I'd like to save in my database. I'm currently using ExcelDataReader to convert the Excel file to a DataSet and then looping through the rows to grab the data. After conversion the column in question at this point is only a string containing the link text.

我有一个c# webjob下载,然后读取Excel文件。其中一个列包含我想要保存在数据库中的链接。我目前使用ExcelDataReader将Excel文件转换为数据集,然后循环遍历行以获取数据。在转换之后,此时所讨论的列仅仅是一个包含链接文本的字符串。

From some other reading it sounds like in Excel, hyperlinks are stored elsewhere and that information isn't preserved when converting the Excel file to a DataSet.

从其他一些阅读资料中可以听出来,超链接存储在其他地方,当将Excel文件转换成数据集时,这些信息不会被保留。

I'm not set on using ExcelDataReader but would like to find a solution to extract these link URLs without having to pay for some third part software.

我不打算使用ExcelDataReader,但我希望找到一个解决方案来提取这些链接url,而不必为第三方软件付费。

Here is the simple code I have so far as reference:

以下是我所引用的简单代码:

FileStream stream = File.Open(fileLocation, FileMode.Open, FileAccess.Read);
IExcelDataReader excelReader = ExcelReaderFactory.CreateOpenXmlReader(stream);
excelReader.IsFirstRowAsColumnNames = true;

DataSet result = excelReader.AsDataSet();

int count = 0;

foreach (DataRow row in result.Tables["WorkSheetName"].DataTable.Rows)
{
    var item = new myObject();

    item.Prop1 = long.Parse(row["Column3"].ToString());
    item.Prop2 = row["Column7"].ToString(); //The link, currently only seeing link text

    this.myDbContext.myTable.Add(item);
    await this.myDbContext.SaveChangesAsync();

    count += 1;
}

1 个解决方案

#1


1  

I ended up being able to get the hyperlink data using EPPLUS to read my excel file.

我最终能够使用EPPLUS获取超链接数据来读取我的excel文件。

Code:

代码:

var pck = new ExcelPackage(excelFileStream);
ExcelWorksheet ws = pck.Workbook.Worksheets.First();

DataTable dt = new DataTable(ws.Name);
int totalCols = ws.Dimension.End.Column;
int totalRows = ws.Dimension.End.Row;
int startRow = 3;
ExcelRange wsRow;
DataRow dr;
foreach (var firstRowCell in ws.Cells[2, 1, 2, totalCols])
{
    dt.Columns.Add(firstRowCell.Text);
}

for (int rowNum = startRow; rowNum <= totalRows; rowNum++)
{
    wsRow = ws.Cells[rowNum, 1, rowNum, totalCols];
    dr = dt.NewRow();
    int rowCnt = 0;
    foreach (var cell in wsRow)
    {
        if (rowCnt == 7)
        {
            if (cell.Hyperlink != null)
            {
                dr[cell.Start.Column - 1] = cell.Hyperlink.AbsoluteUri;
            }
        }
        else
        {
            dr[cell.Start.Column - 1] = cell.Text;
        }

        rowCnt++;
    }

    if (!String.IsNullOrEmpty(dr[7].ToString()))
    {
        dt.Rows.Add(dr);
    }
}

return dt;

#1


1  

I ended up being able to get the hyperlink data using EPPLUS to read my excel file.

我最终能够使用EPPLUS获取超链接数据来读取我的excel文件。

Code:

代码:

var pck = new ExcelPackage(excelFileStream);
ExcelWorksheet ws = pck.Workbook.Worksheets.First();

DataTable dt = new DataTable(ws.Name);
int totalCols = ws.Dimension.End.Column;
int totalRows = ws.Dimension.End.Row;
int startRow = 3;
ExcelRange wsRow;
DataRow dr;
foreach (var firstRowCell in ws.Cells[2, 1, 2, totalCols])
{
    dt.Columns.Add(firstRowCell.Text);
}

for (int rowNum = startRow; rowNum <= totalRows; rowNum++)
{
    wsRow = ws.Cells[rowNum, 1, rowNum, totalCols];
    dr = dt.NewRow();
    int rowCnt = 0;
    foreach (var cell in wsRow)
    {
        if (rowCnt == 7)
        {
            if (cell.Hyperlink != null)
            {
                dr[cell.Start.Column - 1] = cell.Hyperlink.AbsoluteUri;
            }
        }
        else
        {
            dr[cell.Start.Column - 1] = cell.Text;
        }

        rowCnt++;
    }

    if (!String.IsNullOrEmpty(dr[7].ToString()))
    {
        dt.Rows.Add(dr);
    }
}

return dt;