为包含utf-8字符的大数据帧编写xlsx和openxlsx包。

时间:2021-04-05 23:51:19

I am trying to write a fairly large data frame (more than 200 columns and more than 6000 rows) that includes Hebrew characters to excel using the openxlsx package through shiny.
For some reason when I download the file I get a corrupt excel file, when trying to open it i get a

我正在尝试编写一个相当大的数据框架(超过200个列和超过6000行),其中包括使用openxlsx包的希伯来字符。由于某些原因,当我下载文件时,我得到一个腐败的excel文件,当我试图打开它时,我得到了一个。

excel found unreadable content do you want to recover the contents of this workbook

excel发现了不可读的内容,您想要恢复该工作簿的内容吗?

and then:

然后:

excel was able to open the file by repairing or removing the unreadable content

excel可以通过修复或删除不可读的内容来打开文件。

Once i open the file all the hebrew characters are gone!
Trying to reproduce this issue I found that if I try writing a smaller data frame for instance [100:100] it works the hebrew is there, but once i make my data frame larger it does not work.
Here is a link to the file I am using in the test code below
and here is the code i am using:
server.R

一旦我打开文件,所有的希伯来文字都消失了!试图重现这个问题,我发现,如果我尝试编写一个较小的数据框架,例如[100:100],它可以使用希伯来语,但是一旦我把我的数据帧放大,它就不起作用了。下面是我在下面的测试代码中使用的文件的链接,下面是我正在使用的代码:server.R。

library(shiny)
    library(openxlsx)
    shinyServer(function(input, output) {
      datasetInput <- reactive({
      file_1 <- read.csv("../file1.txt", header=T, stringsAsFactors =F)   
      file1<-file_1[1:200,1:200] ##if i place here [100:100] it works fine!
      return(file1)
      })

  output$table <- renderTable({
    datasetInput()
  })

  output$downloadData <- downloadHandler(
    filename = function() { paste("download", "xlsx", sep='.') },
    content = function(file){
      fname <- paste(file,"xlsx",sep=".")
   wb <- createWorkbook()
   print(class(datasetInput()))
    addWorksheet(wb = wb, sheetName = "Sheet 1", gridLines = FALSE)
    writeDataTable(wb = wb, sheet = 1, x = datasetInput())
    saveWorkbook(wb, file, overwrite = TRUE)
    }
  )
})

ui.R

ui.R

 shinyUI(pageWithSidebar(
      headerPanel('Download Example'),
      sidebarPanel(
        downloadButton('downloadData', 'Download')
      ),
      mainPanel(
        tableOutput('table')
      )
    ))

1 个解决方案

#1


1  

Thanks to the creator of this package Alexander Walker the issue was solved:
It appears the issue is due to an escape character in one of the strings "\b"

多亏了这个包裹的创造者亚历山大·沃克,这个问题得到了解决:这个问题似乎是由于一个字符串“\b”中的一个转义字符造成的。

> x <- read.csv("file1.txt")
> x[150,44]
[1] ÷øéîéðåìåâéä áäúîçåú áîãò ôåøðæé - îåñîê    \b

The fix is:

解决办法是:

x <- read.csv("file1.txt", stringsAsFactors = FALSE)
wb <- createWorkbook()
addWorksheet(wb, "Sheet 1")

is_character_col <- which(sapply(x, class) %in% "character") for(i in is_character_col){ x[[i]] <- gsub("\b", "", x[[i]], fixed = TRUE) }

is_character_col <- which(sapply(x, class) %in%“字符”)for(i in is_character_col){x[i]] <- gsub("\b", ", x[[i]], fixed = TRUE)}

writeDataTable(wb, 1, x) saveWorkbook(wb, "hopefully_fixed.xlsx")

writeDataTable(wb, 1, x) saveWorkbook(wb,“hopefully_fix.xlsx”)

#1


1  

Thanks to the creator of this package Alexander Walker the issue was solved:
It appears the issue is due to an escape character in one of the strings "\b"

多亏了这个包裹的创造者亚历山大·沃克,这个问题得到了解决:这个问题似乎是由于一个字符串“\b”中的一个转义字符造成的。

> x <- read.csv("file1.txt")
> x[150,44]
[1] ÷øéîéðåìåâéä áäúîçåú áîãò ôåøðæé - îåñîê    \b

The fix is:

解决办法是:

x <- read.csv("file1.txt", stringsAsFactors = FALSE)
wb <- createWorkbook()
addWorksheet(wb, "Sheet 1")

is_character_col <- which(sapply(x, class) %in% "character") for(i in is_character_col){ x[[i]] <- gsub("\b", "", x[[i]], fixed = TRUE) }

is_character_col <- which(sapply(x, class) %in%“字符”)for(i in is_character_col){x[i]] <- gsub("\b", ", x[[i]], fixed = TRUE)}

writeDataTable(wb, 1, x) saveWorkbook(wb, "hopefully_fixed.xlsx")

writeDataTable(wb, 1, x) saveWorkbook(wb,“hopefully_fix.xlsx”)