I have a column containing 1200 character strings. In each one, every four character group is hexadecimal for a number. i.e. 300 numbers in hexadecimal crammed into a 1200 character string, in every row. I need to get each number out into decimal, and into its own column (300 new columns) named 1-300. Here's what I've figured out so far:


                 [1]  0043003E803C0041004A...(etc...)

Here's what I've done so far:


decimal.fours <- function(x) {
    strtoi(substring(BigString[x], seq(1,1197,4), seq(4,1197,4)), 16L)
[1] 283   291   239   177 ...

But now I'm stuck. How can I output these individual number, (and the remaining 296, into new columns? I have fifty total rows/strings. It would be great to do them all at once, i.e. 300 new columns, containing split up substrings from 50 strings.


Obligatory tidyverse example:



Setup some data



bet <- c(0:9, LETTERS[1:6]) # alphabet for hex digit sequences
i <- 8                      # number of rows
n <- 10                     # number of 4-hex-digit sequences

df <- data_frame(
   big_str=map_chr(1:i, ~sample(bet, 4*n, replace=TRUE) %>% paste0(collapse=""))

## # A tibble: 8 × 2
##   some_other_col                                  big_str
##            <chr>                                    <chr>
## 1              A 432100D86CAA388C15AEA6291E985F2FD3FB6104
## 2              B BC2673D112925EBBB3FD175837AF7176C39B4888
## 3              C B4E99FDAABA47515EADA786715E811EE0502ABE8
## 4              D 64E622D7037D35DE6ADC40D0380E1DC12D753CBC
## 5              E CF7CDD7BBC610443A8D8FCFD896CA9730673B181
## 6              F ED86AEE8A7B65F843200B823CFBD17E9F3CA4EEF
## 7              G 2B9BCB73941228C501F937DA8E6EF033B5DD31F6
## 8              H 40823BBBFDF9B14839B7A95B6E317EBA9B016ED5

Do the manipulation


read_fwf(paste0(df$big_str, collapse="\n"),
         fwf_widths(rep(4, n)),
         col_types=paste0(rep("c", n), collapse="")) %>%
  mutate_all(strtoi, base=16) %>%
  bind_cols(df) %>%
  select(some_other_col, everything(), -big_str)
## # A tibble: 8 × 11
##   some_other_col    X1    X2    X3    X4    X5    X6    X7    X8    X9
##            <chr> <int> <int> <int> <int> <int> <int> <int> <int> <int>
## 1              A 17185   216 27818 14476  5550 42537  7832 24367 54267
## 2              B 48166 29649  4754 24251 46077  5976 14255 29046 50075
## 3              C 46313 40922 43940 29973 60122 30823  5608  4590  1282
## 4              D 25830  8919   893 13790 27356 16592 14350  7617 11637
## 5              E 53116 56699 48225  1091 43224 64765 35180 43379  1651
## 6              F 60806 44776 42934 24452 12800 47139 53181  6121 62410
## 7              G 11163 52083 37906 10437   505 14298 36462 61491 46557
## 8              H 16514 15291 65017 45384 14775 43355 28209 32442 39681
## # ... with 1 more variables: X10 <int>



You can use read.fwf which read in files with fixed width for each column:


# an example vector of big strings
BigString = c("0043003E803C0041004A", "0043003E803C0041004A", "0043003E803C0041004A")

n = 5                  # n is the number of columns for your result(300 for your real case)
      lapply(read.fwf(file = textConnection(BigString), 
                      widths = rep(4, n), 
                      colClasses = "character"), 
             strtoi, base = 16))

#  V1 V2    V3 V4 V5
#1 67 62 32828 65 74
#2 67 62 32828 65 74
#3 67 62 32828 65 74

If you'd like to keep the decimal.hours function, you can modify it as follows and call lapply to convert your bigStrings to list of integers which can be further converted to data.frame with do.call(rbind, ...) pattern:


decimal.fours <- function(x) {
    strtoi(substring(x, seq(1,1197,4), seq(4,1197,4)), 16L)

do.call(rbind, lapply(BigString, decimal.fours))



just a try using base-R


BigString = c("0043003E803C0041004A", "0043003E803C0041004A", "0043003E803C0041004A")
df = data.frame(BigString)

t(sapply(df$BigString, function(x) strtoi(substring(x, seq(1, 297, 4)[1:5],
                                                    seq(4, 300, 4)[1:5]), base = 16)))
#     [,1] [,2]  [,3] [,4] [,5]
#[1,]   67   62 32828   65   74
#[2,]   67   62 32828   65   74
#[3,]   67   62 32828   65   74

# you can set the columns together at the end using `paste0("new_col", 1:300)` 
# [1:5] was just used for this example, because i had strings of length 20cahr



