I am trying to scale the values in a matrix so that each column adds up to one. I have tried:
我试图缩放矩阵中的值,以便每列添加一个。我努力了:
m = matrix(c(1:9),nrow=3, ncol=3, byrow=T)
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 4 5 6
[3,] 7 8 9
colSums(m)
12 15 18
m = m/colSums(m)
[,1] [,2] [,3]
[1,] 0.08333333 0.1666667 0.25
[2,] 0.26666667 0.3333333 0.40
[3,] 0.38888889 0.4444444 0.50
colSums(m)
[1] 0.7388889 0.9444444 1.1500000
so obviously this doesn't work. I then tried this:
所以显然这不起作用。然后我尝试了这个:
m = m/matrix(rep(colSums(m),3), nrow=3, ncol=3, byrow=T)
[,1] [,2] [,3]
[1,] 0.08333333 0.1333333 0.1666667
[2,] 0.33333333 0.3333333 0.3333333
[3,] 0.58333333 0.5333333 0.5000000
m = colSums(m)
[1] 1 1 1
so this works, but it feels like I'm missing something here. This can't be how it is routinely done. I'm certain I am being stupid here. Any help you can give would be appreciated Cheers, Davy
这样可行,但感觉我在这里遗漏了一些东西。这不是常规做法。我确定我在这里很傻。任何你能给予的帮助都会受到赞赏,干杯,戴维
2 个解决方案
#1
39
See ?sweep
, eg:
看?扫描,例如:
> sweep(m,2,colSums(m),`/`)
[,1] [,2] [,3]
[1,] 0.08333333 0.1333333 0.1666667
[2,] 0.33333333 0.3333333 0.3333333
[3,] 0.58333333 0.5333333 0.5000000
or you can transpose the matrix and then colSums(m)
gets recycled correctly. Don't forget to transpose afterwards again, like this :
或者你可以转置矩阵然后colSums(m)被正确回收。不要忘记再次转置,如下:
> t(t(m)/colSums(m))
[,1] [,2] [,3]
[1,] 0.08333333 0.1333333 0.1666667
[2,] 0.33333333 0.3333333 0.3333333
[3,] 0.58333333 0.5333333 0.5000000
Or you use the function prop.table()
to do basically the same:
或者你使用函数prop.table()来做基本相同的事情:
> prop.table(m,2)
[,1] [,2] [,3]
[1,] 0.08333333 0.1333333 0.1666667
[2,] 0.33333333 0.3333333 0.3333333
[3,] 0.58333333 0.5333333 0.5000000
The time differences are rather small. the sweep()
function and the t()
trick are the most flexible solutions, prop.table()
is only for this particular case
时差相当小。 sweep()函数和t()技巧是最灵活的解决方案,prop.table()仅适用于这种特殊情况
#2
5
Per usual, Joris has a great answer. Two others that came to mind:
按照惯例,Joris有一个很好的答案。想到的另外两个人:
#Essentially your answer
f1 <- function() m / rep(colSums(m), each = nrow(m))
#Two calls to transpose
f2 <- function() t(t(m) / colSums(m))
#Joris
f3 <- function() sweep(m,2,colSums(m),`/`)
Joris' answer is the fastest on my machine:
Joris的回答是我机器上最快的答案:
> m <- matrix(rnorm(1e7), ncol = 10000)
> library(rbenchmark)
> benchmark(f1,f2,f3, replications=1e5, order = "relative")
test replications elapsed relative user.self sys.self user.child sys.child
3 f3 100000 0.386 1.0000 0.385 0.001 0 0
1 f1 100000 0.421 1.0907 0.382 0.002 0 0
2 f2 100000 0.465 1.2047 0.386 0.003 0 0
#1
39
See ?sweep
, eg:
看?扫描,例如:
> sweep(m,2,colSums(m),`/`)
[,1] [,2] [,3]
[1,] 0.08333333 0.1333333 0.1666667
[2,] 0.33333333 0.3333333 0.3333333
[3,] 0.58333333 0.5333333 0.5000000
or you can transpose the matrix and then colSums(m)
gets recycled correctly. Don't forget to transpose afterwards again, like this :
或者你可以转置矩阵然后colSums(m)被正确回收。不要忘记再次转置,如下:
> t(t(m)/colSums(m))
[,1] [,2] [,3]
[1,] 0.08333333 0.1333333 0.1666667
[2,] 0.33333333 0.3333333 0.3333333
[3,] 0.58333333 0.5333333 0.5000000
Or you use the function prop.table()
to do basically the same:
或者你使用函数prop.table()来做基本相同的事情:
> prop.table(m,2)
[,1] [,2] [,3]
[1,] 0.08333333 0.1333333 0.1666667
[2,] 0.33333333 0.3333333 0.3333333
[3,] 0.58333333 0.5333333 0.5000000
The time differences are rather small. the sweep()
function and the t()
trick are the most flexible solutions, prop.table()
is only for this particular case
时差相当小。 sweep()函数和t()技巧是最灵活的解决方案,prop.table()仅适用于这种特殊情况
#2
5
Per usual, Joris has a great answer. Two others that came to mind:
按照惯例,Joris有一个很好的答案。想到的另外两个人:
#Essentially your answer
f1 <- function() m / rep(colSums(m), each = nrow(m))
#Two calls to transpose
f2 <- function() t(t(m) / colSums(m))
#Joris
f3 <- function() sweep(m,2,colSums(m),`/`)
Joris' answer is the fastest on my machine:
Joris的回答是我机器上最快的答案:
> m <- matrix(rnorm(1e7), ncol = 10000)
> library(rbenchmark)
> benchmark(f1,f2,f3, replications=1e5, order = "relative")
test replications elapsed relative user.self sys.self user.child sys.child
3 f3 100000 0.386 1.0000 0.385 0.001 0 0
1 f1 100000 0.421 1.0907 0.382 0.002 0 0
2 f2 100000 0.465 1.2047 0.386 0.003 0 0