I have a data set types
which is 100 rows x 61 columns
. The first column indicates the Type_ID
and the remaining 60 columns relate to time periods.
我有一个100行x 61列的数据集类型。第一列表示Type_ID,其余60列表示时间段。
In another data step calculate
; I want to create a multi-dimensional array timeType{100,60}
that is initialized by the 100 rows and 60 columns of data from types
.
在另一个数据步骤中计算;我想创建一个多维数组timeType {100,60},它由类型的100行和60列数据初始化。
All the SAS documentation I've read so far, initialises multi-dimensional arrays using explicit column references, and the array then forces the columns to fit the array dimensions.
到目前为止我读过的所有SAS文档都使用显式列引用初始化多维数组,然后数组强制列适合数组维度。
How can I effectively read-in a dataset into the array?
如何有效地将数据集读入数组?
3 个解决方案
#1
2
Very similar to Quentin. I used the Dorfman made famous construct do _n_ = 1 by 1 until(end)
for the reading loop.
与昆汀非常相似。我使用Dorfman制作的着名构造做_n_ = 1乘1直到(结束)读取循环。
data types;
do type_id = 1 to 100;
array x x1-x60;
do over x; global_counter+1; x = global_counter; end;
output;
end;
run;
data calculate(keep=testid result) typesread_check(keep=x1-x60);
array matrix(100,60) _temporary_;
* load matrix;
do _n_ = 1 by 1 until (end);
set types end=end;
array x x1-x60; * this array is composed of variables that are filled by the SET operation;
do _i_ = 1 to dim(x);
matrix(_n_,_i_) = x(_i_);
end;
end;
* unload matrix to check load with COMPARE;
* can be commented out / removed after confidence established;
do _n_ = 1 to 100;
do _i_ = 1 to 60;
x(_i_) = matrix(_n_,_i_);
end;
output typesread_check;
end;
* perform computations that output;
testid = 1;
result = sum(of matrix(*));
output calculate;
run;
* output is zero rows of differences. That means the matrix was populated correctly;
proc compare noprint data=types(drop=type_id) compare=typesread_check out=diff outnoequal;
run;
If you want to keep the calculation results along type data remove the _temporary_
option. keep (testid result matrix:)
to get output that has 6,000 additional columns corresponding to matrix(100,60)
如果要沿着类型数据保留计算结果,请删除_temporary_选项。保持(testid结果矩阵:)以获得具有对应于矩阵(100,60)的6,000个附加列的输出
#2
1
I'm not big on arrays, and below won't win any style points, but it's a start.
我在阵列上并不大,下面不会赢得任何风格点,但这是一个开始。
Basic approach is to use a DoW-loop to read all of the data into the two-dimensional array.
基本方法是使用DoW循环将所有数据读入二维数组。
data have;
do TypeID=1 to 5;
p1=10*TypeID;
p2=100*TypeID;
p3=1000*TypeID;
output;
end;
run;
data _null_;
*DOW loop to read data into array;
do until (last);
set have end=last;
array timeType(5,3) _temporary_;
array p{*} p:;
row++1;
do col=1 to dim(p);
timeType{row,col}=p{col};
end;
end;
*PUT the values of the array to the log, for checking;
do i=1 to dim1(timeType);
do j=1 to dim2(timeType);
put timeType{i,j}=;
end;
end;
drop row col i j;
run;
#3
0
Use a HASH object instead. First either transpose your rates into a normal dataset instead of "matrix". Or just create it originally that way.
请改用HASH对象。首先将您的费率转换为普通数据集而不是“矩阵”。或者只是以这种方式创建它。
data rates;
do type=1 to 3 ;
do time=1 to 3 ;
input rate @@ ;
output;
end;
end;
cards;
.1 .2 .3 .4 .5 .6 .7 .8 .9
;
Now let's make some example data. I included one record that will not match any record in the rate table.
现在让我们来做一些示例数据。我包含了一条与费率表中的任何记录都不匹配的记录。
data have ;
input type time amount expected;
cards;
1 2 100 20
2 3 100 60
4 5 100 .
;
Now load the rates into a HASH object and use the .FIND() method to locate the rate for the current TYPE and TIME combination.
现在将速率加载到HASH对象中,并使用.FIND()方法定位当前TYPE和TIME组合的速率。
data want ;
set have ;
if _N_ = 1 then do;
declare hash h(dataset:"rates");
rc = h.defineKey('type','time');
rc = h.defineData('rate');
rc = h.defineDone();
call missing(rate);
drop rc;
end;
if (0=h.find()) then payment=amount*rate;
run;
Results.
Obs type time amount expected payment rate
1 1 2 100 20 20 0.2
2 2 3 100 60 60 0.6
3 4 5 100 . . .
#1
2
Very similar to Quentin. I used the Dorfman made famous construct do _n_ = 1 by 1 until(end)
for the reading loop.
与昆汀非常相似。我使用Dorfman制作的着名构造做_n_ = 1乘1直到(结束)读取循环。
data types;
do type_id = 1 to 100;
array x x1-x60;
do over x; global_counter+1; x = global_counter; end;
output;
end;
run;
data calculate(keep=testid result) typesread_check(keep=x1-x60);
array matrix(100,60) _temporary_;
* load matrix;
do _n_ = 1 by 1 until (end);
set types end=end;
array x x1-x60; * this array is composed of variables that are filled by the SET operation;
do _i_ = 1 to dim(x);
matrix(_n_,_i_) = x(_i_);
end;
end;
* unload matrix to check load with COMPARE;
* can be commented out / removed after confidence established;
do _n_ = 1 to 100;
do _i_ = 1 to 60;
x(_i_) = matrix(_n_,_i_);
end;
output typesread_check;
end;
* perform computations that output;
testid = 1;
result = sum(of matrix(*));
output calculate;
run;
* output is zero rows of differences. That means the matrix was populated correctly;
proc compare noprint data=types(drop=type_id) compare=typesread_check out=diff outnoequal;
run;
If you want to keep the calculation results along type data remove the _temporary_
option. keep (testid result matrix:)
to get output that has 6,000 additional columns corresponding to matrix(100,60)
如果要沿着类型数据保留计算结果,请删除_temporary_选项。保持(testid结果矩阵:)以获得具有对应于矩阵(100,60)的6,000个附加列的输出
#2
1
I'm not big on arrays, and below won't win any style points, but it's a start.
我在阵列上并不大,下面不会赢得任何风格点,但这是一个开始。
Basic approach is to use a DoW-loop to read all of the data into the two-dimensional array.
基本方法是使用DoW循环将所有数据读入二维数组。
data have;
do TypeID=1 to 5;
p1=10*TypeID;
p2=100*TypeID;
p3=1000*TypeID;
output;
end;
run;
data _null_;
*DOW loop to read data into array;
do until (last);
set have end=last;
array timeType(5,3) _temporary_;
array p{*} p:;
row++1;
do col=1 to dim(p);
timeType{row,col}=p{col};
end;
end;
*PUT the values of the array to the log, for checking;
do i=1 to dim1(timeType);
do j=1 to dim2(timeType);
put timeType{i,j}=;
end;
end;
drop row col i j;
run;
#3
0
Use a HASH object instead. First either transpose your rates into a normal dataset instead of "matrix". Or just create it originally that way.
请改用HASH对象。首先将您的费率转换为普通数据集而不是“矩阵”。或者只是以这种方式创建它。
data rates;
do type=1 to 3 ;
do time=1 to 3 ;
input rate @@ ;
output;
end;
end;
cards;
.1 .2 .3 .4 .5 .6 .7 .8 .9
;
Now let's make some example data. I included one record that will not match any record in the rate table.
现在让我们来做一些示例数据。我包含了一条与费率表中的任何记录都不匹配的记录。
data have ;
input type time amount expected;
cards;
1 2 100 20
2 3 100 60
4 5 100 .
;
Now load the rates into a HASH object and use the .FIND() method to locate the rate for the current TYPE and TIME combination.
现在将速率加载到HASH对象中,并使用.FIND()方法定位当前TYPE和TIME组合的速率。
data want ;
set have ;
if _N_ = 1 then do;
declare hash h(dataset:"rates");
rc = h.defineKey('type','time');
rc = h.defineData('rate');
rc = h.defineDone();
call missing(rate);
drop rc;
end;
if (0=h.find()) then payment=amount*rate;
run;
Results.
Obs type time amount expected payment rate
1 1 2 100 20 20 0.2
2 2 3 100 60 60 0.6
3 4 5 100 . . .