从另一个数据集初始化SAS多维数组

时间:2022-05-28 18:43:48

I have a data set types which is 100 rows x 61 columns. The first column indicates the Type_ID and the remaining 60 columns relate to time periods.

我有一个100行x 61列的数据集类型。第一列表示Type_ID,其余60列表示时间段。

In another data step calculate; I want to create a multi-dimensional array timeType{100,60} that is initialized by the 100 rows and 60 columns of data from types.

在另一个数据步骤中计算;我想创建一个多维数组timeType {100,60},它由类型的100行和60列数据初始化。

All the SAS documentation I've read so far, initialises multi-dimensional arrays using explicit column references, and the array then forces the columns to fit the array dimensions.

到目前为止我读过的所有SAS文档都使用显式列引用初始化多维数组,然后数组强制列适合数组维度。

How can I effectively read-in a dataset into the array?

如何有效地将数据集读入数组?

3 个解决方案

#1


2  

Very similar to Quentin. I used the Dorfman made famous construct do _n_ = 1 by 1 until(end) for the reading loop.

与昆汀非常相似。我使用Dorfman制作的着名构造做_n_ = 1乘1直到(结束)读取循环。

data types;
  do type_id = 1 to 100;
    array x x1-x60;
    do over x; global_counter+1; x = global_counter; end;
    output;
  end;
run;

data calculate(keep=testid result) typesread_check(keep=x1-x60);

  array matrix(100,60) _temporary_;

  * load matrix;
  do _n_ = 1 by 1 until (end);
    set types end=end;
    array x x1-x60;    * this array is composed of variables that are filled by the SET operation;
    do _i_ = 1 to dim(x);
      matrix(_n_,_i_) = x(_i_);
    end;
  end;

  * unload matrix to check load with COMPARE;
  * can be commented out / removed after confidence established;
  do _n_ = 1 to 100;
    do _i_ = 1 to 60;
      x(_i_) = matrix(_n_,_i_);
    end;
    output typesread_check;
  end;

  * perform computations that output;
  testid = 1;
  result = sum(of matrix(*));

  output calculate;
run;

* output is zero rows of differences. That means the matrix was populated correctly;
proc compare noprint data=types(drop=type_id) compare=typesread_check out=diff outnoequal;
run;

If you want to keep the calculation results along type data remove the _temporary_ option. keep (testid result matrix:) to get output that has 6,000 additional columns corresponding to matrix(100,60)

如果要沿着类型数据保留计算结果,请删除_temporary_选项。保持(testid结果矩阵:)以获得具有对应于矩阵(100,60)的6,000个附加列的输出

#2


1  

I'm not big on arrays, and below won't win any style points, but it's a start.

我在阵列上并不大,下面不会赢得任何风格点,但这是一个开始。

Basic approach is to use a DoW-loop to read all of the data into the two-dimensional array.

基本方法是使用DoW循环将所有数据读入二维数组。

data have;
  do TypeID=1 to 5;
    p1=10*TypeID;
    p2=100*TypeID;
    p3=1000*TypeID;
    output;
  end;
run;

data _null_;
  *DOW loop to read data into array;
  do until (last);
    set have end=last;

    array timeType(5,3) _temporary_;
    array p{*} p:;

    row++1;
    do col=1 to dim(p);
      timeType{row,col}=p{col};
    end;
  end;

  *PUT the values of the array to the log, for checking;
  do i=1 to dim1(timeType);
    do j=1 to dim2(timeType);
      put timeType{i,j}=;
    end;
  end;

  drop row col i j;
run;

#3


0  

Use a HASH object instead. First either transpose your rates into a normal dataset instead of "matrix". Or just create it originally that way.

请改用HASH对象。首先将您的费率转换为普通数据集而不是“矩阵”。或者只是以这种方式创建它。

data rates;
  do type=1 to 3 ;
    do time=1 to 3 ;
      input rate @@ ;
      output;
    end;
  end;
cards;
.1 .2 .3  .4 .5 .6 .7 .8 .9
;

Now let's make some example data. I included one record that will not match any record in the rate table.

现在让我们来做一些示例数据。我包含了一条与费率表中的任何记录都不匹配的记录。

data have ;
  input type time amount expected;
cards;
1 2 100 20
2 3 100 60
4 5 100  .
;

Now load the rates into a HASH object and use the .FIND() method to locate the rate for the current TYPE and TIME combination.

现在将速率加载到HASH对象中,并使用.FIND()方法定位当前TYPE和TIME组合的速率。

data want ;
  set have ;
  if _N_ = 1 then do;
   declare hash h(dataset:"rates");
   rc = h.defineKey('type','time');
   rc = h.defineData('rate');
   rc = h.defineDone();
   call missing(rate);
   drop rc;
  end;
  if (0=h.find()) then payment=amount*rate;
run;

Results.

Obs    type    time    amount    expected    payment    rate

 1       1       2       100        20          20       0.2
 2       2       3       100        60          60       0.6
 3       4       5       100         .           .        .

#1


2  

Very similar to Quentin. I used the Dorfman made famous construct do _n_ = 1 by 1 until(end) for the reading loop.

与昆汀非常相似。我使用Dorfman制作的着名构造做_n_ = 1乘1直到(结束)读取循环。

data types;
  do type_id = 1 to 100;
    array x x1-x60;
    do over x; global_counter+1; x = global_counter; end;
    output;
  end;
run;

data calculate(keep=testid result) typesread_check(keep=x1-x60);

  array matrix(100,60) _temporary_;

  * load matrix;
  do _n_ = 1 by 1 until (end);
    set types end=end;
    array x x1-x60;    * this array is composed of variables that are filled by the SET operation;
    do _i_ = 1 to dim(x);
      matrix(_n_,_i_) = x(_i_);
    end;
  end;

  * unload matrix to check load with COMPARE;
  * can be commented out / removed after confidence established;
  do _n_ = 1 to 100;
    do _i_ = 1 to 60;
      x(_i_) = matrix(_n_,_i_);
    end;
    output typesread_check;
  end;

  * perform computations that output;
  testid = 1;
  result = sum(of matrix(*));

  output calculate;
run;

* output is zero rows of differences. That means the matrix was populated correctly;
proc compare noprint data=types(drop=type_id) compare=typesread_check out=diff outnoequal;
run;

If you want to keep the calculation results along type data remove the _temporary_ option. keep (testid result matrix:) to get output that has 6,000 additional columns corresponding to matrix(100,60)

如果要沿着类型数据保留计算结果,请删除_temporary_选项。保持(testid结果矩阵:)以获得具有对应于矩阵(100,60)的6,000个附加列的输出

#2


1  

I'm not big on arrays, and below won't win any style points, but it's a start.

我在阵列上并不大,下面不会赢得任何风格点,但这是一个开始。

Basic approach is to use a DoW-loop to read all of the data into the two-dimensional array.

基本方法是使用DoW循环将所有数据读入二维数组。

data have;
  do TypeID=1 to 5;
    p1=10*TypeID;
    p2=100*TypeID;
    p3=1000*TypeID;
    output;
  end;
run;

data _null_;
  *DOW loop to read data into array;
  do until (last);
    set have end=last;

    array timeType(5,3) _temporary_;
    array p{*} p:;

    row++1;
    do col=1 to dim(p);
      timeType{row,col}=p{col};
    end;
  end;

  *PUT the values of the array to the log, for checking;
  do i=1 to dim1(timeType);
    do j=1 to dim2(timeType);
      put timeType{i,j}=;
    end;
  end;

  drop row col i j;
run;

#3


0  

Use a HASH object instead. First either transpose your rates into a normal dataset instead of "matrix". Or just create it originally that way.

请改用HASH对象。首先将您的费率转换为普通数据集而不是“矩阵”。或者只是以这种方式创建它。

data rates;
  do type=1 to 3 ;
    do time=1 to 3 ;
      input rate @@ ;
      output;
    end;
  end;
cards;
.1 .2 .3  .4 .5 .6 .7 .8 .9
;

Now let's make some example data. I included one record that will not match any record in the rate table.

现在让我们来做一些示例数据。我包含了一条与费率表中的任何记录都不匹配的记录。

data have ;
  input type time amount expected;
cards;
1 2 100 20
2 3 100 60
4 5 100  .
;

Now load the rates into a HASH object and use the .FIND() method to locate the rate for the current TYPE and TIME combination.

现在将速率加载到HASH对象中,并使用.FIND()方法定位当前TYPE和TIME组合的速率。

data want ;
  set have ;
  if _N_ = 1 then do;
   declare hash h(dataset:"rates");
   rc = h.defineKey('type','time');
   rc = h.defineData('rate');
   rc = h.defineDone();
   call missing(rate);
   drop rc;
  end;
  if (0=h.find()) then payment=amount*rate;
run;

Results.

Obs    type    time    amount    expected    payment    rate

 1       1       2       100        20          20       0.2
 2       2       3       100        60          60       0.6
 3       4       5       100         .           .        .