使用自定义函数创建SELECT语句

时间:2021-09-10 22:29:55

UPDATE

更新

Given this new approach using INTNX I think I can just use a loop to simplify things even more. What if I made an array:

考虑到使用INTNX的新方法,我想我可以用一个循环来简化事情。如果我做了一个数组:

data;
    array period [4] $ var1-var4 ('day' 'week' 'month' 'year');
run;

And then tried to make a loop for each element:

然后尝试对每个元素进行循环:

%MACRO sqlloop;
  proc sql;
    %DO k = 1 %TO dim(period);  /* in case i decide to drop something from array later */
      %LET bucket = &period(k)
      CREATE TABLE output.t_&bucket AS (
        SELECT INTX( "&bucket.", date_field, O, 'E') AS test FROM table);
    %END
  quit;
%MEND
%sqlloop

This doesn't quite work, but it captures the idea I want. It could just run the query for each of those values in INTX. Does that make sense?

这不是很有效,但它抓住了我想要的想法。它可以为INTX中的每个值运行查询。这说得通吗?


I have a couple of prior questions that I'm merging into one. I got some really helpful advice on the others and hopefully this can tie it together.

我之前有几个问题我要合并成一个。我得到了一些非常有用的建议,希望能把它们结合起来。

I have the following function that creates a dynamic string to populate a SELECT statement in a SAS proc sql; code block:

我有以下函数,该函数创建一个动态字符串来填充SAS proc sql中的SELECT语句;代码块:

proc fcmp outlib = output.funcs.test;
    function sqlSelectByDateRange(interval $, date_field $) $;
        day = date_field||" AS day, ";
        week = "WEEK("||date_field||") AS week, ";
        month = "MONTH("||date_field||") AS month, ";
        year = "YEAR("||date_field||") AS year, ";

        IF interval = "week" THEN
            do;
                day = '';
            end;
        IF interval = "month" THEN
            do;
                day = '';
                week = '';
            end;
        IF interval = "year" THEN
            do;
                day = '';
                week = '';
                month = '';
            end;
        where_string = day||week||month||year;
    return(where_string);
    endsub;
quit;

I've verified that this creates the kind of string I want:

我已经验证过,这创建了我想要的那种字符串:

data _null_;
    q = sqlSelectByDateRange('month', 'myDateColumn');
    put q =;
run;

This yields:

这个收益率:

q=MONTH(myDateColumn) AS month, YEAR(myDateColumn) AS year,

This is exactly what I want the SQL string to be. From prior questions, I believe I need to call this function in a MACRO. Then I want something like this:

这正是我想要的SQL字符串。在之前的问题中,我认为我需要在宏中调用这个函数。然后我想要这样的东西:

%MACRO sqlSelectByDateRange(interval, date_field);
  /* Code I can't figure out */
%MEND

PROC SQL;
  CREATE TABLE output.t AS (
    SELECT 
      %sqlSelectByDateRange('month', 'myDateColumn')
    FROM
      output.myTable
  );
QUIT;

I am having trouble understanding how to make the code call this macro and interpret as part of the SQL SELECT string. I've tried some of the previous examples in other answers but I just can't make it work. I'm hoping this more specific question can help me fill in this missing step so I can learn how to do it in the future.

我无法理解如何让代码调用这个宏并将其解释为SQL SELECT字符串的一部分。我已经在其他答案中尝试过前面的一些例子,但我就是做不到。我希望这个更具体的问题能帮助我完成这个缺失的步骤,这样我以后就能学会如何去做。

3 个解决方案

#1


3  

Two things:

两件事:

First, you should be able to use %SYSFUNC to call your custom function.

首先,您应该能够使用%SYSFUNC调用自定义函数。

%MACRO sqlSelectByDateRange(interval, date_field);
    %SYSFUNC( sqlSelectByDateRange(&interval., &date_field.) )
%MEND;

Note that you should not use quotation marks when calling a function via SYSFUNC. Also, you cannot use SYSFUNC with FCMP functions until SAS 9.2. If you are using an earlier version, this will not work.

注意,在通过SYSFUNC调用函数时不应该使用引号。此外,在使用SAS 9.2之前,不能在FCMP函数中使用SYSFUNC。如果您使用的是早期版本,那么这将不起作用。

Second, you have a trailing comma in your select clause. You may need a dummy column as in the following:

其次,在select子句中有一个尾逗号。您可能需要如下所示的虚拟列:

PROC SQL;
  CREATE TABLE output.t AS (
    SELECT 
      %sqlSelectByDateRange('month', 'myDateColumn')
      0 AS dummy
    FROM
      output.myTable
  );
QUIT;

(Notice that there is no comma before dummy, as the comma is already embedded in your macro.)

(注意,在哑符之前没有逗号,因为逗号已经嵌入到宏中。)


UPDATE

更新

I read your comment on another answer:

我看了你对另一个答案的评论:

I also need to be able to do it for different date ranges and on a very ad-hoc basis, so it's something where I want to say "by month from june to december" or "weekly for two years" etc when someone makes a request.

我还需要能够在不同的日期范围和非常特别的基础上做,所以当有人提出请求时,我想说“从6月到12月”或“每周做两年”等等。

I think I can recommend an easier way to accopmlish what you are doing. First, I'll create a very simple dataset with dates and values. The dates are spread throughout different days, weeks, months and years:

我想我可以推荐一种更简单的方式来适应你正在做的事情。首先,我将创建一个包含日期和值的非常简单的数据集。日期分布在不同的日子、星期、月份和年份:

DATA Work.Accounts;

    Format      Opened      yymmdd10.
                Value       dollar14.2
                ;

    INPUT       Opened      yymmdd10.
                Value       dollar14.2
                ;

DATALINES;
2012-12-31  $90,000.00
2013-01-01 $100,000.00
2013-01-02 $200,000.00
2013-01-03 $150,000.00
2013-01-15 $250,000.00
2013-02-10 $120,000.00
2013-02-14 $230,000.00
2013-03-01 $900,000.00
RUN;

You can now use the INTNX function to create a third column to round the "Opened" column to some time period, such as a 'WEEK', 'MONTH', or 'YEAR' (see this complete list):

现在,您可以使用INTNX函数创建第三列,将“open”列四舍五入到某个时间段,例如“WEEK”、“MONTH”或“YEAR”(请参阅完整的列表):

%LET Period = YEAR;

PROC SQL NOPRINT;

    CREATE TABLE Work.PeriodSummary AS
    SELECT   INTNX( "&Period.", Opened, 0, 'E' ) AS Period_End FORMAT=yymmdd10.
           , SUM( Value )                        AS TotalValue FORMAT=dollar14.
    FROM     Work.Accounts
    GROUP BY Period_End
    ;

QUIT;

Output for WEEK:

周的输出:

Period_End   TotalValue
2013-01-05     $540,000
2013-01-19     $250,000
2013-02-16     $350,000
2013-03-02     $900,000

Output for MONTH:

月产量:

Period_End   TotalValue
2012-12-31      $90,000
2013-01-31     $700,000
2013-02-28     $350,000
2013-03-31     $900,000

Output for YEAR:

输出:

Period_End   TotalValue
2012-12-31      $90,000
2013-12-31   $1,950,000

#2


2  

As Cyborg37 says, you probably should get rid of that trailing comma in your function. But note you do not really need to create a macro to do this, just use the %SYSFUNC function directly:

正如Cyborg37所说的,在函数中应该去掉后面的逗号。但请注意,实际上并不需要创建宏来实现这一点,只需直接使用%SYSFUNC函数:

proc sql;
  create table output.t as
  select %sysfunc( sqlSelectByDateRange(month, myDateColumn) )
         * /* to avoid the trailing comma */
  from output.myTable;
quit;

Also, although this is a clever use of user-defined functions, it's not very clear why you want to do this. There are probably better solutions available that will not cause as much potential confusion in your code. User-defined functions, like user-written macros, can make life easier but they can also create an administrative nightmare.

另外,尽管这是一个聪明的用户定义函数的使用,但不清楚为什么要这样做。可能有更好的解决方案,不会在代码中造成太多潜在的混乱。用户定义的函数,比如用户编写的宏,可以使事情变得更简单,但它们也可能造成管理上的梦魇。

#3


1  

I could make all sorts of guesses as to why you're getting errors, but fundamentally, don't do it this way. You can do exactly what you're trying to do in a data step that is much easier to troubleshoot and much easier to implement than a FCMP function which is really just trying to be a data step anyway.

我可以猜测你为什么会犯错误,但基本上,不要这样做。与FCMP函数相比,您可以在一个数据步骤中完成您正在尝试的工作,这个数据步骤比FCMP函数更容易进行故障排除,也更容易实现。

Steps: 1. Create a dataset that has your possible date pulls. If you're using this a lot, you can put this in a permanent library that is defined in your SAS AUTOEXEC. 2. Create a macro that pulls the needed date strings from it. 3. If you want, use PROC FCMP to make this a function-style macro, using RUN_MACRO. 4. If you do that, use %SYSFUNC to call it.

步骤:1。创建一个具有可能日期提取的数据集。如果您经常使用它,您可以将它放在在SAS AUTOEXEC中定义的永久库中。2。创建一个宏,从中提取所需的日期字符串。3所示。如果需要,可以使用PROC FCMP,使用RUN_MACRO使其成为函数式宏。4所示。如果这样做,使用%SYSFUNC调用它。

Here is something that does this:

这里有这样的东西:

1:

1:

data pull_list;
infile datalines dlm='|';
length query $50. type $8.;
input type $ typenum query $;
datalines;
day|1|&date_field. as day
week|2|week(&date_field.) as week
month|3|month(&date_field.) as month
year|4|year(&date_field.) as year
;;;;
run;

2:

2:

%macro pull_list(type=,date_field=);
%let date_field = datevar;
%let type = week;
proc sql noprint;
select query into :sellist separated by ',' 
from pull_list
where typenum >= (select typenum from pull_list where type="&type.");
quit;
%mend pull_list;

3:

3:

proc fcmp outlib = work.functions.funcs;
   function pull_list(type $,date_field $) $;
      rc = run_macro('pull_list', type,date_field);
      if rc eq 0 then return("&sellist.");
      else return(' ');
   endsub;
run;

4:

4:

data test;
input datevar 5.;
datalines;
18963
19632
18131
19105
;;;;
run;
option cmplib = (work.functions);

proc sql;
select %sysfunc(pull_list(week,datevar)) from test;
quit;

One of the big advantages of this is that you can add additional types without having to worry about the function's code - just add a row to pull_list and it works. If you want to set it up to do that, I recommend using something other than 1,2,3,4 for typenum - use 10,20,30,40 or something so you have gaps (say, if "twoweek" is added, it would be between 2 and 3, and 25 is easier than 2.5 for people to think about). Create that pull_list dataset, put it on a network drive where all of your users can use it (if anybody beyond you uses it, or a personal one if not), and go from there.

这样做的一个好处是,您可以添加额外的类型,而不必担心函数的代码——只需向pull_list添加一行即可。如果你想设置它,我建议使用其他比1,2,3,4 typenum -使用10、20、30、40之类的有差距(说,如果“twoweek”补充说,它将在2和3之间,和25比2.5更容易为人们思考)。创建这个pull_list数据集,将它放在网络驱动器上,您的所有用户都可以使用它(如果您以外的任何人使用它,或者个人使用它),然后从那里开始。

#1


3  

Two things:

两件事:

First, you should be able to use %SYSFUNC to call your custom function.

首先,您应该能够使用%SYSFUNC调用自定义函数。

%MACRO sqlSelectByDateRange(interval, date_field);
    %SYSFUNC( sqlSelectByDateRange(&interval., &date_field.) )
%MEND;

Note that you should not use quotation marks when calling a function via SYSFUNC. Also, you cannot use SYSFUNC with FCMP functions until SAS 9.2. If you are using an earlier version, this will not work.

注意,在通过SYSFUNC调用函数时不应该使用引号。此外,在使用SAS 9.2之前,不能在FCMP函数中使用SYSFUNC。如果您使用的是早期版本,那么这将不起作用。

Second, you have a trailing comma in your select clause. You may need a dummy column as in the following:

其次,在select子句中有一个尾逗号。您可能需要如下所示的虚拟列:

PROC SQL;
  CREATE TABLE output.t AS (
    SELECT 
      %sqlSelectByDateRange('month', 'myDateColumn')
      0 AS dummy
    FROM
      output.myTable
  );
QUIT;

(Notice that there is no comma before dummy, as the comma is already embedded in your macro.)

(注意,在哑符之前没有逗号,因为逗号已经嵌入到宏中。)


UPDATE

更新

I read your comment on another answer:

我看了你对另一个答案的评论:

I also need to be able to do it for different date ranges and on a very ad-hoc basis, so it's something where I want to say "by month from june to december" or "weekly for two years" etc when someone makes a request.

我还需要能够在不同的日期范围和非常特别的基础上做,所以当有人提出请求时,我想说“从6月到12月”或“每周做两年”等等。

I think I can recommend an easier way to accopmlish what you are doing. First, I'll create a very simple dataset with dates and values. The dates are spread throughout different days, weeks, months and years:

我想我可以推荐一种更简单的方式来适应你正在做的事情。首先,我将创建一个包含日期和值的非常简单的数据集。日期分布在不同的日子、星期、月份和年份:

DATA Work.Accounts;

    Format      Opened      yymmdd10.
                Value       dollar14.2
                ;

    INPUT       Opened      yymmdd10.
                Value       dollar14.2
                ;

DATALINES;
2012-12-31  $90,000.00
2013-01-01 $100,000.00
2013-01-02 $200,000.00
2013-01-03 $150,000.00
2013-01-15 $250,000.00
2013-02-10 $120,000.00
2013-02-14 $230,000.00
2013-03-01 $900,000.00
RUN;

You can now use the INTNX function to create a third column to round the "Opened" column to some time period, such as a 'WEEK', 'MONTH', or 'YEAR' (see this complete list):

现在,您可以使用INTNX函数创建第三列,将“open”列四舍五入到某个时间段,例如“WEEK”、“MONTH”或“YEAR”(请参阅完整的列表):

%LET Period = YEAR;

PROC SQL NOPRINT;

    CREATE TABLE Work.PeriodSummary AS
    SELECT   INTNX( "&Period.", Opened, 0, 'E' ) AS Period_End FORMAT=yymmdd10.
           , SUM( Value )                        AS TotalValue FORMAT=dollar14.
    FROM     Work.Accounts
    GROUP BY Period_End
    ;

QUIT;

Output for WEEK:

周的输出:

Period_End   TotalValue
2013-01-05     $540,000
2013-01-19     $250,000
2013-02-16     $350,000
2013-03-02     $900,000

Output for MONTH:

月产量:

Period_End   TotalValue
2012-12-31      $90,000
2013-01-31     $700,000
2013-02-28     $350,000
2013-03-31     $900,000

Output for YEAR:

输出:

Period_End   TotalValue
2012-12-31      $90,000
2013-12-31   $1,950,000

#2


2  

As Cyborg37 says, you probably should get rid of that trailing comma in your function. But note you do not really need to create a macro to do this, just use the %SYSFUNC function directly:

正如Cyborg37所说的,在函数中应该去掉后面的逗号。但请注意,实际上并不需要创建宏来实现这一点,只需直接使用%SYSFUNC函数:

proc sql;
  create table output.t as
  select %sysfunc( sqlSelectByDateRange(month, myDateColumn) )
         * /* to avoid the trailing comma */
  from output.myTable;
quit;

Also, although this is a clever use of user-defined functions, it's not very clear why you want to do this. There are probably better solutions available that will not cause as much potential confusion in your code. User-defined functions, like user-written macros, can make life easier but they can also create an administrative nightmare.

另外,尽管这是一个聪明的用户定义函数的使用,但不清楚为什么要这样做。可能有更好的解决方案,不会在代码中造成太多潜在的混乱。用户定义的函数,比如用户编写的宏,可以使事情变得更简单,但它们也可能造成管理上的梦魇。

#3


1  

I could make all sorts of guesses as to why you're getting errors, but fundamentally, don't do it this way. You can do exactly what you're trying to do in a data step that is much easier to troubleshoot and much easier to implement than a FCMP function which is really just trying to be a data step anyway.

我可以猜测你为什么会犯错误,但基本上,不要这样做。与FCMP函数相比,您可以在一个数据步骤中完成您正在尝试的工作,这个数据步骤比FCMP函数更容易进行故障排除,也更容易实现。

Steps: 1. Create a dataset that has your possible date pulls. If you're using this a lot, you can put this in a permanent library that is defined in your SAS AUTOEXEC. 2. Create a macro that pulls the needed date strings from it. 3. If you want, use PROC FCMP to make this a function-style macro, using RUN_MACRO. 4. If you do that, use %SYSFUNC to call it.

步骤:1。创建一个具有可能日期提取的数据集。如果您经常使用它,您可以将它放在在SAS AUTOEXEC中定义的永久库中。2。创建一个宏,从中提取所需的日期字符串。3所示。如果需要,可以使用PROC FCMP,使用RUN_MACRO使其成为函数式宏。4所示。如果这样做,使用%SYSFUNC调用它。

Here is something that does this:

这里有这样的东西:

1:

1:

data pull_list;
infile datalines dlm='|';
length query $50. type $8.;
input type $ typenum query $;
datalines;
day|1|&date_field. as day
week|2|week(&date_field.) as week
month|3|month(&date_field.) as month
year|4|year(&date_field.) as year
;;;;
run;

2:

2:

%macro pull_list(type=,date_field=);
%let date_field = datevar;
%let type = week;
proc sql noprint;
select query into :sellist separated by ',' 
from pull_list
where typenum >= (select typenum from pull_list where type="&type.");
quit;
%mend pull_list;

3:

3:

proc fcmp outlib = work.functions.funcs;
   function pull_list(type $,date_field $) $;
      rc = run_macro('pull_list', type,date_field);
      if rc eq 0 then return("&sellist.");
      else return(' ');
   endsub;
run;

4:

4:

data test;
input datevar 5.;
datalines;
18963
19632
18131
19105
;;;;
run;
option cmplib = (work.functions);

proc sql;
select %sysfunc(pull_list(week,datevar)) from test;
quit;

One of the big advantages of this is that you can add additional types without having to worry about the function's code - just add a row to pull_list and it works. If you want to set it up to do that, I recommend using something other than 1,2,3,4 for typenum - use 10,20,30,40 or something so you have gaps (say, if "twoweek" is added, it would be between 2 and 3, and 25 is easier than 2.5 for people to think about). Create that pull_list dataset, put it on a network drive where all of your users can use it (if anybody beyond you uses it, or a personal one if not), and go from there.

这样做的一个好处是,您可以添加额外的类型,而不必担心函数的代码——只需向pull_list添加一行即可。如果你想设置它,我建议使用其他比1,2,3,4 typenum -使用10、20、30、40之类的有差距(说,如果“twoweek”补充说,它将在2和3之间,和25比2.5更容易为人们思考)。创建这个pull_list数据集,将它放在网络驱动器上,您的所有用户都可以使用它(如果您以外的任何人使用它,或者个人使用它),然后从那里开始。