My company receives data from a client that is unable to provide data in any direct format so we have to import several reports that are in a grouped layout like the one below. We have to develop in house methods to ungroup the report and then import the data to get all of the data we need. Currently a member on my team is using MS Access / VBA to generate the needed detail records but I want to move this to a server based and automated process. We are using SQL Server 2008R2 for storage and I would like to use SSIS to accomplish the task. Does anyone know of a way I can generate the detail records and import the data directly into SQL Server?
我的公司从无法以任何直接格式提供数据的客户端接收数据,因此我们必须导入多个报告,这些报告采用分组布局,如下所示。我们必须开发内部方法来取消组合报告,然后导入数据以获取我们需要的所有数据。目前我的团队成员正在使用MS Access / VBA生成所需的详细记录,但我想将其移至基于服务器和自动化的流程。我们使用SQL Server 2008R2进行存储,我想使用SSIS来完成任务。有谁知道我可以生成详细记录并将数据直接导入SQL Server的方法?
2 个解决方案
#1
1
Hmm - well you will definitely have to do some programmatic adjusting of the data set to add that group date to the detail line. I'm unsure of how you will be importing the xlsx but I would recommend first off just using a SSIS package and then doing the adjustments in a script task as the "best" way to do this. See here on how to handle Excel in SSIS Script tasks.
嗯 - 你肯定要对数据集进行一些程序调整,以便将该组日期添加到细节线。我不确定你将如何导入xlsx,但我建议首先使用SSIS包,然后在脚本任务中进行调整,作为“最佳”方式。请参阅此处了解如何在SSIS脚本任务中处理Excel。
If you don't know SSIS or especially programming though, you're next best bet (in my opinion) is to just import the data into a staging table, do the manipulations with T-SQL and then insert that table into your main table. I did a SQL Fiddle of this here.
如果您不了解SSIS或特别是编程,那么您最好选择(在我看来)将数据导入临时表,使用T-SQL进行操作然后将该表插入主表。我在这里做了一个SQL小提琴。
CREATE TABLE ActivitySummary
(
id int identity(1,1),
activity_date date,
activity varchar(100),
paid_time decimal(5,2),
unpaid_time decimal(5,2),
total_time decimal(5,2)
)
CREATE TABLE ActivitySummary_STG
(
id int identity(1,1),
activity_date date,
activity varchar(100),
paid_time decimal(5,2),
unpaid_time decimal(5,2),
total_time decimal(5,2)
)
GO
-- Simulate import of Excel sheet into staging table
truncate table ActivitySummary_STG;
GO
INSERT INTO ActivitySummary_STG (activity_date, activity, paid_time, unpaid_time, total_time)
select '8/14/17',null,null,null,null
UNION ALL
select null,'001 Lunch',0,4.4,4.4
UNION ALL
select null,'002 Break',4.2,0,4.2
UNION ALL
select null,'007 System Down',7.45,0,7.45
UNION ALL
select null,'019 End of Work Day',0.02,0,0.02
UNION ALL
select '8/15/17',null,null,null,null
UNION ALL
select null,'001 Lunch',0,4.45,4.45
UNION ALL
select null,'002 Break',6.53,0,6.53
UNION ALL
select null,'007 System Down',0.51,0,0.51
UNION ALL
select null,'019 End of Work Day',0.02,0,0.02
GO
-- Code to massage data
declare @table_count int = (select COALESCE(count(id),0) from ActivitySummary_STG);
declare @counter int = 1;
declare @activity_date date,
@current_date date;
WHILE (@table_count > 0 AND @counter <= @table_count)
BEGIN
select @activity_date = activity_date
from ActivitySummary_STG
where id = @counter;
if (@activity_date is not null)
BEGIN
set @current_date = @activity_date;
delete from ActivitySummary_STG
where id = @counter;
END
else
BEGIN
update ActivitySummary_STG SET
activity_date = @current_date
where id = @counter;
END
set @counter += 1;
END
INSERT INTO ActivitySummary (activity_date, activity, paid_time, unpaid_time, total_time)
select activity_date, activity, paid_time, unpaid_time, total_time
from ActivitySummary_STG;
truncate table ActivitySummary_STG;
GO
select * from ActivitySummary;
#2
1
I'd do it with a script component.
我用脚本组件来做。
Total Data Flow:
总数据流量:
ExcelSource --> Script Component (Tranformation) --> Conditional Split --> SQL Destination
ExcelSource - >脚本组件(转换) - >条件拆分 - > SQL目标
In script component:
在脚本组件中:
Check accountSummary on InputColumns
检查InputColumns上的accountSummary
Add ActivityDate as output column.
添加ActivityDate作为输出列。
Open Script:
outside of your row processing.
在你的行处理之外。
Add:
public datetime dte;
inside row processing:
内行处理:
if (DateTime.TryParse(Row.ActivitySummary.ToString()))
{dte=DateTime.Parse(Row.ActivitySummary.ToString());}
else
{Row.ActivityDate = dte;}
Then add a Conditional Split to remove null Activity Dates
然后添加条件拆分以删除空活动日期
#1
1
Hmm - well you will definitely have to do some programmatic adjusting of the data set to add that group date to the detail line. I'm unsure of how you will be importing the xlsx but I would recommend first off just using a SSIS package and then doing the adjustments in a script task as the "best" way to do this. See here on how to handle Excel in SSIS Script tasks.
嗯 - 你肯定要对数据集进行一些程序调整,以便将该组日期添加到细节线。我不确定你将如何导入xlsx,但我建议首先使用SSIS包,然后在脚本任务中进行调整,作为“最佳”方式。请参阅此处了解如何在SSIS脚本任务中处理Excel。
If you don't know SSIS or especially programming though, you're next best bet (in my opinion) is to just import the data into a staging table, do the manipulations with T-SQL and then insert that table into your main table. I did a SQL Fiddle of this here.
如果您不了解SSIS或特别是编程,那么您最好选择(在我看来)将数据导入临时表,使用T-SQL进行操作然后将该表插入主表。我在这里做了一个SQL小提琴。
CREATE TABLE ActivitySummary
(
id int identity(1,1),
activity_date date,
activity varchar(100),
paid_time decimal(5,2),
unpaid_time decimal(5,2),
total_time decimal(5,2)
)
CREATE TABLE ActivitySummary_STG
(
id int identity(1,1),
activity_date date,
activity varchar(100),
paid_time decimal(5,2),
unpaid_time decimal(5,2),
total_time decimal(5,2)
)
GO
-- Simulate import of Excel sheet into staging table
truncate table ActivitySummary_STG;
GO
INSERT INTO ActivitySummary_STG (activity_date, activity, paid_time, unpaid_time, total_time)
select '8/14/17',null,null,null,null
UNION ALL
select null,'001 Lunch',0,4.4,4.4
UNION ALL
select null,'002 Break',4.2,0,4.2
UNION ALL
select null,'007 System Down',7.45,0,7.45
UNION ALL
select null,'019 End of Work Day',0.02,0,0.02
UNION ALL
select '8/15/17',null,null,null,null
UNION ALL
select null,'001 Lunch',0,4.45,4.45
UNION ALL
select null,'002 Break',6.53,0,6.53
UNION ALL
select null,'007 System Down',0.51,0,0.51
UNION ALL
select null,'019 End of Work Day',0.02,0,0.02
GO
-- Code to massage data
declare @table_count int = (select COALESCE(count(id),0) from ActivitySummary_STG);
declare @counter int = 1;
declare @activity_date date,
@current_date date;
WHILE (@table_count > 0 AND @counter <= @table_count)
BEGIN
select @activity_date = activity_date
from ActivitySummary_STG
where id = @counter;
if (@activity_date is not null)
BEGIN
set @current_date = @activity_date;
delete from ActivitySummary_STG
where id = @counter;
END
else
BEGIN
update ActivitySummary_STG SET
activity_date = @current_date
where id = @counter;
END
set @counter += 1;
END
INSERT INTO ActivitySummary (activity_date, activity, paid_time, unpaid_time, total_time)
select activity_date, activity, paid_time, unpaid_time, total_time
from ActivitySummary_STG;
truncate table ActivitySummary_STG;
GO
select * from ActivitySummary;
#2
1
I'd do it with a script component.
我用脚本组件来做。
Total Data Flow:
总数据流量:
ExcelSource --> Script Component (Tranformation) --> Conditional Split --> SQL Destination
ExcelSource - >脚本组件(转换) - >条件拆分 - > SQL目标
In script component:
在脚本组件中:
Check accountSummary on InputColumns
检查InputColumns上的accountSummary
Add ActivityDate as output column.
添加ActivityDate作为输出列。
Open Script:
outside of your row processing.
在你的行处理之外。
Add:
public datetime dte;
inside row processing:
内行处理:
if (DateTime.TryParse(Row.ActivitySummary.ToString()))
{dte=DateTime.Parse(Row.ActivitySummary.ToString());}
else
{Row.ActivityDate = dte;}
Then add a Conditional Split to remove null Activity Dates
然后添加条件拆分以删除空活动日期