I'm following the example walkthrough Export to SQL from Application Insights using Stream Analytics. I am trying to export custom event dimensions (context.custom.dimensions in the JSON example below) which get added as a nested JSON array in the data file. How do I flatten the dimensions array at context.custom.dimensions for export to SQL?
我正在使用Stream Analytics跟踪从Application Insights导出到SQL的示例演练。我正在尝试导出自定义事件维度(下面的JSON示例中的context.custom.dimensions),它将作为嵌套的JSON数组添加到数据文件中。如何在context.custom.dimensions中展平维度数组以导出到SQL?
JSON...
{
"event": [
{
"name": "50_DistanceSelect",
"count": 1
}
],
"internal": {
"data": {
"id": "aad2627b-60c5-48e8-aa35-197cae30a0cf",
"documentVersion": "1.5"
}
},
"context": {
"device": {
"os": "Windows",
"osVersion": "Windows 8.1",
"type": "PC",
"browser": "Chrome",
"browserVersion": "Chrome 43.0",
"screenResolution": {
"value": "1920X1080"
},
"locale": "unknown",
"id": "browser",
"userAgent": "Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/43.0.2357.134 Safari/537.36"
},
"application": {},
"location": {
"continent": "North America",
"country": "United States",
"point": {
"lat": 38.0,
"lon": -97.0
},
"clientip": "0.115.6.185",
"province": "",
"city": ""
},
"data": {
"isSynthetic": false,
"eventTime": "2015-07-15T23:43:27.595Z",
"samplingRate": 0.0
},
"operation": {
"id": "2474EE6F-5F6F-48C3-BA43-51636928075A"
},
"user": {
"anonId": "BA05C4BE-1C42-482F-9836-D79008E78A9D",
"anonAcquisitionDate": "0001-01-01T00:00:00Z",
"authAcquisitionDate": "0001-01-01T00:00:00Z",
"accountAcquisitionDate": "0001-01-01T00:00:00Z"
},
"custom": {
"dimensions": [
{
"CategoryAction": "click"
},
{
"SessionId": "73ef454d-fa39-4125-b4d0-44486933533b"
},
{
"WebsiteVersion": "3.0"
},
{
"PageSection": "FilterFind"
},
{
"Category": "EventCategory1"
},
{
"Page": "/page-in-question"
}
],
"metrics": []
},
"session": {
"id": "062703E5-5E15-491A-AC75-2FE54EF03623",
"isFirst": false
}
}
}
3 个解决方案
#1
6
A slightly more dynamic solution is to set up a temp table:
一个稍微动态的解决方案是设置临时表:
WITH ATable AS (
SELECT
temp.internal.data.id as ID
,dimensions.ArrayValue.CategoryAction as CategoryAction
,dimensions.ArrayValue.SessionId as SessionId
,dimensions.ArrayValue.WebsiteVersion as WebsiteVersion
,dimensions.ArrayValue.PageSection as PageSection
,dimensions.ArrayValue.Category as Category
,dimensions.ArrayValue.Page as Page
FROM [analyticseventinputs] temp
CROSS APPLY GetElements(temp.[context].[custom].[dimensions]) as dimensions)
and then doing joins based on a unique key
然后根据唯一键进行连接
FROM [analyticseventinputs] Input
Left JOIN ATable CategoryAction on
Input.internal.data.id = CategoryAction.ID AND
CategoryAction.CategoryAction <> "" AND
DATEDIFF(day, Input, CategoryAction) BETWEEN 0 AND 5
The rather annoying bit is the requirement for the datediff, because the joins are intended to combine 2 streams of data but in this case you are only joining on the unique key. So I set it to a large value of 5 days. This really only protects against the custom params not coming in ordered compared to the other solution.
相当讨厌的位是对datediff的要求,因为连接旨在组合2个数据流,但在这种情况下,您只加入唯一键。所以我把它设置为5天的大值。与其他解决方案相比,这实际上只能防止自定义参数不符合要求。
#2
5
Most tutorials online use CROSS APPLY or OUTER APPLY however this is not what you are looking for because it will put each property on a different row. To over come this use the functions: GetRecordPropertyValue and GetArrayElement as demoed below. This will flatten out the properties into a single row.
大多数在线教程使用CROSS APPLY或OUTER APPLY但是这不是你想要的,因为它会将每个属性放在不同的行上。为了解决这个问题,请使用以下函数:GetRecordPropertyValue和GetArrayElement。这会将属性展平为一行。
SELECT
GetRecordPropertyValue(GetArrayElement(MySource.context.custom.dimensions, 0), 'CategoryAction') AS CategoryAction,
GetRecordPropertyValue(GetArrayElement(MySource.context.custom.dimensions, 1), 'SessionId') AS SessionId,
GetRecordPropertyValue(GetArrayElement(MySource.context.custom.dimensions, 2), 'WebsiteVersion') AS WebsiteVersion,
GetRecordPropertyValue(GetArrayElement(MySource.context.custom.dimensions, 3), 'PageSection') AS PageSection,
GetRecordPropertyValue(GetArrayElement(MySource.context.custom.dimensions, 4), 'Category') AS Category,
GetRecordPropertyValue(GetArrayElement(MySource.context.custom.dimensions, 5), 'Page') AS Page
INTO
[outputstream]
FROM
[inputstream] MySource
#3
2
What schema do you have in SQL? Do you want a single row in SQL with all the dimensions as columns?
你在SQL中有什么架构?你想在SQL中使用所有维度为列的单行吗?
This might not be possible today. However there will be more Array/Record functions in Azure Stream Analytics after July 30.
今天这可能是不可能的。但是,7月30日之后,Azure Stream Analytics中将有更多阵列/记录功能。
Then you will be able to do something like this:
然后你就可以做这样的事情:
SELECT
CASE
WHEN GetArrayLength(A.context.custom.dimensions) > 0
THEN GetRecordPropertyValue(GetArrayElement(A.context.custom.dimensions, 0), 'CategoryAction')
ELSE ''
END AS CategoryAction
CASE
WHEN GetArrayLength(A.context.custom.dimensions) > 1
THEN GetRecordPropertyValue(GetArrayElement(A.context.custom.dimensions, 1), 'WebsiteVersion')
ELSE ''
END AS WebsiteVersion
CASE
WHEN GetArrayLength(A.context.custom.dimensions) > 2
THEN GetRecordPropertyValue(GetArrayElement(A.context.custom.dimensions, 2), 'PageSection')
ELSE ''
END AS PageSection
FROM input
If you want to have separate rows per dimension then you can use CROSS APPLY operator.
如果您希望每个维度具有单独的行,则可以使用CROSS APPLY运算符。
#1
6
A slightly more dynamic solution is to set up a temp table:
一个稍微动态的解决方案是设置临时表:
WITH ATable AS (
SELECT
temp.internal.data.id as ID
,dimensions.ArrayValue.CategoryAction as CategoryAction
,dimensions.ArrayValue.SessionId as SessionId
,dimensions.ArrayValue.WebsiteVersion as WebsiteVersion
,dimensions.ArrayValue.PageSection as PageSection
,dimensions.ArrayValue.Category as Category
,dimensions.ArrayValue.Page as Page
FROM [analyticseventinputs] temp
CROSS APPLY GetElements(temp.[context].[custom].[dimensions]) as dimensions)
and then doing joins based on a unique key
然后根据唯一键进行连接
FROM [analyticseventinputs] Input
Left JOIN ATable CategoryAction on
Input.internal.data.id = CategoryAction.ID AND
CategoryAction.CategoryAction <> "" AND
DATEDIFF(day, Input, CategoryAction) BETWEEN 0 AND 5
The rather annoying bit is the requirement for the datediff, because the joins are intended to combine 2 streams of data but in this case you are only joining on the unique key. So I set it to a large value of 5 days. This really only protects against the custom params not coming in ordered compared to the other solution.
相当讨厌的位是对datediff的要求,因为连接旨在组合2个数据流,但在这种情况下,您只加入唯一键。所以我把它设置为5天的大值。与其他解决方案相比,这实际上只能防止自定义参数不符合要求。
#2
5
Most tutorials online use CROSS APPLY or OUTER APPLY however this is not what you are looking for because it will put each property on a different row. To over come this use the functions: GetRecordPropertyValue and GetArrayElement as demoed below. This will flatten out the properties into a single row.
大多数在线教程使用CROSS APPLY或OUTER APPLY但是这不是你想要的,因为它会将每个属性放在不同的行上。为了解决这个问题,请使用以下函数:GetRecordPropertyValue和GetArrayElement。这会将属性展平为一行。
SELECT
GetRecordPropertyValue(GetArrayElement(MySource.context.custom.dimensions, 0), 'CategoryAction') AS CategoryAction,
GetRecordPropertyValue(GetArrayElement(MySource.context.custom.dimensions, 1), 'SessionId') AS SessionId,
GetRecordPropertyValue(GetArrayElement(MySource.context.custom.dimensions, 2), 'WebsiteVersion') AS WebsiteVersion,
GetRecordPropertyValue(GetArrayElement(MySource.context.custom.dimensions, 3), 'PageSection') AS PageSection,
GetRecordPropertyValue(GetArrayElement(MySource.context.custom.dimensions, 4), 'Category') AS Category,
GetRecordPropertyValue(GetArrayElement(MySource.context.custom.dimensions, 5), 'Page') AS Page
INTO
[outputstream]
FROM
[inputstream] MySource
#3
2
What schema do you have in SQL? Do you want a single row in SQL with all the dimensions as columns?
你在SQL中有什么架构?你想在SQL中使用所有维度为列的单行吗?
This might not be possible today. However there will be more Array/Record functions in Azure Stream Analytics after July 30.
今天这可能是不可能的。但是,7月30日之后,Azure Stream Analytics中将有更多阵列/记录功能。
Then you will be able to do something like this:
然后你就可以做这样的事情:
SELECT
CASE
WHEN GetArrayLength(A.context.custom.dimensions) > 0
THEN GetRecordPropertyValue(GetArrayElement(A.context.custom.dimensions, 0), 'CategoryAction')
ELSE ''
END AS CategoryAction
CASE
WHEN GetArrayLength(A.context.custom.dimensions) > 1
THEN GetRecordPropertyValue(GetArrayElement(A.context.custom.dimensions, 1), 'WebsiteVersion')
ELSE ''
END AS WebsiteVersion
CASE
WHEN GetArrayLength(A.context.custom.dimensions) > 2
THEN GetRecordPropertyValue(GetArrayElement(A.context.custom.dimensions, 2), 'PageSection')
ELSE ''
END AS PageSection
FROM input
If you want to have separate rows per dimension then you can use CROSS APPLY operator.
如果您希望每个维度具有单独的行,则可以使用CROSS APPLY运算符。