Forgive the n00b-ish question but I am new to data structures. I had been recently asked to aggregate a given array over another array and produce a tree based result.
请原谅我的问题,但我对数据结构并不熟悉。我最近被要求在另一个数组上聚合给定的数组并生成基于树的结果。
Can someone give me some pointers on how to attain this output?
有人能给我一些关于如何获得这个输出的建议吗?
INPUT
输入
var T = [
['COUNTRY', 'GENDER', 'MARITAL STATUS', 'SALES'],
['India', 'Female', 'Single', 2400],
['India', 'Male', 'Single', 5200],
['India', 'Female', 'Married', 4300],
['India', 'Male', 'Married', 3200],
['England', 'Female', 'Single', 1600],
['England', 'Female', 'Married', 2000],
['England', 'Male', 'Single', 4800],
['England', 'Male', 'Married', 6400],
];
var A = ['GENDER', 'MARITAL STATUS', 'COUNTRY'];
OUTPUT: Use 2*spaces for each leaf node.
输出:为每个叶节点使用2*空格。
TOTAL 29900
Female <Female Total>
Single <Single Female Total>
India <Single Female Total India>
England <Single Female Total England>
Married <Married Female Total>
India <Married Female Total India>
England <Married Female Total England>
Male <Male Total>
Single <Single Male Total>
India <Single Male Total India>
England <Single Male Total England>
Married <Married Male Total>
India <Married Male Total India>
England <Married Male Total England>
3 个解决方案
#1
6
The result can be represented by a nested object, each inner object is a sub tree with its total:
结果可以由一个嵌套对象表示,每个内部对象都是一个子树,其总数为:
{
total: 29900,
Female: {
total: 10300,
Single: {
total: 4000,
India: {
total: 2400
},
...
},
...
},
...
}
Just loop through all then entries, and add values to corresponding sub tree nodes.
只需遍历所有条目,并向相应的子树节点添加值。
For the output, you can use JSON.stringify
and strip unnecessary text from it.
对于输出,可以使用JSON。对不必要的文本进行字符串化和删除。
Warning: spoilers below
警告:以下剧透
const T = [
['COUNTRY', 'GENDER', 'MARITAL STATUS', 'SALES'],
['India', 'Female', 'Single', 2400],
['India', 'Male', 'Single', 5200],
['India', 'Female', 'Married', 4300],
['India', 'Male', 'Single', 3200],
['England', 'Female', 'Single', 1600],
['England', 'Female', 'Married', 2000],
['England', 'Male', 'Single', 4800],
['England', 'Male', 'Married', 6400],
]
const A = ['GENDER', 'MARITAL STATUS', 'COUNTRY']
function aggregate(T, A) {
const [fields, ...data] = T
const columns = A.map(name => fields.findIndex(it => name === it))
const count = fields.length - 1
const result = { total: 0 }
data.forEach(values => {
result.total += values[count]
//Go through the tree path, reduce is used here
//to avoid creating extra tracking variable for current position
columns.reduce((ref, index) => {
const key = values[index]
const next = ref[key] || (ref[key] = { total: 0 })
next.total += values[count]
return next
}, result)
})
return result
}
function pack(data) {
return Object.keys(data).reduce((result, key) => {
if (key !== 'total') {
const name = key + " " + data[key].total
result[name] = pack(data[key])
}
return result
}, {})
}
function format(result) {
return JSON.stringify(pack(result), null, 2)
.replace(/[^A-z0-9\n\s]/g, '')
.replace(/\n?\s*\n/g, '\n')
}
function output(it) {
const result = "TOTAL " + it.total + format(it)
console.log(result)
}
output(aggregate(T, A))
#2
2
A common approach for working with tree structures is to represent them as nested objects, as demonstrated by DarkKnight's answer, then create a string representation from this data structure.
使用树形结构的一种常见方法是将它们表示为嵌套对象,如DarkKnight的回答所示,然后从这个数据结构中创建一个字符串表示。
In the case presented by the OP, an alternative approach is to first sort the data, then create the string representation of the tree directly from the sorted data without needing any intermediate data structure of nested objects.
在OP提供的情况下,另一种方法是先对数据进行排序,然后直接从排序的数据创建树的字符串表示,而不需要任何嵌套对象的中间数据结构。
Given the array of columns to aggregate by,
给定列的数组,
['GENDER', 'MARITAL STATUS', 'COUNTRY']
we can sort the data by these columns:
我们可以按以下列对数据进行排序:
GENDER STATUS COUNTRY SALES
Female Single India 2400
Female Single England 1600
Female Married India 4300
Female Married England 2000
Male Single India 5200
Male Single England 4800
Male Married India 3200
Male Married England 6400
Looping backwards from the last row, while aggregating, we can build the string representation of the tree bottom up. The last row differs from the previous at level 3 (COUNTRY), which gives the following output:
从最后一行向后循环,在聚合时,我们可以构建树下向上的字符串表示。最后一行与第3级(国家)的前一行不同,第3级输出如下:
England 6400
The row before differs on both level 3 (COUNTRY) and 2 (MARITAL STATUS), prepended to the current output:
之前的行在第3级(国家)和第2级(婚姻状况)上都不相同,都是在当前输出之前:
Married 9600
India 3200
England 6400
After the row before that:
在那之前的那行之后:
England 4800
Married 9600
India 3200
England 6400
Then, the 5th row differs from the one before on all 3 levels:
那么,第五行在三个层面上都与前一行不同:
Male 19600
Single 10000
India 5200
England 4800
Married 9600
India 3200
England 6400
And so on, until the whole tree is represented:
等等,直到整个树被表示出来:
Total 29900
Female 10300
Single 4000
India 2400
England 1600
Married 6300
India 4300
England 2000
Male 19600
Single 10000
India 5200
England 4800
Married 9600
India 3200
England 6400
Below is working code (ES3 compliant), demonstrating the approach.
下面是工作代码(ES3兼容),演示了该方法。
var T = [
['COUNTRY', 'GENDER', 'MARITAL STATUS', 'SALES'],
['India', 'Female', 'Single', 2400],
['India', 'Male', 'Single', 5200],
['India', 'Female', 'Married', 4300],
['India', 'Male', 'Married', 3200],
['England', 'Female', 'Single', 1600],
['England', 'Female', 'Married', 2000],
['England', 'Male', 'Single', 4800],
['England', 'Male', 'Married', 6400],
];
var A = ['GENDER', 'MARITAL STATUS', 'COUNTRY'];
var valueField = 'SALES';
// sortBy GENDER ascending
// MARITAL STATUS descending
// COUNTRY descending
var sortDirection = [1, -1, -1];
function find(arr, val){
for(var i = 0 ; i < arr.length; i++){
if(arr[i] === val) return i;
}
return -1;
}
function buildTreeString(
data, aggregateBy, sortDirection, valueField
) {
var i, record,
value, level,
groupBy = [],
result = [],
sums = [],
total = 0;
// get column index of valueField
valueField = find(data[0], valueField);
// get column indexes to groupBy
for(var i = 0; i < aggregateBy.length; i++) {
groupBy[i] = find(data[0], aggregateBy[i]);
}
// sort data
data = data.slice(1)
.sort(function(a, b) {
var i, compare = 0, column;
for(i = 0; i < groupBy.length && !compare; i++) {
column = groupBy[i];
compare = a[column] < b[column] ?
sortDirection[i] :
(a[column] > b[column] ?
-sortDirection[i] : 0);
}
return compare;
});
// loop over data rows, output tree nodes
for(i = 0; i < data.length; i++) {
record = data[i];
value = record[valueField];
total += value;
//loop over columns, starting from deepest level
for(level = groupBy.length - 1; level > -1; level--) {
sums[level] = (sums[level] || 0) + value;
if(i == data.length - 1 ||
record[groupBy[level]] != data[i + 1][groupBy[level]]) {
//output tree node
result.push(
Array(level + 2).join(' ') +
record[groupBy[level]] + ' ' +
sums[level]);
//reset level aggregation
sums[level] = 0;
}
}
}
result.push('Total ' + total);
result.reverse();
return result.join('\n');
}
console.log(
buildTreeString(T, A, sortDirection, valueField)
);
#3
2
Tree is a good option to implement this question. You can also do the aggregation in one-time pass. The basic idea is
树是实现这个问题的一个很好的选择。您还可以一次性地进行聚合。基本思想是
-
sort the data by the given columns.
按给定列对数据进行排序。
-
loop the array, check the given column's value
循环数组,检查给定列的值。
2.1 aggregate the group count if column's value is as same as previous line
2.1如果列的值与前一行相同,则聚合组计数
2.2 output group name and count if column's value is different with previous line
2.2如果列的值与前一行不同,则输出组名和计数
Here is a example which I guided a CS student for his assignment, which is pretty similar to yours.
这是我指导一个CS学生完成作业的一个例子,和你们的作业很相似。
The method sumaryStage3 at Here implement the logic of steps 2.
这里的sumaryStage3方法实现步骤2的逻辑。
Pls ignore the code style and quality. It's not my code.
请忽略代码样式和质量。这不是我的代码。
#1
6
The result can be represented by a nested object, each inner object is a sub tree with its total:
结果可以由一个嵌套对象表示,每个内部对象都是一个子树,其总数为:
{
total: 29900,
Female: {
total: 10300,
Single: {
total: 4000,
India: {
total: 2400
},
...
},
...
},
...
}
Just loop through all then entries, and add values to corresponding sub tree nodes.
只需遍历所有条目,并向相应的子树节点添加值。
For the output, you can use JSON.stringify
and strip unnecessary text from it.
对于输出,可以使用JSON。对不必要的文本进行字符串化和删除。
Warning: spoilers below
警告:以下剧透
const T = [
['COUNTRY', 'GENDER', 'MARITAL STATUS', 'SALES'],
['India', 'Female', 'Single', 2400],
['India', 'Male', 'Single', 5200],
['India', 'Female', 'Married', 4300],
['India', 'Male', 'Single', 3200],
['England', 'Female', 'Single', 1600],
['England', 'Female', 'Married', 2000],
['England', 'Male', 'Single', 4800],
['England', 'Male', 'Married', 6400],
]
const A = ['GENDER', 'MARITAL STATUS', 'COUNTRY']
function aggregate(T, A) {
const [fields, ...data] = T
const columns = A.map(name => fields.findIndex(it => name === it))
const count = fields.length - 1
const result = { total: 0 }
data.forEach(values => {
result.total += values[count]
//Go through the tree path, reduce is used here
//to avoid creating extra tracking variable for current position
columns.reduce((ref, index) => {
const key = values[index]
const next = ref[key] || (ref[key] = { total: 0 })
next.total += values[count]
return next
}, result)
})
return result
}
function pack(data) {
return Object.keys(data).reduce((result, key) => {
if (key !== 'total') {
const name = key + " " + data[key].total
result[name] = pack(data[key])
}
return result
}, {})
}
function format(result) {
return JSON.stringify(pack(result), null, 2)
.replace(/[^A-z0-9\n\s]/g, '')
.replace(/\n?\s*\n/g, '\n')
}
function output(it) {
const result = "TOTAL " + it.total + format(it)
console.log(result)
}
output(aggregate(T, A))
#2
2
A common approach for working with tree structures is to represent them as nested objects, as demonstrated by DarkKnight's answer, then create a string representation from this data structure.
使用树形结构的一种常见方法是将它们表示为嵌套对象,如DarkKnight的回答所示,然后从这个数据结构中创建一个字符串表示。
In the case presented by the OP, an alternative approach is to first sort the data, then create the string representation of the tree directly from the sorted data without needing any intermediate data structure of nested objects.
在OP提供的情况下,另一种方法是先对数据进行排序,然后直接从排序的数据创建树的字符串表示,而不需要任何嵌套对象的中间数据结构。
Given the array of columns to aggregate by,
给定列的数组,
['GENDER', 'MARITAL STATUS', 'COUNTRY']
we can sort the data by these columns:
我们可以按以下列对数据进行排序:
GENDER STATUS COUNTRY SALES
Female Single India 2400
Female Single England 1600
Female Married India 4300
Female Married England 2000
Male Single India 5200
Male Single England 4800
Male Married India 3200
Male Married England 6400
Looping backwards from the last row, while aggregating, we can build the string representation of the tree bottom up. The last row differs from the previous at level 3 (COUNTRY), which gives the following output:
从最后一行向后循环,在聚合时,我们可以构建树下向上的字符串表示。最后一行与第3级(国家)的前一行不同,第3级输出如下:
England 6400
The row before differs on both level 3 (COUNTRY) and 2 (MARITAL STATUS), prepended to the current output:
之前的行在第3级(国家)和第2级(婚姻状况)上都不相同,都是在当前输出之前:
Married 9600
India 3200
England 6400
After the row before that:
在那之前的那行之后:
England 4800
Married 9600
India 3200
England 6400
Then, the 5th row differs from the one before on all 3 levels:
那么,第五行在三个层面上都与前一行不同:
Male 19600
Single 10000
India 5200
England 4800
Married 9600
India 3200
England 6400
And so on, until the whole tree is represented:
等等,直到整个树被表示出来:
Total 29900
Female 10300
Single 4000
India 2400
England 1600
Married 6300
India 4300
England 2000
Male 19600
Single 10000
India 5200
England 4800
Married 9600
India 3200
England 6400
Below is working code (ES3 compliant), demonstrating the approach.
下面是工作代码(ES3兼容),演示了该方法。
var T = [
['COUNTRY', 'GENDER', 'MARITAL STATUS', 'SALES'],
['India', 'Female', 'Single', 2400],
['India', 'Male', 'Single', 5200],
['India', 'Female', 'Married', 4300],
['India', 'Male', 'Married', 3200],
['England', 'Female', 'Single', 1600],
['England', 'Female', 'Married', 2000],
['England', 'Male', 'Single', 4800],
['England', 'Male', 'Married', 6400],
];
var A = ['GENDER', 'MARITAL STATUS', 'COUNTRY'];
var valueField = 'SALES';
// sortBy GENDER ascending
// MARITAL STATUS descending
// COUNTRY descending
var sortDirection = [1, -1, -1];
function find(arr, val){
for(var i = 0 ; i < arr.length; i++){
if(arr[i] === val) return i;
}
return -1;
}
function buildTreeString(
data, aggregateBy, sortDirection, valueField
) {
var i, record,
value, level,
groupBy = [],
result = [],
sums = [],
total = 0;
// get column index of valueField
valueField = find(data[0], valueField);
// get column indexes to groupBy
for(var i = 0; i < aggregateBy.length; i++) {
groupBy[i] = find(data[0], aggregateBy[i]);
}
// sort data
data = data.slice(1)
.sort(function(a, b) {
var i, compare = 0, column;
for(i = 0; i < groupBy.length && !compare; i++) {
column = groupBy[i];
compare = a[column] < b[column] ?
sortDirection[i] :
(a[column] > b[column] ?
-sortDirection[i] : 0);
}
return compare;
});
// loop over data rows, output tree nodes
for(i = 0; i < data.length; i++) {
record = data[i];
value = record[valueField];
total += value;
//loop over columns, starting from deepest level
for(level = groupBy.length - 1; level > -1; level--) {
sums[level] = (sums[level] || 0) + value;
if(i == data.length - 1 ||
record[groupBy[level]] != data[i + 1][groupBy[level]]) {
//output tree node
result.push(
Array(level + 2).join(' ') +
record[groupBy[level]] + ' ' +
sums[level]);
//reset level aggregation
sums[level] = 0;
}
}
}
result.push('Total ' + total);
result.reverse();
return result.join('\n');
}
console.log(
buildTreeString(T, A, sortDirection, valueField)
);
#3
2
Tree is a good option to implement this question. You can also do the aggregation in one-time pass. The basic idea is
树是实现这个问题的一个很好的选择。您还可以一次性地进行聚合。基本思想是
-
sort the data by the given columns.
按给定列对数据进行排序。
-
loop the array, check the given column's value
循环数组,检查给定列的值。
2.1 aggregate the group count if column's value is as same as previous line
2.1如果列的值与前一行相同,则聚合组计数
2.2 output group name and count if column's value is different with previous line
2.2如果列的值与前一行不同,则输出组名和计数
Here is a example which I guided a CS student for his assignment, which is pretty similar to yours.
这是我指导一个CS学生完成作业的一个例子,和你们的作业很相似。
The method sumaryStage3 at Here implement the logic of steps 2.
这里的sumaryStage3方法实现步骤2的逻辑。
Pls ignore the code style and quality. It's not my code.
请忽略代码样式和质量。这不是我的代码。