So here's my challenge. I have a spreadsheet that looks like this:
所以这是我的挑战。我有一个如下所示的电子表格:
prod_id | pack | value | durable | feat | ease | grade | # of ratings
1 75 85 99 90 90 88 1
2 90 95 81 86 87 88 9
3 87 86 80 85 82 84 37
4 92 80 68 67 45 70 5
5 93 81 94 93 90 90 4
6 93 70 60 60 70 70 1
Each product has individual grade criteria (packaging - ease of use), an overall average grade, and number of ratings the product received.
每种产品都有各自的等级标准(包装 - 易用性),总体平均等级和产品收到的评级数。
The entire data set I have places 68% of the products within the 80-89 grade range. I need to have the grades recalculated to take into account the number of ratings for the product, so products that fall far below the total average number of ratings are ranked lower (and receive a lower grade). Basically a product with a grade of 84 and 100 ratings should rank higher than a product with a grade of 95 with only 5 ratings.
我拥有的整个数据集中有68%的产品在80-89等级范围内。我需要重新计算成绩以考虑产品的评级数量,因此远远低于总平均评级数的产品排名较低(并且得分较低)。基本上,等级为84和100等级的产品应该高于等级为95且仅有5个等级的产品。
I hope this makes sense, thanks for any help in advance!
我希望这是有道理的,谢谢你提前帮助!
2 个解决方案
#1
I can't tell exactly without a calculator, but it looks like
没有计算器,我无法确切地说出来,但它看起来像
Grade = AVG(pack, value, durable, feat, ease)
If that's the case, then you just have to define "fall far below the total average number of ratings". I'll weight against the standard deviation from the mean - which may or may not be a decent algorithm (I'm not statistician). But, this means any rating that's exactly the mean = 1, and you get +/- from there.
如果是这种情况,那么您只需要定义“远远低于总平均评分数”。我将权衡与平均值的标准偏差 - 这可能是也可能不是一个不错的算法(我不是统计学家)。但是,这意味着任何等级均为平均值= 1,并且从那里得到+/-。
WeightedGrade = Grade * ABS((Rating - AVG(H:H)) / STDEV(H:H))
#2
What you need is a meaningful algorithm for weighting. You can choose anything that makes sense to you, but the first thing to try, based on your requirements, is to multiply the raw grade by a weighting factor. Calculate that as the ratio of the # of ratings divided by the total # of ratings gives this for an answer:
你需要的是一个有意义的加权算法。您可以选择对您有意义的任何事情,但首先要根据您的要求尝试将原始等级乘以加权因子。计算一下,当评级数除以评级总数时,得到的答案为:
prod id raw grade # ratings weight weighted grade
1 88 1 0.01754386 1.543859649
2 88 9 0.157894737 13.89473684
3 84 37 0.649122807 54.52631579
4 70 5 0.087719298 6.140350877
5 90 4 0.070175439 6.315789474
6 70 1 0.01754386 1.228070175
57
Not sure if this makes sense for your problem, but it does meet your requirements. Maybe you can normalize the weighted grades so prod id # 3 is 100 and scale the rest from that.
不确定这是否对您的问题有意义,但它确实符合您的要求。也许你可以将加权等级标准化,所以prod id#3是100,并从中扩展其余部分。
Have a look at "Collective Intelligence" for some other ideas.
看看其他一些想法的“集体智慧”。
#1
I can't tell exactly without a calculator, but it looks like
没有计算器,我无法确切地说出来,但它看起来像
Grade = AVG(pack, value, durable, feat, ease)
If that's the case, then you just have to define "fall far below the total average number of ratings". I'll weight against the standard deviation from the mean - which may or may not be a decent algorithm (I'm not statistician). But, this means any rating that's exactly the mean = 1, and you get +/- from there.
如果是这种情况,那么您只需要定义“远远低于总平均评分数”。我将权衡与平均值的标准偏差 - 这可能是也可能不是一个不错的算法(我不是统计学家)。但是,这意味着任何等级均为平均值= 1,并且从那里得到+/-。
WeightedGrade = Grade * ABS((Rating - AVG(H:H)) / STDEV(H:H))
#2
What you need is a meaningful algorithm for weighting. You can choose anything that makes sense to you, but the first thing to try, based on your requirements, is to multiply the raw grade by a weighting factor. Calculate that as the ratio of the # of ratings divided by the total # of ratings gives this for an answer:
你需要的是一个有意义的加权算法。您可以选择对您有意义的任何事情,但首先要根据您的要求尝试将原始等级乘以加权因子。计算一下,当评级数除以评级总数时,得到的答案为:
prod id raw grade # ratings weight weighted grade
1 88 1 0.01754386 1.543859649
2 88 9 0.157894737 13.89473684
3 84 37 0.649122807 54.52631579
4 70 5 0.087719298 6.140350877
5 90 4 0.070175439 6.315789474
6 70 1 0.01754386 1.228070175
57
Not sure if this makes sense for your problem, but it does meet your requirements. Maybe you can normalize the weighted grades so prod id # 3 is 100 and scale the rest from that.
不确定这是否对您的问题有意义,但它确实符合您的要求。也许你可以将加权等级标准化,所以prod id#3是100,并从中扩展其余部分。
Have a look at "Collective Intelligence" for some other ideas.
看看其他一些想法的“集体智慧”。