One of the issues I've run into with Apache Spark, is visualizing Decision Trees.
我在Apache Spark中遇到的一个问题是可视化决策树。
I can produce a tree using DecisionTree.trainClassifier
. and I can get some rudimentary output using :
我可以使用DecisionTree.trainClassifier生成一棵树。我可以使用以下方法获得一些基本输出:
print(model.toDebugString())
But ideally, the current output:
但理想情况下,目前的输出:
If (feature 0 <= -35.0)
If (feature 24 <= 176.0)
Predict: 2.1
If (feature 24 = 176.0)
Predict: 4.2
Else (feature 24 > 176.0)
Predict: 6.3
Else (feature 0 > -35.0)
If (feature 24 <= 11.0)
Predict: 4.5
Else (feature 24 > 11.0)
Predict: 10.2
could be output as JSON, or something parseable, so that we could layer in a D3 Visualization library. Using the example above...
可以输出为JSON或可解析的东西,以便我们可以在D3 Visualization库中进行分层。使用上面的例子......
{
"node": [
{
"name":"node1",
"rule":"feature 0 <= -35.0",
"children":[
{
"name":"node2",
"rule":"feature 24 <= 176.0",
"children":[
{
"name":"node4",
"rule":"feature 20 < 116.0",
"predict": 2.1
},
{
"name":"node5",
"rule":"feature 20 = 116.0",
"predict": 4.2
},
{
"name":"node5",
"rule":"feature 20 > 116.0",
"predict": 6.3
}
]
},
{
"name":"node3",
"rule":"feature 0 > -35.0",
"children":[
{
"name":"node7",
"rule":"feature 3 <= 11.0",
"predict": 4.5
},
{
"name":"node8",
"rule":"feature 3 > 11.0",
"predict": 10.2
}
]
}
]
}
]
}
1 个解决方案
#1
3
I came across this project Decision-Tree-Visualization-Spark For Visualizing Decision Tree model
我遇到了这个项目Decision-Tree-Visualization-Spark For Visualizing Decision Tree model
It has two steps
它有两个步骤
- Parse Spark Decision Tree output to a JSON format.
- Use the JSON file as an input to a D3.js visualization.
将Spark Spark Decision Tree输出解析为JSON格式。
使用JSON文件作为D3.js可视化的输入。
For the parser check Dt.py
对于解析器检查Dt.py
The input to the function def tree_json(tree)
is your models toDebugString()
函数def tree_json(tree)的输入是你的模型toDebugString()
#1
3
I came across this project Decision-Tree-Visualization-Spark For Visualizing Decision Tree model
我遇到了这个项目Decision-Tree-Visualization-Spark For Visualizing Decision Tree model
It has two steps
它有两个步骤
- Parse Spark Decision Tree output to a JSON format.
- Use the JSON file as an input to a D3.js visualization.
将Spark Spark Decision Tree输出解析为JSON格式。
使用JSON文件作为D3.js可视化的输入。
For the parser check Dt.py
对于解析器检查Dt.py
The input to the function def tree_json(tree)
is your models toDebugString()
函数def tree_json(tree)的输入是你的模型toDebugString()