反馈:Apache Spark决策树的可视化

时间:2021-12-01 23:09:24

One of the issues I've run into with Apache Spark, is visualizing Decision Trees.

我在Apache Spark中遇到的一个问题是可视化决策树。

I can produce a tree using DecisionTree.trainClassifier. and I can get some rudimentary output using :

我可以使用DecisionTree.trainClassifier生成一棵树。我可以使用以下方法获得一些基本输出:

print(model.toDebugString())

But ideally, the current output:

但理想情况下,目前的输出:

    If (feature 0 <= -35.0)
  If (feature 24 <= 176.0)
    Predict: 2.1
  If (feature 24 = 176.0)
    Predict: 4.2
  Else (feature 24 > 176.0)
    Predict: 6.3
Else (feature 0 > -35.0)
  If (feature 24 <= 11.0)
    Predict: 4.5
  Else (feature 24 > 11.0)
    Predict: 10.2

could be output as JSON, or something parseable, so that we could layer in a D3 Visualization library. Using the example above...

可以输出为JSON或可解析的东西,以便我们可以在D3 Visualization库中进行分层。使用上面的例子......

{
"node": [
    {
        "name":"node1",
        "rule":"feature 0 <= -35.0",
            "children":[
                {
                  "name":"node2",
                  "rule":"feature 24 <= 176.0",
                  "children":[
                      {
                      "name":"node4",
                      "rule":"feature 20 < 116.0",
                      "predict":  2.1
                      },
                      {
                      "name":"node5",
                      "rule":"feature 20 = 116.0",
                      "predict": 4.2
                      },
                      {
                      "name":"node5",
                      "rule":"feature 20 > 116.0",
                      "predict": 6.3
                      }
                  ]                    
                },
                {
                "name":"node3",
                "rule":"feature 0 > -35.0",
                  "children":[
                      {
                      "name":"node7",
                      "rule":"feature 3 <= 11.0",
                      "predict": 4.5
                      },
                      {
                      "name":"node8",
                      "rule":"feature 3 > 11.0",
                      "predict": 10.2
                      }
                  ]                                        
                }

            ]
    }
]

}

1 个解决方案

#1


3  

I came across this project Decision-Tree-Visualization-Spark For Visualizing Decision Tree model

我遇到了这个项目Decision-Tree-Visualization-Spark For Visualizing Decision Tree model

It has two steps

它有两个步骤

  • Parse Spark Decision Tree output to a JSON format.
  • 将Spark Spark Decision Tree输出解析为JSON格式。

  • Use the JSON file as an input to a D3.js visualization.
  • 使用JSON文件作为D3.js可视化的输入。

For the parser check Dt.py

对于解析器检查Dt.py

The input to the function def tree_json(tree) is your models toDebugString()

函数def tree_json(tree)的输入是你的模型toDebugString()

#1


3  

I came across this project Decision-Tree-Visualization-Spark For Visualizing Decision Tree model

我遇到了这个项目Decision-Tree-Visualization-Spark For Visualizing Decision Tree model

It has two steps

它有两个步骤

  • Parse Spark Decision Tree output to a JSON format.
  • 将Spark Spark Decision Tree输出解析为JSON格式。

  • Use the JSON file as an input to a D3.js visualization.
  • 使用JSON文件作为D3.js可视化的输入。

For the parser check Dt.py

对于解析器检查Dt.py

The input to the function def tree_json(tree) is your models toDebugString()

函数def tree_json(tree)的输入是你的模型toDebugString()