Intellipaat Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Machine Learning by (19k points)

I have been exploring scikit-learn, making decision trees with both entropy and gini splitting criteria, and exploring the differences.

My question, is how can I "open the hood" and find out exactly which attributes the trees are splitting on at each level, along with their associated information values, so I can see where the two criteria make different choices?

So far, I have explored the 9 methods outlined in the documentation. They don't appear to allow access to this information. But surely this information is accessible? I'm envisioning a list or dict that has entries for node and gain.

Thanks for your help and my apologies if I've missed something completely obvious.

1 Answer

0 votes
by (33.1k points)

To find the detail of attributes used in the Decision tree algorithm. You can simply draw a tree using the PYDot library of python. A tree graph helps you to understand the use of proper attributes at alternative node points. 

Code to draw a graph using PYDot:

 

from pydot import Dot, Edge

        g = Dot()

        g.set_node_defaults(color='lightgray',                      style='filled', shape='box', fontname='Courier',            fontsize='10')

       for node in sorted(self.nodes, key=lambda x: x.num):

            if draw_branches and node.type.is_cond:

                g.add_edge(Edge(str(node), str(node.true),                  color='green'))

                g.add_edge(Edge(str(node), str(node.false),                  color='red'))

            else:

                for suc in self.sucs(node):

                    g.add_edge(Edge(str(node), str(suc),                        color='blue'))

            for except_node in self.catch_edges.get(node,                                                            []):

                g.add_edge(Edge(str(node), str(except_node),

                color='black', style='dashed'))

        g.write_png('%s/%s.png' % (dname, name))

The code will return graph similar to the following garph:

image

Hope this answer helps.

31k questions

32.8k answers

501 comments

693 users

Browse Categories

...