一、什么是递归?
-
在函数内部,可以调用其他函数,如果一个函数内部调用自己本身,这个函数就叫做递归函数。
-
PS : 在函数内部调用其他函数不是函数的嵌套,而在函数的内部定义子函数才是函数的嵌套。
-
-
递归的特性:
-
递归函数必须有一个明确的结束条件
-
每进入更深一层的递归时,问题规模相对于上一次递归都应减少
-
相邻两次重复之间有紧密的联系,前一次要为后一次做准备(通常前一次的输出作为后一次的输入)
-
递归的效率不高,递归层次过多会导致栈溢出(在计算机中,函数调用是通过栈(stack)这种数据结构实现的,每当进入一次方法调用,栈就会加一层栈帧,每当返回一层栈帧,栈就会减一层栈帧。由于栈的大小不是无限的,所以,递归调用的次数过多,会导致栈溢出)
-
-
先看一个例子,一个关于实现叠加的两种方法的例子:
import sys
#通过循环来实现叠加
def sum1(n):
'''
1 to n,The sum function
'''
sum = 0
for i in range(1,n + 1):
sum += i
return sum
#通过函数的递归来实现叠加
def sum2(n):
'''
1 to n,The sum function
'''
if n > 0:
return n + sum_recu(n - 1) #调用函数自身
else:
return 0
print("循环叠加-->",sum1(100))
print("递归叠加-->",sum2(100))
#两者实现的效果均是:5050-
从上述的例子可以看出,两者都实现了叠加的效果,那么后者相对于前者有什么优点和缺点?
-
二 、递归函数有什么优缺点?
-
递归函数的优点
-
定义简单,逻辑(logic)清晰。理论上,所有的递归都可以写成循环的方式,但循环的逻辑不如递归清晰
-
-
递归的缺点
-
递归调用的次数过多,会导致栈溢出(stackoverflow)
-
三、我们使用递归函数创建决策树
-
Implement the function
build_tree(rows)
. This is the function we use to actually build our tree. Please follow the steps below,-
We will be using recursive function here (递归函数)
-
Find the best split using the method we implemented before, store information gain and the question to a local variable
-
Define the ending condition. If there is no gain, i.e.
gain == 0
, return a leaf nodeLeaf(rows)
-
Otherwise, get the partition of the tree at the current node with the best question(
Determine
object that we got before) -
We use DFS(Depth First Search) to build the tree, and do the true_branch recursively first.
-
We then split the false_branch recursively
-
At last, we need to return something. We will return a
DecisionNode
object here since the starting point is also aDecisionNode
-
Notes:
-
This function might take you some time and thinking. Be patient
-
You need to understand the logic behind our DT before you even start to think. Talk to me if you are not feeling confident enough
-
Look up recursive function and depth first search if necessary.
-
-
-
code is as follows
def build_tree(rows):
"""
开始创建我们的决策树,使用递归法
Building our tree recursively
:param rows: 一部分数据 a subset of our data set
:return: recursively return a decision node and finally a tree
"""
# Your code here**-**
# 找到这组数据的最佳分割点 looking for the datasets best split
# 此处build_tree_best_question本身就是一对象,可以直接使用
build_tree_best_gain, build_tree_best_question = find_best_split(rows)
# When info_gain = 0, return Leaf(rows)
if build_tree_best_gain == 0:
return Leaf(rows)
# 按照最佳分割点进行分割
true_node, false_node = partition(rows,build_tree_best_question)
left_tree = build_tree(true_node)
right_tree = build_tree(false_node)
# otherwise return DecisionNode
return DecisionNode(build_tree_best_question,left_tree,right_tree) -
JAN 1.9
-