Let’s solve another python exercise. Here it is:

*4.8 You are given a binary tree in which each node contains a value. Design an algorithm
to print all paths which sum up to that value. Note that it can be any path in the tree
– it does not have to start at the root.
*

I first scratched this ugly drawing.

For the sum 8 I expect the solution to be:

[[3, 2, 3], [3, 2, 3, -2, 2], [3, 4, 1], [4, 1, 3]]

### Representation and test

Let’s write the simplest representation and a test. The code in the test() method creates the tree in my drawing:

class TreeNode: def __init__(self, data, left=None, right=None): self.data=data self.left=left self.right=right def get_paths(node,sum): def test(): n1=TreeNode(-2, TreeNode(2)) n2=TreeNode(3,n1,TreeNode(2)) n3=TreeNode(2,n2,TreeNode(1)) n4=TreeNode(1, TreeNode(3)); n5=TreeNode(4,n4) n8=TreeNode(3,n3,n5) p=get_paths(n8,8) res=[[3, 2, 3], [3, 2, 3, -2, 2], [3, 4, 1], [4, 1, 3]] res.sort() p.sort() assert (p==res)

### First attempt

Here is what I first came up with: A recursive function to compute all paths that sum to a given value starting at the root node. We will not stop the recursion once we have found the sum because we can have negative values so we might find it again. We then just traverse the tree and call the above function for each node as the root:

def get_paths_with_root(node,s_val): "returns a list of lists of nodes. Each list contains a path of nodes of sum s_val" if not node: return [] found_solution=[] if (node.data==s_val): found_solution = [[node.data]] l_left=get_paths_with_root(node.left,s_val-node.data) if l_left: l_left=[ [node.data]+x for x in l_left] l_right=get_paths_with_root(node.right,s_val-node.data) if l_right: l_right=[ [node.data]+x for x in l_right] return found_solution+l_left+l_right def get_paths(tree_node, val): "returns all paths with sum=val in the tree" if not tree_node: return [] return get_paths_with_root(tree_node,val) + get_paths(tree_node.left, val) + get_paths(tree_node.right, val)

Well, this is it. It’s maybe not 100% pythonic but at least the test passes. What’s the complexity of this ? Is O(n^2).

If you’re not convinced, this is why:

We traverse the tree once with ` get_paths `

method and, for each node, we traverse the again the tree having the current node as root.

Suppose an operation consists in visiting a node, we will have n operations for the root node, n/2 operations for the 2 nodes on level 2, n/4 operations for each of the 4 nodes on level 3 etc, which leads us to n + 2*n/2 + 4*n/4… = n^2

#### Update

Well, mebubo was not convinced about this complexity being O(n^2). He’s right: n + 2*n/2 + 4*n/4… is not n^2 because it is not n+n+..n, n tims but n+n+..n k times where k = log(n) is the number of levels, therfore the complexity is O(n*log(n))

Could we do better ?

### A cleaner solution

We could traverse the tree and keep, for each iteration, the chain of nodes which led us to the current node. We will search for our sum in this chain, starting with the current node and looking up:

def get_paths_optimized(node, sum_val, node_chain=None): if not node_chain: node_chain=[] if not node: return [] s=0 res=[] node_chain_tmp = [node.data] + node_chain for i, val in enumerate(node_chain_tmp): s=s+val if s==sum_val: solution=(node_chain_tmp[:i+1]) solution.reverse() res.append(solution) return res + get_paths_optimized(node.left, sum_val, node_chain_tmp) + get_paths_optimized(node.right, sum_val, node_chain_tmp)

So what’s the complexity for this one ?

For each node, we will have k operations, where k is a path of nodes starting from the root to the current node. Therefore we have n*k operations to perform. K is at most the height of the tree, therefore, intuitively, k is log(n). You can find demonstration of this here . Or read the one in the Cracking the coding interview.

So, the complexity of this second approach is n*log(n).

That’s all folks.

Come on, surely n + 2*n/2 + 4*n/4… is not n^2 🙂

yes of course, it is n*log(n). Which seems reasonable even without any demonstration: at each level we will perform at most n operations (at each level we will visit less than all the nodes) and we have in average log(n) levels (for a balanced tree).