DFS and practice questions

Precautions

Btw if input is given like [“X591X”,”X1X5X”,”X231X”, “1XXX1”] and we should convert it to 2d graph representation like this

def convert_to_graph(lst):
    graph = []
    for row in lst:
        graph.append(list(row))
    return graph

Also really important for the boundary check of 2x2 graph.

The problem lies in the condition next_x >= len(graph) when checking the bounds for next_x in the BFS loop. It should be next_x >= len(graph[0]) since the length of each row in graph represents the number of columns.

basic DFS with backtracking

Let’s say we are given list [1,2,3] and want to create permutation where order matters so like [1,3,2], [2,1,3], [2,3,1] , etc.

I had 2 problems.

1) What do I iterate through in my DFS? The index of the list or the actual number in the list?? The way to mark index or number as visited will differ so I wasn’t sure

2) I struggled with what parameters to pass in for my recursive dfs function. Should I pass the next index (i+1) for DFS to go to the next index? But what about list index out of range error? Certainly i+1 goes beyond my list and this error will occur!

Doubts to clear

1) Actually either way is fine! But the way to mark index or number as visited differs but that is the only difference

With index traversal:

nums = [1,2,3]
visited = [False for _ in range(len(nums))]
ans = []
path =[]

def dfs(nums,path,visited):
    # index traversal
    for i in range(len(nums)):
        if len(path) == len(nums):
            ans.append(path.copy())
            return
        if not visited[i]:
            visited[i]=True
            path.append(nums[i])
            dfs(nums,path,visited)
            path.pop()
            visited[i]=False
            
dfs(nums,path,visited)
print(ans)

Here, I am accessing the index of nums list and marking the index of visited list as true or false. Don’t worry about backtracking yet we go in later.

With number traversal:

nums = [1,2,3]
ans = []
path =[]

def dfs(nums,path):
    for num in nums:
        if len(path) == len(nums):
            ans.append(path.copy())
            return
        if num not in path:
            path.append(num)
            dfs(nums,path)
            path.pop()
         
dfs(nums,path)
print(ans)

As you can see, we cannot do something like

visited = [False for _ in range(len(nums))]

def dfs(nums,path,visited):
    for i in nums:
        if len(path) == len(nums):
            ans.append(path)
            return
        if not visited[i]:
            visited[i]=True
            path.append(i)
            dfs(nums,path,visited)
            path.pop()

because I am accessing my boolean visited array with an actual number, not the index. So in this case, we cannot use this way of marking number as visited. Instead, there is a much simpler way - if num not in list ! This checks for any incoming iterated number if it is already present (i.e. visited) in my path list!

2) Actually, I was confused at the fundamental concept of not visited[index] or if num not in list. We don’t need to pass in the next index or value and frankly we don’t need to care because the boolean list or if num not in list automatically checks this for us.

Let’s say we don’t put index+1 as parameter for our recursive function. I am afraid it will recur back to index and go on infinite loop. Will it actually?

def dfs(nums,path,visited):
    # index traversal
    for i in range(len(nums)):
        if len(path) == len(nums):
            ans.append(path.copy())
            return
        if not visited[i]:
            visited[i]=True
            path.append(nums[i])
            dfs(nums,path,visited)
            path.pop()
            visited[i]=False

Let’s say i=1 and the corresponding number is 2 right. So we mark visited[1] = True and append 2 to our path. Then go to dfs(nums,path,visited). i will go from 0 again. But it WILL NOT go on infinite loop check because of our boolean check or if num not in list. So i will go to 1 and since it is already visited, it will go to 2.

See? We don’t need to pass index+1 as parameter.

Example

Let’s take this leetcode question as example

If we encounter the X sign in 2x2 board, we do a DFS search on that index.

BTW BFS is used to find SHORTEST distance.

2 conditions for DFS

1) Specify the end condition

If DFS search goes beyond the boundary of the board + very important if we meet an already visited node (for example in this q, visited node is marked as dot while we are looking for X)

2) Mark the current,visiting node as visited(true)

Or you will get StackOverflow error, which means recursion is never- ending.

To make it as visited, we can create a 2x2 boolean array. But try to mark on the given board itself to save space. For example, here we can just mark visited node the same way as the space not occupied by battleship, which is marked by dot.

DFS template

    def countBattleships(self, board):
        count = 0
        for i in range(len(board)):
            for j in range(len(board[0])):
                if board[i][j] == 'X':
                    self.dfs(i, j, board)
                    count += 1
        return count
    
    def dfs(self, i, j, board):
        if i < 0 or i >= len(board) or j < 0 or j >= len(board[0]):
            return
        if board[i][j] == '.':
            return
        board[i][j] = '.'
        self.dfs(i + 1, j, board)
        self.dfs(i - 1, j, board)
        self.dfs(i, j + 1, board)
        self.dfs(i, j - 1, board)

Practice questions

네트워크

https://school.programmers.co.kr/learn/courses/30/lessons/43162

First the list that is given to us needs to be processed to something like

[[1, 1, 0], [1, 1, 0], [0, 0, 1]] 
to graph[i] like graph[0]=[0,1] , graph[1]=[0,1], graph[2]=[2]

We can do this like this where if value is 1, there is an edge connecting the 2 nodes so we add that to the list at graph[i]

matrix = [[1, 1, 0], [1, 1, 0], [0, 0, 1]]
graph = {}

for i in range(len(matrix)):
    graph[i] = [j for j, value in enumerate(matrix[i]) if value == 1]

print(graph)

from collections import defaultdict
def solution(n, computers):
    graph = {}
    for i in range(len(computers)):
        graph[i] = [j for j, value in enumerate(computers[i]) if value == 1]

    print(graph)
    ans = 0
    visited=  [False for _ in range(len(graph))]
    for i in range(len(graph)):
      if not visited[i]:
        dfs(i,visited,graph)
        ans+=1
    # print(ans)
    return ans

def dfs(i,visited,graph):
  visited[i] = True
  for next_node in graph[i]:
    if not visited[next_node]:
      dfs(next_node,visited,graph)

타겟 넘버

https://school.programmers.co.kr/learn/courses/30/lessons/43165

DFS first

Initial thinking is like this

def solution(numbers, target):
  answer = 0
  
  dfs(0,0,numbers,target,0)
  print(answer)
  return answer
  
def dfs(index,curr_sum,numbers,target):
  if index==len(numbers) and curr_sum==target:
    answer+=1
  dfs(index+1,curr_sum+1)
  dfs(index+1,curr_sum-1)  

But how do you update answer value in python? You need to pass it to my dfs function as parameter for value to be updated.

Ok so let’s pass that answer as parameter like this

def solution(numbers, target):
    answer = 0
    dfs(0, 0, numbers, target, 0)
    print(answer)
    return answer

def dfs(index, curr_sum, numbers, target, answer):
    if index == len(numbers) and curr_sum == target:
        answer += 1
    dfs(index + 1, curr_sum + numbers[index], numbers, target, answer)
    dfs(index + 1, curr_sum - numbers[index], numbers, target, answer)

Now we get a index out of range error. But that is strange. I added a condition to stop recursion when index reaches the len(numbers).

Actually, if you look carefully, my condition does not work properly. Let’s take index = 5 and len(numbers) is also 5. Assuming that curr_sum is target too, we increment answer. Then, nothing stops the code from going to the next line and executing those 2 dfs functions that take index+1 as parameter, which goes beyond our list range.

To effectively stop those recursive calls once final index is reached, we need to add a return before those dfs calls.

if index == len(numbers):
    if curr_sum == target:
        answer += 1
    return

Now all seems fixed right? Until our answer value is still 0. Why?

In python, we have to understand mutable and immutable objects. Integers like our answer are immutable which means its value cannot be changed once it is declared. Since answer is declared as integer 0, we can’t change that. However, lists are mutable so we can update answer value if it is stored in a list like

def solution(numbers, target):
    answer = [0]  # Use a mutable list to store the answer count

    dfs(0, 0, numbers, target, answer)

    return answer[0]


def dfs(index, curr_sum, numbers, target, answer):
    if index == len(numbers):
        if curr_sum == target:
            answer[0] += 1
        return

    dfs(index + 1, curr_sum + numbers[index], numbers, target, answer)
    dfs(index + 1, curr_sum - numbers[index], numbers, target, answer)

There is a more common way like

def solution(numbers, target):
    return dfs(0, 0, numbers, target)

def dfs(index, curr_sum, numbers, target):
    count = 0
    if index == len(numbers):
        if curr_sum == target:
            return 1
        return 0

    
    count += dfs(index + 1, curr_sum + numbers[index], numbers, target)
    count += dfs(index + 1, curr_sum - numbers[index], numbers, target)
    return count

We can also use global keyword like here to modify a global variable inside and outside our function. In Python, integers are indeed immutable, which means their values cannot be changed once they are created. However, when you use the global keyword inside a function, it allows you to modify the value of a global variable from within that function.

def solution(numbers, target):
    global answer
    answer = 0
    dfs(0, 0, numbers, target)
    print(answer)
    return answer

def dfs(index, curr_sum, numbers, target):
    global answer
    if index == len(numbers):
        if curr_sum == target:
            answer += 1
        return 
    dfs(index + 1, curr_sum + numbers[index], numbers, target)
    dfs(index + 1, curr_sum - numbers[index], numbers, target)

You can also use nonlocal like

def solution(numbers, target):
  answer = 0

  def dfs(index, curr_sum):
    nonlocal answer
    if index == len(numbers):
      if curr_sum == target:
        answer += 1
      return
    dfs(index + 1, curr_sum + numbers[index])
    dfs(index + 1, curr_sum - numbers[index])

  dfs(0, 0)
  print(answer)
  return answer

BFS version

Taking note of DFS version, I tried thinking of BFS version

Wrong inital code

def solution(numbers, target):
  queue =deque()
  queue.append((0,0))
  count = bfs(queue,numbers,target)
  print(count)
  return count

def bfs(queue,numbers,target):
  count =0
  while queue:
    curr_index,curr_sum = queue.popleft()
    if curr_index==len(numbers):
      if curr_sum==target:
        count+=1
      break
    queue.append((curr_index+1,curr_sum+numbers[curr_index+1]))
    queue.append((curr_index+1,curr_sum-numbers[curr_index+1]))

Again im getting index out of range error.

Now firstly, we shouldnt be using break because that means all the elements waiting in our queue will be processed once we meet that curr_index==len(numbers) condition cuz it breaks out of the loop completely. Instead, we should use continue to process the other data waiting in our queue.

Secondly, index out of range error.While we are intercepting well the first parameter of queue.append((curr_index+1,curr_sum+numbers[curr_index+1])), the second parameter is the issue. Why? numbers[curr_index+1]. This surely will go beyond the length of our nums list.

Finally, we should be returning count in our bfs function because we want that count value

So correct solution is

from collections import deque

def solution(numbers, target):
    queue = deque()
    queue.append((0, 0))
    count = bfs(queue, numbers, target)
    print(count)
    return count

def bfs(queue, numbers, target):
    count = 0
    while queue:
        curr_index, curr_sum = queue.popleft()
        if curr_index == len(numbers):
            if curr_sum == target:
                count += 1
            continue
        queue.append((curr_index+1, curr_sum+numbers[curr_index]))
        queue.append((curr_index+1, curr_sum-numbers[curr_index]))
    return count

전력망을 둘로 나누기

https://school.programmers.co.kr/learn/courses/30/lessons/86971 This was similiar to 네트워크 problem so I tried solving it through dfs where I remove that wire from graph and do dfs search on that modified graph.

my wrong code

from collections import deque,defaultdict
import math


def solution(n, wires):
  graph = defaultdict(list)
  for wire in wires:
    graph[wire[0]].append(wire[1])
    graph[wire[1]].append(wire[0])
  print(graph)
  visited = [False for _ in range(n+1)]
  count =n
  for wire in wires:
    graph[wire[0]].remove(wire[1])
    graph[wire[1]].remove(wire[0])
    for i in range(1,len(graph)+1):
      if not visited[i] and graph[i]:
        count = min(count, dfs(visited, graph, i))
      # Restore the removed edges
    graph[wire[0]].append(wire[1])
    graph[wire[1]].append(wire[0])
  print(count)
  return count

def dfs(visited,graph,i):
  visited[i]=True
  count=1
  for next_node in graph[i]:
    if not visited[next_node]:
      count += dfs(visited,graph,next_node)
  return count

But the problem is that removing wire from my graph resulted in empty list for a given key and dfs still traverses through that. So that is the wrong part. I needed a different way to mark the disconnection, apart from this remove() way.

Also another problem was using the same visited list each time I iterated through wires. I should have instantiated a new one each time for each wire.

For the first problem, I googled and some used a 2d list like this check_link. (or you can create graph each time you iterate through wires but inefficient)

# 없앤 간선인지 아닌지 체크 값이 들어있는 리스트
check_link = [[True]*(n+1) for _ in range(n+1)]

for a, b in wires:
  visited = [False]*(n+1)

  check_link[a][b] = False # a-b 간선 끊기
  cnt_a = DFS(a, graph, visited, check_link)
  cnt_b = DFS(b, graph, visited, check_link)
  check_link[a][b] = True # a-b 간선 다시 연결

I fixed it like this

def solution(n, wires):
    graph = defaultdict(list)
    for wire in wires:
        graph[wire[0]].append(wire[1])
        graph[wire[1]].append(wire[0])
    print(graph)

    check_link = [[True for _ in range(n+1)] for _ in range(n+1)]
    count = n
    for wire in wires:
        visited = [False for _ in range(n+1)]
        check_link[wire[0]][wire[1]] = False
        count1= dfs(visited,check_link,graph,wire[0])
        count2=dfs(visited,check_link,graph,wire[1])
        count = min(abs(count1-count2),count)
            # Restore the removed edges
        check_link[wire[0]][wire[1]] = True
    print(count)
    return count

def dfs(visited,check_link,graph,node):
    visited[node]=True
    count=1
    for next_node in graph[node]:
        if not visited[next_node] and check_link[node][next_node]:
            count += dfs(visited,check_link, graph,next_node)
    return count

BFS and union find version (union find problem actually)

모음사전

https://school.programmers.co.kr/learn/courses/30/lessons/84512

Actually this can be solved via DFS but I learned some useful methods like product(), which does cartesian product.

First with just pure looping,

def solution(word):
    vowels = ['A', 'E', 'I', 'O', 'U']
    word_dict = {}  
    index = 1  

    for w1 in vowels:
        word_dict[w1] = index
        index += 1
        if w1 == word:
            return word_dict[w1]
        for w2 in vowels:
            word_dict[w1 + w2] = index
            index += 1
            if w1 + w2 == word:
                return word_dict[w1 + w2]
            for w3 in vowels:
                word_dict[w1 + w2 + w3] = index
                index += 1
                if w1 + w2 + w3 == word:
                    return word_dict[w1 + w2 + w3]
                for w4 in vowels:
                    word_dict[w1 + w2 + w3 + w4] = index
                    index += 1
                    if w1 + w2 + w3 + w4 == word:
                        return word_dict[w1 + w2 + w3 + w4]
                    for w5 in vowels:
                        word_dict[w1 + w2 + w3 + w4 + w5] = index
                        index += 1
                        if w1 + w2 + w3 + w4 + w5 == word:
                            return word_dict[w1 + w2 + w3 + w4 + w5]

    return 0  

With product() and repeat parameter, which allows how many repeats of same character in the word you are forming. Here we varied that repeat parameter via a for loop

from itertools import product

def solution(word):
    words = []
    for i in range(1, 6):
        for c in product(['A', 'E', 'I', 'O', 'U'], repeat=i):
            print(c)
            words.append(''.join(list(c)))

    words.sort()
    return words.index(word) + 1
solution("AAAE")

Lastly, via DFS

def dfs(cnt, w, word_list, words):
    if cnt == 5:
        return
    for i in range(len(words)):
        word_list.append(w + words[i])
        dfs(cnt + 1, w + words[i], word_list, words)

def solution(word):
    answer = 0
    word_list = []
    words = 'AEIOU'
    dfs(0, "", word_list, words)
    return word_list.index(word) + 1

solution("AAAE")

숫자 변환하기

https://school.programmers.co.kr/learn/courses/30/lessons/154538

initial inefficient code that had runtime issues

from collections import defaultdict, deque, Counter
def solution(x, y, n):
    queue =deque()
    queue.append((x,0))
    while queue:
        cur_val,cur_count= queue.popleft()
        if cur_val==y:
            # print(cur_count)
            return cur_count
        if cur_val>y:
            continue
        queue.append((2*cur_val,cur_count+1))
        queue.append((3*cur_val,cur_count+1))
        queue.append((cur_val+n,cur_count+1))
    return -1

improvement would be to use a visited list or memoisation so that we don’t calculate the same value twice

from collections import deque

def solution(x, y, n):
    visited = [0] * 1000001

    q = deque()
    q.append((x, 0))
    visited[x] = 1
    while q:
        num, cnt = q.popleft()
        if num == y:
            return cnt
        for next_num in (num + n, num * 2, num * 3):
            if next_num <= y and not visited[next_num]:
                visited[next_num] = 1
                q.append((next_num, cnt + 1))

    return -1