DFS and practice questions
Precautions
Btw if input is given like [“X591X”,”X1X5X”,”X231X”, “1XXX1”] and we should convert it to 2d graph representation like this
def convert_to_graph(lst):
graph = []
for row in lst:
graph.append(list(row))
return graph
Also really important for the boundary check of 2x2 graph.
The problem lies in the condition next_x >= len(graph) when checking the bounds for next_x in the BFS loop. It should be next_x >= len(graph[0]) since the length of each row in graph represents the number of columns.
basic DFS with backtracking
Let’s say we are given list [1,2,3] and want to create permutation where order matters so like [1,3,2], [2,1,3], [2,3,1] , etc.
I had 2 problems.
1) What do I iterate through in my DFS? The index of the list or the actual number in the list?? The way to mark index or number as visited will differ so I wasn’t sure
2) I struggled with what parameters to pass in for my recursive dfs function. Should I pass the next index (i+1) for DFS to go to the next index? But what about list index out of range error? Certainly i+1 goes beyond my list and this error will occur!
Doubts to clear
1) Actually either way is fine! But the way to mark index or number as visited differs but that is the only difference
With index traversal:
nums = [1,2,3]
visited = [False for _ in range(len(nums))]
ans = []
path =[]
def dfs(nums,path,visited):
# index traversal
for i in range(len(nums)):
if len(path) == len(nums):
ans.append(path.copy())
return
if not visited[i]:
visited[i]=True
path.append(nums[i])
dfs(nums,path,visited)
path.pop()
visited[i]=False
dfs(nums,path,visited)
print(ans)
Here, I am accessing the index of nums list and marking the index of visited list as true or false. Don’t worry about backtracking yet we go in later.
With number traversal:
nums = [1,2,3]
ans = []
path =[]
def dfs(nums,path):
for num in nums:
if len(path) == len(nums):
ans.append(path.copy())
return
if num not in path:
path.append(num)
dfs(nums,path)
path.pop()
dfs(nums,path)
print(ans)
As you can see, we cannot do something like
visited = [False for _ in range(len(nums))]
def dfs(nums,path,visited):
for i in nums:
if len(path) == len(nums):
ans.append(path)
return
if not visited[i]:
visited[i]=True
path.append(i)
dfs(nums,path,visited)
path.pop()
because I am accessing my boolean visited array with an actual number, not the index. So in this case, we cannot use this way of marking number as visited. Instead, there is a much simpler way - if num not in list ! This checks for any incoming iterated number if it is already present (i.e. visited) in my path list!
2) Actually, I was confused at the fundamental concept of not visited[index] or if num not in list. We don’t need to pass in the next index or value and frankly we don’t need to care because the boolean list or if num not in list automatically checks this for us.
Let’s say we don’t put index+1 as parameter for our recursive function. I am afraid it will recur back to index and go on infinite loop. Will it actually?
def dfs(nums,path,visited):
# index traversal
for i in range(len(nums)):
if len(path) == len(nums):
ans.append(path.copy())
return
if not visited[i]:
visited[i]=True
path.append(nums[i])
dfs(nums,path,visited)
path.pop()
visited[i]=False
Let’s say i=1 and the corresponding number is 2 right. So we mark visited[1] = True and append 2 to our path. Then go to dfs(nums,path,visited). i will go from 0 again. But it WILL NOT go on infinite loop check because of our boolean check or if num not in list. So i will go to 1 and since it is already visited, it will go to 2.
See? We don’t need to pass index+1 as parameter.
Example
Let’s take this leetcode question as example
If we encounter the X sign in 2x2 board, we do a DFS search on that index.
BTW BFS is used to find SHORTEST distance.
2 conditions for DFS
1) Specify the end condition
If DFS search goes beyond the boundary of the board + very important if we meet an already visited node (for example in this q, visited node is marked as dot while we are looking for X)
2) Mark the current,visiting node as visited(true)
Or you will get StackOverflow error, which means recursion is never- ending.
To make it as visited, we can create a 2x2 boolean array. But try to mark on the given board itself to save space. For example, here we can just mark visited node the same way as the space not occupied by battleship, which is marked by dot.
DFS template
def countBattleships(self, board):
count = 0
for i in range(len(board)):
for j in range(len(board[0])):
if board[i][j] == 'X':
self.dfs(i, j, board)
count += 1
return count
def dfs(self, i, j, board):
if i < 0 or i >= len(board) or j < 0 or j >= len(board[0]):
return
if board[i][j] == '.':
return
board[i][j] = '.'
self.dfs(i + 1, j, board)
self.dfs(i - 1, j, board)
self.dfs(i, j + 1, board)
self.dfs(i, j - 1, board)
Practice questions
네트워크
https://school.programmers.co.kr/learn/courses/30/lessons/43162
First the list that is given to us needs to be processed to something like
[[1, 1, 0], [1, 1, 0], [0, 0, 1]]
to graph[i] like graph[0]=[0,1] , graph[1]=[0,1], graph[2]=[2]
We can do this like this where if value is 1, there is an edge connecting the 2 nodes so we add that to the list at graph[i]
matrix = [[1, 1, 0], [1, 1, 0], [0, 0, 1]]
graph = {}
for i in range(len(matrix)):
graph[i] = [j for j, value in enumerate(matrix[i]) if value == 1]
print(graph)
from collections import defaultdict
def solution(n, computers):
graph = {}
for i in range(len(computers)):
graph[i] = [j for j, value in enumerate(computers[i]) if value == 1]
print(graph)
ans = 0
visited= [False for _ in range(len(graph))]
for i in range(len(graph)):
if not visited[i]:
dfs(i,visited,graph)
ans+=1
# print(ans)
return ans
def dfs(i,visited,graph):
visited[i] = True
for next_node in graph[i]:
if not visited[next_node]:
dfs(next_node,visited,graph)
타겟 넘버
https://school.programmers.co.kr/learn/courses/30/lessons/43165
DFS first
Initial thinking is like this
def solution(numbers, target):
answer = 0
dfs(0,0,numbers,target,0)
print(answer)
return answer
def dfs(index,curr_sum,numbers,target):
if index==len(numbers) and curr_sum==target:
answer+=1
dfs(index+1,curr_sum+1)
dfs(index+1,curr_sum-1)
But how do you update answer value in python? You need to pass it to my dfs function as parameter for value to be updated.
Ok so let’s pass that answer as parameter like this
def solution(numbers, target):
answer = 0
dfs(0, 0, numbers, target, 0)
print(answer)
return answer
def dfs(index, curr_sum, numbers, target, answer):
if index == len(numbers) and curr_sum == target:
answer += 1
dfs(index + 1, curr_sum + numbers[index], numbers, target, answer)
dfs(index + 1, curr_sum - numbers[index], numbers, target, answer)
Now we get a index out of range error. But that is strange. I added a condition to stop recursion when index reaches the len(numbers).
Actually, if you look carefully, my condition does not work properly. Let’s take index = 5 and len(numbers) is also 5. Assuming that curr_sum is target too, we increment answer. Then, nothing stops the code from going to the next line and executing those 2 dfs functions that take index+1 as parameter, which goes beyond our list range.
To effectively stop those recursive calls once final index is reached,
we need to add a return
before those dfs calls.
if index == len(numbers):
if curr_sum == target:
answer += 1
return
Now all seems fixed right? Until our answer value is still 0. Why?
In python, we have to understand mutable and immutable objects. Integers like our answer are immutable which means its value cannot be changed once it is declared. Since answer is declared as integer 0, we can’t change that. However, lists are mutable so we can update answer value if it is stored in a list like
def solution(numbers, target):
answer = [0] # Use a mutable list to store the answer count
dfs(0, 0, numbers, target, answer)
return answer[0]
def dfs(index, curr_sum, numbers, target, answer):
if index == len(numbers):
if curr_sum == target:
answer[0] += 1
return
dfs(index + 1, curr_sum + numbers[index], numbers, target, answer)
dfs(index + 1, curr_sum - numbers[index], numbers, target, answer)
There is a more common way like
def solution(numbers, target):
return dfs(0, 0, numbers, target)
def dfs(index, curr_sum, numbers, target):
count = 0
if index == len(numbers):
if curr_sum == target:
return 1
return 0
count += dfs(index + 1, curr_sum + numbers[index], numbers, target)
count += dfs(index + 1, curr_sum - numbers[index], numbers, target)
return count
We can also use global
keyword like here to modify a global variable inside
and outside our function. In Python, integers are indeed immutable, which means their values cannot be changed once they are created. However, when you use the global keyword inside a function, it allows you to modify the value of a global variable from within that function.
def solution(numbers, target):
global answer
answer = 0
dfs(0, 0, numbers, target)
print(answer)
return answer
def dfs(index, curr_sum, numbers, target):
global answer
if index == len(numbers):
if curr_sum == target:
answer += 1
return
dfs(index + 1, curr_sum + numbers[index], numbers, target)
dfs(index + 1, curr_sum - numbers[index], numbers, target)
You can also use nonlocal
like
def solution(numbers, target):
answer = 0
def dfs(index, curr_sum):
nonlocal answer
if index == len(numbers):
if curr_sum == target:
answer += 1
return
dfs(index + 1, curr_sum + numbers[index])
dfs(index + 1, curr_sum - numbers[index])
dfs(0, 0)
print(answer)
return answer
BFS version
Taking note of DFS version, I tried thinking of BFS version
Wrong inital code
def solution(numbers, target):
queue =deque()
queue.append((0,0))
count = bfs(queue,numbers,target)
print(count)
return count
def bfs(queue,numbers,target):
count =0
while queue:
curr_index,curr_sum = queue.popleft()
if curr_index==len(numbers):
if curr_sum==target:
count+=1
break
queue.append((curr_index+1,curr_sum+numbers[curr_index+1]))
queue.append((curr_index+1,curr_sum-numbers[curr_index+1]))
Again im getting index out of range error.
Now firstly, we shouldnt be using break because that means all the elements waiting in our queue will be processed once we meet that curr_index==len(numbers) condition cuz it breaks out of the loop completely. Instead, we should use continue to process the other data waiting in our queue.
Secondly, index out of range error.While we are intercepting well the first parameter of queue.append((curr_index+1,curr_sum+numbers[curr_index+1])), the second parameter is the issue. Why? numbers[curr_index+1]. This surely will go beyond the length of our nums list.
Finally, we should be returning count in our bfs function because we want that count value
So correct solution is
from collections import deque
def solution(numbers, target):
queue = deque()
queue.append((0, 0))
count = bfs(queue, numbers, target)
print(count)
return count
def bfs(queue, numbers, target):
count = 0
while queue:
curr_index, curr_sum = queue.popleft()
if curr_index == len(numbers):
if curr_sum == target:
count += 1
continue
queue.append((curr_index+1, curr_sum+numbers[curr_index]))
queue.append((curr_index+1, curr_sum-numbers[curr_index]))
return count
전력망을 둘로 나누기
https://school.programmers.co.kr/learn/courses/30/lessons/86971 This was similiar to 네트워크 problem so I tried solving it through dfs where I remove that wire from graph and do dfs search on that modified graph.
my wrong code
from collections import deque,defaultdict
import math
def solution(n, wires):
graph = defaultdict(list)
for wire in wires:
graph[wire[0]].append(wire[1])
graph[wire[1]].append(wire[0])
print(graph)
visited = [False for _ in range(n+1)]
count =n
for wire in wires:
graph[wire[0]].remove(wire[1])
graph[wire[1]].remove(wire[0])
for i in range(1,len(graph)+1):
if not visited[i] and graph[i]:
count = min(count, dfs(visited, graph, i))
# Restore the removed edges
graph[wire[0]].append(wire[1])
graph[wire[1]].append(wire[0])
print(count)
return count
def dfs(visited,graph,i):
visited[i]=True
count=1
for next_node in graph[i]:
if not visited[next_node]:
count += dfs(visited,graph,next_node)
return count
But the problem is that removing wire from my graph resulted in empty list for a given key and dfs still traverses through that. So that is the wrong part. I needed a different way to mark the disconnection, apart from this remove() way.
Also another problem was using the same visited list each time I iterated through wires. I should have instantiated a new one each time for each wire.
For the first problem, I googled and some used a 2d list like this check_link. (or you can create graph each time you iterate through wires but inefficient)
# 없앤 간선인지 아닌지 체크 값이 들어있는 리스트
check_link = [[True]*(n+1) for _ in range(n+1)]
for a, b in wires:
visited = [False]*(n+1)
check_link[a][b] = False # a-b 간선 끊기
cnt_a = DFS(a, graph, visited, check_link)
cnt_b = DFS(b, graph, visited, check_link)
check_link[a][b] = True # a-b 간선 다시 연결
I fixed it like this
def solution(n, wires):
graph = defaultdict(list)
for wire in wires:
graph[wire[0]].append(wire[1])
graph[wire[1]].append(wire[0])
print(graph)
check_link = [[True for _ in range(n+1)] for _ in range(n+1)]
count = n
for wire in wires:
visited = [False for _ in range(n+1)]
check_link[wire[0]][wire[1]] = False
count1= dfs(visited,check_link,graph,wire[0])
count2=dfs(visited,check_link,graph,wire[1])
count = min(abs(count1-count2),count)
# Restore the removed edges
check_link[wire[0]][wire[1]] = True
print(count)
return count
def dfs(visited,check_link,graph,node):
visited[node]=True
count=1
for next_node in graph[node]:
if not visited[next_node] and check_link[node][next_node]:
count += dfs(visited,check_link, graph,next_node)
return count
BFS and union find version (union find problem actually)
모음사전
https://school.programmers.co.kr/learn/courses/30/lessons/84512
Actually this can be solved via DFS but I learned some useful methods like product(), which does cartesian product.
First with just pure looping,
def solution(word):
vowels = ['A', 'E', 'I', 'O', 'U']
word_dict = {}
index = 1
for w1 in vowels:
word_dict[w1] = index
index += 1
if w1 == word:
return word_dict[w1]
for w2 in vowels:
word_dict[w1 + w2] = index
index += 1
if w1 + w2 == word:
return word_dict[w1 + w2]
for w3 in vowels:
word_dict[w1 + w2 + w3] = index
index += 1
if w1 + w2 + w3 == word:
return word_dict[w1 + w2 + w3]
for w4 in vowels:
word_dict[w1 + w2 + w3 + w4] = index
index += 1
if w1 + w2 + w3 + w4 == word:
return word_dict[w1 + w2 + w3 + w4]
for w5 in vowels:
word_dict[w1 + w2 + w3 + w4 + w5] = index
index += 1
if w1 + w2 + w3 + w4 + w5 == word:
return word_dict[w1 + w2 + w3 + w4 + w5]
return 0
With product() and repeat parameter, which allows how many repeats of same character in the word you are forming. Here we varied that repeat parameter via a for loop
from itertools import product
def solution(word):
words = []
for i in range(1, 6):
for c in product(['A', 'E', 'I', 'O', 'U'], repeat=i):
print(c)
words.append(''.join(list(c)))
words.sort()
return words.index(word) + 1
solution("AAAE")
Lastly, via DFS
def dfs(cnt, w, word_list, words):
if cnt == 5:
return
for i in range(len(words)):
word_list.append(w + words[i])
dfs(cnt + 1, w + words[i], word_list, words)
def solution(word):
answer = 0
word_list = []
words = 'AEIOU'
dfs(0, "", word_list, words)
return word_list.index(word) + 1
solution("AAAE")
숫자 변환하기
https://school.programmers.co.kr/learn/courses/30/lessons/154538
initial inefficient code that had runtime issues
from collections import defaultdict, deque, Counter
def solution(x, y, n):
queue =deque()
queue.append((x,0))
while queue:
cur_val,cur_count= queue.popleft()
if cur_val==y:
# print(cur_count)
return cur_count
if cur_val>y:
continue
queue.append((2*cur_val,cur_count+1))
queue.append((3*cur_val,cur_count+1))
queue.append((cur_val+n,cur_count+1))
return -1
improvement would be to use a visited list or memoisation so that we don’t calculate the same value twice
from collections import deque
def solution(x, y, n):
visited = [0] * 1000001
q = deque()
q.append((x, 0))
visited[x] = 1
while q:
num, cnt = q.popleft()
if num == y:
return cnt
for next_num in (num + n, num * 2, num * 3):
if next_num <= y and not visited[next_num]:
visited[next_num] = 1
q.append((next_num, cnt + 1))
return -1