Remove the minimum number of invalid parentheses in order to make the input string valid. Return all possible results.
Note: The input string may contain letters other than the parentheses (
and )
.
Example 1:
Input: "()())()"
Output: ["()()()", "(())()"]
Example 2:
Input: "(a)())()"
Output: ["(a)()()", "(a())()"]
Example 3:
Input: ")("
Output: [""]
A naive solution is to enumerate all possible subsequences on opening/closing parentheses using DFS, then for each of them, check if it is balanced and requires a minimum removal. This is very inefficient because it is a guranteed O(N * 2^M) runtime, N is the length of S and M is the number of parentheses. The problem here is that this solution does not check the internal structure of S and make pruning in its search space.
A better approach is to do DFS with pruning. Notice that all finial result strings must remove the same amount of opening and closing parentheses. And we can find out exactly how many opening/closing parentheses that must be removed in a O(N) scan. Using this information, we derive the following algorithm.
1. For each '(', as long as we have more to remove, remove the current one and recurse on the next character. Then recurse on the next character again with the current character kept.
2. For each ')', as long as we have more to remove, remove the current one and recurse on the next character. Then if the '(' we've kept so far is more than ')' we've kept, recurse on the next character with the current ')' kept. We do this extra check to avoid keeping a ')' with no preceeding '(' to match with. Example: ")()("
3. For non-parentheses letters, simply recurse on the next character.
The base condition is that when we reach the end of S, if we've removed the required amount of opening/closing parentheses, we add the current subsequence to the final result.
One key point here is that we initially save all results in a set to remove duplicates. It is easy to see that we can possibly have duplicated answer string in case of adjacent parentheses of the same kind. One way to get around this is to only remove the first one of such same kind cluster. But this does not work when more than 1 should be removed out of the cluster. Extra logic needs to be developed to handle all cases. Given the worse case runtime is already O(N * 2^M), using a set makes the implementation easier without sacrificing the runtime much.
class Solution { public List<String> removeInvalidParentheses(String s) { int leftRemove = 0, rightRemove = 0; for(int i = 0; i < s.length(); i++) { if(s.charAt(i) == ')') { if(leftRemove > 0) { leftRemove--; } else { rightRemove++; } } else if(s.charAt(i) == '('){ leftRemove++; } } Set<String> ans = new HashSet<>(); boolean[] remove = new boolean[s.length()]; dfsSolve(ans, s, remove, 0, leftRemove, rightRemove, 0, 0); return new ArrayList<>(ans); } private void dfsSolve(Set<String> ans, String s, boolean[] remove, int currIdx, int leftRemove, int rightRemove, int leftCount, int rightCount) { if(currIdx == s.length()) { if(leftRemove == 0 && rightRemove == 0) { StringBuilder sb = new StringBuilder(); for(int i = 0; i < s.length(); i++) { if(!remove[i]) { sb.append(s.charAt(i)); } } ans.add(sb.toString()); } return; } if(s.charAt(currIdx) == '(') { if(leftRemove > 0) { remove[currIdx] = true; dfsSolve(ans, s, remove, currIdx + 1, leftRemove - 1, rightRemove, leftCount, rightCount); remove[currIdx] = false; } dfsSolve(ans, s, remove, currIdx + 1, leftRemove, rightRemove, leftCount + 1, rightCount); } else if(s.charAt(currIdx) == ')') { if(rightRemove > 0) { remove[currIdx] = true; dfsSolve(ans, s, remove, currIdx + 1, leftRemove, rightRemove - 1, leftCount, rightCount); remove[currIdx] = false; } if(leftCount > rightCount) { dfsSolve(ans, s, remove, currIdx + 1, leftRemove, rightRemove, leftCount, rightCount + 1); } } else { dfsSolve(ans, s, remove, currIdx + 1, leftRemove, rightRemove, leftCount, rightCount); } } }