Close

Mohit Athwani

MA
Leetcode – 10. Regular Expression Matching (Dynamic Programming)

This article is in continuation to my previous article solving the same problem with recursion.

The fact that this problem could be solved with recursion made me think about the optimal substructure of this problem. As it turns out, this problem has one and can be solved using dynamic programming.

To solve using dynamic programming, we will construct a table of dimension NxM, where N is the length of the string and M is the length of the pattern.

So, given the string abc and patter a•c, we construct a table:

a c
a [0,0] [0,1] [0,2]
b [1,0] [1,1] [1,2]
c [2,0] [2,1] [2,2]

With the table above, we are missing one very important aspect of the problem. The base case!

To accommodate for the base case which is explained in my previous post, we need a table of size (N+1)x(M+1)

a c
T F F F
a F [1,1] [1,2] [1,3]
b F [2,1] [2,2] [2,3]
c F [3,1] [3,2] [3,3]

In the above table, only [0,0] is True and all other values in the corresponding row and column are False satisfying the base case.

Solving for [1,1], we see that s[i] == p[j], so T[i][j] will be True iff the previous state of the problem, i.e empty string satisfies empty pattern T[i - 1][j - 1] is True.

Solving for [1,2], we see that p[j] == '•', so T[i][j] will be True iff the previous state of the problem, i.e empty string satisfies pattern a• is True.

Solving for [1,3], s[i] != p[j] hence T[i][j] = False.

a c
T F F F
a F T F F
b F [2,1] [2,2] [2,3]
c F [3,1] [3,2] [3,3]

Solving for [2,1] is similar to solving for [1,3]

Solving for[2,2], we see that p[j] == '•', so T[i][j] will be True iff the previous state of the problem, i.e. string a satisfies pattern a i.e. T[i - 1][j - 1].

Moving on, our final table looks something like:

a c
T F F F
a F T F F
b F F T F
c F F F T

We see a pattern arising here,

T[i][j] = True when (s[i] == p[j] || p[j] == '•')

Let’s make things a little more complicated. Let’s try to solve the pattern a*b with the string aaab. Like before, the initial state of the table would look like this:

a * b
T F F F
a F [1,1] [1,2] [1,3]
a F [2,1] [2,2] [2,3]
a F [3,1] [3,2] [3,3]
b F [4,1] [4,2] [4,3]

If you think about it, the cell in [0,2] should be True since an empty string can technically satisfy the pattern a*, so this is another special condition that needs to be accounted for when solving this problem using dynamic programming. Another example for this special case would be a pattern like a*b*c*.

Let’s look another example where the string is abbc and pattern is a•*c. The matrix for this example would look like:

a * c
T F F F F
a F [1,1] [1,2] [1,3] [1,4]
b F [2,1] [2,2] [2,3] [2,4]
b F [3,1] [3,2] [3,3] [3,4]
c F [4,1] [4,2] [4,3] [4,4]

Solving for [1,1] is easy and can be taken from the recurrence relation derived above.

Solving for [1,2] is also easy and can be derived from the recurrence relation above.

Solving for [1,3] is a bit more complicated since p[j] == '*'. There are two ways the pattern a•* can be satisfied:

  1. If the string from “`0. . .i“` matches the pattern from “`0. . .j-2“`, i.e considering 0 occurrences of the character before the “`*“`
  2. If the previous state of the string satisfies the current pattern and “`s[i] == p[j-1] || p[j – 1] == “•”“`

In the case of [1,3], we check to see if the string a satisfies the pattern a, in this case it is True.

Solving for [1,4] is pretty straight forward.

At the end of this iteration, our matrix looks like:

a * c
T F F F F
a F T F T F
b F [2,1] [2,2] [2,3] [2,4]
b F [3,1] [3,2] [3,3] [3,4]
c F [4,1] [4,2] [4,3] [4,4]

Solving for [2,1] is similar to what we’ve seen before. The string ab cannot match the pattern a.

Solving for [2,2], the first recurrence relation holds, i.e ab can match a• iff a matches a.

[2,3] is where things get a little complicated but we should be able to solve it with the help of the 2 conditions listed above.

Does the string ab match the pattern a? Condition 1 is False.

Is string[i] == pattern[j - 1] || pattern[j - 1] == "•" ? True. So does the string a satisfy the pattern a•*? True. Therefore, [2,3] is True.

Solving for [4,3] you’ll realize that using the 2 conditions above, we get the value True, but in reality it should be False. I guess this is a flaw of the algorithm. So moving on, solving for [4,4] is pretty straight forward as well.

At this point, our table should look like:

a * c
T F F F F
a F T F T F
b F F T T F
b F F F T F
c F F F T T

To account for the case where the pattern is a* or a*b*c* and similar, we have to do some preparation before we start generating actual table values. If you think about it, an empty can string can satisfy the patterns above, so in the 0th row of the matrix we should do if p[j] == '*' { T[0][j] = T[0][j-2] } and this will set the value to True at all the columns where there is a * and if the empty string can satisfy the pattern.

Space and Time Complexity:

Other than the space and time complexity taken up by the Swift String API to generate substrings, the time and space and complexity of this algorithm is O(NM)

The problem is taken from Leetcode

Solution hosted on Github

Leave a comment

Facebook plugin is not configured. Please add a website for the facebook comments and a number of comments from the Dashboard > Suarez > Additional Options > Facebook.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: