given two sequences you want to find the longest common subsequence. formally, if you have sequences and , you want to find indices and to satisfy:
- for all , and
- is maximized
note that subsequences of a string do not have to be contiguous: they are not substrings.
this is a canonical dynamic programming problem, where you can iteratively build a solution from solutions to sub-problems, and these sub-problems overlap. here, the table you build is the length of the longest common subsequence of a prefix of and . so we populate our table:
these correspond to the recursive cases:
- last character doesnt match: recursively find LCS with either last of or last of chopped off
- last character matches: include this in a candidate LCS, and recursively find LCS with last of and chopped off
this can be thought of as taking steps in the “edit graph” between two strings.