$29
The class random_generator has been updated (random_generator.h and ran-
dom_generator.cc) by a member function which generates random strings of a fixed
length using the a given number of characters from the alphabet, starting with “a”.
• char* random.string.m(int n, int no_ch)
The function allocates n + 1 characters. The first n characters (0 …n — 1) are chosen at random using the first no_ch characters from the alphabet starting with “a” (e.g., for no_ch = 4 the characters are randomly chosen out of {“a”,”b”,”c”,”d”}). The n-th character is set to 0 in order to mark the end of the string.
Dynamic programming
The dynamic programming Smith-Waterman algorithm is matching sequences recursively defined as follows, given X = XI,… , xn (along table rows) and Y = ye, … , y,n (along table columns)
M(i, 0) = 0, for all 0 < i < n M(0,j) =0, for all 0 < j <m
M(i — 1, j — 1) + 2 if xi = y;
M(i — 1, j — 1) — 1 if x, 0 yi
M(i’ -1) — max M(i — 1, j) — 1 if “-” is inserted into Y M(i, j — 1) — 1 if “-” is inserted into X
The function M(i,j) defines a so called matching score for the partial sequences X, and it,. If in the recursive definition of M the maximum value is due to the third or fourth line, you have to insert the character “-” into either X or Y in order to reconstruct the matching sequences X’ and Y’. Similar to the LCS problem we need only need a table to store the M(i, j) values, but an additional table that allows us to later generate X’ and Y’ from X and Y.
Example 1
Example 2
Example 3
x
x’
y,
v
abababda
cacacccbab
cdbaabbdca
a-bababda
ca-cacccbab
c-dba–abbdca
acbabab-a
cadaadcc—
cadcacca-bd–
acbababa
bccadaadcc
cadcaccabd
M(n,m)
12
4
5
Example 4
Example 5
x
X : r
Y
caacbdacca
dcacccbbba
caacb-dacc-a
dcacccb-bba
c–cbcd-ccba
dca–badba
bccbcdccba
aadcabadba
M(n,m)
7
1. Implement the bottom-up version of the Smith-Waterman algorithm given by the recursive definition of the function M (as seen on the slides).
2. Implement the top-down with memoization version of the Smith-Waterman algorithm given by the recursive definition of the function
Notes:
• How do you initialize the necessary tables given the definition of Keep in mind that you have to able to determine whether or not you already computed a table value (memoization).
• Values could be negative, but is there a limit for how small they can get?
3. Implement the function void PRINT-SEQ-ALIGN-X ( …) that takes a number of parameters and then recursively prints the matching sequence that is derived from Implement a separate function void PRINT-SEQ-ALIGN-Y( . ) that recursively prints the matching sequence that is derived from Y.
4. Find the maximum alignment for X = dcdcbacbbb and Y = acdccabdbb by using the Smith-Waterman algorithm (see slides). Execute the pseudocode algorithm and fill the necessary tables H and P in a bottom-up fassion. Reconstruct the strings X’ and Y’ using the tables H and
(7+20+8+15 = 50 points)
Exercise 15.1-2 Show, by means of a counterexample, that the following “greedy” strategy does not always determine an optimal way to cut rods. Define the density of a rod of length i to be that is, its value per inch. The greedy strategy for a rod of length 71 cuts off a first piece of length i, where 1 < i < n, having maximum density. It then continues by applying the greedy strategy to the remaining piece of length n
(7 points)
Exercise 15.1-5 The Fibonacci numbers are defines by recurrence (3.22). Give an 0(n) time dynamic-programming algorithm to compute the n-th Fibonacci number. Draw the subproblem graph. How many vertices and edges are in the graph?
(8 points)
Exercise 15.4-1 Determine an LCS of (1, 0, 0, 1, 0, 1, 0,1) and
(0, 1,0,1,1,0,1,1,0).