CS590 homework 4 – Dynamic Programming, Greedy Algorithms

Starting from:

$35

Home

The class random_generator has been updated (random_generator.h and ran-

dom_generator.cc) by a member function which generates random strings of a fixed

length using the a given number of characters from the alphabet, starting with “a”.

• char* random.string.m(int n, int no_ch)

The function allocates n + 1 characters. The first n characters (0 …n — 1) are chosen at random using the first no_ch characters from the alphabet starting with “a” (e.g., for no_ch = 4 the characters are randomly chosen out of {“a”,”b”,”c”,”d”}). The n-th character is set to 0 in order to mark the end of the string.

Dynamic programming

The dynamic programming Smith-Waterman algorithm is matching sequences recursively defined as follows, given X = XI,… , xn (along table rows) and Y = ye, … , y,n (along table columns)

M(i, 0) = 0, for all 0 < i < n M(0,j) =0, for all 0 < j <m

M(i — 1, j — 1) + 2 if xi = y;

M(i — 1, j — 1) — 1 if x, 0 yi

M(i’ -1) — max M(i — 1, j) — 1 if “-” is inserted into Y M(i, j — 1) — 1 if “-” is inserted into X

The function M(i,j) defines a so called matching score for the partial sequences X, and it,. If in the recursive definition of M the maximum value is due to the third or fourth line, you have to insert the character “-” into either X or Y in order to reconstruct the matching sequences X’ and Y’. Similar to the LCS problem we need only need a table to store the M(i, j) values, but an additional table that allows us to later generate X’ and Y’ from X and Y.

Example 1

Example 2

Example 3

x’

abababda

cacacccbab

cdbaabbdca

a-bababda

ca-cacccbab

c-dba–abbdca

acbabab-a

cadaadcc—

cadcacca-bd–

acbababa

bccadaadcc

cadcaccabd

M(n,m)

Example 4

Example 5

X : r

caacbdacca

dcacccbbba

caacb-dacc-a

dcacccb-bba

c–cbcd-ccba

dca–badba

bccbcdccba

aadcabadba

M(n,m)

1. Implement the bottom-up version of the Smith-Waterman algorithm given by the recursive definition of the function M (as seen on the slides).

2. Implement the top-down with memoization version of the Smith-Waterman algorithm given by the recursive definition of the function

Notes:

• How do you initialize the necessary tables given the definition of Keep in mind that you have to able to determine whether or not you already computed a table value (memoization).

• Values could be negative, but is there a limit for how small they can get?

3. Implement the function void PRINT-SEQ-ALIGN-X ( …) that takes a number of parameters and then recursively prints the matching sequence that is derived from Implement a separate function void PRINT-SEQ-ALIGN-Y( . ) that recursively prints the matching sequence that is derived from Y.

4. Find the maximum alignment for X = dcdcbacbbb and Y = acdccabdbb by using the Smith-Waterman algorithm (see slides). Execute the pseudocode algorithm and fill the necessary tables H and P in a bottom-up fassion. Reconstruct the strings X’ and Y’ using the tables H and

(7+20+8+15 = 50 points)

Exercise 15.1-2 Show, by means of a counterexample, that the following “greedy” strategy does not always determine an optimal way to cut rods. Define the density of a rod of length i to be that is, its value per inch. The greedy strategy for a rod of length 71 cuts off a first piece of length i, where 1 < i < n, having maximum density. It then continues by applying the greedy strategy to the remaining piece of length n

(7 points)

Exercise 15.1-5 The Fibonacci numbers are defines by recurrence (3.22). Give an 0(n) time dynamic-programming algorithm to compute the n-th Fibonacci number. Draw the subproblem graph. How many vertices and edges are in the graph?

(8 points)

Exercise 15.4-1 Determine an LCS of (1, 0, 0, 1, 0, 1, 0,1) and

(0, 1,0,1,1,0,1,1,0).