$24
Purpose:
The purpose of this lab is to implement a word searching application using optimal BST.
General Requirements:
Suppose that we are designing an application to translate text from English to Chinese. For each occurrence of each English word in the text, we need to look up its Chinese equivalent. One way to perform these lookup operations is to build an optimal BST with English words as keys. The file of words you read from will be data.txt. You may hard code the file name if you wish.
Optimal BST:
Assume we are going to translate “to be or not to be”. The keys will be sorted in ascending order based on its ASCII values. The probability of each key is calculated by its frequency in the sentence.
ASCII(be) < ASCII(or) < ASCII(to) < ASCII(not)
P(be) = 0.333
P(or) = 0.167
P(to) = 0.333
P(not) = 0.167
Thus, given {be, or, to, not} with p1 = 0.333, p2 = 0.167, p3 = 0.333, p4 = 0.167.
Pseudocode:
for I = 1 to n do
Ci,i = Pi;
ti,I = i;
endfor;
//initialization
for m=1 to n-1 do //compare Ci,j in increasing m
for i=1 to n-m do
j=i+m;
sum=0; //computing sum(Pi,…,Pj)
for l = i to j do
sum=sum+Pl;
endfor;
Ci,j=min{Ci,k-1 + Ck+1,j} + sum
ti,j=k;
endfor;
endfor;
Expected Results:
data.txt: to be or not to be
Here is the optimal BST:
to
/ \
be not
\
or
The minimal cost is: 1.833.
Submission:
Follow the conventions below to facilitate grading:
Source Code
Place all your source files (*.cpp, *.hpp) and input files in a single folder with no subfolders.
Name your folder using the convention Lastname_LabX (e.g., Smith_Lab06).
Include a functioning Makefile inside the folder. (The makefile should also include the clean command.)
Verify that your code runs on lab Linux machines before submission.
Compressed File
Compress using .zip, .rar, or .tar.gz.
Name your file using the convention Lastname_LabX (e.g., Smith_Lab06.zip).
Email
Use the following subject for your email: Lastname_LabX (e.g., Smith_Lab06).
Send you code to l290w868@ku.edu if you are in one of Lei’s sections or to dhwanipandya1401@ku.edu if you are in one of Dhwani’s sections.
Anytime you have a question about the lab, put the word question in the subject of the email.