$24
It might happen that you will nd solutions, either partial or complete, on the web, on sites such as StackOver ow. Feel free to use these websites, but
Remember that we know that these websites exist.
We have tools to compare your code and detect when it looks suspiciously identical to some-one else’s.
If you copy-paste code you don’t understand, you’re missing the point of this homework and the reason why it was given.
Deliverables
The main deliverable of this homework is a zip le containing three les: ex1.py, ex2.py and ex3.py.
The les must be stored in a directory called LastNameFirstName. For example, if your name is
Donald Duck, you’ll submit a zip le that contains:
DuckDonald/
ex1.py
ex2.py
ex3.py
The zip le name must be called CS4341_HW1_LastnameFirstname.zip, where Lastname and Firstname are your last name and rst name. For example: CS4341_HW1_DuckDonald.zip.
If you don’t comply with these guidelines, each mistake will drop 10% of the points you earned. Example mistakes: le is not a zip, but a rar or a tar.gz; once unzipped, there is no directory with your name; there are missing/extra les in the directory; the les are the right ones, but the names are wrong.
About the code itself:
The names of the functions in the code skeletons provided below are mandatory — these functions must be implemented and named as indicated. You can write extra functions, if you think it’s a good idea.
The les you decide to submit contain only the requested code and nothing else. For example, remove any print() statement you added for your own debugging and remove any testing code you might have.
Be sure to comment your code thoroughly and add docstrings in your de nitions of extra functions and classes.
All the code must be written with a resonably recent version of Python 3, which is Python 3.7.2 currently. Any version after 3.6 would be OK.
Again, each mistake in code style will cost you 10% of your points.
The reason why we want you follow these guidelines strictly is that, if you do, we’ll be able to make Exercise 1: Loops and Arrays (30 points)
Connect Four is a turn-based two-player game in which you win if you can string four of your tokens in a horizontal, vertical, or diagonal sequence. Complete this Python class which, given a board con guration, calculates the outcome of a board con guration. Fill in this template:
class C o n n e c t F o u r :
def __init__ ( self , board , w , h ):
""" Class c o n s t r u c t o r """
# Board data
self . board = board
# Board width
self . w = w
# Board height
self . h = h
def isLineAt ( self , x , y , dx , dy ):
""" Return True if a line of id en ti cal tokens exists starting at (x , y ) in d ir ec tio n ( dx , dy ) """
# Your code here
def i s A n y L i n e A t ( self , x , y ):
Return True if a line of id en ti cal symbols exists starting at (x , y ) in any di re ct io n """
return ( self . isLineAt (x ,
y ,
1 ,
0)
or
# H o r i z o n t a l
self . isLineAt (x ,
y ,
0 ,
1)
or
#
Vertical
self . isLineAt (x ,
y ,
1 ,
1)
or
#
Diagonal
up
self . isLineAt (x ,
y ,
-1 ,
-1))
#
Diagonal
down
def g e t O u t c o m e ( self ):
""" Returns the winner of
the
game :
1
for Player
1 , 2 for Player 2 , and
for no winner """
Your code here , use i s A n y L i n e A t ()
def p r i n t O u t c o m e ( self ):
""" Prints the winner of the game """
= self . g e t O u t c o m e () if o == 0:
print ("No winner ") else :
print (" Player ", o, " won ")
Example output:
W = 7
H = 6
BOARD1 = [
[0 ,0 ,0 ,0 ,0 ,0 ,0] ,
[0 ,0 ,0 ,0 ,0 ,0 ,0] ,
[0 ,0 ,0 ,0 ,0 ,0 ,0] ,
[0 ,0 ,0 ,0 ,0 ,0 ,0] ,
[0 ,0 ,0 ,0 ,0 ,0 ,0] ,
[0 ,0 ,0 ,0 ,0 ,0 ,0]
]
c = ConnectFour (BOARD1 , W, H)
c . p r i n t O u t c o m e ()
# No winner
BOARD2 = [
[1 ,1 ,1 ,1 ,2 ,1 ,0] ,
[0 ,2 ,2 ,2 ,0 ,0 ,0] ,
[0 ,0 ,2 ,0 ,0 ,0 ,0] ,
[0 ,0 ,0 ,0 ,0 ,0 ,0] ,
[0 ,0 ,0 ,0 ,0 ,0 ,0] ,
[0 ,0 ,0 ,0 ,0 ,0 ,0]
]
= C o n n e c t F o u r ( BOARD2 , W , H )
. p r i n t O u t c o m e ()
# Player 1 won
Note that the board is stored upside down with respect to what is normally seen in the original game.
Exercise 2: Recursion (40 points)
Write a recursive function that checks whether two strings passed as input are one the reverse of the other. Fill in this template.
def i sr ev er se ( s1 , s2 ):
# Your code here
Example output:
print ("<empty ,
<empty
- ", isreverse ("", ""))
#
True
print ("a, a
- ",
isreverse ("a", "a"))
#
True
print ("ab , ba - ",
isreverse ("ab",
"ba"))
#
True
print ("abc ,
cba
-
",
isreverse ("abc",
"cba "))
#
True
print ("abcd ,
cba
-
",
isreverse (" abcd ", "cba "))
#
False
Exercise 3: Files and Data Structures (30 points)
This exercise asks you to write a program that computes the Jaccard index of two text les. This coe cient is used in Natural Language Processing as a crude estimate of the similarity of two text les.
To calculate the Jaccard index, also known as “intersection over union”, you rst parse each le and produce a set of tokens for each of them (words1 and words2). Note: it is a set, not a list, because we do not care about duplicates within a speci c le.
Then, you calculate the index with
jintersection(words1,words2)j
junion(words1,words2)j
(1)
where intersection(words1,words2) is a set containing the words in common between the two texts, and union(words1,words2) is the union of the two sets. Finally, the j j operator indicates the number of elements in a set (i.e., len() in Python).
To test your code, you’ll need some text dump. Project Gutenberg (https://www.gutenberg.org/)
has thousands of books in various formats, including plain text. For example:
Alice in Wonderland
Through the looking glass
Fill in this template:
def wordset ( fname ):
""" Returns the set of words c o r r e s p o n d i n g to the given file """
# Your code here
def jaccard ( fname1 , fname2 ):
""" Ca lcu la te Jaccard index """
# Your code here - call wordset ()
Example output:
FNAME1 = " a l i c e _ a s c i i . txt "
FNAME2 = " g l a s s _ a s c i i . txt "
print (" Jaccard index between ", FNAME1 , "and", FNAME2 , jaccard (FNAME1 , FNAME2 ))
# Jaccard index between a l i c e _ a s c i i . txt and g l a s s _ a s c i i . txt 0 . 1 5 2 8 8 1 4 5 7 1 5 1 3 8 8 4 8
Implementation Notes
Converting a text le from UTF8 to ASCII
The books in Project Gutenberg are encoded in UTF8, which is more cumbersome to parse than plain ASCII. Since the point of the exercise is not to make parsing di cult, but to review data structures and some Python syntax, focus your text parsing on ASCII text.
You can either take text that is already in ASCII format, or, under Linux and Mac, you can convert UTF8 to ASCII with this command typed in the terminal (the $ indicates the Bash command prompt, it’s not part of the command itself):
$ iconv -f UTF8 -t ASCII -c a l i c e _ u t f 8 . txt a l i c e _ a s c i i . txt
The command above assumes that the text you want to convert is stored in alice_utf8.txt. The option -c ignores any character in the UTF8 le that cannot be converted to ASCII. The conversion won’t be perfect, but it’s not a big deal.
Constructing the word list of an ASCII le
To construct the word list from an ASCII le, lter out anything that is not an alphabetic charac-ter. A fast and simple way to achieve this is to employ a regular expression, as explained in this StackOver ow answer. The functions str.lower() and str.split() will also be useful.