Starting from:

$30

Lab Assignment 4

    1. You are allowed to form a group of two to do this lab assignment.

    2. You are strongly recommended to bring your own laptop to the lab with Anaconda1 and Pycharm2 installed. You don’t even have to attend the lab session if you know what you are required to do by reading this assignment.

    3. Only Python 3.x is acceptable. You need to specify your python version as the first line in your script. For example, if your scripts are required to run in Python 3.6, the following line should appear in the first line of your scripts:

#python_version == '3.6'

4. For those of you using the Windows PC in SHB 924A (NOT recommended) with your CSDOMAIN account3, please login and open “Computer” on the desktop to check if an

“S:” drive is there. If not, then you need to click “Map network drive”, use “S:” for the drive letter, fill in the path \\ntsvr1\userapps and click “Finish”. Then open the “S:”

drive, open the Python3 folder, and click the “IDLE (Python 3.7 64-bit)” shortcut to start doing the lab exercises. You will also receive a paper document and if anything has changed, please be subject to the paper.
    5. Your code should only contain specified functions. Please delete all the debug statements (e.g. print) before submission.

Exercise 1 (20 marks)

Let a be the list of values produced by range(1, 11).

    1. Using the function map with a lambda argument, write an expression that will produce each of the following: (5 marks)

        ◦ A list of round down square root of the corresponding values in the original list; The expected output should be a list as follows,
[1, 1, 1, 2, 2, 2, 2, 2, 3, 3]

        ◦ A list where each element is larger by one than the corresponding element in the original list; The expected output should be a list as follows, (5 marks)

1An open data science platform powered by Python. https://www.continuum.io/downloads

2

3A non-CSE student should ask the TA for a CSDOMAIN account.


1 of 7
CSCI 2040 A/B    Lab Assignment 4    Page 2


[2, 3, 4, 5, 6, 7, 8, 9, 10, 11]

Note that you should use lambda arguments in this part.

    2. Write a  list comprehension that will produce each of the following:

        ◦ A list contains values in the original list that are less than or equal to 7; The expected output should be a list as follows, (5 marks)
[1, 2, 3, 4, 5, 6, 7]

        ◦ A list contains values that are the square of odd values in the original list. The expected output should be a list as follows, (5 marks)
[1, 9, 25, 49, 81]

Note that in this exercise, you are required to write only one line of code for each expression.

We will manually check the correctness of your answer.

Save your script for this exercise in p1.py

Exercise 2 (20 marks)

x is a list of strings. For example,

x = ['python is cool',

'pythom is a large heavy-bodied snake', 'The python course is worse taking', 'python python python python']

In the example, there is totally 1 low frequency occurrence of the word ’python’ (case sensitive) in the strings whose length are more than 20 and the appearance times are less than 4 in the list x.

To count the number of low frequency occurrences of a certain word str in the strings which are long enough, use filter and reduce and write a function low_freq_word_count(x, str, n, m) in the functional programming paradigm.

This function takes a list of strings x , the string we want to find str and two numbers n, m as the inputs. The output or the returned value is the total number of low frequency occurrences of the string str in the strings whose length is more than n(>n) and the appearance frequency in it is less than m(<m) in the list x. The function should be in the following format:

def low_freq_word_count(x, str, n, m):

# your code here

Hint:

    • You could use filter to filter out the string whose length are not long enough or these frequencies that are too high and use reduce to do the summation.


2 of 7
CSCI 2040 A/B    Lab Assignment 4    Page 3


    • str.count(sub) return the number of occurrences of substring sub in string str.

Note that you are required to use filter, reduce to finish your code. No use of these will

result in the deduction of your grade for this exercise.

Save your script for this exercise in p2.py

Exercise 3 (40 marks)

Visualization is widely used for data analysis. In this exercise, you will use Python scripts to draw 5 kinds of figures. All the script for this exercise should be in a single file p3.py. The

package matplotlib is useful for this exercise, which can be imported as following4:

import matplotlib.pyplot as plt

Note: your grade for this exercise depends on the readability of your figures. All the plots should be clear to read, and all the plots must contain x-label and y-label.

Histogram (10 marks)

Histogram5 is a way to visualize the distribution of continuous variables.

    • Write script in p3.py to plot a histogram for the random numbers in random_numbers generated by the following scripts. The x-axis is the value of random numbers, and the y-axis is the probability density. Hint: You could use hist() in matplotlib.
    • Save your figure as histogram.png. Hint: You could use savefig() in matplotlib.

import numpy as np

random_numbers = np.random.normal(0, 1, 1000)

Pie chart (10 marks)

Pie chart6 is a way to visualize the distribution of categorical data.

    • Write script in p3.py to plot a pie chart for the number of students of 3 selected colleges in CUHK in 2019. You can select any 3 colleges in CUHK. The data for the number of students can be found in https://www.iso.cuhk.edu.hk/images/ publication/facts-and-figures/2019/html5/english/10/. The categories in the pie chart are the colleges in CUHK, and there should be label text for each category on the figure. Hint: You may use pie() in matplotlib.

    • Save your figure as pie.png. Hint: You could use savefig() in matplotlib.


    • https://matplotlib.org/

5https://en.wikipedia.org/wiki/Histogram
6https://en.wikipedia.org/wiki/Pie_chart


3 of 7
CSCI 2040 A/B    Lab Assignment 4    Page 4


Bar chart (10 marks)

Bar chart is also suitable to visualize categorical variables.

    • Write script in p3.py to plot a bar chart for the number of students of all the 9 colleges in CUHK in 2019. There should be label text for each colleges on the figure. Hint: We could use bar() or barh() in matplotlib.

    • Save your figure as bar.png. Hint: You could use savefig() in matplotlib.

Scatter plot and line chart (10 marks)

Scatter plot7 and line chart8 are common ways to visualize the relationship between two continuous variables. Suppose we have two lists of numbers x_list and y_list generated by the following scripts.

import numpy as np

x_list = np.linspace(0, 1, 100)

y_list = x_list + np.random.rand(100)

    • Write script in p3.py to draw a scatter plot of x_list (x-axis) and y_list (y-axis). Draw a line chart for the function y = x in the same figure.

Hint: You may use plot() and scatter() in matplotlib, and you could set the alpha blending value to make the dots more transparent.
    • You are recommended to try marker=’*’ and color=’red’ options in scatter() and linestyle=’dashed’ options in plot().

    • Save your figure as scatter_line.png. Hint: You could use savefig() in matplotlib.

Exercise 4 (20 marks)

Mastering a programming language is not only about the syntax, but also requires one to know the programming style. In this exercise, you will get a sense of the Pythonic way of programming. In a nutshell, a Pythonic way of programming is to utilize Python’s features that are designed to make a programmer’s life easier. Here are some examples:

    1. Creating list of lists (using list comprehension).

Suppose you want a 2-dimensional array that is a list of 4 empty lists. Since Python does not have declaration for a 2-dimensional array, you need to construct it from lists. The wrong way is to append the same list for 4 times (Why it is wrong9).



7https://en.wikipedia.org/wiki/Scatter_plot

8https://en.wikipedia.org/wiki/Line_chart
9http://cryptroix.com/2016/10/25/python-call-object/


4 of 7
CSCI 2040 A/B    Lab Assignment 4    Page 5



    • wrong code list = [] list_of_lists = [] for i in range(4):

list_of_lists.append(list)

The ugly code runs a explicit for-loop.

    • correct but ``ugly'' code list_of_lists = []
for i in range(4): list_of_lists.append([])

The Pythonic code has only one line that utilizes list comprehension.

# Pythonic code

list_of_lists = [[] for _ in range(4)]

    2. Open a file, reading a file

Suppose you need to process the contents in a file, line by line. The following is the ugly code, and may forget reading a new line in the while-loop or forget closing the file.

# ``ugly'' code

file = open('some_file_name')

line = f.readline()

while line:

# do something with the line

line = f.readline() # you may forget this file.close() # you may forget this

In a Pythonic way, we use with which automatically close the file after usage, and we do a for-loop directly over the file.

# Pythonic code

with open('some_file_name') as file:

for line in file:

            ▪ do something with the line

    3. Chained comparison

        ◦ ``ugly'' code

if 0 <= x and x <= 100:

x = x + 1




5 of 7
CSCI 2040 A/B    Lab Assignment 4    Page 6



        ◦ Pythonic code

if 0 <= x <= 100: x += 1

    4. Conditional operator

        ◦ ``ugly'' code

if    0 <= x and x <= 100:

y = x + 1

else:

y = x - 1

# Pythonic code

y = x+1 if 0 <= x <= 100 else x-1

    5. Multiple assignment

    • ``ugly'' code x = 1
y = 2

    • Pythonic code x, y = 1, 2

More examples can be found in many online posts by searching “Pythonic”10.
In this exercise, you need to write a function named get_average_grades in p4.py that takes the name of the grading file as the input, and returns a list of the average grades for each lab assignments of the Python course. A prototype of your function can be

def get_average_grades(filename='grades.csv')

return average_grades_list

By default, the grades are recorded in an input file named grades.csv, in the same folder of your scripts. Each line in this file records the grades of a student in the past lab assignments,

which are separated by commas (that is called “CSV” file). For example, we have 3 students and 4 lab assignments, and the grades.csv has the following contents:

60,61,62.5,-1

-1,70,75,73

80,-1,87.5,-1





10https://medium.com/the-andela-way/idiomatic-python-coding-the-smart-way-cc560fa5f1d6


6 of 7
CSCI 2040 A/B    Lab Assignment 4    Page 7


Here, if a student does not submit a lab assignments, his grade is recorded as -1. For example, the student for the first row has grades 60, 61 and 62.5 for the first three lab assignments respectively, and the “-1” indicates that this student does not submit the fourth lab assignment.

The average grade for a lab assignment is the average grade of all the students who submit this lab assignment. For the above example, the average grade for lab assignment 1,2,3,4 are 70, 65.5, 75, 73. The output is a list of the average grades for each lab assignments (each number should be float which will be compared by the sample answer).

The return value of get_average_grades for the above example should be a Python list:

[70, 65.5, 75, 73]

Your scripts should not contain any one of the above mentioned 5 kinds of “ugly” code. Your marks will be deducted by 4 for each kind of “ugly” code in your scripts. Your scripts can be in any style that does not contain the above mentioned “ugly” code, you are NOT necessarily required to use the Pythonic code.

Submission rules

    1. Please name the functions and script files with the exact names specified in this assign-ment and test all your scripts. Any script that has any wrong name or syntax error will not be marked.

    2. For each group, please pack all your script files as a single archive named as

<student-id1>_<student-id2>_lab4.zip

For example, 1155012345_1155054321_lab4.zip, i.e., just replace <student-id1> and <student-id2> with your own student IDs. If you are doing the assignment alone, just leave <student-id2> empty, e.g, 1155012345_lab4.zip.

    3. Upload the zip file to your blackboard ( https://blackboard.cuhk.edu.hk),

        ◦ Only one member of each group needs to upload the archive file.

        ◦ Subject of your file should be <student-id1>_<student-id2>_lab4 if you are in a two-person group or <student-id1>_lab4 if not.

        ◦ No later than 23:59 on Friday, Nov. 27, 2020

    4. Students in the same group would get the same marks. Marks will be deducted if you do not follow the submission rules. Anyone/Anygroup who is caught plagiarizing would get 0 score!






7 of 7

More products