Homework 4 Networks, Crowds, and Markets Solution

Starting from:

~~$35~~

$29

Home

Part 1: Voting

1. In the American presidential elections, while the popular vote is used up to the state level, the electoral college decides the winner at the national level. Assuming there are only two candidates, is this system strategy-proof? Does it elect a Condorcet winner? Justify your answers. (Note: You can assume for simplicity that each state gets a single \vote" in a national election, and the state then runs an election by popular vote with however many people are in that state to determine which vote to cast at the national level.)

2. Suppose we want to run a popular vote election between two candidates A and B. There are 1,000,000 eligible voters and suppose 52% of them prefer A to B. Instead of running a full election where every person casts a vote, we poll m randomly selected people (with replacement for simplicity) and ask them to report their preferred candidate. The result of the poll is the majority of the responses received.

(a) Compute a value of m so that the result of the poll is incorrect with probability at most 1%? (Use the Cherno bound in the book, show your work.)

(b) Let n be the number of people in the population, be de ned such that (1=2 + ) n prefer A to B, and let be the desired accuracy (so the probability the result is incorrect is at most ). Write m as a function of n, , and .

If the number of people in the population increased by a factor of 10, how would that a ect m? If decrease by a factor of 2, how would that a ect m? If we want to increase our con dence by a factor of 10, how would that change m? If = 1=n (so 1 person would be the deciding vote), what would this imply about m given your bound from above?

(c) In practice, what might be wrong with the above assumptions (i.e. why might we not we use polls to run our elections)?

Part 2: Stable Matchings

3. For the following setting, nd (a) the male-optimal and (b) the female-optimal stable match-ing. For each part, simulate the Gale-Shapley algorithm (i.e. in words, clearly indicate what happens at each step) to show that it arrives at the matching you nd.

Females
Preferences

Males
Preferences

A
X>W>Y >Z

W
D>B>C>A

B
X>W>Y >Z

X
D>B>A>C

C
X>W>Z>Y

Y
C>B>D>A

D
Y >W>Z>X

Z
D>B>C>A

4. Prove that there exists a non-bipartite matching setting (where every individual has prefer-ences over all other individuals) for which no stable matching exists.

Part 3: Beliefs

5. There is a test for a certain disease that has a 15% false positive rate and a 25% false negative rate. (So, if someone has the disease, there is a 75% chance the test will return positive; if they do not, there is an 85% chance it will return negative.) If 1% of the population has this disease, what is the probability that someone who tests positive for the disease actually has it? (Use Bayes’ Rule; show your work.)

6. In the \foolishness of crowds" example from section 16.2 of the notes, let’s assume that people decide that their own evidence should instead be weighed c times as heavily as others’ prior guesses, where c is an integer greater than 1. Now what is the probability that a cascade occurs and everyone is incorrect (in terms of , where the probability that a player receives correct evidence is 1=2 + )?

7. Recall the \muddy children" example covered in class and in the notes. Section 17.2 of the notes gives a formal argument that Claim 17.1 (that, if there are m muddy children, they will answer \yes" on and not before round m) holds for the case where there are two total children.

(a) Now draw the knowledge network for the case where there are three total children.

(b) Using a similar argument to the case of two children, use your network and the formal \possible worlds" model of knowledge to formally show that Claim 17.1 holds for the case when there are three total children (and any non-zero number of muddy children).

Part 4: PageRank and Social Networks

The objective of this question is to use the PageRank algorithm as a way to determine how \in uential" a node is in a social network based on its in-links from in uential nodes. For this question, we provided a template in python called ‘hw4.py’. You must use the template to submit your code, as we will grade your code in a (partially) automated way. You should submit the hw4.py le, not a .pynb or other python le.

8. Design an algorithm that runs the iterative -scaled PageRank algorithm for a speci ed num-ber n of rounds on a given directed graph, with = 1=7. Run it (with n = 10) on the examples in gures 15.1 (both left and right) and 15.2 (the two disjoint triangle graph), as well as at least two other simple test cases with at least 10 nodes.

9. Now we’ll run PageRank on the Facebook data. [http://snap.stanford.edu/data/egonets-Facebook.html].

The le is called \facebook combined.txt.gz"; remember that it has 4,039 nodes.

(a) Once again, remember that this is an undirected graph! Before running your algorithms from the previous problem, implement a transformation into a directed graph, i.e. each undirected edge corresponds to two di erent directed edges.

(b) Now, run the PageRank algorithms from the last problem on this new graph. You shouldn’t need n to be much higher than 10-20 for the algorithm to converge to a xed point.

(c) Where did most of the score tend to end up in your experiments? Look at the nodes that have the highest or lowest scores; is there a consistent pattern among your trials?

3

(d) Intuitively explain your results in terms of a measure of in uence in a social network. Do you think that this is an accurate measurement? How could we try to improve it (for instance, by incorporating link strengths or other measures of popularity)?

Part 5: Essay Question

(This problem should be completed individually and not in a group. However, it will be graded based on completion.)

Write 1/2 to 1 page discussing or analyzing one of the following prompts using any of the concepts taught in this class:

Read the following New York Times article adapting a recent work from Nobel Prize winners Esther Du o and Abhijit Banerjee: https://www.nytimes.com/2019/10/26/opinion/sunday/du o-banerjee-economic-incentives.html The article makes the claim that people aren’t driven by nancial incentives as much as you would assume. In particular, they cite a study that claims that people believe that \Every-one else responds to incentives, but I dont." Why might this undermine certain assumptions made in this class? How can we model this situation using ideas from this class?

Watch the following speech from Sacha Baron Cohen: https://www.youtube.com/watch?v=ymaWq5yZIYM

The speech discusses the relevance of the spread of (mis)information in social network. For example, he makes the claim that \fake news outperforms real news because lies spread faster than truth." How can use tools from this class to explain this phenomena? How could we use tools from this class to identify the spread of misinformation?

Pick one or more recent news article (within the last few months) related to networks, markets, or beliefs to analyze and discuss. (In particular, you may look at one of the above examples and discuss a di erent aspect if you wish.)

4