$24
1. Consider the following alternative to potentially achieve -di erential privacy for releasing the mean of a database X 2 f0; 1gn. So we have n users each with one attribute and we want
P
to release the count of users with this attribute so f(X) = i Xi. As studied in class, this query has S1(f) = 1 so we can use Laplacian noise. But what if instead of Laplacian noise we consider the following strategy for achieving privacy: 1. Sample a uniform real-value in the interval [ C= ; C= ] (for some constant C). 2. Release f(X) + Z.
Is the above mechanism -di erentially private? (For some xed constant C.) [4 points]
2. In class we stated without proof that the exponential mechanism gets a reasonable guarantee for preserving the utility of the released id. In particular, we stated in class without proof that (using notation from class), if we use exponential mechanism for a utility function with sensitivity u,
P r[u(ME(X; u; R)) OP Tu(X)
u
(ln jRj + t)] e t:
Prove the above statement. [4 points]
[Hint: Try to reason that every item with as small utility as above is quite unlikely, and then use that the total number of items is at most jRj.]
3. Suppose your database is the income numbers of n individuals. What scheme would you use to release the median income in the dataset to satisfy -di erential privacy while achieving a good guarantee on error?
1
Concretely, suppose the each persons income is an integer in f0; 1; 2; : : : ; Ng. So the database X 2 [N]n and the query we are trying to release privately is M edian(X). While this is not needed, if it helps for you, you can suppose that all incomes are distinct and that n is odd.
• What is the sensitivity of this query as a function of N? [2 points]
• What would happen if you use Laplacian mechanism given this sensitivity bound? [2 points]
• Describe a score function or utility function for which the arg max would exactly be the median and one whose sensitivity does not depend on N; n. Note the remarkable gains you would get by now using exponential mechanism with this score function to release the median instead! [2 points]
• Describe a score function for which the arg max would exactly be the 90’th percentile income level among the people in the database and one whose sensitivity is still inde-pendent of N; n. [2 points]
2