$24
1. (11 pts) In the folder \statisticians", you can nd the data about statisticians’ publica-tions in 4 journals during 2003-2012. Look at the ReadMe le in the folder to understand the data set. You can also explore the paper at https://arxiv.org/abs/1410.2840 for a detailed analysis if you want. We will explore the data set in class and test basic visualizations. You are supposed to further extend the analysis as below:
1. Visualize the abstract word cloud over three period (2003-2005, 2006-2008, 2009-2012). Do you observe any trend? (2 pts)
2. Ego network exploring: among three famous statisticians \Peter J Bickel", \Jian-qing Fan", \David Dunson", who is the one with the largest number of collabora-tors? What are their numbers of collaborators? (1 pts)
3. Pick one of the three persons for the following analysis:
(a) Visualize the person’s collaboration network change over the three periods above (the persons and all of his collaborators only, as an induced network). Comment or visualize certain expansion pattern. Make sure you visualize the three networks in the three periods in the same reasonable layout, even though they have di erent nodes. You need to gure out a way to do this. (Network visualization over time 2 pts, reasonable pattern summary or visualization 2 pts)
(b) Visualize the word cloud from his/her articles in the data set over the three pe-riods. Is the trend in for this person similar to the global trend? (Visualization 2 pts, comparison 1 pt)
(c) Use Google Scholar, nd all the paper titles of this person. Again, visual the word cloud from the titles over time, for the person’s whole academic career. Does the period of 2003-2012 seems to match what you observe from (c)? (Hint:
1 you do not have to segment the data by each year, just check the word cloud before 2003, and the cloud after 2012.) (1 pt)