We address the research challenges of privacypreserving WebID analytics on the decentralized Social Web. We first argue why we should use open and decentralized control but not closed and centralized control of personal data management. Then, we present a policy-aware architecture, where a data owner hand-picks a trusted data controller to mask his/her personally identifiable information (PII) and other sensitive social relationships of the WebID so only anonymous RDF(S) linked datasets are available for analytics. Moreover, we advocate using a R and Hadoop integration paradigm, called RHadoop, for effective hybrid WebID analytics of large-scale social network linked datasets. Finally, we propose various types of semanticsenabled policies to call for the RHadoop hybrid WebID analytics and further balance data utility and protection on the privacyaware Social Web.
Special sessions on Big Data Analytics, 2014 IEEE Web Intelligence Conference, 11-14 August, 2014, Warsaw, Poland