Giusy Di Lorenzo
Optimizing cities by mining pervasive data
Abstract: Over the past decade the development of digital networks and operations has produced an unprecedented wealth of information reflecting various aspects of urban life. These accumulations of digital traces are valuable sources of data in capturing the pulse of the city in an astonishing degree of temporal and spatial detail, and could be used to make urban systems more efficient.
The deep penetration of mobile phones offers cities the ability to opportunistically monitor citizens' interactions and use data-driven insights to better plan and manage services. In this context, transit operators can leverage pervasive mobile sensing to better match observed demand for travel with their service offerings. With large scale data on mobility patterns, operators can move away from the costly and resource intensive four-step transportation planning processes prevalent in the West, to a more data-centric view, that places the instrumented user at the center of development. In this framework, using mobile phone data to perform transit analysis and optimization represents a new frontier with significant societal impact, especially in developing countries.
During the talk I will present how mobile-phone data can be used to optimize the planning of a city-wide public transit network, and a real system that has been tested for the city of Abidjan, Ivory Coast, with the focus to improve the existing SOTRA transit network.
Web dynamics aware search: predicting the future to keep up with the present
Abstract: Search engines need to serve the content to the users, which is interesting and important for them at the very moment of issuing the query requesting for it. However, partly, due to the limits of computational resources needed to frequently gather and analyze the data, and, partly, due to the lack of statistics, that adequately reflects the real-world interests of users in the search engine logs, it becomes impossible to always know what users need at each and every second. That lack of knowledge inevitably impacts millions of queries and, in order to make its effect less critical, search engines need to actually predict what is interesting for users now, using the data that they gathered in the recent past.
In my talk I will overview a few tasks where this problem is noticeable and will demonstrate how it is solved by search engines and by Yandex, in particular. I will focus on three specific challenges: 1) predicting in what content users will be interested in the future and adapting browsing trails based link analysis algorithms accordingly; 2) predicting where the new content, that users will want in the future, will appear, and adapting crawling algorithms accordingly; and 3) predicting how users will search for the interesting content in the future, and adapting query autocompletion algorithms accordingly.
Exploiting opinion dynamics to reach near-perfect sentiment classification
Abstract: Opinions and remarks spread fast through Social Media networks. Some die fast, some explode in a matter or minutes, others grow for months. In the meantime, Websays pipeline assigns sentiment labels to mentions, and analysts and clients correct them in an active-learning setting. By exploiting the temporal dynamics of this process, Websays can reach near-perfect sentiment classification in real time, on the portion of opinions that matter to clients.