SIGIR 2011 Workshop: Internet Advertising (IA2011)

Important Dates

Submission deadline: June 20, 2011
Notification of Acceptance: July 10, 2011
Camera ready: July 18, 2011
Workshop Date: July 28, 2011


8:30-8:40 Opening Remarks
8:40-9:40 Keynote: A Perspective on Targeted Advertising: Principles, Implementation, Controversies
Andrei Broder
9:40-10:20 Invited talk: Attribution and Marketing Effectiveness in Display Advertising with Unreliable Cookies Using Bayesian Kalman Filtering
Ram Akella
10:20-10:50 Coffee Break
10:50-11:20 Contributed talks: Information Retrieval and Machine Learning for Internet Advertising
10:50-11:05 Contributed talk
Learning User Behaviors for Advertisements Click Prediction [slides]
11:05-11:20 Contributed talk
Learning to Active Learn with Applications in the Online Advertising Field of Look-Alike Modeling [slides]
11:20-12:00 Invited talk: Story of Phoenix Nest [slides]
Yang Liu
12:00-14:00 Lunch: Sunshine Cafe, Beijing Hotel, Sponsored by BaiDu
We will host all the registered audiences and invited speakers with Buffet.
14:00-14:40 Invited talk: Display Advertising: Opportunities & Challenges
Baogang Yao
14:40-15:00 Invited talk: Data-driven display advertising
Xuehua Shen
15:00-15:45 Contributed talks: Beyond Traditional Advertising
15:00-15:15 Contributed talk
Effective Blog Advertising by Understanding Bloggerr's Emotions & Needs [slides]
15:15-15:30 Contributed talk
Mobile Advertising: Triple-win for Consumers, Advertisers and Telecom Carrier [slides]
15:30-15:45 Contributed talk
Classifying Business Marketing Messages on Facebook [slides]
15:45-16:10 Coffee Break
16:10-16:50 Invited talk: How effective is Targeted Advertising? [slides]
Ayman Farahat and Michael Bailey
16:50-17:30 Panel Discussions

Accepted Papers

Learning User Behaviors for Advertisements Click Prediction [pdf]
Chieh-Jen Wang (National Taiwan University), Hsin-Hsi Chen (National Taiwan University)

Effective Blog Advertising by Understanding Blogger's Emotions & Needs [pdf]
Yao-sheng Chang (National Cheng Kung Univ.), Wen-hsiang Lu (CSIE, NCKU)

Mobile Advertising: Triple-win for Consumers, Advertisers and Telecom Carrier [pdf]
Chia-Hui Chang (National Central University), Kuan-Hua Huo (National Central University)

Optimizing Display Advertisements Based on Historic User Trails [pdf]
Neha Gupta (UMD), Udayan Khurana (UMD), Tak Lee (UMD), Sandeep Nawathe (Tumri Inc.)

Ranking of New Sponsored Online Ads Using Semantically Related Historical Ads [pdf]
Hamed Neshat (Simon Fraser University), Mohamed Hefeeda (Simon Fraser University)

Classifying Business Marketing Messages on Facebook [pdf]
Bei Yu (Syracuse University), Linchi Kwok (Syracuse University)

Learning to Active Learn with Applications in the Online Advertising Field of Look-Alike Modeling
James Shanahan (Church and Duncan Group Inc.)


Internet advertising, a form of advertising that utilizes the Internet and World Wide Web to deliver marketing messages and attract customers, has seen exponential growth since its inception over 15 years ago, resulting in a $65 billion market worldwide in 2008; it has been pivotal to the success of the World Wide Web.

The dramatic growth of internet advertising poses great challenges to the information retrieval community and calls for new technologies to be developed. Internet advertising is a complex problem. It has different formats, including search advertising, display advertising, social network advertising, in app/game advertising). It contains multiple parties (i.e., advertisers, users, publishers, and ad platforms such as ad exchanges), which interact with each other harmoniously but exhibit a conflict of interest when it comes to risk and revenue objectives. It is highly dynamic in terms of the rapid change of user information needs, non-stationary bids of advertisers, and the frequent modifications of ads campaigns. It is very large scale, with billions of keywords, tens of millions of ads, billions of users, millions of advertisers where events such as clicks and actions can be extremely rare. In addition, the field lies at intersection of information retrieval, machine learning, economics, optimization, distributed systems and information science all very advanced and complex fields in their own right.

For such a complex problem, conventional technologies and evaluation methodologies are not be sufficient, and the development of new algorithms and theories is sorely needed.

The goal of this workshop is to overview the state of the art in Internet advertising, and to discuss future directions and challenges in research and development. We expect the workshop to help develop a community of researchers who are interested in this area, and yield future collaboration and exchanges.

Possible topics include:
  • IR and advertising
    • CTR prediction
    • Relevancy studies for advertising
    • Behavior targeting and audience selection
    • Ad selection and ranking
    • Ad taxonomy construction and alignment
    • Ad classification and clustering
  • Evaluation and benchmarks
    • Human labeling for ads
    • Evaluation metrics for ad effectiveness
    • Public benchmarks for academic research
    • Experimental design (considering second order effects)
  • Beyond traditional advertising
    • In game advertising
    • In app advertising
    • Mobile advertising
    • Social advertising
    • Advertising on four screens
    • Ad Exchanges and RTB: expressing constraints and forecasting Others
  • Others
    • Credit assignment
    • Privacy protection
    • Auction theory
    • Mechanism design
    • Bid and campaign optimization
The above list is not exhaustive, and we welcome submissions on highly related topics too.

Invited Talks

In alphabetical order

Attribution and Marketing Effectiveness in Display Advertising with Unreliable Cookies Using Bayesian Kalman Filtering
Ram Akella (Stanford University and University of California)
Evaluating and enhancing the effectiveness of marketing campaigns is an important and current problem in On-line Display Advertising. The main objective of advertisers and the advertising network is to increase the number of commercial actions, under the Cost-Per-Action (CPA) business model. From an ad network perspective, it is often not possible to obtain click-stream and other information pertinent to advertising from within the website, due to privacy constraints. Moreover, a significant number of users do not have reliable cookies. Estimates from the ad network show that around 15% of users do not have reliable cookies. Thus, establishing a direct correspondence between ads and individual user actions is often difficult and possibly infeasible.
In this talk, I will present the approaches we have been developing to associate commercial actions with marketing campaigns in the absence of cookies. I will describe a time series approach to address this key problem using a Dynamic Linear Model (DLM), also called Bayesian Kalman Filtering, to model the effects of impressions on the number of actions, aggregated over all users. The model includes a persistence of commercial actions several periods after user exposure to ad impressions, assuming decay in the impact. Our model supports several campaigns being run simultaneously, together with any base-DLM model to describe commercial actions without incorporating impressions. To estimate the parameters of the model, we use a Gibbs-sampling-based approach. The interest in this approach derives from the fact that much of the previous literature has focused primarily on clicks, rather than actions.
Finally, I will discuss the limitations inherent in inferring causality, the current measures, and challenges in running experiments.
The context and test-bed for this work is large-scale production data (termed ''Big Data'' in recent times) from AOL, using the internal Hadoop cluster.
Joint work with Joel Barajas, in collaboration with Marius Holtan, Brad Null, Jaimie Kwon (AOL)
About speaker
Ram Akella is Professor of Information Systems and Technology Management, and Director of the Center for Large Scale Analytics and Smart Services (CLASS) at the University of California, including KISMT; his appointment at Stanford is in Informatics and Medicine. His current interests span Social Networks and Dynamic Bayesian Recommender Systems and Personalization (Google), Computational Advertising (with Yahoo), Online Marketing (AOL Faculty Award), Query Intent Detection in Sponsored Search (Microsoft Faculty Award), Social Media & Marketing Analytics (Crowd Science, Serendio), Social Game Analytics (Claritics), Machine Learning, Interactive Information Retrieval (SAP), Information Extraction and Service Analytics (Cisco), Data, Text, Image and Video Mining (NASA), Healthcare Analytics (IBM Faculty Award, CITRIS), Energy Analytics(CITRIS). His broader interests span Financial Engineering, Supply Chain, Innovation, Product-Service Analytics, Pricing and Costing, He has worked with over 200 firms in domains ranging from semiconductors, PCs, and software, to automotive and food. Ram has straddled Engineering, Management, Computer Science, Information Sciences, Medical, through appointments at Carnegie Mellon (Associate Professor between the Business School and Computer Science), and then at MIT, Berkeley and Stanford, including forming Technology and Information Management (TIM) at UCSC as Founding Director/Chair. He followed up his BS and PhD in EECS at IIT Madras and IISc, Bangalore, with postdoctoral appointments at Harvard and MIT. His editorial and program committee roles span IEEE, INFORMS, and ACM. His students have gone on to be faculty and department Chairs at major schools such as University of Michigan Ann Arbor, Northwestern University, Columbia University, NYU, USC, and the London Business School, or executives, entrepreneurs, VCs, and lawyers (AT Kearney, BCG, Juniper, Intel Capital, Spoke, MerchantCircle, Cooley LLP, etc.). His industry board memberships include the Unit Trust of India Ventures. He enjoys receiving stock from appreciative executives.

A Perspective on Targeted Advertising: Principles, Implementation, Controversies
Andrei Broder (Yahoo!)
Online user interaction is becoming increasingly personalized both via explicit means: customizations, options, add-ons, skins, apps, etc. and via implicit means, that is, deep data mining of user activities that allows automated personalized content and experiences, e.g. individualized top news stories, personalized ranking of search results, personal ''radio stations'' that capture idiosyncratic tastes from past choices, individually recommended purchases, and so on. On the other hand, the vast majority of providers of content and services (e.g. portals, search engines, social sites) are supported by advertising, which at core, is just a different type of information. Thus, not surprisingly, on-line advertising is becoming increasingly personalized as well, supported by the emerging new discipline of Computational Advertising.
The central problem of Computational Advertising is to find the "best match" between a given user in a given context and a suitable advertisement. The context could be a user entering a query in a search engine ("sponsored search"), a user reading a web page ("content match" and "display ads"), a user communicating via instant-messaging or via e-mail, a user interacting with a portable device, and many more. The information about the user can vary from scarily detailed to practically nil. The number of potential advertisements might be in the billions. Thus, depending on the definition of "best match" this problem leads to a variety of massive optimization and search problems, with complicated constraints. The solution to these problems provides the scientific and technical foundations of the online advertising industry, which according to E-Marketer, has achieved $26B dollars in revenue in 2010 in US alone, for the first time exceeding newspaper advertising revenue at ''only'' 22.8B dollars.
The focus of this talk is targeted advertising, a form of personalized advertising whereby advertisers specify the features of their desired audience, either explicitly, by specifying characteristics such as demographics, location, and context, or implicitly by providing examples of their ideal audience. A particular form of targeted advertising is behavioral targeting, where the desired audience is characterized by its past behavior. We will discuss how targeted advertising fits the optimization framework above, present some of the mechanisms by which targeted and behavioral advertising are implemented, and briefly survey the controversies surrounding behavioral advertising as a potential infringement on user privacy. We will conclude with some speculations about the future of personalized advertising and interesting areas of research.
Note: This talk represents the personal opinions of the author and do not necessarily reflect the views of Yahoo! Inc.
About speakers
Andrei Broder is a Yahoo! Fellow and Vice President for Computational Advertising. He also serves as chief scientist of Yahoo!'s Advertising Product Group. Previously he was an IBM Distinguished Engineer and the CTO of the Institute for Search and Text Analysis in IBM Research. From 1999 until 2002 he was Vice President for Research and Chief Scientist at the AltaVista Company. He was graduated Summa cum Laude from Technion, the Israeli Institute of Technology, and obtained his M.Sc. and Ph.D. in Computer Science at Stanford University. His current research interests are centered on computational advertising, web technologies, context-driven information supply, and randomized algorithms. Broder has authored more than a hundred papers and was awarded thirty-two patents. He is a member of the US National Academy of Engineering, a fellow of ACM and of IEEE, and past chair of the IEEE Technical Committee on Mathematical Foundations of Computing.

How effective is Targeted Advertising?
Ayman Farahat and Michael Bailey (Yahoo!)
Advertisers are demanding more accurate estimates of the impact of targeted advertisements. Traditional measures of the effectiveness of targeted advertising ignore a potential selection bias if there is heterogeneity in the response to a targeted advertisement in the population that is correlated with the targeting criterion. In the first part of this talk, we formulate the evaluation of advertising effectiveness as a treatment effect problem. We review a number of econometric techniques used to estimate the average treatment effect. In the second part of the talk, we present a large scale natural experiment for evaluating the impact of targeted ads. We show that on average, naive estimates of impact of targeted ads are orders of magnitude higher than bias-corrected estimates
About speakers
Ayman Farahat is a MarketPlace architect with Yahoo! In this role; Ayman provide domain expertise and guidance on marketplace design and yield optimization for brand display marketplaces, balancing utilities over advertisers, consumers, and publishers to promote long-term marketplace health. In particular, Ayman has been focusing on cross-media measurements, attribution and optimization of branded advertising. Prior to Yahoo!, Ayman was chief scientist at Admob where he worked on mobile advertising.

Story of Phoenix Nest
Yang Liu (Baidu)
Phoenix Nest is Baidu's online marketing professional edition, and was fully launched on 2009/12/01. Since then, it has been a major driver of Baidu's revenue growth. This talk will tell some of the stories of the history of Phoenix Nest, and the experience and lessons we learned.
About speakers
Yang is currently engineering director of Baidu, leading the development of Baidu's search ads platform, mainly focus on ads serving backend and targeting quality. From 2006 to 2010, he was tech lead in Google, working on AdSense targeting quality. Before that, he worked as staff engineer for Sybase, and engineering manager for a startup. Yang received his bachelor degree and master degree from Shanghai Jiaotong university.

Data-Driven Display Advertising
Xuehua Shen (Pinyou)
In this talk, I will give an overview of exciting display advertising industry in USA and China and the difference of two different markets. The talk will focus on the innovation and trend of data-driven behavior targeting. A large scale data collection, processing, modeling, and application based on Hadoop cloud-computing platform will be discussed.
About speakers
Xuehua Shen received B. S. of Computer Science in Nanjing University, China, and Ph.D. of Computer Science at University of Illinois at Urbana-Champaign, USA. His dissertation work is personalized search. After Ph.D. research, he worked in Google search quality, doing personalized search, and search quality live experiment platform based on real user interactions. He then worked in BlueKai, a display advertising startup in Silicon Valley, using Hadoop cloud-computing platform to do personalized ads and predictive modeling. Now, he is co-founder and CTO in Pinyou, a data-driven display advertising startup in Beijing.

Display Advertising: Opportunities & Challenges
Baogang Yao (Managing Director, Microsoft adCenter China)
Display advertising, though considered by many as a 'maturing' form of online advertising, is still an area with lots of growth opportunities as more and more offline ad spend moves online. However this has not been a smooth migration since significant challenges still remain. This talk will cover the main growth opportunities as well as the top obstacles that are slowing down this shift. This talk will also cover an overview of Microsoft advertising products, as well as how Microsoft R&D teams in China are contributing to these products.
About speaker
Baogang Yao is currently the managing director of Microsoft Ad Platform China, leading the development efforts on various technologies for online advertising. Baogang joined Microsoft headquarter in 1996 as a technical lead and then development manager and worked for several business groups including Windows. Baogang joined Microsoft Advanced Technology Center (ATC) as the director of engineering in 2004, responsible for the product development of instant messaging, web search, and online advertising. During 2007 to 2009, Baogang held the position of assistant managing director and director of engineering for Google China, overseeing the search and advertising technologies for the global and local markets of Google. After the two years at Google, he returned to Microsoft and led the Microsoft Ad Platform China. Baogang received his bachelor degree from Fudan University in 1992 and the master degree from UIUC in 1996.

Paper Submissions

Submissions to the IAD workshop should be in the format of short papers: 4-6 pages formatted in the SIGIR style. The submission does not need to be blind. Please upload submissions in PDF to

Accepted papers will be made available Internet at the workshop website. In addition, we plan to invite extended versions of selected papers for a special issue of a top-tier information retrieval journal (under discussion).

Program Committee

Aris Anagnostopoulos (Sapienza University of Rome)
Misha Bilenko (Microsoft)
Wenkui Ding (Tsinghua University)
Marcus Fontoura (Google)
Bin Gao (Microsoft Research Asia)
Vanja Josifovski (Yahoo! Research)
Sandeep Pandey (Yahoo! Research)
Kunal Punera (Yahoo! Research)
Benyah Shaparenko (Google NYC)
Dirk Van den Poel (University Gent)
Jun Wang (University College London)
Hwanjo Yu (POSTECH)


Tie-Yan Liu (Microsoft Research Asia)
Tao Qin (Microsoft Research Asia)
James G. Shanahan (Independent Consultant)

Contact US

taoqin AT microsoft DOT com