July 22, 2011
Room 111 (CCIB)


IJCAI's Industry Day will take place on Friday July 22nd and is intended to provide a forum to discuss how industry is leveraging or could leverage machine learning/AI techniques to have an impact in the real world. This year we have an impressive and fascinating program. If you are interested in knowing more about how Watson works (and won in Jeopardy!), the large scale machine learning algorithms that Google uses, or how machine learning and AI are helping Microsoft, Telefonica and Yahoo! in their businesses, please come to the Industry Day. In addition to speakers from large companies, we will feature speakers from start-ups and relevant smaller and local companies.


If you wish to attend the Industry Day, please register in the conference site for the corresponding pass (registration). The Industry Day pass will give you access to the room where it will run the event, to the exhibits room and to coffee breaks and closing event of IJCAI 2011. The number of attendants is limited and the price of the pass is 50€.


08:00 - 08:30
08:30 - 08:40
08:40 - 09:20
Learning, Inference, and Action in the Open World
09:20 - 10:00

Are Telcos poised to become a Data Business?

10:00 - 10:30
Coffee Break
10:30 - 11:10

Large Scale Machine Learning At Google: Some Examples and Challenges

11:10 - 11:50

Social network Analysis in Industry

11:50 - 12:30

Top Lessons learned Developing, Deploying and Operating Real-World Recommender Systems

12:30 - 13:00
Panel 1
13:00 - 14:30
Lunch Break
14:30 - 15:10

What is Watson?

15:10 - 15:50

The Power of Prediction



Learning, Inference, and Action in the Open World
Eric Horvitz, Microsoft Research

Eric Horvitz is a Distinguished Scientist at Microsoft Research. His interests span theoretical and practical challenges with developing systems that perceive, learn, and reason. He has been elected Fellow of the AAAI, American Association for the Advancement of Science, and the American Academy of Arts and Sciences. He was recently president of the AAAI and is now serving as the Immediate Past President. He received his PhD and MD degrees from Stanford. More information can be found at:

Methods for learning, reasoning, and decision making under uncertainty with probabilistic graphical models lay at the heart of a two-decade rolling revolution in AI. I will discuss methods that we have employed to learn graphical models from data, and share highlights on the development of several applications, including efforts in transportation, healthcare, operating systems, and sustainability. Finally, I'll address opportunities with developing new kinds of competencies via weaving together sets of perceptual, learning, and reasoning components.



Are Telcos poised to become a Data Business?
Carlos Domingo, Telefonica Research

Carlos Domingo is CEO for Telefónica I+D; he joined the company in 2006 as Director of Internet & Multimedia and Director of the Barcelona Center. Since then, he has implemented a new way of working and a new understanding of the R&D in a big Telco. Carlos serves on the board of directors of Jajah since 2010. Carlos holds a MSc in computer science from the Tokyo Institute of Technology, a PhD in computer science from the Polytechnic University in Catalonia and post grad business studies from Stanford Graduate School of Business. With more than 15 years of experience in the IT and telecommunications world, he has developed a big part of his career in Japan and USA, being VP of Celertem Technologies and eventually becoming the President & CEO of its subsidiary in Seattle after merging Extensis LizardTech and DiamonSoft. In 2008 he was awarded with the `National Award to the Professional Career´ granted by the Asociación de Ingenieros en Informática (Computer Science Engineers Association) in Spain.


Since some years we hear the phrase that "Data is the new Oil". This metaphor implies (i) that data is raw and in order to extract value, it requires a data refinery (data mining); (ii) data refinery and exploitation is not without risks (privacy and data protection); (iii) the companies having access to the data have a significant competitive advantage in the market. In this talk, I will elaborate on all three aspects and argue that Telcos are well-positioned to become the new data refineries, although there are other likely candidates as well, such as the big internet players.



Large Scale Machine Learning At Google: Some Examples and Challenges
Samy Bengio, Research Scientist in Machine Learning Google

Samy Bengio (PhD in computer science, University of Montreal, 1993) is a research scientist at Google since 2007. Before that, he was senior researcher in statistical machine learning at IDIAP Research Institute since 1999, where he supervised PhD students and postdoctoral fellows. His research interests span many areas of machine learning such as large scale online learning, support vector machines, time series prediction, mixture models, speech recognition, multi channel and asynchronous sequence processing, multi-modal (face and voice) person authentication, brain computer interfaces, and document retrieval. He is Associate Editor of the Journal of Computational Statistics, is on the editorial boards of the Journal of Machine Learning Research and the Machine Learning Journal, has been general chair of the Workshops on Machine Learning for Multimodal Interactions (MLMI'2004, 2005 and 2006), programme chair of the IEEE Workshop on Neural Networks for Signal Processing (NNSP'2002), and on the programme committee of several international conferences such as NIPS, ICML and ECML.

Machine learning research has often concentrated on more and more complex algorithms to extract the best models out of a limited number of training examples. In recent years, though, with the availability of cheaper computing resources, industries have started collecting very large databases (millions or even billions of examples, thousands or even millions of features, thousands or more classes, etc). Trying to apply classical machine learning algorithms on these large datasets is often a challenge. In this talk, I'll describe a few examples and challenges we recently faced at Google in order to adapt or develop machine learning algorithms for our ever larger datasets and problems. This includes examples such as speech recognition, machine translation, image and video annotation, natural language processing and more.



Social network Analysis in Industry

I will first introduce the practical constraints of data mining projects in industry, due to large volumes of data and ever faster pace of the economy. I will then present recent developments in using data mining on "networked data", that is on data which can actually be represented as a social graph. I will show how at KXEN, we have developed a social network analysis approach allowing our customers to build and analyze very large social networks (tens of millions of nodes). I will illustrate this approach with real-world applications in telecommunications and credit card fraud.

Françoise Soulié Fogelman is VP Innovation at KXEN, responsible for leading KXEN innovation program, working with Research & Development, Product development, Sales and Marketing to help promote KXEN's offer. She is also in charge of managing KXEN's University Program. She has over 30 years of experience in data mining and CRM both from an academic and a business perspective. Prior to KXEN, she directed the first French research team on Neural Networks at Paris 11 University where she was a CS Professor. She then co-founded Mimetics, a start-up selling products and services using neural network technology, and became its Chief Scientific Officer. After that she started the Data Mining and CRM group at Atos Origin and, most recently, created and managed the CRM Agency for Business & Decision, a French IS company specialized in Business Intelligence and CRM. Ms Soulie Fogelman holds a master's degree in mathematics from Ecole Normale Superieure and a PhD in Computer Science from University of Grenoble. She was advisor to over 20 PhD on data mining, has authored more than 100 scientific papers and books and has been an invited speaker to many academic and business events.



Top Lessons learned Developing, Deploying and Operating Real-World Recommender Systems
Marc Torrens, Chief Innovation Officer, Strands

In 2004, Marc co-founded Strands, Inc. and is currently serving Strands as Chief Innovation Officer. He brings expertise in bridging the gap between the outputs of scientific research and the inputs needed by real-world Internet applications in the finance industry. He enjoys overcoming new business challenges with state-of-the-art technology in the areas of Artificial Intelligence, Usability, Data Visualization, and Product Design. Prior to Strands, Marc co-founded Iconomic Systems SA in 1999 and served the company as Chief Technology Officer. Iconomic Systems developed software for planning travel on the Internet. The company was sold to i:FAO AG (Germany) in 2001.

He is co-inventor of several patents for Iconomic Systems and Strands. Marc has also published more than 20 referred papers on his work in international conferences and journals on Artificial Intelligence and Usability. He also regularly participates in conference committees (IJCAI, ECAI, AAAI, ACM EC, ACM RecSys, UMAP, and others) and has been chair of several workshops and conferences, recently including the ACM International Conference on Recommender Systems 2010. Marc Torrens received a M.Sc. degree in Computer Science from the Universitat Politècnica de Catalunya (UPC), Barcelona,
Catalonia, in 1997, and a PhD in Computer Science from the École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland, in 2002. He also has a postgraduate Certificate in Entrepreneurship, organized by CREATE (Chair of Entrepreneurship at EPFL).

Strands develops products that help people find information online that they want and need. Strands offers production recommendation services for eCommerce, interactive tools for personal finance management, and personal interest and lifestyle-oriented social discovery solutions. Strands also operates, a personal finance management platform, and, a training log and information source for active people. In this talk, Strands Chief Innovation Officer, Marc Torrens, PhD, will discuss Strands' "Top 10 Lessons Learned" from their experience building and deploying recommender systems for a wide range of customers. The lessons learned will range from customer relations and marketing, to business planning, to the technical. As recommender technology becomes ubiquitous online, and even overshadows search in many commercials settings, Strands has found these "Top 10 Lessons Learned" continue to be valuable guidelines.



What is Watson?
David Gondek, IBM Watson Research Centre

Computer systems that can directly and accurately answer questions over a broad domain of human knowledge have been envisioned by scientists and writers since the advent of computers themselves. Open domain question answering holds tremendous promise for facilitating informed decision making over vast volumes of natural language content. Applications in business intelligence, healthcare, customer support, enterprise knowledge management, social computing, science and government would all benefit from deep language processing. The DeepQA project is aimed at exploring how advancing and integrating Natural Language Processing (NLP), Information Retrieval (IR), Machine Learning (ML), massively parallel computation and Knowledge Representation and Reasoning (KR&R) can advance open-domain automatic Question Answering. One proof-point in this challenge is to develop a computer system that can successfully compete against top human players at the Jeopardy! quiz show ( Attaining champion-level performance in Jeopardy! requires a computer system to rapidly and accurately answer rich open-domain questions, and to assess its own confidence in a category or question. The system must deliver high degrees of precision and confidence over a very broad range of knowledge and natural language content within a very short 3-second response time. The need for speed and high precision demands a massively parallel computing platform capable of generating, evaluating and combing thousands of hypotheses and their associated evidence. In this talk I will introduce the audience to the Jeopardy! Challenge and the techniques used to tackle it.



The Power of Prediction
Ricardo Baeza-Yates, Yahoo! Research

Ricardo Baeza-Yates is VP of Yahoo! Research for Europe, Middle East and Latin America, leading the labs at Barcelona, Spain and Santiago, Chile, as well as supervising the newer lab in Haifa, Israel. Until 2005 he was the director of the Center for Web Research at the Department of Computer Science of the Engineering School of the University of Chile; and ICREA Professor at the Dept. of Technology of the Univ. Pompeu Fabra in Barcelona, Spain. He is co-author of the best-seller book Modern Information Retrieval, published in 1999 by Addison-Wesley with a second edition in 2010, as well as co-author of the 2nd edition of the Handbook of Algorithms and Data Structures, Addison-Wesley, 1991; and co-editor of Information Retrieval: Algorithms and Data Structures, Prentice-Hall, 1992, among more than 200 other publications.

He has received the Organization of American States award for young researchers in exact sciences (1993) and with two Brazilian colleagues obtained the COMPAQ prize for the best CS Brazilian research article (1997). In 2003 he was the first computer scientist to be elected to the Chilean Academy of Sciences. During 2007 he was awarded the Graham Medal for innovation in computing, given by the University of Waterloo to distinguished ex-alumni. In 2009 he was awarded the Latin American distinction for contributions to CS in the region and became an ACM Fellow. Finally, in 2011 he also became IEEE Fellow.


In Web search engines, performance and user experience are key aspects. One technique to improve performance is to use machine learning to predict if a future event will be true or not, and assuming that the prediction is correct, use a different algorithm for each predicted case. We show a few examples of this technique for query processing related to caching, index selection and lack of good answers. Similarly, this idea is used to determine the intention of a query and then trigger different user experiences.


Sponsored by: