Saturday, January 19, 2013

Are You Up to the Challenge? Big Data and SBP 2013 Conference in DC

For the past several years I was on the SBP (Social Computing, Behavioral-Cultural Modeling and Prediction) Conference Committee and enjoyed working with a great group of colleagues with the grand finale being the conference itself. I had been responsible for the tutorials  and also helped to identify several keynoters over the years. Together, with Dr. Patrick Qiang, I also gave a tutorial. at SBP 2010.

This year, since I am on sabbatical and have been and will be out of the US a lot, I stepped down from the committee.

The SBP 2013 Conference will take place in Washington DC, April 2-5, and the keynoters are wonderful: Dr. Bernardo Huberman of HP, Dr. Michelle Gelfand of the U. of Maryland, and Dr. Myron Gutman of NSF. The tutorials should also be great.

But what really caught my interest, and the news is starting to circulate via various e-lists, is the Challenge Problem. with assistance from none other than Dr. Alex (Sandy) Pentland of MIT, who was one of the tutorial givers that I had invited for the  SBP 2011 Conference. He also was one of the plenary speakers at the Northeast Regional INFORMS Conference at UMass Amherst that I was involved in (and, I do admit, it was wonderful). Dr. Pentland had also spoken in our UMass Amherst INFORMS Speakers Series.

The deadline for the challenge problem is January 31, 2013, so time is tight!

Challenge Problem

Internationall Conference on Social Computing, Behavioral-Cultural Modeling, & Prediction

SBP is offering a challenge problem for the second time in 2013. We organize this challenge to encourage researchers to combine the power of big data and the power of systems thinking in the field of social computing, behavioral-cultural modeling, and prediction. These datasets are brought to you by the MIT Human Dynamics Laboratory, with special thanks to professor Alex (Sandy) Pentland and his research team.

Problem Description

Cell phones afford a convenient platform to advance the understanding of social dynamics and influence, because of their pervasiveness, sensing capabilities, and computational power. Many applications have emerged in recent years in mobile health, mobile banking, location based services, media democracy, and social movements. With these new capabilities, we can potentially identify exact points and times of infection for diseases, determine who most influences us to gain weight or become healthier, know exactly how information flows among employees and how productivity is affected in our work spaces, and understand how rumors spread.

There remain, however, significant challenges to making mobile phones the essential tool for conducting social science research and also support mobile commerce with a solid social science foundation. Perhaps the greatest challenge is the lack of data in the public domain. There is a need for data large and extensive enough to capture the disparate facets of human behavior and interactions. Another major challenge lies in the interdisciplinary nature of conducting social science research with mobile phones. Software engineers need to work collaboratively alongside social scientists and data miners in various fields.
In an attempt to address these challenges, we have worked with the MIT Human Dynamics laboratory to release several mobile data sets in "Reality Commons" that contain the dynamics of several communities of about 100 people each. We invite researchers:
  • To propose and submit their own applications of the data to demonstrate the scientific and business values of these data sets,
  • To suggest how to meaningfully extend these experiments to larger populations, or
  • To develop the math that fits agent-based models or systems dynamics models to larger populations. The problem itself will be open-ended and encourage approaches from different disciplines, encompassing a range of applications using this data, including:
  • Social network analysis
  • Data visualization
  • Simulation studies
  • Predictive modeling
  • Qualitative studies to supplement existing quantitative work
  • Creative new applications of the data

The DataSet

Data center dynamics: The data contain the performance, behavior, and interpersonal interactions of participating employees at a Chicago-area data server configuration firm for one month. It is the first data set to contain the performance and dynamics of a real-world organization with a temporal resolution of a few seconds. The sensor data were collected by Daniel Olguin, Ben Waber, Tamie Kim, and Alex Pentland in 2007 using Sociometric Badges.
Social Evolution in an undergraduate dormitory: The data contain surveys and sensor data about the diffusion of political opinions, diet, exercise, obesity, eating habits, epidemiological contagion, depression and stress, and political opinions from 70 residents of an undergraduate dormitory. These residents represent 80% of the total population.
Friends and Family dataset: The Friends and Family experiment was designed to study (a) how people make decisions, with emphasis on the social aspects involved, and (b) how we can empower people to make better decisions using personal and social tools. The subjects were members of a young-family residential living community adjacent to a major research university in North America. All members of the community are couples, and at least one person in each family is affiliated with the university. The community is composed of over 400 residents, approximately half whom have children. The sensor data in this data set were collected using the funf open-source sensing platform for Android phones.
Reality Mining dataset: Data contain the dynamics of 75 students/faculty in the MIT Media Laboratory, and 25 incoming students at the MIT Sloan business school adjacent to the Media Laboratory. The Reality Mining experiment conducted in 2004 was the first to study community dynamics by tracking a sufficient amount people with their personal mobile phones and resulted in one of the most complete mobile data sets with rich personal behavior and interpersonal interactions. Prior to this experiment, cell phones were not powerful enough to track people.

Data access

Data is accessible from
Users will be asked to fill out a short form and agree to privacy and data use restrictions.

Submission and Evaluation

Submissions Submissions will be evaluated based on theoretical grounding as well as use of evidence. Winners will be selected by an interdisciplinary committee of researchers and will be recognized at the conference with 1st, 2nd, and 3rd prizes. The winners will introduce the idea briefly at the conference to the audience and/or give a quick demo. Challenge organizers intend to organize winning entries into a special issue of an appropriate journal.
The submissions (6 pages) should be formatted according to the Springer-Verlag LNCS/LNAI guidelines. Sample LaTeX2e and WORD files are available from Submissions for the challenge can be made here.

Important Dates

  • Submission deadline: January 31, 2013 (23:59 PST)
  • Notification of Winners: March 2, 2013

Challenge Problem Co-chairs

Nitin Agarwal, University of Arkansas,
Wen Dong, MIT Media Lab,

Prior work using this data

A bibtex file of references to prior publications using this data is available for dowload (righ-click to 'save as'): sbp2013challenge.bib
BibTeX files can be read by most bibliographic software packages well as LaTeX. Many free converters are available.

2012 challenge problem winners

Congratulations to winners of the first SBP challenge problem:

  • Matthew Lease, School of Information, University of Texas at Austin "Discovering and Navigating Memes in Social Media"
  • Masoud Makrehchi, Research Scientist, Thomson Reuters, Toronto, Canada "Conflict Thermometer: Predicting Social Conflicts by Analyzing Language Gap in Polarized Social Media."