Friday, September 21, 2012

What We Learned from a Big Data Decathlete and Isenberg Alum

Today, the UMass Amherst INFORMS (Institute for Operations Research and the Management Sciences) Student Chapter had the pleasure of hosting Dr. Davit Khachatryan, of  PricewaterhouseCoopers (PWC) from Arlington, Virginia.

He received a PhD from UMass Amherst with a concentration in Management Science at the Isenberg School of Management just 2 years ago and has worked on some fascinating projects.

His presentation today at the Isenberg School, "Show Me the Data: My Experience as a Statistical Consultant," captivated the audience, which included undergrads, MBAs, and graduate students from both the College of Engineering and the Isenberg School.

Big data is a very hot topic these days not only because of the volumes that are being captured from RFID technology, ATMs, cell phones, social media, etc., but also the cost of storing data has really dropped from only a few years ago, so huge amounts of data are now available for analysis, if one can make sense of the data.

He talked about the cost of working with real data is that real data is dirty and that a large part of the time devoted to a project typically entails cleaning up and organizing the client's data so that one can then get to the interesting work of modeling and predictive analytics.

Clients now want reusable, robust, and automatic solutions, rather than a 1 time solution. He spoke a lot about using SAS and SQL and the importance of computer programming.

In addition, he spoke about testing models that clients use and the existence of  "model governance" boards in corporations that check the models.

The applications that he discussed (anonymized, for obvious reasons) included two in healthcare and one in finance -- catching whether rules that are in place work for "anti-money laundering" for a bank. Interesting, when he programmed the rules and ran the model over a long time horizon's worth of data the results differed from what the bank had caught in terms of such transactions. I loved the idea of forensics in this application domain.
Dr. Khachatryan emphasized the importance of managing relationships with clients, the importance of communication and writing skills, and also that a lot of work that needs to be done is not necessarily easy or fun.

His one hour presentation began at 2PM with a nice reception (and lunch) preceding it. I left at about 3:45 and the discussions were still going strong.

It was a terrific educational experience hearing about what it takes to be a data decathlete (knowing multiple regression, understanding time series, knowing about neural networks, being adept at computer programming, listening carefully to clients' needs,  building bridges with clients, and being a good team player).