I work in the areas of:
• big data: deriving insights from large amounts of data
• human data interaction: making our interactions with data easier
• impactful domains: creating impactful data science applications
Our Copula based Wind Resource Estimation work is covered by MIT News. To read about it:
"Siting wind farms more quickly and cheaply" See MIT news here
"Abnormal" New Wind Farm System Chops Months off Timeline
Tina Casey, CleanTechnica.
"New Wind Speed Prediction Method Could Lead To Cheaper Wind Farms",
Dianne Depra, Tech Times.
The IJCAI paper is available here (Results using sensor data from the top of MOS, Boston)
A previous NIPS paper is here (This paper shows the results on data collected from Indiana, Maine and Nebraska)
Our MOOC dropout/stopout prediction work is covered by MIT News and different outlets. To read about it:
See news here
Coverage in campus technology
Coverage in Information week
A previous blog on edX
Current paper on Transfer Learning can be accessed here
Previous papers leading upto the Transfer learning approach:
Feature Factory: Crowd Sourcing Feature Discovery.
Kalyan Veeramachaneni, Una-May O'Reilly, Kiarash Adl. WIP session at ACM Learning @Scale, 2015.
Likely to stop? Predicting Stopout in Massive Open Online Courses
Colin Taylor*, Kalyan Veeramachaneni, Una-May O'Reilly, arXiv report, August, 2014
Towards Feature Engineering at Scale for Data from Massive Open Online Courses
Kalyan Veeramachaneni, Una-May O'Reilly,Colin Taylor, arXiv report, July 2014
Bio: Kalyan is a Research Scientist in the Computer Science and Artificial Intelligence Laboratory (CSAIL,MIT). His primary research interests are in machine learning and building large scale statistical models that enable discovery from large amounts of data. His research is at the intersection of Big data, machine learning and data science. He co-leads a group called Any Scale learning for all. The group is interested in Big data science and Machine learning, and is comprised of 20 members: postdoctoral fellows, graduate (MEng, S.M., and Ph.D), and undergraduate students.
Current research: During the past three years I have set out to answer a seemingly simple question: “why does it take so much time to process, analyze and derive insights from data?”. I ventured into a number of domains (Education, Medicine, and Energy) and designed several novel approaches. This has allowed me to identify critical issues at the very foundation of the way we interact with, work around barriers and materialize insights from data. Consequently, I have founded multiple long term projects with a vision of making human interaction with data easier. In addition to simply scaling machine learning approaches, novel approaches, systems were required. These novel methods include scaling of processes that have “human-in-the-loop”, identification and storage of intermediary pre-processed data structures for re-use, and the creation of interfaces to exploit such intermediate structures. Ultimately, this has led me to design approaches and methods for automating much of the data science endeavor.
I currently lead projects with three major research themes.