Feature Labs Unveils New Language to Help Data Scientists Define Prediction Problems in Big Data Sets

Defining a good prediction problem is a critical step before starting the label-segment-feature process of big data analysis, since this process can take 6-8 months to complete. Data scientists and domain experts typically spend a couple months defining prediction problems with that in mind.

But what if computers could define prediction problems in just days instead of months? A new company spun off from MIT’s Laboratory for Information and Decision Systems introduced such a language called Trane that promises to do just that.

MIT News describes the commercialization of data analysis research started in the lab of Kalyan Veeramachaneni. Along with master’s graduates Max Kanter and Benjamin Schreck, the trio formed Feature Labs.

The group introduced a marketing example: predicting whether a customer would buy a new product based on the person’s three most recent purchases. To use Trane, the data scientists set up time-series data in tables with columns representing measurements and the times they were made. Row operations compared the measurements using mathematical parameters.

Trane then exhaustively looks through combinations of such operations and determines what questions can be asked of the data. To test its utility, the researchers limited the number of sequential operations that could be performed on the data set to five. The operations were drawn from a set of 11 columns and six rows.

Even with that limited data set, the language reproduced every question that the scientists asked and then hundreds of others that they had not. The use of Trane should enable domain experts to specify their problems much more precisely according to Kiri Sagstaff, an AI and machine learning expert from NASA who was not involved with the research.


©2018 https://www.datasciencegraduateprograms.com All Rights Reserved.