Feature Labs Unveils New Language to Help Data Scientists Define Prediction Problems in Big Data Sets

Defining a good prediction problem is a critical step before starting the label-segment-feature process of big data analysis, since this process can take 6-8 months to complete. Data scientists and domain experts typically spend a couple months defining prediction problems with that in mind.

Featured Programs:

Sponsored School(s)

UC Berkeley

Featured Program: Master of Information and Data Science Online - Bachelor's Degree Required.

Request Info

Syracuse University

Featured Program: M.S. in Applied Data Science: GRE Waivers available | Master of Information Management Online

Request Info

Grand Canyon University

Featured Program: B.S. in Business Information Systems and M.S. in Data Science

Request Info

But what if computers could define prediction problems in just days instead of months? A new company spun off from MIT’s Laboratory for Information and Decision Systems introduced such a language called Trane that promises to do just that.

MIT News describes the commercialization of data analysis research started in the lab of Kalyan Veeramachaneni. Along with master’s graduates Max Kanter and Benjamin Schreck, the trio formed Feature Labs.

The group introduced a marketing example: predicting whether a customer would buy a new product based on the person’s three most recent purchases. To use Trane, the data scientists set up time-series data in tables with columns representing measurements and the times they were made. Row operations compared the measurements using mathematical parameters.

Trane then exhaustively looks through combinations of such operations and determines what questions can be asked of the data. To test its utility, the researchers limited the number of sequential operations that could be performed on the data set to five. The operations were drawn from a set of 11 columns and six rows.

Even with that limited data set, the language reproduced every question that the scientists asked and then hundreds of others that they had not. The use of Trane should enable domain experts to specify their problems much more precisely according to Kiri Sagstaff, an AI and machine learning expert from NASA who was not involved with the research.