MIT researchers have taught a computer to predict the actions of people better people themselves

At MIT, we developed the algorithm for analyzing data sets Data Science Machine, able to choose from the available parameters relevant to the prediction of future trends. According to the results of tests he walked in the accuracy of forecasts, most people before that were set similar tasks. Reported MIT News.

To test the system MIT researchers conducted three separate competitions in which participated more in addition to the computer 906 teams formed out of people.

Data Science Machine overtaken 615 of them: in one case, the accuracy of the prediction algorithm was 96% of that leader, in the second - 94%, in the third - 87%. This “human” teams made up their techniques to predict in a few months, and Data Science Machine did it for a period of two to 12 hours.

In one case, the command should have, based on data about visits to the site MIT student to determine the probability of whether the student be dismissed from the institute for the next ten days. It was found that the main factor was how long before the deadline the student begins to work on a task, and how much more time he spends on the site, studying a particular subject than his classmates. Site MIT did not write down these figures directly, but Data Science Machine - and other participants in the experiment - managed to get them through the analysis of the entire data set.

In another competition needed to predict the effectiveness of the placement of wind power plants on the basis of data from meteorological stations. Data Science Machine was able to get three times more accurate prediction of power plants in the two years ahead than do professional consultants, analysts in the field of energy.

The main task of Data Science Machine is a selection of the most important variables for the study. Programmers do not set them manually: the algorithm determines their own, by analyzing the correlation between the data and using the machine learning.

For example, the disposal system may be multiple databases with records on certain purchases. In one of them can be two columns: product number and its price. In another - a shopping list of a particular buyer.

Comparing these two databases, the system will detect the same articles and build a relationship: it is based on the Data Science Machine will calculate the total amount of the order, the average check, the minimum check and other variables that will help in future predictions. Subsequently, the algorithm sorts and combines these variables, making predictions based on a small data set and gradually improve its accuracy.

19 October 2015

