Predicting Events and Data Mining - Forward to the Future
An interesting open source information monitoring service appeared on the Web - Recorded Future .
It allows you to accumulate information from more than 150,000 different media with the ability to store an archive of up to 5 years with the possibility of subsequent analysis and extraction of knowledge about the possible consequences of the incident and future events. ')
The author of the service is Chris Holden, who kindly offered us to use Recorded Future without making payment, although the full functionality is available only on a commercial basis.
For example, now the service carries out continuous monitoring of more than 8,000 political leaders from various countries of the world, allowing you to keep track of where and why a famous figure will go. Sometimes, a good analytics of these events allows you to establish relationships in international relations and predict the most likely models of their development by analyzing the travel history of the selected person.
The most interesting cases demonstrating the capabilities of the system are reflected in the following application examples:
The application of the service has wider boundaries than the use in analyzing the geopolitical situation, terrorism and protest activity. It is successfully suitable for monitoring corporate news, information on competing companies, their products and the mechanisms of their consecration in the press.
Analytics allows you to track events related to the emergence of any new technology, entering into contracts, changing members of the board of directors or key persons of the company, which is already a very powerful and convenient analytical tool with the ability to evaluate emotional coloring (positive, negative) :
Futures - “What Apple has outlined for 2012/2013 year”
The forecast of protest activity in August 2012 against the Russian Federation
Example of creating a query (Python):
import urllib, json, datetime, zlib, sys, time defquery(q, usecompression=True):""" JSON- """try: url = 'http://api.recordedfuture.com/ws/rfq/instances?%s'if usecompression: url = url + '&compress=1'for i in range(3): try: data = urllib.urlopen(url % urllib.urlencode({"q":q})) if type(data) != str: data = data.read() if usecompression: data = zlib.decompress(data) breakexcept: print >>sys.stderr, "Retrying failed API call." time.sleep(1) res = json.loads(data) if res['status'] != "SUCCESS": print >>sys.stderr, "Error",str(res['errors']) return res except Exception, e: print str(e) return {'status': 'FAILURE', 'errors': str(e)}
The idea used in the service is very simple - from all sources there are dates in various notations (numeric, symbolic), after which the events that are assigned to them are recorded. At the same time, it is analyzed when exactly this event will occur (“soon”, “in a few months”, “in the distant future”). The service constantly sends updates on the most interesting areas for tracking:
Processing this information falls on the programmer’s shoulders, with the exception of evaluating “positive” and “negative”. The use of such a resource allows you to create a sufficiently powerful and effective tool for competitive analysis and be used for BI purposes.