Every day, private companies generate large amounts of data and the public sector makes available to everybody information that can be used to discover and create new applications. If you add the potential of cloud computing, a new world full of possibilities for developers arises.
However, although it may seem that the Big Data phenomenon is a recent boom, the truth is that the processing of large volumes of data has been around for decades. The problem was that these tasks required dedicated hardware, such as supercomputers, and, of course, it was very expensive. In addition, special software and the work of developers with different programming and analytical skills were also needed.
Over the years, the collection and analysis of large amounts of data has become a more affordable and common task for companies, generating an increasing demand of systems and programming environments able to run on basic products operated by programmers and analysts without extraordinary skills. Nowadays there are many options to manage information and new ways to capture the value of data.
Because of this, and according to AppCircus (a global platform supporting the development of mobile apps), over the next few years we will see a surge in applications that offer new and better functionalities to users based on the analysis of Big Data, such as systems with increasingly accurate music recommendations, very detailed meteorological data or a customised mobile experience depending on the user’s preferences.
As the variety and volume of data grows, so do the opportunities to create new applications leveraging the Big Data ecosystem. This is evidenced by competitions like Innova Challenge Big Data, which aims to find, with the collaboration of the developers’ community, new and innovative ways to harness the possibilities offered by massive data processing. We can already see some examples made reality, such as the winner of the last edition, Qkly, a trip planner which reduces the waiting time in stores or payment services. This application uses information from the API of BBVA (who opened its data for this competition) in order to automatically produce forecasts of massification, showing customers estimates over at what times of the day a particular place is more crowded, and allowing to create instant optimised schedules to avoid queues.
According to AppCircus, “the mobile applications sector is offering users increasingly useful, simple and fast apps, a process in which a key determinant will be the ability of developers to implement solutions allowing to systematize and analsze data from several sources in order to convert it into useful information through Big Data tools.” This trend will continue to increase in 2014.
Developers have the need, more than ever, to have access to analysis tools within Big Data development environments. Here's a small selection of the many tools available.
It may be perhaps the most popular tool. Hadoop is a Java-based framework that allows distributed processing of large data sets across groups of computers that use simple programming models. Rather than relying on hardware for high availability, this software library is designed to detect and handle failures at the application layer, so delivering a highly-available service on top of a cluster of computers, each of which may be prone to failures.
Google BigQuery and Prediction API
Other tools you may find helpful are Google BigQuery and Prediction API. The first one allows querying large volumes of data in real time. You can access BigQuery using a browser or a command-line tool, or making calls to the REST BigQuery API using a variety of client libraries such as Java, PHP or Python. There are also several third-party tools that can be used to interact with BigQuery, such as data visualisation tools.
Besides, Google Prediction API will help you differentiate a large number of patterns and feedback in your application. For example, through this tool Google offers spam detection, recommendation engines and sentiment analysis. In addition, it also tells you all the steps you must follow to include them in your app.
This is a service for processing data transmission in real time on a massive scale. Amazon Kinesis allows you to collect and process hundreds of terabytes of data per hour from hundreds of thousands of data sources, so you can easily write applications that process information in real-time, from sources such as visits to websites, marketing and financial information, manufacturing tools and social networks, as well as measurement data and operating records.