BBVA API Market
It is a complicated task to manage to take advantage of Dark Data. The first step is to identify what data a business has stored that are not being analyzed, the second is to try to foresee the potential of such data before endeavoring on development work to extract them.
Developing our own utilities that perfectly fit what we need can be an excessively intensive task, especially if we are not able to see what the final value that we will be able to get will be, either in terms of immediate monetary revenue or added value for other parts of the business. Fortunately there are multiple tools and APIs to work with and immerse yourself in this mass of data.
A clear example of Dark Data is the content of videos that many platforms host. Usually the analysis focuses on the metadata surrounding the video such as the title, date, duration or tags generated or applied by humans.
With OpenWhisk you can analyze the content within each of the videos’ scenes. It does this by extracting individual shots and, in parallel, it identifies what happens in each of them: who appears, what texts there are, what is represented, what objects can be seen, and so on.
This is what IBM calls Dark Vision. Once the data concerning each of the video’s scenes is obtained, the level of improvements and possibilities increases exponentially.
Scholars from Stanford University in California created DeepDive, another system to extract data in a structured way. The main advantage of DeepDive is that it creates SQL tables with data extracted from documents. The platform has been used to categorize a totally disorganized corpus of data by several universities and research groups, with surprising results.
It is a qualitative leap compared to other platforms and software based on the initial manual identification of the data. DeepDive automates much of the process with machine learning. It allows the group in charge of the analysis to define the objectives to be achieved instead of scheduling concrete and specific tasks. Once these objectives are clear, the system will begin analysis and extraction.
The developers of DeepDive have left room for inaccuracies and to understand ambiguous data. For example, it understands that two terms are the same even though one contains spelling mistakes.
Experts in Dark Data say the first step is “restoring the context”. Starting to analyze each piece of data by emulating the situation prior to it being stored. These techniques can serve to greatly improve the future success of the analysis.
Each business is different, and the Dark Data generated by a bank is very different from a law firm or anyone with a social network or an e-commerce site. Managing to “light up” dark data has many challenges at a technical level, and solutions can range from applying a better methodology to the existing development to hiring a specific disciplinary team if it is predicted that the hidden value is huge.
In fact, the best situation is for the data to always remain structured from when they are gathered and preventing them from becoming Dark Data due to technical negligence. If the technical resources are in place, no data should be given as lost once stored.
Various case studies are used to show how open finance enables the financial inclusion of SMEs and the economic growth of developing regions.