Thursday, October 16, 2014

Making Use of Big Data


A lady pregnant with twins gets a call from the local supermarket for a 50% offer on baby items on Thursday between 10 o’clock in the morning to midday. On arriving at the supermarket, she is lead to a section where all the baby items suited for twins were arranged carefully. She has an easy time piking the necessary items and heads to the till. As she walked out, she asks “How did you know I was going to give birth to twins?” The manager informs her there they had noticed her shopping trends; she would buy baby items in twos in all the supermarkets of the chain. May be your favourite supermarket might do this for you soon. This in essence is known as customer profiling from Big Data analysis. 

Citing an example closer home, M-Shwari uses a credit score algorithm based on your credit history as well as usage of M-PESA and other Safaricom products, to determine your loan eligibility and maximum loan amount. In Tanzania, Vodacom Tanzania has partnered with a New York based company, First Access, to enhance financial inclusion. First Access is a data analytics company that can predict risk for consumers who have never had a bank account or a credit score, using their prepaid mobile data. First Access offers the first instant risk-scoring tool for financial institutions serving low-income customers. Scores are authorized by subscribers in real time via text message through an approval system that First Access has pioneered The scores are based on demographic, geographic, financial and social network data from subscriber mobile records. Delivered in real time to any participating financial institution, they include a recommended loan size in the local currency and eligibility for instant disbursal.

In some cases, these analytics can be used to prevent calamities. During the recent Ebola outbreak in West Africa, Orange Telecom in Senegal handed over anonymised voice and text data from 150,000 mobile phones to Flowminder, a Swedish non-profit organisation, which was then able to draw up detailed maps of typical population movements in the region. Authorities could then see where the best places were to set up treatment centres, and more controversially, the most effective ways to restrict travel in an attempt to contain the disease. In developed countries, Port, train and flight data, and number plate recognition, can all help track potentially infected people and identify who they may have come into contact with an infected person.

To most of us, Big data is an all-encompassing term for any collection of data sets so large and complex that it becomes difficult to process using traditional data processing applications. Every day, we leave traces of data on social media, telephone calls, bank transactions, mobile money transactions, hospital visits, credit card transactions, supermarket tills, fast food restaurants, online trading and many more. The sheer size of data generated on a daily basis is amazing. Both Google and Ebay generate approximately 100 petabyte, each to 100,000 Terabytes, of data per day. This is in comparison to Facebook at 600 Terabytes and Twitter at 100 Terabytes per day. The presents challenges such as analysis, capture, curation, search, sharing, storage, transfer, visualization, and privacy violations. 

In today's environment of data abundance and frequent data overload, the ability to discover unique insight enables organizations to improve decision making, resulting in the ability to take advantage of opportunities, minimize risks, and control costs using Big Data analytics. Big data analytics is about asking new questions, formulating new hypotheses, improve customer experience, exploration and discovery, and making data-driven decisions. Big data analytics is about bringing together many different data sources and mining them to find patterns. The first task is to break down data silos to access all data an organization stores in different places and often in different systems. A second task is to create platforms that can pull in unstructured data as easily as structured data. Specialized software tools and applications are then used for predictive analytics, data mining, text mining, forecasting, and data optimization.