A lady pregnant with
twins gets a call from the local supermarket for a 50% offer on baby items on
Thursday between 10 o’clock in the morning to midday. On arriving at the
supermarket, she is lead to a section where all the baby items suited for twins
were arranged carefully. She has an easy time piking the necessary items and
heads to the till. As she walked out, she asks “How did you know I was going to
give birth to twins?” The manager informs her there they had noticed her
shopping trends; she would buy baby items in twos in all the supermarkets of
the chain. May be your favourite supermarket might do this for you soon. This
in essence is known as customer profiling from Big Data analysis.
Citing an example
closer home, M-Shwari uses a credit score algorithm based on your credit
history as well as usage of M-PESA and other Safaricom products, to determine
your loan eligibility and maximum loan amount. In Tanzania, Vodacom Tanzania
has partnered with a New York based company, First Access, to enhance financial
inclusion. First Access is a data analytics company that can predict risk for
consumers who have never had a bank account or a credit score, using their
prepaid mobile data. First Access offers the first instant risk-scoring tool
for financial institutions serving low-income customers. Scores are authorized
by subscribers in real time via text message through an approval system that
First Access has pioneered The scores are based on demographic, geographic,
financial and social network data from subscriber mobile records. Delivered in
real time to any participating financial institution, they include a
recommended loan size in the local currency and eligibility for instant disbursal.
In some cases, these analytics can be used to
prevent calamities. During the recent Ebola outbreak in West Africa, Orange
Telecom in Senegal handed over anonymised voice and text data from 150,000
mobile phones to Flowminder, a Swedish non-profit organisation, which was then
able to draw up detailed maps of typical population movements in the region. Authorities
could then see where the best places were to set up treatment centres, and more
controversially, the most effective ways to restrict travel in an attempt to
contain the disease. In developed countries, Port, train and flight data, and
number plate recognition, can all help track potentially infected people and
identify who they may have come into contact with an infected person.
To most of us, Big data is an all-encompassing term for
any collection of data sets so large and complex that it becomes difficult to
process using traditional data processing applications. Every day, we leave
traces of data on social media, telephone calls, bank transactions, mobile
money transactions, hospital visits, credit card transactions, supermarket
tills, fast food restaurants, online trading and many more. The sheer size of
data generated on a daily basis is amazing. Both Google and Ebay generate
approximately 100 petabyte, each to 100,000 Terabytes, of data per day. This is
in comparison to Facebook at 600 Terabytes and Twitter at 100 Terabytes per
day. The presents challenges such as analysis, capture, curation, search,
sharing, storage, transfer, visualization, and privacy violations.
In today's environment of data abundance and
frequent data overload, the ability to discover unique insight enables
organizations to improve decision making, resulting in the ability to take
advantage of opportunities, minimize risks, and control costs using Big Data
analytics. Big data analytics is about asking new questions, formulating new
hypotheses, improve customer experience, exploration and discovery, and making
data-driven decisions. Big data analytics is about bringing together many
different data sources and mining them to find patterns. The first task is to
break down data silos to access all data an organization stores in different
places and often in different systems. A second task is to create platforms
that can pull in unstructured data as easily as structured data. Specialized
software tools and applications are then used for predictive analytics, data
mining, text mining, forecasting, and data optimization.