Big Data is the data that, in addition to being massive in size, is of a greater variety and complexity, and is generated at a high velocity. Collectively, these are referred to as the three Vs of Big Data. However, the concept is relative and highly dependent: the organizations that lack the ability to handle, store or analyze their own sets of data are in fact experiencing the Big Data phenomenon.
Big Data analytics, on the other hand, is a business strategy that uses technology to gain deeper insights into customers, partners, and businesses, and hence achieving competitive advantage. It involves working with data that, because of its size and variety, lie beyond the ability of typical database management systems to store, manage and analyze. This use involves two dimensions: one is Big Data, which is annotated with the three ‘Vs’: data volume, velocity, and variety. The other dimension is analytics, which refers to the ability to gain insights from data and making decisions by applying analytical methods from mathematics, statistics, data mining, machine learning, visualization, etc.
The emergence of Big Data is closely linked to advances in Information and Communication Technology (ICT). One indicator of such a link is the digital footprints that are left by people and things in forms like sensor data, commercial transactions, public and private records stored by companies, photos, videos, tweets, etc. which are considered the main sources of Big Data.
Organizations that want to develop promising Big Data projects need to be engaged in successful ICT plans across different sectors, the most important of which are:
- Cloud computing: The ability to run programs on many distributed computers at the same time, where users need only to pay for what they use. The service became popular after Amazon introduced its Elastic Compute Cloud (EC2) in 2006. In cloud computing, resources are shared by multiple users and allocated on demand. The goal is to maximize computing time and increasing efficiency. Cloud computing allows small companies to rent large-scale computational units to store and process their data, supported by the emergence of platform-as-a-service (PaaS) for Big Data that helps small companies deploy and administer their clusters with reduced prices and effort.
- Internet of Things (IoT): A complex network that seamlessly connects people and things together through the Internet. Theoretically, anything that can be connected (e.g. smart watches, cars, homes, thermostats, vending machines, servers…) will be connected in the near future by sensors and RFID tags. This allows networked objects to continuously send data over the Web and from anywhere. The first time the term was used was in 1999 by Kevin Ashton, the RFID expert. Three challenges are present when processing the data gathered from IoT: FIRST, data collection, which includes gathering data from different sources. SECOND: data storage, where collected data can be stored in a relational database, in the cloud or in a NoSQL database (e.g. MongoDB and CouchDB). THIRD and the toughest challenge is data analysis. How to extract value from the huge amount of collected data? This requires the development of applications that can analyze data for patterns, trends, critical points, etc. a time-consuming mission, particularly for real-time data processing.
- Artificial Intelligence. Artificial Intelligence (AI) is concerned with designing intelligent systems that exhibit characteristics associated with intelligence in human behavior. Areas stemming from AI include neural networks, time series prediction, classification, evolutionary computation, genetic programming, vision, robotics, expert systems, speech processing, planning, and natural language processing. What is important is that many of the AI methods mentioned above are used increasingly in processing the enormous amount of data (in Big Data analytics) in a rapidly changing world. Applying AI to Big Data analytics can help businesses make sense of data, better detect correlations between factors, deal with the speed at which information is changing, and gain insights from the information they have. This will finally improve their decision-making
- Networked systems. A large number of technologies (e.g. IoT, cloud services, and media distributions) that are involved in Big Data analytics are basically part of complex networked systems. For Big Data enterprises that experience increases in workloads both in size and complexity, networked systems will play a crucial role solving network traffic issues and ensuring that workloads are completed and insights are delivered in time.
- Mobile services. The mobility of persons allows us to encounter more people and visit many places which in turn affects the experiences we gather. Both the mobility of us and the devices that we use help in sending out the temporary locations that we visit and the situations that we engage in. We are no longer just in one place or doing just one thing at a time. Sensor industry will continue to benefit from mobility to build sensors that will be part of the wireless communication network to provide coverage in biology, air pollution, weather, moisture, and motion. These units will be installed on mobile devices, in the environment or attached to the body and will produce (with the help of embedded software and network systems) a large amount of data about body activities, user movements, or user interactions with other people.