For organizations of all sizes, data management has shifted from an important competency to a critical differentiator that can determine market winners and has-beens. Fortune 1000 companies and government bodies are starting to benefit from the innovations of the web pioneers. These organizations are defining new initiatives and reevaluating existing strategies to examine how they can transform their businesses using Big Data. In the process, they are learning that Big Data is not a single technology, technique or initiative. Rather, it is a trend across many areas of business and technology.
Big Data refers to technologies and initiatives that involve data that is too diverse, fast changing or massive for conventional technologies, skills and infrastructure to address efficiently. Said differently, the volume, velocity or variety of data is too great.
But today, new technologies make it possible to realize value from Big Data. For example, retailers can track user web clicks to identify behavioral trends that improve campaigns, pricing and stockage. Utilities can capture household energy usage levels to predict outages and to incent more efficient energy consumption. Governments and even Google can detect and track the emergence of disease outbreaks via social media signals. Oil and gas companies can take the output of sensors in their drilling equipment to make more efficient and safer drilling decisions. "Big Data" describes data sets so large and complex they are impractical to manage with traditional software tools.
Specifically, Big Data relates to data creation, storage, retrieval and analysis that is remarkable in terms of volume, velocity, and variety:
Volume. A typical PC might have had 10 gigabytes of storage in 2000. Today, Facebook ingests 500 terabytes of new data every day; a Boeing 737 will generate 240 terabytes of flight data during a single flight across the US; the proliferation of smart phones, the data they create and consume; sensors embedded into everyday objects will soon result in billions of new, constantly-updated data feeds containing environmental, location, and other information, including video.
Velocity. Clickstreams and ad impressions capture user behavior at millions of events per second; high-frequency stock trading algorithms reflect market changes within microseconds; machine to machine processes exchange data between billions of devices; infrastructure and sensors generate massive log data in real-time; on-line gaming systems support millions of concurrent users, each producing multiple inputs per second.
Variety. Big Data data isn't just numbers, dates, and strings. Big Data is also geospatial data, 3D data, audio and video, and unstructured text, including log files and social media. Traditional database systems were designed to address smaller volumes of structured data, fewer updates or a predictable, consistent data structure. Traditional database systems are also designed to operate on a single server, making increased capacity expensive and finite. As applications have evolved to serve large volumes of users, and as application development practices have become agile, the traditional use of the relational database has become a liability for many companies rather than an enabling factor in their business.