by Sue Poremba on 22/05/12 at 4:35 pm
How big is big data?
In a single word: huge.
Scientists studying big data have moved far beyond terms like gigabyte or terabyte. Now they use words like exabytes (20 zeros) and quintillion.
In their paper “The World’s Technological Capacity to Store, Communicate, and Compute Information,” (PDF) researchers Martin Hilbert and Priscila López found that the world’s technological per capita capacity to store information has roughly doubled every 40 months since the 1980s. IBM added: “Every day, we create 2.5 quintillion bytes of data–so much that 90 percent of the data in the world today has been created in the last two years alone.”
According to Gartner, big data covers three dimensions: volume, velocity and variety. Big data comes in one size – large – and much of it is time sensitive and is used as it comes. Big data also comes in a variety of forms, from text to audio to video, and it comes from sources like email and documents, cell phones and GPS, vinyl records and digital cameras, and more.
Hilbert and Lopez consider the digital age to have begun in 2002, the year when digital storage overtook analog storage. The research, however, covered the years 1986 to 2007 — the point when, as the researchers point out, 94 percent of our memory was stored in a digital format.
Also, according to the researchers and posted on Physorg.com:
- “In 2007, humankind successfully sent 1.9 zettabytes of information through broadcast technology such as televisions and GPS. That’s equivalent to every person in the world reading 174 newspapers every day.”
- “In 2007, all the general-purpose computers in the world computed 6.4 x 10^18 instructions per second, in the same general order of magnitude as the number of nerve impulses executed by a single human brain. Doing these instructions by hand would take 2,200 times the period since the Big Bang.”
Perhaps surprisingly, the overall numbers have not changed much since 2007. According to the fifth annual IDC Digital Universe study released last June, 1.8 zettabytes of data was expected to be created in 2011. Or, as put in perspective by a Computerworld article, “the equivalent to every U.S. citizen writing 3 tweets per minute for 26,976 years.”
However, this number is expected to increase considerably over the coming decade, with the IDC predicting that overall data will grow by 50 times by 2020 as we generate even more ways to create data. For now, the bulk of the data – about 90 percent — will be unstructured, such as email and video.
And yet, in comparison to other big numbers in the natural world, the numbers in big data are but needles to larger haystacks. The amount of big data generated is still less than 1 percent of the information stored in a human being’s DNA.
About the Author: This post is by Rackspace blogger Sue Poremba. Rackspace Hosting is the service leader in cloud computing, and a founder of OpenStack, an open source cloud operating system. The San Antonio-based company provides Fanatical Support to its customers and partners, across a portfolio of IT services, including Managed Hosting and Cloud Computing.