Three LittleThings: Big Data in a Nutshell
Big data is a big topic these days. Everyone’s talking about it. Many of us worry about what someone might learn about us from information that we don’t know is floating around out there. I think it’s a good news/bad news thing.
Bad news: Yes, there’s a whole lot of publicly available, personal information about you that you’re probably not aware of.
Good news: Much more of the personal data about you–about 80%–is safely behind reasonably secure firewalls.
Bad news: The personal information lets companies target ads at you for goods and services you are most likely to be interested in. This may be good news, particularly if you see an ad for something that you happen to need or want.
Good news: There is so much information sitting in databases that companies are drowning in it. Outside of the government, Google, and a handful of other businesses, most have no idea how to mine these bits and bytes to extract useful information.
Here are three key things about big data you should know:
1. “Big data” isn’t new. Companies you buy from, organizations you belong to, and the government have all been collecting information about you for decades. What’s new is that you know they’re doing it. There’s an Internet archive called the Wayback Machine that’s been around for nearly two decades. Want an eye-opener? Search for yourself and see what comes up.
2. People are getting smarter about analyzing information, and computers are getting powerful enough to mash huge amounts of data quickly. We are able to make connections that we never could before. For a clever explanation of big data analytics, read Using Metadata to Find Paul Revere. It will give you some idea how online dating sites and LinkedIn decide who you should connect with. (There’s some math but it’s not bad. The modern cultural references lighten it up for you.)
3. The challenge with any kind of data, big or small, is being clear about what you want to learn, then defining the metrics for it accurately. Sounds like the same old problem we’ve been facing for a while, doesn’t it?
There you go–big data in a nutshell.