Let’s talk metadata. It’s a term I’d imagine most of us have at least heard, but what does it actually mean?
“Metadata” was hurled into the spotlight back in 2013. After Snowden revealed the extent of the US’s mass surveillance program, starting with a Guardian story about the gathering of phone records, the NSA higher-ups attempted to soothe the worried masses with assurances that they don’t listen in on calls, they “only gather metadata”. (Well, they tried to soothe the American masses. For those of us who aren’t US citizens, all of our call content is fair game.)
How soothed one should feel about someone “only” gathering one’s metadata entirely depends on what metadata actually is. And what someone can know about us based on that data.
And with that masterfully crafted segue, we arrive at our first subheading-slash-question:
The common description is that metadata is “data about data”. Which doesn’t really tell us very much, so let’s look at an example instead.
On Wikipedia’s metadata page there’s a picture of an ancient way of archiving information. I say ancient, but any of you who – like me – when you hear the phrase “can’t touch this” immediately start humming to yourselves “Daaa-da-da-da, da-da, da-da” are old enough to recognize this picture from visits to the library. I present: Exhibit A, card catalogues.
Drawer after drawer packed with index cards. On these cards there was information about all the books in the library. The cards didn’t contain the actual content of the books, rather they contained information about the books. The author, publication year, publisher, and so forth. Not the book itself, but information about the book. Metadata.
These days metadata is mostly generated and stored digitally. And we all generate it. Lots of it. Let’s take our cell phones as an example.
Regarding cell phone conversations, the conversation itself – what is said – isn’t metadata. But pretty much everything else you could think of is. For instance who you called, when you placed the call, how long the call lasted, and where you were during the call (which base station(s) your phone was connected to).
Imagine that every time you made a call you had to inform me of whom you called, what time you called them, how long you spoke for, and where in the world you where when you made the call. The only thing you wouldn’t have to tell me was what you talked about.
And then imagine that this goes for every single call you make or receive (I gather everyone else’s metadata, too). And I gather this information around the clock, all day long, year after year. And I store all the information – and have software to analyze it with.
I’m going to be able to work out a surprising amount of things about you. Who your family members are. Who you’re friends with. Who you work with. Who you spend time with, both privately and professionally. How often you call your mother (and if you remembered her last birthday). What your hobbies and interests are. If you make (or receive) any booty calls. And more.
And there were just your calls. We can riase the stakes by including the recipients of any text messages and the times they were sent. (And I won’t even go into the staggering amount of further data I would have if I didn’t stop at your cell phone: e-mails, surfing habits, online purchases, what apps you use and what metadata they gather, etc.)
I don’t actually gather anyone’s metadata. So I can understand how you might be sceptical about my interpretation of how much I could learn about you based on metadata alone. Let’s turn, instead, to a quote by someone who should know a thing or two about it: Stewart Baker, former General Counsel at the NSA (source: Rusbridger in nybooks.com).
“Metadata absolutely tells you everything about somebody’s life. If you have enough metadata you don’t really need content.”
If you want to read more about (meta)data and privacy, I can highly recommend Bruce Schneier’s book Data and Goliath. One of the things Schneier covers in the book is some interesting research into metadata, in which researchers were given access to a chunk of metadata (with pesmission from the people involved in the study). Based on that data alone, they were able to identify things like that there among the participants was one with medical problems, one that had recently had an abortion, and one that had a (no-longer-quite-so-secret) cannabis-growing set-up. For more details, you can find the study on Web Policy. I’ll conclude with the study’s final remarks by the researchers: phone metadata is highly sensitive.
So the next time you hear someone defend the gathering of your information by saying that they are “only gathering metadata”, remember that metadata is far from “only” metadata.
In Metadata Equals Surveillance, Schneier compiles a list of just some of the many articles arguing against trivializing metadata. For more, start with these pieces by Wired, The Guardian, The Electronic Frontier Foundation, Techdirt, The American Civil Liberties Union, The New Yorker, Ars Techinca, and the Cato Institute.