Not Just Metadata: The NSA and the content of your text messages

On January 17, the revealed startling new information about National Security Agency (NSA) bulk collection of mobile phone users’ text messages.

The Guardian published heavily redacted portions of a slide presentation made by the NSA for their British counterpart agency, the Government Communications Security Headquarters (GCHQ). The slide show detailed a series of programs known as DISHFIRE, SPYDER, and MILKBONE that collect not only metadata but also so-called meta-content. Meta-content refers to location, time and user data collected by examining content. This direct revelation of meta-content collection proves without a doubt that President Obama was lying when he claimed “No one is reading your emails.”

The slide presentation called the content of text messages a “rich data set, high impact,” that leads to “analytic gems.” As of 2011, over 194 million text messages were slurped up by the NSA's cyber vacuums every day. It can be assumed that this figure of 70.8 billion text messages per year has only increased. The sharing of data between the NSA and GCHQ allows both agencies to skirt the laws in each country on domestic spying – if they feel suddenly bound to respect them – which they apparently do not. The software likely to be behind the bulk scanning of content for so-called meta-content, such as the CIA-developed Metacarta and Recorded Future, has capabilities that are frightening. This software scans the meta-content and then attaches tags to the data set through a program code named PREFER.

The data garnered from text message meta-content includes missed calls, driving directions, names, electronic business cards, appointments, credit card transactions, flight departure notifications and all the other electronic hum-drum of a busy modern life. This meta-content is then distilled to paint a rich picture of a person's habits, schedule, friends, and travel patterns. The meta-data and meta-content are also intrinsically linked by both the unique identifier of the device and the unique identifier of the phone's Sim card. Altogether, these data sets tell a spy agency everything from your identity and itinerary to your social network and dining habits.

Observe the following hypothetical case: Siting in my office, I arrange a luncheon meeting with a Free Press advertising client at a nearby eatery, the Olde Towne Cafe. The client calls back and I miss their call. They then text me for directions from their office. I respond quickly and confirm the time of the meeting. We meet and I order lunch. I pay with plastic and the Olde Towne Café’s marketing plan sends a receipt via text to my phone, which is part of their discount plan if I charge five orders there. We eat, we talk, we leave and I text them a polite thank you. They respond that they will have the check to me via mail the following Wednesday and they would like me to confirm receipt via text at that time.

In this hypothetical case, meta-data would only yield that person A and person B exchanged a series of texts from nearby cellular towers, then from the same cell tower, then silence. The meta-content provided by PREFER and its related programs, which likely include MetaCarta and Recorded Future, paint a brighter picture of my hypothetical lunch meeting.

MetaCarta scans text documents including my text messages with natural language capability for place names. It does so with a very strong idea of context. The program can contextualize my use of the word “Madison” to realize I'm referring to Madison Street in Columbus, not Madison Avenue in New York and not Madison, Wisconsin. It can easily do this from a part of a text where I say “then turn right on Madison.”

Through the use of MetaCarta alone, the spy agency has pulled the locations of my office, my client's office and the location of our meeting from a single series of texts. Since I received a receipt that includes a promotional discount based on visits, it becomes clear that I eat at this place often, and now it is linked in MILKBONE database to me and vice-versa.

Recorded Future does the same thing with texts or tweets or web pages as MetaCarta does, except that Recorded Future contextualizes documents with time. It can extract that my client was running late from the content of our texts, when we were supposed to meet, and that there are two events (the mailing of the check and confirmation of receipt) the following Wednesday.

The bank accounts associated with the cards that we used to purchase our meals are now associated with our phones, and with each other. A great deal of information about the both of us has been generated for by using two programs that were developed by private industry with CIA funding from its non-profit venture capital firm In-Q-Tel.

The idea that billions of dollars have been spent over the years to determine who meets with their advertising clients over lunch has a chilling effect of free speech, free association, and civil society in general. What would be the effect on a free press or a free society if my hypothetical luncheon meeting had not been with a client, but as a reporter meeting with a source who wished to anonymously give me proof of war-crimes or government corruption?

Nearly every day, new revelations about omnipresent surveillance come to light and each day gives new evidence that cast doubt on the lies we are told. The Free Press will always continue to investigate and report, and in spite of all the impressive and expensive tools in the hands of the spy agencies, we still have the ability to protect our sources.

Date Originally Published: 
Wednesday, January 29, 2014
Article originally published at: 
Gerry Bello