Posts by Colin Fleming
-
Just to be clear, a yottabyte was only bandied around at the very beginning of the reporting on all this, Binney's estimate was 5 zettabytes (2-3 orders of magnitude less). 5 zettabytes still doesn't seem to be practical at the moment but may be within, say, a decade or two.
All this is mostly a distraction from Keith's original article though, and one thing is pretty clear - the NSA is about to get a truckload of storage, no legal restrictions on recording anything that NZers do online and the GCSB has at least some level of access to the data.
-
OnPoint: BTW, the NZ Police can use…, in reply to
Interesting, I hadn't seen that, thanks. I think that it's likely that the NSA has a much higher data density than private organisations - Google, for example, has much more efficient data centres than most private organisations because they design everything themselves, and it's reasonable to assume that the NSA does the same, but you're right, you wouldn't get four to six orders of magnitude there. Reading around a bit it seems like the yottabyte figure which was bandied around at the beginning was probably total signal data to be processed, not stored.
It's still pretty clearly their end objective though, even if technically they can't do it right now. The operational life of a centre like this is probably 30 years, and it's reasonable to assume that 64TB drives will be commonplace in, say, 10 years. The amount of information they're interested in storing (human to human communication, basically) is also probably a pretty small percentage of all internet traffic.
-
I actually posted about this over on Pundit with respect to John Key's assertions that the GCSB bill won't allow wholesale spying - I'll paste it below because I think it's relevant (and I can't seem to link to a comment on Pundit):
I'm a little late to this discussion, not least because the GCSB bill is now sadly law. But I'd like to thank everyone involved for some unusually level-headed discussion on this topic and I think it's worth continuing the debate - maybe we'll get to argue for it to be repealed or amended one day.
I want to talk a little about the technical capabilities of current surveillance organisations, especially the NSA, and why this potentially makes many of John Key's assurances worthless. He has stated several times that the GCSB will not be spying wholesale on the NZ public, but as far as I know has refused to answer questions on whether the GCSB will face any legal restrictions on data on New Zealanders obtained from our intelligence partners, and the bill (again, as far as I know) contains no clarification on this.
The NSA has recently been caught out several times using interesting interpretations of words like "surveillance". Their current stance seems to be that they can collect data on everyone but it's not considered surveillance until a human looks at it. There's very little clarity on all this of course, since it's all secret. However any legal niceties there are only related to Americans - they have absolutely no restrictions at all on their ability to store anything and everything on New Zealanders. Lt Gen Keith Alexander (head of the NSA) was quoted in one of the GCHQ documents leaked by Snowden as saying "why can't we collect all the signals, all the time?" and this is clearly the NSA's intention. Their new data centre in Utah is estimated by William Binney (ex-NSA operative turned whistleblower) to store 5 zettabytes - this is sufficient storage to store all worldwide internet traffic for about 7.5 years. Of course, they can probably fairly easily skim out youtube and porn, which leads Binney to conclude that the new data centre (only one of two they are currently building) would be capable of storing all internet human-to-human communications for over 100 years and have plenty of space left over.
So unless we can get some legal protection to prevent the GCSB obtaining our data from the NSA, I think it's reasonable to assume that within, say, one to two years, all our online communications and activity will be stored and are probably accessible to the GCSB.
I'm no expert on data mining and machine learning, although I am a software developer and understand it reasonably well. It's clearly incredibly powerful technology that could be used by law enforcement to do real good - given a single contact known to be a terrorist you can easily identify all their known associates and anyone they have communicated with online, ever. Using the storage I described above you could then go back and listen to all those communications, or more likely have them automatically analysed to identify suspicious keywords. The state of the art in voice transcription is actually pretty good now (my Google Voice account in the US sends me transcriptions of voice messages left for me by email - it's surprisingly good), and it's reasonable to assume that the NSA's state of the art is way ahead of even Google - they're by far the biggest employer of mathematicians in the US.
The main problem is that to look for patterns in the data you do need all the data and I think these capabilities are just too powerful for governments to refuse - sooner or later total collection will happen whether we like it or not. John Key's assertions that the GCSB would require a lot of analysts to look at any mass collected data are just not true these days and it may not be long before these agencies don't need analysts at all to detect keywords of any type in any kind of communication, and they'll be able to do this for any of your communications since total collection began. My personal suspicion is that the NSA will collect everything on everyone, then analyse the data automatically and then flag the results of that analysis to an analyst who would then get a warrant to look at the data. Note that this is a fairly significant change - the warrant would no longer be for future communications but past and future communications.
Unless we can get a lot more clarity on the relationship between the GCSB and the NSA/GCHQ and on the legal restrictions on data obtained from them, John Key's assertions are basically meaningless. What's interesting is that, like tax law, purely national laws are becoming increasingly irrelevant as technology advances, and the law is clearly incapable of keeping up with the new surveillance techniques.
It's a frightening time.