Friday, May 24, 2013

"I added your numbers, and I have no idea what they are."

So it turns out there's a thesis arguing that polynomial-time, fully homomorphic encryption is possible. (Link is to the dumbed-down -- but still journal-published -- version that mortals like me are capable of understanding.)

It's hard to understate the significance of this. This means that it's possible for you to give someone your data in encrypted form, and for them to execute arbitrary operations and give it back to you, without ever knowing what the data is. That is, they transform an input ciphertext to and output ciphertext such that when you decrypt the output, you have the answer to your query about the data, but at no point did they decrypt it or learn what was inside.

In other words: "I just calculated the sum of the numbers you gave me, but I have no idea what the sum is, nor what any of the numbers are."

If it sounds impossible, it's not because you misunderstand it, but because that kind of thing shouldn't be possible -- how can you perform arbitrary operations on data without learning something about it? Sure, maybe there are edge cases, but a rich, Turing-complete set?

It would mean that "the cloud" can ensure your privacy, while *also* doing useful operations on your data (as the author, Craig Gentry, goes at great length to emphasize).

As best I can tell from the paper, here's the trick, and the intuition why it's possible:

1) The computation must be non-deterministic -- i.e. many encrypted outputs correspond to the correct decrypted output. This is the key part that keeps the computation provider from learning about the data.

2) The output must be fixed size, so you have a sort of built-in restriction of "limit to the first n bytes of the answer".

3) It does require a blowup in the computational resources expended to get the answer. However, as noted above, it's only a polynomial blowup. And thanks to comparative advantage, it can still make sense to offload the computation to someone else, for much the same reason that it makes sense for surgeons to hire secretaries even when the surgeon can do every secretarial task faster. (Generally, when the provider's opportunity cost of performing the task is less than yours.)

4) Finally, to be fully homomorphic -- capable of doing every computation, not just a restricted set of additions and such -- the encrypted computation has to find a way around the buildup of "noise" in the computation, i.e. properties of the output that put it outside the range of what can be decrypted (due to exceeding the modulus of the operation needed to extract the output). And to do that, in turn, its operation set must be sufficient to perform its own decryption.

I'm only about halfway through the paper, but it's been really enlightening to get the intuition behind why this kind of thing can work.