The Official Secrets Act is not to protect secrets, it is to protect officials.
– Sir Humphrey Appleby, Yes Minister
Imagine that you work for an organization in which secrecy is cherished and enforced. A sensitive document which proves that your organization has engaged in unethical / illegal activities has recently been leaked to a major newspaper, thus provoking a scandal that blemishes your organization’s reputation. As expected, your organization is greatly displeased with the leak and, therefore, a team of plumbers is created in order to find the leakers. If you are a plumber, you would like to maximize your chances of finding the leakers, whereas if you are a leaker, you would like to maximize your chances of evading the plumbers.
Suppose you are a plumber. You set up a classical canary trap in order to expose the leakers: you give slightly different versions of a sensitive document to different people, you wait until one of those versions is leaked to the press, and knowledge of which version got leaked allows you to determine who the leaker is.
For example, suppose further that you have five suspects. Let be the set of five documents (five slightly different versions of an original document), and let denote the set of five suspects. A canary trap is thus a bijective mapping that assigns suspects to documents. Pictorially, this bijection can be represented by the following graph
Since a single suspect is assigned to each document (otherwise wouldn’t be a mapping), knowing which version is leaked allows the plumbers to find the leaker. Let us suppose that version is leaked to the press. Then, the plumbers immediately know that is the leaker, as depicted below
There is a problem, however. If the one who leaked the sensitive document to the press is not clueless, he will be conservative and assume that there is an ongoing counter-intelligence operation within the organization and, therefore, he will keep a low-profile and abstain from leaking any more documents. If none of the documents in are leaked, then the plumbers cannot determine who the leaker is. The canary trap fails.
The leaker can also steal someone else’s document and leak it to the press, so that the plumbers will hunt the wrong suspect. Note, however, that stealing other suspect’s version could leave an electronic trail (computer access is monitored, surveillance video cameras are everywhere) that could eventually serve to expose the actual leaker. Therefore, this countermeasure does not seem very wise to me.
Pursuit and Evasion
As the canary trap failed to expose the leaker, the plumbers take drastic measures and have the five suspects polygraph-tested. Suppose that all five polygraph tests were negative. The plumbers then conclude that either the leaker is not among the five suspects, or there’s at least one leaker among the suspects and he managed to cheat the polygraph test.
At this point, one may wonder how were the plumbers able to narrow down their search to five suspects only. Let us suppose that the information leaked to the newspaper could have come from three reports only, and that these three reports were available to five people (the same people who later became the five suspects, of course). The plumbers then build the following access graph
which assigns suspects to reports, so that one can know which suspects had access to a given report. Please do note that, contrary to what happened with the canary trap, one can now assign more than one suspect to each report! This suggests that we should use binary relations instead of mappings. However, I will exploit a mathematical loophole, and use mappings nonetheless. Let be the set of three reports, and let be the power set of (i.e., the set of all subsets of ). Let be the access mapping, that assigns sets of suspects to each report (note that the codomain of is , not ).
For example, suppose that the newspaper that published the leak included an excerpt from report in their article. The plumbers then know that the leaker is in the subset , as illustrated below
The hunt for the leaker has thus been further narrowed down, and now the plumbers have only three suspects. However, if the plumbers find evidence in the newspaper article that the leaked report was , then they are able to immediately determine that the leaker was suspect , as . A cautious, wise leaker might want to make sure he knows the access graph before leaking any document.
What exactly does one learn from this post? From the (admittedly simplistic and superficial) analysis above, one can conclude that the leaker should strive to maximally confuse the plumbers by leaking files that a lot of people have access to. The more suspects the plumbers have to deal with, the more time the leaker buys. Perhaps an innocent suspect will fail the polygraph test. Would the plumbers, in their own self-interest, crucify the innocent one in order to avoid a potentially long hunt for the true leaker? This question is rhetorical, of course.
The plumbers can also gain some insight from the analysis, I believe. The lesson to be learned is that the access mapping is the most valuable asset a plumber can have. If is a set of sensitive reports and is the subset of leaked reports, then computing the intersection should allow the plumbers to maximally narrow down the list of suspects. But from then on, the analysis presented in this post cannot help much. Polygraph-test all the suspects that are left, and hope there won’t be a lot of false positives!
It would not be unreasonable to think of this leakers versus plumbers game as a pursuit-evasion one: the leaker wants to maximize the number of suspects on the plumbers’ list, while the plumbers want to minimize the number of suspects as much as they can. The leaker tries to maximize the plumbers’ uncertainty, the plumbers try to minimize it. A word of caution is in order, however. Assumptions do matter a lot. For example, we assumed that the leaker had perfect knowledge of the access graph, but what if he does not? Then he risks falling into a canary trap. Mathematical reasoning does serve to clarify one’s thought, but let us not be so arrogant as to believe that a simple model can capture all of the real world’s complexity. Common sense should not be abandoned.
As you might have noticed, I make no moral judgements in this post. That is due to the fact that I find leaking of classified information to be amoral. In certain circumstances, leaking is a moral imperative. In other circumstances, leaking might compromise a country’s national security. Secrecy often serves to protect a country, but it might also serve to protect the interests of a few people who, either due to malice or incompetence, did fail to fulfil their duty.