The attack model described in this paper is that a computer in a high-security area has been compromised, and secret data is attempted to be transmitted to the outside world over a network. However, the high-security network carefully monitors its traffic to the outside such that it can not simply send the data out and expect to be undetected. In this paper they address a common attack on this scenario, a side-channel timing attack where the timing of legitimate packets is modified so that an outsider with access to timing information alone can retrieve the data that is leaked out. They experiment with three attacks in this paper, the first of which is IPCTC. This was the first covert timing channel that was developed in 2004. In the simplified version, there is an interval of time built into the system, say one second. If we want to covertly transmit a one bit, during the interval of one second a packet with arbitrary data will be sent out. Otherwise, no packets are sent out. Over many seconds data of arbitrary length can be sent out. In the full version, the interval is much smaller (and therefore the side-channel bandwidth is much higher), and the interval rotates among different values. The next attack is a time-replay covert timing channel attack. In this one, rather than sending out arbitrary data with a clearly non-standard timing distribution, legitimate traffic is sampled and divided up into traffic with large realistic delays and traffic with small realistic delays. In order to transmit a one bit, the delay between packets is a random one from the sampled traffic with large delays, whereas to transmit a zero bit the packet delay is a random one from the traffic with small realistic delays. In this way, given data with approximately as many zero bits as one bits the overall traffic pattern will resemble that of the traffic that was sampled. The final attack examined is JitterBug. This does not require a compromised computer but rather compromised hardware. An attack on keyboard data would proceed by introducing unnoticeable delays that vary based on the data we want to transmit out. Interactive sessions such as VNC or SSH pass this information along almost instantly thus preserving the introduced delays which can then be read outside of the network. This scheme allows a throughput of one bit per key press. An example would be say we want to transmit out the bits {0, 1, 0, 0, 1}. If our real-time window was set to 20ms, and the keyboard delays were {125ms, 64ms, 110ms, 72ms, 212ms}, a jitterbugged keyboard would change these delays to {140ms, 70ms, 120ms, 80ms, 230ms}. By checking the delay modulo 20ms, someone outside the network could retrieve the leaked data. Some detection methods are then discussed. They are basically divided into two broad classes: shape tests and regularity tests. Shape tests refer to first-order statistics about the packet delays such as mean, variance and distribution. Regularity tests, on the other hand, use higher-order statistics and correlations in the data. The key insight here is that when information is being leaked on a side-channel, the entropy of the timing of the packets will differ from legitimate traffic. Their proposed detection method attempts to measure entropy as well as conditional entropy (the inter-packet delay given the previous inter-packet delay, or previous several inter-packet delays). They can then compare the entropy and conditional entropy of legitimate traffic to observed traffic to determine if a side-channel leakage is occurring. Since they can not, in the mathematical sense, exactly measure entropy and conditional entropy with only a finite sequence of inter-packet delays, they approximate them using empirical probability density functions. The scheme essentially takes inter-packet delays and puts them into bins. Then, to determine entropy the distribution of packets in the bins can be compared. Using smaller bins is determined to be better as a measure of entropy. For conditional entropy, they use a similar scheme. However, they determine that larger bins work better for conditional entropy. Their final scheme uses both large bins for an entropy measurement and small bins for a conditional entropy measurement. If either of the tests fail, the traffic is determined to be malicious and leaking information through a timing side-channel. They evaluate their scheme on the three previously-mentioned timing attacks. Their training set consists of 80 gigabytes of sample data collected from a campus network. They set a false positive threshold of 1% by looking at the entropy and conditional entropy of legitimate traffic.They then compare their mechanism to the existing ones for detecting side-channel timing information leaks. For IPCTC, both entropy and conditional entropy are violated and true positive rate is 100% for both. For TRCTC, the entropy test's true positive rate is very low, 2% as expected because of the design of the attack. However, the conditional entropy true positive rate is 100%, which means the overall detection mechanism is successful. Finally, for JitterBug, the entropy test's true positive rate is 100% which the conditional entropy test is a low 4%. Finally, countermeasures are discussed, which mostly consist of lowering the side-channel's bandwidth to evade detection. In the future work section they plan to further explore countermeasures as well as quantify the degree to which the side-channel's bandwidth is reduced by having an entropy-detection mechanism present. Discussion Questions: Can timing information be used to send data into the compromised process rather than only out? Would their scheme detect it? Is incoming traffic timing as predictable as outgoing traffic timing? Is a 1% false positive rate ridiculously high? How do you predict a lower rate would reduce their ability to detect side-channel timing attacks? Jitterbug could use a smaller timing window to evade the entropy test. They claim that using a smaller timing window would decrease the capacity of JitterBug. How so? It seems like introducing a slight randomness into packets would eliminate, or at least significantly slow down the mentioned side-channel timing attacks. Is this true in general? Is this technique general enough to capture all yet-to-be-invented side channel timing information leaks? Packet ordering is another common side-channel attack. Do you think this entropy based approach could be generalized to work there as well? Side channel attacks within an operating system can occur too (to get around information flow security). Can an entropy-based approach work here? What would it be based on, system calls? Even within a single protocol, say SSH, large variations exist between the traffic patterns generated by different users. Do the entropy and conditional entropy remain similar even with different traffic patterns? Pros: Takes entropy metric from biology and reuses solution Looks to be applicable across domains, not just for inter-packet timing Good implementation details Good related work section Cons: Explanation of bin choices was weak Experimental detection uses too many packets to be practical; how well would it work with less packets? The argument about countermeasures is hand-wavy Votes: SA - 1 WA - 4 WR - 1 SR - 0