Automated, Sub-second Attack Signature Generation: A Basis for Building Self-Protecting Servers Zhenkai Liang and R. Sekar ACM Conference on Computer and Communications Security 2005 Summary This paper presented an approach to protect against large-scale, repetitive attacks based on automatic generation of attack signatures by observing ongoing attacks, then filtering out future attacks before they can compromise the target server. The signatures attempt to capture the underlying vulnerability, which means that they have an advantage in blocking most other attacks that exploit the same vulnerability. Their signature checking introduces a 10% overhead, and generates signatures very fast which allows applications to restore service quickly, protect against brute-force attacks, and rapidly deploy signatures around the internet. The paper's approach is to first detect an attack and then correlate the attack to input. They take advantage of the fact that memory exploits often result in corrept pointer values, and detect when this value is included in an attack packet. They then generate a signature based on the identified message type and components in an attack packet, and filter on these signatures from all packets entering the system in the future. Their method was implemented and evaluated on a Red Hat Linux distribution, and uses internal components (which hook themselves with protected programs by means of library interposition) and external components (which run in a separate process and implement the function generation aspect). They evaluated their system against the three main types of memory exploits, using semi-current attacks that take advantage of buffer overflows, heap overflows, and format string exploits. They also evaluated the quality of their signatures by verifying the number of false alarm signatures generated, and the the robustness of the system with respect to polymorphic attacks. Pros - No external software required. - Captures the underlining problems of attacks, instead of actual syntax of the attack. - Signatures are generated very quickly, so low service downtown. - Short message signatures; signatures are conservative, and err on not denying service. - Servers can log recent valid inputs that it's recieved (testing phase). - Enhaces/preserves effectiveness of address space randomization (protects ASR functionality). - Focuses on present attack attempts with detection after first attack attempt. - Signatures can be propaged to multiple other hosts (prevent flash worms from spreading). - Can run message filter on firewall. - Attack detection is passive. - Effective input correlation mechanism. - Performance impact is minimal. Cons - Arrogant writing style with blatent grammatical errors. - Entire rule set is with every rule they create. This could get very large. - They claim to use LD_PRELOAD in testing, but this doesn't work for programs that are running as root; All daemons running as root can't use their detection mechanism. - Claim they can go against repeated, multiple attacks, but some scenarios were described where this might not be the case. For example, they do their signatures based on the first filter... so send a huge exploit, back off by one byte each time, and use stack randomization mechanisms to determine address. There is step-off element that an attacker could do.sized packets in. - Haven't described what is involved in the recovery period. Does it require human intervention? Discussion How does the input get defined to a message type? They don't describe how this is handled? Are all inputs held? What happens in the case that a message is not able to be defined? In testing, they say there is only 10% overhead. What is this 10% relating to? Is this related to the 100 held inputs they used in simulation or something more general? The attacker doesn't have control over how the server executes packet order, so maybe flooding the server wouldn't be a successful attack since flooding may not prevent execution of the first packet. This was brought up in response to a proposed attack. If the server maintained a size n queue of valid packets, send n-1 valid packets, and then a malicious packet. This could also be solved by increasing the current state buffer size. Regarding firewall propagation, signatures are generated on hosts, and pushed up to a firewall. Is this still possible if you are using some type of encryption protocol on your packets? Probably not since the firewalls can't read the packets as they come in. Does this scheme even work at an application layer of encryption? Again, likely not since the signature matching occurs before this. This scheme replaces the OS signal handlers with their own. Since they use their own, flaws in user programs (ie. that create segfaults) could create signatures that prevent applications from running. Most of the analysis in this paper assumes that attack traffic and normal traffic have very different packet sizes. This is true for the most part, but not always. Similar work in this area uses average packet sizes, and then filters out packets much larger than the average size. This advantage of this signature generation over just using ASR was addressed. Their scheme is modular, meaning that ASR ins't required, but some detection technique is needed. Additionally, you can now push ASR generated signatures up to firewalls if individual host protection is not a feasible administrative task (ie. corporate environment). Also, once we generate a signature, we don't need to deal with ASR segfaults anymore. The paper was unclear exactly how signatures are matched to packets as they arrive. It is done on a content and size criteria, but the specifics of the content comparison were left out. Is this done by exact matching of binary expressions? A regular expression approach? Votes Strong Accept - 1 Accept - 9