Results 1 - 10
of
31
D.: Polyglot: automatic extraction of protocol message format using dynamic binary analysis
- In: CCS ’07: Proceedings of the 14th ACM conference on Computer and communications security
, 2007
"... Protocol reverse engineering, the process of extracting the application-level protocol used by an implementation, without access to the protocol specification, is important for many network security applications. Recent work [17] has proposed protocol reverse engineering by using clustering on netwo ..."
Abstract
-
Cited by 46 (8 self)
- Add to MetaCart
Protocol reverse engineering, the process of extracting the application-level protocol used by an implementation, without access to the protocol specification, is important for many network security applications. Recent work [17] has proposed protocol reverse engineering by using clustering on network traces. That kind of approach is limited by the lack of semantic information on network traces. In this paper we propose a new approach using program binaries. Our approach, shadowing, uses dynamic analysis and is based on a unique intuition—the way that an implementation of the protocol processes the received application data reveals a wealth of information about the protocol message format. We have implemented our approach in a system called Polyglot and evaluated it extensively using real-world implementations of five different protocols: DNS, HTTP, IRC, Samba and ICQ. We compare our results with the manually crafted message format, included in Wireshark, one of the state-ofthe-art protocol analyzers. The differences we find are small and usually due to different implementations handling fields in different ways. Finding such differences between implementations is an added benefit, as they are important for problems such as fingerprint generation, fuzzing, and error detection.
Discoverer: Automatic protocol reverse engineering from network traces
- In Proceedings of the 16th USENIX Security Symposium (Security’07
, 2007
"... Application-level protocol specifications are useful for many security applications, including intrusion prevention and detection that performs deep packet inspection and traffic normalization, and penetration testing that generates network inputs to an application to uncover potential vulnerabiliti ..."
Abstract
-
Cited by 33 (1 self)
- Add to MetaCart
Application-level protocol specifications are useful for many security applications, including intrusion prevention and detection that performs deep packet inspection and traffic normalization, and penetration testing that generates network inputs to an application to uncover potential vulnerabilities. However, current practice in deriving protocol specifications is mostly manual. In this paper, we present Discoverer, a tool for automatically reverse engineering the protocol message formats of an application from its network trace. A key property of Discoverer is that it operates in a protocol-independent fashion by inferring protocol idioms commonly seen in message formats of many application-level protocols. We evaluated the efficacy of Discoverer over one text protocol (HTTP) and two binary protocols (RPC and CIFS/SMB) by comparing our inferred formats with true formats obtained from Ethereal [5]. For all three protocols, more than 90 % of our inferred formats correspond to exactly one true format; one true format is reflected in five inferred formats on average; our inferred formats cover over 95 % of messages, which belong to 30-40 % of true formats observed in the trace. 1
Automatic Network Protocol Analysis
- Proceedings of the 15th Annual Network and Distributed System Security Symposium (NDSS’08
, 2008
"... ..."
Grammar-based Whitebox Fuzzing
"... Whitebox fuzzing is a form of automatic dynamic test generation, based on symbolic execution and constraint solving, designed for security testing of large applications. However, the effectiveness of whitebox fuzzing is limited when testing applications with highly-structured inputs, such as compile ..."
Abstract
-
Cited by 32 (1 self)
- Add to MetaCart
Whitebox fuzzing is a form of automatic dynamic test generation, based on symbolic execution and constraint solving, designed for security testing of large applications. However, the effectiveness of whitebox fuzzing is limited when testing applications with highly-structured inputs, such as compilers and interpreters. These applications process their inputs in stages, such as lexing, parsing and evaluation. Due to the enormous number of control paths in the early processing stages, whitebox fuzzing rarely reaches parts of the application beyond the first stages. In this paper, we study how to enhance whitebox fuzzing of complex structured-input applications with a grammarbased specification of their valid inputs. We present a novel dynamic test generation algorithm where symbolic execution directly generates grammar-based constraints whose satisfiability is checked using a custom grammar-based constraint solver. We have implemented this algorithm and evaluated it on a large security-critical application, the JavaScript interpreter of Internet Explorer 7. Results of experiments show that grammar-based whitebox fuzzing explores deeper program paths and avoids dead-ends due to non-parsable inputs. Compared to regular whitebox fuzzing, grammar-based whitebox fuzzing increased coverage of the code generation module of the IE7 JavaScript interpreter from 53 % to 81% while using three times fewer tests.
Tupni: Automatic Reverse Engineering of Input Formats
- In Proceedings of the 15th ACM Conference on Computer and Communications Security (CCS
, 2008
"... Recent work has established the importance of automatic reverse engineering of protocol or file format specifications. However, the formats reverse engineered by previous tools have missed important information that is critical for security applications. In this paper, we present Tupni, a tool that ..."
Abstract
-
Cited by 28 (2 self)
- Add to MetaCart
Recent work has established the importance of automatic reverse engineering of protocol or file format specifications. However, the formats reverse engineered by previous tools have missed important information that is critical for security applications. In this paper, we present Tupni, a tool that can reverse engineer an input format with a rich set of information, including record sequences, record types, and input constraints. Tupni can generalize the format specification over multiple inputs. We have implemented a prototype of Tupni and evaluated it on 10 different formats: five file formats (WMF, BMP, JPG, PNG and TIF) and five network protocols (DNS, RPC, TFTP, HTTP and FTP). Tupni identified all record sequences in the test inputs. We also show that, by aggregating over multiple WMF files, Tupni can derive a more complete format specification for WMF. Furthermore, we demonstrate the utility of Tupni by using the rich information it provides for zeroday vulnerability signature generation, which was not possible with previous reverse engineering tools.
Generic application-level protocol analyzer and its language
, 2005
"... The Shield project relied on application protocol analyzers to detect potential exploits of application vulnerabilities. We present the design of a second-generation generic application-level protocol analyzer (GAPA) that encompasses a domain-specific language and the associated run-time. We designe ..."
Abstract
-
Cited by 23 (3 self)
- Add to MetaCart
The Shield project relied on application protocol analyzers to detect potential exploits of application vulnerabilities. We present the design of a second-generation generic application-level protocol analyzer (GAPA) that encompasses a domain-specific language and the associated run-time. We designed GAPA to satisfy three important goals: safety, real-time analysis and response, and rapid development of analyzers. We have found that these goals are relevant for many network monitors that implement protocol analysis. Therefore, we built GAPA to be readily integrated into tools such as Ethereal as well as Shield. GAPA preserves safety through the use of a memorysafe language for both message parsing and analysis, and through various techniques to reduce the amount of state maintained in order to avoid denial-of-service attacks. To support online analysis, the GAPA runtime uses a streamprocessing model with incremental parsing. In order to speed protocol development, GAPA uses a syntax similar to many protocol RFCs and other specifications, and incorporates many common protocol analysis tasks as built-in abstractions. We have specified 10 commonly used protocols in the GAPA language and found it expressive and easy to use. We measured our GAPA prototype and found that it can handle an enterprise client HTTP workload at up to 60 Mbps, sufficient performance for many end-host firewall/IDS scenarios. At the same time, the trusted code base of GAPA is an order of magnitude smaller than Ethereal. 1
Shieldgen: Automatic data patch generation for unknown vulnerabilities with informed probing
- In In Proceedings of 2007 IEEE Symposium on Security and Privacy
, 2007
"... In this paper, we present ShieldGen, a system for automatically generating a data patch or a vulnerability signature for an unknown vulnerability, given a zero-day attack instance. The key novelty in our work is that we leverage knowledge of the data format to generate new potential attack instances ..."
Abstract
-
Cited by 22 (2 self)
- Add to MetaCart
In this paper, we present ShieldGen, a system for automatically generating a data patch or a vulnerability signature for an unknown vulnerability, given a zero-day attack instance. The key novelty in our work is that we leverage knowledge of the data format to generate new potential attack instances, which we call probes, and use a zero-day detector as an oracle to determine if an instance can still exploit the vulnerability; the feedback of the oracle guides our search for the vulnerability signature. We have implemented a ShieldGen prototype and experimented with three known vulnerabilities. The generated signatures have no false positives and a low rate of false negatives due to imperfect data format specifications and the sampling technique used in our probe generation. Overall, they are significantly more precise than the signatures generated by existing schemes. We have also conducted a detailed study of 25 vulnerabilities for which Microsoft has issued security bulletins between 2003 and 2006. We estimate that ShieldGen can produce high quality signatures for a large portion of those vulnerabilities and that the signatures are superior to the signatures generated by existing schemes. 1.
Dispatcher: Enabling active botnet infiltration using automatic protocol reverse-engineering
- In CCS’09: of the 16th ACM conference on Computer and communications security
, 2009
"... Automatic protocol reverse-engineering is important for many security applications, including the analysis and defense against botnets. Understanding the command-and-control (C&C) protocol used by a botnet is crucial for anticipating its repertoire of nefarious activity and to enable active botnet i ..."
Abstract
-
Cited by 19 (3 self)
- Add to MetaCart
Automatic protocol reverse-engineering is important for many security applications, including the analysis and defense against botnets. Understanding the command-and-control (C&C) protocol used by a botnet is crucial for anticipating its repertoire of nefarious activity and to enable active botnet infiltration. Frequently, security analysts need to rewrite messages sent and received by a bot in order to contain malicious activity and to provide the botmaster with an illusion of successful and unhampered operation. To enable such rewriting, we need detailed information about the intent and structure of the messages in both directions of the communication despite the fact that we generally only have access to the implementation of one endpoint, namely the bot binary. Current techniques cannot enable such rewriting. In this paper, we propose techniques to extract the format of protocol messages sent by an
Melange: Creating a ”functional” internet
- In EuroSys
, 2007
"... Most implementations of critical Internet protocols are written in type-unsafe languages such as C or C++ and are regularly vulnerable to serious security and reliability problems. Type-safe languages eliminate many errors but are not used to due to the perceived performance overheads. We combine tw ..."
Abstract
-
Cited by 10 (4 self)
- Add to MetaCart
Most implementations of critical Internet protocols are written in type-unsafe languages such as C or C++ and are regularly vulnerable to serious security and reliability problems. Type-safe languages eliminate many errors but are not used to due to the perceived performance overheads. We combine two techniques to eliminate this performance penalty in a practical fashion: strong static typing and generative metaprogramming. Static typing eliminates run-time type information by checking safety at compile-time and minimises dynamic checks. Meta-programming uses a single specification to abstract the lowlevel code required to transmit and receive packets. Our domain-specific language, MPL, describes Internet packet protocols and compiles into fast, zero-copy code for both parsing and creating these packets. MPL is designed for implementing quirky Internet protocols ranging from the low-level: Ethernet, IPv4, ICMP and TCP; to the complex application-level: SSH, DNS and BGP; and even file-system protocols such as 9P. We report on fully-featured SSH and DNS servers constructed using MPL and our OCaml framework MELANGE, and measure greater throughput, lower latency, better flexibility and more succinct source code than their C equivalents OpenSSH and BIND. Our quantitative analysis shows that the benefits of MPL-generated code overcomes the additional overheads of automatic garbage collection and dynamic bounds checking. Qualitatively, the flexibility of our approach shows that dramatic optimisations are easily possible. 1.
Rethinking Hardware Support for Network Analysis and Intrusion Prevention
, 2006
"... The performance pressures on implementing effective network security monitoring are growing fiercely due to rising traffic rates, the need to perform much more sophisticated forms of analysis, the requirement for inline processing, and the collapse of Moore's law for sequential processing. Given the ..."
Abstract
-
Cited by 9 (4 self)
- Add to MetaCart
The performance pressures on implementing effective network security monitoring are growing fiercely due to rising traffic rates, the need to perform much more sophisticated forms of analysis, the requirement for inline processing, and the collapse of Moore's law for sequential processing. Given these growing pressures, we argue that it is time to fundamentally rethink the nature of using hardware to support network security analysis. Clearly, to do so we must leverage massively parallel computing elements, as only these can provide the necessary performance. The key, however, is to devise an abstraction of parallel processing that will allow us to expose the parallelism latent in semantically rich, stateful analysis algorithms; and that we can then further compile to hardware platforms with different capabilities.

