Results 1 -
3 of
3
Incremental Learning of System Log Formats
"... System logs come in a large and evolving variety of formats, many of which are semi-structured and/or non-standard. As a consequence, off-the-shelf tools for processing such logs often do not exist, forcing analysts to develop their own tools, which is costly and time-consuming. In this paper, we pr ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
System logs come in a large and evolving variety of formats, many of which are semi-structured and/or non-standard. As a consequence, off-the-shelf tools for processing such logs often do not exist, forcing analysts to develop their own tools, which is costly and time-consuming. In this paper, we present an incremental algorithm that automatically infers the format of system log files. From the resulting format descriptions, we can generate a suite of data processing tools automatically. The system can handle large-scale data sources whose formats evolve over time. Furthermore, it allows analysts to modify inferred descriptions as desired and incorporates those changes in future revisions. 1
Forensic Triage for Mobile Phones with DEC0DE
"... We present DEC0DE, a system for recovering information from phones with unknown storage formats, a critical problem for forensic triage. Because phones have myriad custom hardware and software, we examine only the stored data. Via flexible descriptions of typical data structures, and using a classic ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
We present DEC0DE, a system for recovering information from phones with unknown storage formats, a critical problem for forensic triage. Because phones have myriad custom hardware and software, we examine only the stored data. Via flexible descriptions of typical data structures, and using a classic dynamic programming algorithm, we are able to identify call logs and address book entries in phones across varied models and manufacturers. We designed DEC0DE by examining the formats of one set of phone models, and we evaluate its performance on other models. Overall, we are able to obtain high performance for these unexamined models: an average recall of 97 % and precision of 80 % for call logs; and average recall of 93 % and precision of 52 % for address books. Moreover, at the expense of recall dropping to 14%, we can increase precision of address book recovery to 94 % by culling results that don’t match between call logs and address book entries on the same phone. 1
Incremental Learning of System Log Formats [Extended Abstract]
"... System logs come in a large and evolving variety of formats, many of which are semi-structured and/or non-standard. As a consequence, off-the-shelf tools for processing such logs often do not exist, forcing analysts to develop their own tools, which is costly and timeconsuming. In this paper, we pre ..."
Abstract
- Add to MetaCart
System logs come in a large and evolving variety of formats, many of which are semi-structured and/or non-standard. As a consequence, off-the-shelf tools for processing such logs often do not exist, forcing analysts to develop their own tools, which is costly and timeconsuming. In this paper, we present an incremental algorithm that automatically infers the format of system log files. From the resulting format descriptions, we can generate a suite of data processing tools automatically. The system can handle large-scale data sources whose formats evolve over time. Furthermore, it allows analysts to modify inferred descriptions as desired and incorporates those changes in future revisions. Categories and Subject Descriptors

