![Скачать презентацию Prophiler A fast filter for the large-scale detection Скачать презентацию Prophiler A fast filter for the large-scale detection](https://present5.com/wp-content/plugins/kama-clic-counter/icons/ppt.jpg)
93a0940a3db3cc223fe1a852fd2b8045.ppt
- Количество слайдов: 22
Prophiler: A fast filter for the large-scale detection of malicious web pages Reporter : 鄭志欣 Advisor: Hsing-Kuo Pao Date : 2011/03/31 1
Conference • Davide Canali, Marco Cova, Giovanni Vigna and Christopher Kruegel, "Prophiler: a Fast Filter for the Large-Scale Detection of Malicious Web Pages", 20 th International World Wide Web Conference (WWW 2011) 2
Outline Introduction Approach Implementation and Setup Evaluation Conclusion 3
Intruduction • Malicious Web pages – Drive-by-Download : Java. Script – Compromising hosts – Large-scare Botnets • Static analysis vs. Dynamic analysis – Dynamic analysis spent a lot of time. – Static analysis reduce the resources required for performing large-scale analysis. – URL blacklists (Google safe Browsing) – Honey. Client: Wepawet Phoney. C JSUnpack – Combined ? • Quickly discard benign pages forwarding to the costly analysis tools(Wepawet). 4
Prophiler Prophiler, uses static analysis techniques to quickly examine a web page for malicious content. HTML , Java. Script , URL information Model : Using Machine-Learning techniques 5
Approach Features Neko HTML Parser HTML, Java. Script, URL information Total features : 77 New features : 17 Models 6
Features 7
Reference Paper • [26]C. Seifert, I. Welch, and P. Komisarczuk. Identification of Malicious Web Pages with Static Heuristics. In Proceedings of the Australasian Telecommunication Networks and Applications Conference (ATNAC), 2008. • [16] P. Likarish, E. Jung, and I. Jo. Obfuscated Malicious Javascript Detection using Classification Techniques. In Proceedings of the Conference on Malicious and Unwanted Software (Malware), 2009 • [6] B. Feinstein and D. Peck. Caffeine Monkey: Automated Collection, Detection and Analysis of Malicious Java. Script. In Proceedings of the Black Hat Security Conference, 2007. • [17] J. Ma, L. Saul, S. Savage, and G. Voelker. Beyond Blacklists: Learning to Detect Malicious Web Sites from Suspicious URLs. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2009. • [25] C. Seifert, I. Welch, and P. Komisarczuk. Identification of Malicious Web Pages Through Analysis of Underlying DNS and Web Server Relationships. In Proceedings of the LCN Workshop on Network Security (WNS), 2008. 8
Effectiveness of new features HTML(7) Java. Script(4) URL and Host(5) #elements containing suspicious content shellcode presence probability(J 48) TLD of the URL #iframes the presence of decoding routines the absence of a subdomain in the URL #elements with a small area the maximum string length the TTL of the host’s DNS A record the whitespace percentage of the web page the entropy of the scripts the presence of a suspicious domain name or file name the page length in characters the presence of a port number in the URL the presence of meta refresh tags the percentage of scripts in the page 9
Discussion Assumptions First, distribution of feature values for malicious examples is different from benign examples. Second, the datasets used for model training share the same feature distribution as the real-world data that is evaluated using the models. Trade-offs False negative vs. False positive 10
Implementation and Setup(cont. ) • Prophiler as a filter for our existing dynamic analysis tool, called Wepawet. • Collection URLs : Heritrix (tools), Spam Email • Terms form Twitter , Google , Wikipedia trends • Collecting URLs : 2, 000 URLs/day 11
12
Implementation and Setup • The crawler fetches pages and submits them as input to Prophiler. • Server : – Ubuntu Linux x 64 v 9. 10 – 8 -core Intel Xeon processor and 8 GB of RAM • The system in this configuration is able to analyze on average 320, 000 pages/day. • Analysis must examine around 2 million URLs each day. 13
Evaluation Total web pages : 20 million web pages. 14
Evaluation (cont. ) • Training Set : – – 787 Wepawet’s database. 51, 171 Top 100 Alexa website Google safebrowsing API , anti-virus , experts. 10 -Fold 15
16
Evaluation (cont. ) • Validation – – – – 153, 115 pages Submitted to Wepawet spent 15 days Benign : 139, 321 pages Malicious : 13, 794 pages False Positive : 10. 4% False Negative : 0. 54% Saving valuable resources 17
18
Evaluation (cont. ) Large-scale Evaluation 18, 939, 908 pages run 60 -days 14. 3% as malicious 85. 7% as reduction of load on the back-end analyzer 1, 968 malicious pages/days (by Wepawet) False Positive rate : 13. 7% False Negaitve rate : 1% 19
1968 every day as malicious by Wepawet 20
Evaluation (cont. ) Comparsion 15000 web pages Malicious : 5861 pages Benign : 9139 pages 21
Conclusion We developed Prophiler, a system whose aim is to provide a filter that can reduce the number of web pages that need to be analyzed dynamically to identify malicious web pages. Deployed our system as a front-end for Wepawet , with very small false negative rate. 22