File(s) under embargo
Reason: Need to publish this work in journal paper
until file(s) become available
A MACHINE LEARNING BASED WEB SERVICE FOR MALICIOUS URL DETECTION IN A BROWSER
thesisposted on 12.12.2019 by Hafiz Muhammad Junaid Khan
In order to distinguish essays and pre-prints from academic theses, we have a separate category. These are often much longer text based documents than a paper.
Malicious URLs pose serious cyber-security threats to the Internet users. It is critical to detect malicious URLs so that they could be blocked from user access. In the past few years, several techniques have been proposed to differentiate malicious URLs from benign ones with the help of machine learning. Machine learning algorithms learn trends and patterns in a data-set and use them to identify any anomalies. In this work, we attempt to ﬁnd generic features for detecting malicious URLs by analyzing two publicly available malicious URL data-sets. In order to achieve this task, we identify a list of substantial features that can be used to classify all types of malicious URLs. Then, we select the most signiﬁcant lexical features by using Chi-Square and ANOVA based statistical tests. The effectiveness of these feature sets is then tested by using a combination of single and ensemble machine learning algorithms. We build a machine learning based real-time malicious URL detection system as a web service to detect malicious URLs in a browser. We implement a chrome extension that intercepts a browser’s URL requests and sends them to web service for analysis. We implement the web service as well that classifies a URL as benign or malicious using the saved ML model. We also evaluate the performance of our web service to test whether the service is scalable.