HoneySurfer: Intelligent Web-Surfing Honeypots
In Singapore’s evolving cyber landscape, 96% of organisations have suffered at least one cyber attack and 95% of organisations have been reporting more sophisticated attacks in the frame of one year according to a 2019 report[1] by Carbon Black. As such, more tools must be utilised to counter increasingly refined attacks performed by malicious actors. Honeypots are effective tools for studying and mitigating these attacks. They work as decoy systems, typically deployed alongside real systems to capture and log the activities of the attacker. These systems are useful as they can actively detect potential attacks, help cybersecurity specialists study an attacker’s tactics and even misdirect attackers from their intended targets. Honeypots can be classified into two main categories: 1. Low-interaction honeypots merely emulate network services and internet protocols, allowing for limited interaction with the attacker. 2. High-interaction honeypots emulate operating systems, allowing for much more interaction with the attacker. Although honeypots are powerful tools, its value diminishes when its true identity is uncovered by attackers. This is especially so with attackers becoming more skilled through system fingerprinting or analysing network traffic from targets and hence, hindering honeypots from capturing more experienced attackers. While substantial research has been done to defend against system fingerprinting scans (see 1.1 Related Work), not much has been done to defend against network traffic analysis. As pointed out by Symantec[2][3], when attackers attempt to sniff network traffic of the system in question, the lack of network traffic raises a red flag, increasing the likelihood of the honeypot’s true identity being discovered. In addition, the main concern with regards to honeypot deployment being their ability to attract and engage attackers for a substantial period of time, an increased ability to interest malicious actors is invaluable. Producing human-like network activity on a honeypot would appeal to more malicious actors. Hence, this research aims to build an intelligent web-surfer which can learn and thus simulate human web-surfing behaviour, creating evidence of human network activities to disguise the identity of honeypots as production systems and luring in more attackers interested in packet sniffing for malicious purposes.
Enhancement of Online Stochastic Gradient Descent using Backward Queried Images
Stochastic gradient descent (SGD) is one of the preferred online optimization algorithms. However, one of its major drawbacks is its predisposition to forgetting previous data when optimizing through a data stream, also known as catastrophic interference. In this project, we attempt to mitigate this drawback by proposing a new low-cost approach which incorporates backward queried images with SGD during online training. Under this new approach, we propose that for every new training sample through the data stream, the neural network is optimized using the corresponding backward queried image from the initial dataset. After compiling the accuracy of the proposed method and SGD under a data-stream of 50,000 training cases with 10,000 test cases and comparing our algorithm to SGD, we see substantial improvements in the performance of the neural network with two different MNIST datasets (Fashion and Kuzushiji), classifying the MNIST datasets at a high accuracy for the mean, minimum, lower quartile, median, and upper quartile, while maintaining lower standard deviation in performance, demonstrating that our proposed algorithm can be a potential alternative to online SGD.
Predicting the Binding Affinity between Medicine and Estrogen Receptor Beta
Recent studies showed that the probability of Taiwanese females developing breast cancer has risen dramatically over the past 30 years. We are now facing younger and more breast cancer patients in Taiwan. What makes the matter even more severe, is the fact that patients that take cancer treating medicine will suffer from its serious side effects, some may even lose the ability to reproduce. We hope to develop a new system that can help doctors and researchers develop new medicine for treating breast cancer, the way medicine cures cancer tumors are by attaching onto the infected cells’ receptors. After collecting MACCS data (converted from SMILES), the dataset will be used for training the machine learning program. Due to the problem of insufficient training data, we used an ensemble method to generate our machine learning model. Among the three basic ensemble techniques, Max Voting, Averaging, and Weighted Averaging. we selected the max voting technique to perform the prediction for this research. We created two separate datasets, positive and negative, the two datasets will later be used as training data for the program. We weren’t sure of the ratio of positive and negative in the training data, therefore we compare 40 different ratios and evaluate the results. By comparing the accuracy of the models, we found out that when the ratio between positive data and negative data is 1:3000, the machine learning program will have the highest precision. After we created the final model through voting among the 1000 models generated, we evaluate the precision of the model through the following methods, AUC, precision, recall. The ultimate goal of this research is to assist doctors and researchers shorten the process of developing and testing new medicines.
Limited Query Black-box Adversarial Attacks in the Real World
We study the creation of physical adversarial examples, which are robust to real-world transformations, using a limited number of queries to the target black-box neural networks. We observe that robust models tend to be especially susceptible to foreground manipulations, which motivates our novel Foreground attack. We demonstrate that gradient priors are a useful signal for black-box attacks and therefore introduce an improved version of the popular SimBA. We also propose an algorithm for transferable attacks that selects the most similar surrogates to the target model. Our black-box attacks outperform state-of-the-art approaches they are based on and support our belief that the concept of model similarity could be leveraged to build strong attacks in a limited-information setting.
A Person Re-identification based Misidentification-proof Person Following Service Robot
Two years ago, I attended a robot contest, in which one of the missions required the robot to follow the pedestrian to complete the task. At that time, I used their demo program to complete the task. Not long after, I found two main issues: 1. The program follows the closest point read by the depth camera, which if I walk close to a wall next to, the robot may likely ‘follow’ the wall. 2. Not to mention if another pedestrian crosses between the robot and the target. Regarding these two issues, I decided to improve it. We’ve designed a procedure of using YOLO Object Detection and Person re-identification to re-identify the target for continuous following.
A Person Re-identification based Misidentification-proof Person Following Service Robot
Two years ago, I attended a robot contest, in which one of the missions required the robot to follow the pedestrian to complete the task. At that time, I used their demo program to complete the task. Not long after, I found two main issues: 1. The program follows the closest point read by the depth camera, which if I walk close to a wall next to, the robot may likely ‘follow’ the wall. 2. Not to mention if another pedestrian crosses between the robot and the target. Regarding these two issues, I decided to improve it. We’ve designed a procedure of using YOLO Object Detection and Person re-identification to re-identify the target for continuous following.
Limited Query Black-box Adversarial Attacks in the Real World
We study the creation of physical adversarial examples, which are robust to real-world transformations, using a limited number of queries to the target black-box neural networks. We observe that robust models tend to be especially susceptible to foreground manipulations, which motivates our novel Foreground attack. We demonstrate that gradient priors are a useful signal for black-box attacks and therefore introduce an improved version of the popular SimBA. We also propose an algorithm for transferable attacks that selects the most similar surrogates to the target model. Our black-box attacks outperform state-of-the-art approaches they are based on and support our belief that the concept of model similarity could be leveraged to build strong attacks in a limited-information setting.