REU@NMT CSE Research Projects

Project 1: PDF malware detection with visualization techniques

PDF (Portable Document Format) is a file format invented by Adobe for presenting, exchanging and archiving documents that is independent of hardware, software, and operating systems. As one of the most used file formats, PDF documents have become one of the major vectors for malware attacks. This is mainly due to the flexibility of PDF file structure and the ability of embedding different kinds of contents such as JavaScript code, encoded streams and image objects etc. These features can be exploited by attackers to embed the malware in PDF files using tools like Metasploit. For example, it was reported that the current popular Ransomware can be hidden inside PDF documents to launch the attacks. This project will investigate the using of visualization techniques for PDF malware detection which has not been explored before. Various local and global image features will be tested to find promising ones for PDF malware detection.

Project 2: Advanced machine learning techniques for botnet detection

A botnet is a collection of compromised internet-connected devices that are infected by malware and remotely controlled as a group by a Botmaster. Owners usually are unaware of the infection of their devices. Botnets are currently the main platform for cybercriminals to launch various cyber attacks including distributed denial-of-service (DDoS) attacks, sending spam emails, installing spyware to steal sensitive private information, and so on. It was reported that there was a 69.2% increase in the first quarter of 2017 over the previous quarter in botnet malware usage, which poses a great challenge to cybersecurity. The project will attack the botnet detection with advanced machine learning techniques. Network-based detection will be adopted in this project, which analyzes the characteristics of network flow to identify anomalous traffic between the infected devices and the Command & Control (C&C) server of the botnet. Bio-inspired algorithms will be employed to select qualified network flow-based feature subset for botnet detection. Advanced machine learning techniques such as ensemble learning and cost-sensitive learning will be explored to improve the detection performance.

Project 3: Efficient access control model for smart grids

Smart grids integrate cyber infrastructure with legacy power grid to enable efficient power generation, delivery and usage. However, the adding of cyber infrastructure increases the complexity of the system, which introduces new vulnerabilities that could be exploited by potential adversaries. Traditional access control methods for computer networks such as mandatory access control (MAC), discretionary access control (DAC), and role-based access control (RBAC) are not suitable for smart grids due to the unique characteristics of smart grids. Smart grids have multiple types of systems roles (operators, engineers, technicians, managers etc.) and multiple security domains (e.g. interconnected region networks), even domain of domains. The primary security objective of smart grids is availability while the traditional access control models only focus on confidentiality and integrity. The aim of this project is to develop a flexible, sustainable and scalable access control model for smart grids. A security analysis will be performed to prove that the developed model is secure against threats and attacks.

Project 4: Privacy preserving for location-based services using spatial transformations

While mobile users would like to use location-based services (LBSs) to obtain answers to queries such as ‘Find me the nearest Italian restaurant with a rating > 3 on’, they would also like to preserve their privacy by not disclosing their exact location. This project will investigate the technique of hiding the exact location through a spatial transformation. Hilbert curves are space-filling curves that provide such spatial transformation through hash functions with limited preservation of proximity of the domain. This provides a mapping from (x, y)-coordinates of points of interest (e.g., restaurants) into non-negative integers. So, a trusted entity is employed which transforms (encrypts) the location of each point of interest into a corresponding integer and sends these encoded locations to the Location-based Server while sharing the parameters used for the transformation (encryption key) only with the user (i.e., not with the LBS).

To query the LBS, the user would encrypt his/her own location and send that to the LBS, which would find the nearest point of interest in the encrypted space and return that location. The user then decrypts the returned location to find the actual location. There are two interesting research questions to be investigated. First, to what extent is the ‘nearby’ point of interest in the encrypted space actually close in the (x, y) space? How effective are heuristics such as the use of two orthogonal curves to mitigate non-proximate locations? Second, to what extent can an adversary perform the decryption given limited knowledge of the parameters used in the creation of the Hilbert curve? In particular, are there some parameters that are more crucial than others?

Project 5: Secure data logging for mobile devices

Users of modern smartphones store a lot of sensitive data in their devices. Attackers are interested in obtaining a foothold on these phones in order to take monetary and strategic advantage of the sensitive data. another problem is that phones are often lost or stolen; when such events occur, data is revealed to potentially dangerous adversaries. When a lost or stolen phone is recovered, the owner would want to learn if the sensitive data was accessed or changed. While the technology of database logging is well developed, the techniques typically do not apply to phones owing to the limitations of storage space. We have shown how tamper-resistant logging can be developed on devices like cell phones with only a part of the actual log stored on the phone and used to mitigate the above problems. This project will implement the scheme and experiment with the degree of tamper resistance and performance. Furthermore, the use of the ideas behind steganography to detect unauthorized access will be investigated.

Project 6: Developing user mental model through reasoned action approach against semantic attacks

Semantic attacks, which target the way we, as humans, assign meaning to contents, have been challenging and difficult to handle with in computer security. The underlying cause of semantic attacks is a semantic barrier between the user’s mental model and the system’s actual processing model. The SSLstripping attack is one of the semantic attacks, which takes advantage of the simple observation that most users do not explicitly type the safe address of a web page (https), but rather rely either on the browser or the target page to redirect them to a secure location. This opens the opportunity to strip users’ sessions of its security, while giving the user the illusion of privacy. In this project, we will investigate how to better understand and develop the user’s mental processing model in the context of cybersecurity through the use of the reasoned action approach (RAA), which explains that a user’s behavior is determined by his/her intention to perform the behavior and the intention is, in turn, a function of attitudes towards the behavior, perceived norms (or social pressure), and perceived behavior control (capacity and relevant skills/abilities). Then we will conduct research on how to utilize the developed model to enhance the effectiveness, efficiency, and satisfaction of our previous solution against the SLLstripping attack (SSLight).