Motivation

Cyber physical systems have become increasingly ubiquitous and have fairly recently exploded in variety as a result of increased interest in the Internet of Things (IoT). These systems now range anywhere from telemetry devices, small scale avionics, increasingly powerful smarthpones, and wearables. However, while the capabilities and performance of many of these systems have increased substantially over the years, one aspect of these systems that has failed to catch up is security. A variety of papers address concerns with attacks targeting the network interfaces^[1][2] and software / firmware^[3] of these devices, but there has been substantially less addressing the detection and mitigation of compromised hardware components. Redundant systems are typically employed to protect against failure of particular components^[1], but the task of actual detection of an attack that may be responsible for these failures (focusing on controller corruption in our case) remains a largely unexplored area. A goal of this project is to suggest a means by which detection of some of these hardware attacks can occur in real time via the use of a deep reinforcement learning network that can be trained offline. We apply this approach to a well studied system - a linear balancing inverted pendulum - and attempt to simulate detection. Additionally, if possible, we will attempt port our simulation testbed to a physical testbed to work with real hardware components.

Objectives

The goal of this project is to successfully train a simulated inverted pendulum to balance in an upright position for an extended period of time and issue an alert when an attack is detected. The training of the pendulum will be done via the tweaking of a neural network using SARSA deep reinforcement learning. Should the delay prove non-detrimental, this simulation will be ported to a physical testbed involving a linear channel slider and a DC motor as the actuators, and an Intel Edison as the primary controller.

Deliverables for the project include the code used to run train and run simulations (primarily done in MATLAB and Simulink) as well as the code deployed to the target embedded device used to control the testbed actuators (all done in C). Photos, videos, and reports detailing intermediate progress and the final state of the project will be produced throughout the development cycle.

Current Progress

This project can be roughly divided into 9 phases

Phase 1: Model the motion of the pendulum in the simulation environment
Phase 2: Replace the standard physical motion modeling with a neural network and train using SARSA deep reinforcement learning
Phase 3: Train the network while running simulations to generate a network that produces desired behavior (in this case balancing the pendulum) for at least a specified period of time
Phase 4: Introduce attacks into simulation environment and refine network to be robust to them
Phase 5: Develop and introduce a means of attack detecction to the simulation
Phase 6: Set up the physical testbed and generate PWMs from the embedded controller to demonstrate working condition
Phase 7: Translate simulation code and trained neural network to embedded controller
Phase 8: Carry out physical testing and record / report results
Phase 9: Refine and retest using findings from Phase 8

The project is now complete, subject to future iterations to improve the integrity and reproducability of results.

References

[1] Alvaro A. Cardenas, Saurabh Amin†, Bruno Sinopoli‡, Annarita Giani∗ Adrian Perrig‡ Shankar Sastry, “Challenges for Securing Cyber Physical Systems” [Online]. Available: https://pdfs.semanticscholar.org/d514/97e5827cc00d9d00c26e27a769d42284cfba.pdf. [Accessed: 19-Nov-2017].

[2] Yilin Mo, Tiffany Hyun-Jin Kim, Kenneth Brancik, Dona Dickinson, Heejo Lee, Adrian Perrig, and Bruno Sinopoli, “Cyber–Physical Security of a Smart Grid Infrastructure” [Online]. Available: http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=6016202. [Accessed: 19-Nov-2017].

[3] Olga Gelbart, Eugen Leontie, Bhagirath Narahari, Rahul Simha, “A compiler-hardware approach to software protection for embedded systems” [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0045790608000608. [Accessed: 19-Nov-2017].

Overview

Motivation

Objectives

Current Progress

References