Since our project spans essentially two areas (the inverted pendulum control system and secure cyber physical systems), we wanted to briefly speak of prior work in both of these areas.
The inverted pendulum is a well studied physical system that, in many cases, serves well as the basis for an introduction to basic control theory principles. Its simplicity has resulted in the production of many papers which address speciic areas of the system, including PID control and Reinforcement Learning (a quick google search will bring up a decently sized list). There are several Youtube videos with simulated or physical models, including here[1], here[2], and here[3].
We chose to build the project on this basic system due to the extensive history we would to use as a resource when tuning or debugging portions of our own system. However, it is worth noting that these variations on the Inverted Pendulum (chiefly PID control and RL approaches) are largely disjoint approaches to the balancing problem. Furthermore, there is virtually no prior work done in implementing security on any of these systems. We therefore identified an opportunity here to use RL and PID control in conjunction to implement a control system with an added layer of security on top.
As mentioned previously, most work concerning security in any sort of cyber-oriented system deals exclusively with software or firmware. Hardware components are mainly called upon to add extra layers of protection to software protocols via encryption or other mechanisms[4]; however, security of the hardware components themselves is largely unexplored.
Most of the explorations in the area deal primarily with sensor falsification. In particular, theory suggests that, assuming we know how to model the system dynamics, we can secure sensor readings by matching a value read at any particular point in time to the expected value given the current system dyanmics, signaling the alarm if the deviation from the expectation is found to be statistically significant[5]. However, there are several papers outlining that this approach is faulty since the dynamics of the system in quesiton could also be known by the attacker. The adversary can therm formulate an attack vector that does not deviate enough to be regristered as statistically signficant, or evne train a machine learning algorithm to identify such vectors[6].
A possible solution to this would be to hide the system dynamics completely, or at the very least have it be represented in a way that obscures understnding. A possible approach to this is to model the dynamics with a Neural Network, which is the approach we decided to take for this project. Neural networks can be trained to approximate nearly any function with the right training and structure, and the entanglement of specific wieghts between each node, biases of each node, and large number of input parameters has consistently made these networks difficult to interpret. In this regard, we could still represent our system dynamics while making the representation very difficult to decipher and attack.
[1] Youtube. "Learning to Swing-Up and Balance from Scratch". Dec 2017. Online Video Clip.
[2] Youtube. "PID control of an inverted pendulum using Arduino Mega 2560 - Odwrócone wahadło". Dec 2017. Online Video Clip.
[3] Youtube. "Control of Inverted Pendulum with Servo Pneumatics - Enfield Technologies". Dec 2017. Online Video Clip.
[4] "Protecting the IoT with Secure Hardware" [Online]. Available https://www.digikey.com/en/articles/techzone/2017/mar/protecting-the-iot-with-secure-hardware
[5] Yilin Mo, Tiffany Hyun-Jin Kim, Kenneth Brancik, Dona Dickinson, Heejo Lee, Adrian Perrig, and Bruno Sinopoli, “Cyber–Physical Security of a Smart Grid Infrastructure” [Online]. Available: http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=6016202. [Accessed: 19-Nov-2017].
[6] Tommaso Dreossi, Alexandre Donz´e, Sanjit A. Seshia, "Compositional Falsification of Cyber-Physical Systems with Machine Learning Components" [Online]. Available: https://arxiv.org/pdf/1703.00978.pdf