Designing safe and adaptive time-critical fog-based systems

Sammanfattning: Safety-critical systems in industrial automation, avionics, or automotive domains demand correct, timely and predictable performance under all(including faulty) operating conditions. Fault-tolerance plays an important role in ensuring seamless system function even in the presence of failures. Typically such systems run hard real-time applications, and hence timing violations can result in hazards.   Fog computing is an adaptive paradigm which distributes computation and communication along the cloud-IoT continuum to reduce communication latencies, making it more conducive to execute real-time applications. This requires enhancements to the network connecting various sub-systems to support timely delivery of safety-critical messages. Traditionally safety-critical systems are designed offline and are not re-configured during runtime. The inherent adaptive properties of fog computing systems make it susceptible to timeliness violations and can be a hindrance to safety guarantees. At the same time, adaptivity in terms of migrating computation and communication to different devices in the fog-cloud continuum can be used to make the system more fault-tolerant by suitable design approaches. In this work we provide design approaches geared towards achieving safety and predictability of critical applications that run on adaptive fog computing platforms. To this end, we start by performing a survey of safety considerations in a fog computing system and identifying key safety challenges. We then propose a design approach to improve predictability in an autonomous mobile robot use-case in a factory setting designed using the fog computing paradigm. We narrow our attention on time-sensitive networking (TSN) and propose a temporal redundancy-based fault tolerance approach for time-sensitive messages. Furthermore, we study the 802.1CB TSN protocol and suggest improvements to reduce network congestion owing to replicated frames.As a future work, we intend to also include the wireless aspects in the evaluation of timeliness guarantees for safety-critical applications. The emphasis will be on run-time failure scenarios and self-healing mechanisms based on online decisions taken in concert with offline guarantees.

  Denna avhandling är EVENTUELLT nedladdningsbar som PDF. Kolla denna länk för att se om den går att ladda ner.