An Emulation-Based Performance Evaluation Methodology for Edge Computing and Latency Sensitive Applications

Sammanfattning: Cloud Computing, with its globally-accessible nature and virtually unlimited scalability, has revolutionized our daily lives since its widespread adoption in the early 2000s. It allows us to access our documents anywhere, keep in touch with friends, back up photos, and even remotely control our appliances. Despite this, Cloud Computing has limitations when it comes to novel appli- cations requiring real-time processing or low-latencies. Applications such as Cyber-Physical Systems (CPSs) or mobile eXtended Reality (XR), which in turn also have great transformative potential, are unable to run on the Cloud. Edge Computing is emerging as a potential solution to these limitations by bringing computation closer to the edge of the network, thereby reducing latency and enabling real-time decision making. The combination of Edge Computing and modern mobile network technologies such as 5G offers potential for massive deployments of latency-sensitive applications. However, scaling and understanding these deployments poses important challenges such the optimization of latency through multiple processing steps and trade-offs in wireless system choice, protocols, hardware, and algorithms. Existing approaches have so far been unsuccessful in capturing the complex effects arising from the interplay between network and compute in these systems. This dissertation addresses the challenge of performance evaluation of Edge Computing and the applications enabled by this paradigm with two key contributions to literature. First, a methodological approach to experimentally studying the trade-offs between system responsiveness and resource consumption in latency-sensitive applications such as CPSs and XR is introduced. These applications and systems feature characteristics, such as tight interaction with the physical world and the involvement of humans, that make them challenging to study through simulated approaches or analytical modeling. The approach presented in this thesis involves therefore the emulation of the client-side workload while maintaining the real server-side process and network stack to retain the realism of network and compute effects. Next, an exploration of the requirements for accuracy in the emulation is presented. This work discusses the extent to which accuracy in the emulation can open new avenues for optimization of these systems. To this end, the first-ever realistic model of human timings for a particular class of Mobile Augmented Reality (MAR) applications is provided. The model is combined with a mathematical framework to study the potential for optimization in Edge Computing applications. Results indicate that the methodology introduced in this work offers advantages over existing methods by improving efficiency, repeatability, and replicability. By fully integrating workload components into the emulated software domain, this methodology reduces complexity while still capturing complex effects of network and compute factors that are challenging to model. This approach represents thus an important contribution to literature, as it consists of a comprehensive method for the performance evaluation of Edge environments, encompassing both the application and the infrastructure. Furthermore, results from the exploration into the implications of realism in the emulation suggest that incorporating enhanced realism in client-side emulation can enable the implementation of optimization approaches that would otherwise be infeasible. In particular, this work highlights the significance of considering human behavior and reactions in addition to system-related metrics and performance optimizations in the context of MAR.