Performance Studies of Fault-Tolerant Middleware

Sammanfattning: Today’s software engineering and application development trend is to take advantage of reusable software. Much effort is directed towards easing the task of developing complex, distributed, network based applications with reusable components. To ease the task of the distributed systems’ developers, one can use middleware, i.e. a software layer between the operating system and the application, which handles distribution transparently. A crucial feature of distributed server applications is high availability. This implies that they must be able to continue activity even in presence of crashes. Embedding fault tolerance mechanisms in the middleware on top of which the application is running, offers the potential to reduce application code size thereby reducing developer effort. Also, outage times due to server crashes can be reduced, as failover is taken care of automatically by middleware. However, a trade-off is involved: during periods with no failures, as information has to be collected for the automatic failover, client requests are serviced with higher latency. To characterize the real benefits of middleware, this trade-off needs to be studied. Unfortunately, to this date, few trade-off studies involving middleware that supports fault tolerance with application to realistic cases have been conducted. The contributions of the thesis are twofold: (1) insights based on empirical studies and (2) a theoretical analysis of components in a middleware equipped with fault tolerance mechanisms. In connection with part (1) the thesis describes detailed implementation of two platforms based on CORBA (Common Object Request Broker Architecture) with fault tolerance capabilities: one built by following the FT-CORBA standard, where only application failures are taken care of, and a second obtained by implementing an algorithm that ensures uniform treatment of infrastructure and application failures. Based on empirical studies of the availability/performance trade-off, several insights were gained, including the benefits and drawbacks of the two infrastructures. The studies were performed using a realistic (telecommunication) applicationset up to run on top of both extended middleware platforms. Further, the thesis proposes a technique to improve performance in the FT-CORBA based middleware by exploiting application knowledge; to enrich application code with fault tolerance mechanisms we use aspect-oriented programming. In connection with part (2) the thesis models elements of an FT-CORBA like architecture mathematically, in particular by using queuing theory. The model is then used to study the relation between different parameters. This provides the means to configure one middleware parameter, namely the checkpointing interval, leading to maximal availability or minimal response time.

  Denna avhandling är EVENTUELLT nedladdningsbar som PDF. Kolla denna länk för att se om den går att ladda ner.