Prof. Kishor Trivedi (Duke University, USA)
Topic: Parametric Uncertainty Propagation through Dependability Models
Talk Outline: Abstract—The uncertainty propagation is to investigate the effect of errors in model input parameters on the system output measure in probability models. In this paper, we present a moment-based approach of the uncertainty propagation of model input parameters. The presented approach requires only the fist two moments of model parameters, and has an advantage in terms of computation over the closed-form, numerical and sampling-based approaches for uncertainty propagation. The paper presents the properties of moment-based approach by comparing the existing Bayes estimation for the uncertainty propagation in a simple reliability model. An availability model of a server with virtual machines is used to illustrate the applicability of our method in practical problems.
Short Bio: Kishor Trivedi heads Duke High Availability Assurance Laboratory (DHAAL) and holds the Hudson Chair in the Department of Electrical and Computer Engineering at Duke University. He is known as a leading international expert in the domain of reliability and performability evaluation of Dependable systems, and has made seminal contributions to stochastic modeling formalisms and their efficient solution. He is currently carrying out experimental research in software reliability during operation where he is researching software fault tolerance through environmental diversity. This work, including software bug classification, empirical study of real failure data and associated theory of affordable software fault tolerance, has already gained significant attention.
Prof. Raimundo Macêdo (Federal University of Bahia - UFBA, Brazil)
Topic: Dependability in Hybrid Message-passing Distributed Systems
Talk Outline: Message passing distributed system models assume certain timing properties. For example, synchronous models have known time bounds for process executions and message transmissions, whereas in asynchronous models time bounds are unknown. Timing assumptions have implications in what can be computed; for instance, a deterministic consensus is impossible in asynchronous systems with crash failures (the famous FLP impossibility result). In these two extreme examples of system models, timing assumptions are valid for all processes and channels (space) and for the whole system existence (time). We call these systems homogeneous in space and time. There are, however, heterogeneous or hybrid systems. A system can be hybrid in either space, or in time, or in both space and time. For example, a typical partially synchronous system based on the Global Stabilization Time (GST) assumption is hybrid in time because it becomes synchronous after GST, but homogeneous in space as processes and channels maintain mutually consistent properties, that is, at a given time instant the system is either synchronous or asynchronous. In this talk, we discuss how to explore heterogeneity in space and/or in time to enhance the dependability of some classes of real systems made of sub-systems with varied characteristics. For instance, in a system that we call partitioned synchronous, which is hybrid in space but homogeneous in time, we can implement perfect crash failure detection and consensus deterministically, with a limited number of communication rounds. Moreover, we discuss how to explore space heterogeneity to enhance dependability properties even when the system is hybrid in both time and space, or when processes are byzantine.
Short Bio: Raimundo José de Araújo Macêdo is a Full Professor in the Computer Science Department at Federal University of Bahia (UFBA) in Brazil and the head of the Distributed Systems Laboratory (LaSiD) at UFBA. He holds a Ph.D. in Computer Science from the University of Newcastle upon Tyne (England). He has participated and coordinated several research projects with different Brazilian and foreign research institutions, covering the many aspects of dependable distributed systems (algorithms, architectures, and implementations). Main research topics include group communication, consensus, replication, failure detection, self-management, simulations, real-time, architectural frameworks, and cyber-physical systems. He served on the program and organizing committees of many international and national conferences, including DSN, SRDS, DISC, Middleware, DAIS, LADC, SBRC, and SBESC. He is an IFIP Councillor, a Brazilian Computer Society (SBC) Director, and the General Chair of IEEE SRDS 2018.