Tesi etd-02232022-162403
Link copiato negli appunti
Tipo di tesi
Dottorato
Autore
DONNARUMMA, CIRO
URN
etd-02232022-162403
Titolo
Safety-critical systems for railway applications: a real-time systems perspective
Settore scientifico disciplinare
ING-INF/05
Corso di studi
Istituto di Tecnologie della Comunicazione, dell'Informazione e della Percezione - PH.D. PROGRAMME IN EMERGING DIGITAL TECHNOLOGIES (EDT)
Commissione
relatore Prof. BUTTAZZO, GIORGIO CARLO
Membro Prof. CINQUE, MARCELLO
Membro Dott.ssa COLLA, VALENTINA
Presidente Prof. BONDAVALLI, Andrea
Membro Prof. CINQUE, MARCELLO
Membro Dott.ssa COLLA, VALENTINA
Presidente Prof. BONDAVALLI, Andrea
Parole chiave
- railway
- interlocking
- CBI
- SSI
- real-time systems
- safety
Data inizio appello
11/07/2022;
Disponibilità
parziale
Riassunto analitico
The technological advancement of computer architectures has brought to evermore computational power, enabling plenty of new applications for computer systems in a variety of different domains. The railway domain has also ridden the wave of innovation by exploiting the computational power offered by modern computing systems to implement in software functionalities like railway signalling and interlocking — i.e., functionalities that supervise and control the railway network to guarantee a safe movement of the trains — which were previously provided by electro-mechanical and relay-based systems. The migration from electro-mechanical and relay-based interlocking toward Computer Based Interlocking (CBI) — also known as Solid State Interlocking (SSI) — led to several benefits. For instance, the flexibility offered by computing systems allowed railway engineers to design a single customizable interlocking platform able to interpret and enforce different safety rules (or safety logic). This flexibility has enabled the reuse of the same interlocking platform to control different segments of the railway network by simply customizing it, i.e., by merely changing its configuration, reducing the engineering costs. Moreover, aided by the performance growth of the computer architecture, railway engineers have also been able to further reduce the costs by designing CBI platforms capable of covering an evermore significant portion of the railway, hence, reducing the total number of CBIs needed to manage the whole railway network.
However, although the technological growth of the railway systems has been driven by the untiring computer architectures improvement, on the other side, it has been slowed down by the safety and predictability requirements that are typical for such a kind of system. Indeed, the systems above are commonly known as safety-critical systems because they play a crucial role in ensuring the passengers' safety: they aim to reduce the probability of rail disasters almost to zero. Hence, due to their criticality, a framework of safety regulations has been established to drive their design and development processes to the end of reducing their probability of failure. For instance, the principal three regulations for the railway domain adopted in Europe (and in many countries outside Europe) are EN 50126, EN 50128 and EN 50129, which were standardized by the CENELEC, i.e., the European Committee for Electrotechnical Standardization. Therefore, to be deployed, a safety-critical system must comply with such regulations, and the compliance must be established by an accredited third recognized entity, known as independent safety assessor, through a meticulous certification process, at the end of which the system itself receives the certification of compliance.
As the certification process represents a very meticulous and complex process, CBI systems have always been built upon well-established technologies, which are known to be "certifiable". For instance, according to the procedures followed by Rete Ferroviaria Italiana S.p.a. (RFI) — which is the Italian railway infrastructure manager — most CBI systems still use either single or multicore computing architectures with only one core powered up, scheduled by the cyclic scheduling algorithm. However, this approach has several limitations, for instance:
1) It restrains following the technological progress of the computing architectures, which is pushing towards improving the performance by increasing the parallelism — i.e., the number of cores — rather than by individually increasing the cores' performance; and
2) It wastes much computational power by both (i) using only one core of the underlying platform, and (ii) using a scheduling algorithm that does not allow to reach the full processor utilization.
On the other hand, the proper use of the high computational power offered by the current computing architectures would enable further cost reduction by both allowing (i) the integration of multiple functions on a single computing platform, and (ii) the improvement of the CBIs for covering a wider railway's portion. Motivated by this consideration, railways operators (e.g., RFI) are trying to overcome the previous limitations by looking for solutions that allow the certification of safety-critical systems for railway applications based on multicore computing platforms scheduled by a more performing scheduling algorithm like Fixed Priority.
This dissertation aims at achieving the goal presented above by focusing on two fundamental techniques typically used to implement safety-critical systems in compliance with safety regulations: voting in redundant architectures and online monitoring for fault detection.
First, this dissertation tackles the problem of voting in 2-out-of-2 redundant architectures from a scheduling perspective by presenting, analyzing and comparing two approaches for scheduling voting-related activities and managing inter-replica communication and synchronization under fixed priorities. The first approach is easy-to-use because it relies on passive waiting (i.e., tasks' self-suspension), which represents a well-known programming paradigm. The second approach is a novel idea based on the Logical Execution Time (LET) paradigm, which requires additional tasks for carrying out voting-related activities. Hence, although it is not easy to use, the latter approach performs better than the former. For the sake of clarity, both of the approaches mentioned above will be presented for single-core systems, but the one inspired by the LET paradigm can easily be generalized to also apply to multicore platforms. Then, it proposes a software architecture that enables the execution of online fault detection tests — specifically those for memories — upon multicore systems managed by the fixed priority scheduling algorithm. Moreover, it proposes an optimization algorithm driven by the schedulability analysis to find the diagnostic test's optimal configuration, making the system schedulable and maximizing the fault coverage.
Finally, this dissertation presents a hard real-time operating system architecture for multicore processors based on the partitioned fixed-priority algorithm by focusing on compliance with the regulation CENELEC EN 50128, which regards software systems for safety-critical railway applications.
This dissertation enriches the state-of-the-art of safety-critical and hard real-time systems for railway applications with solutions for fundamental problems arising out of the migration from the well-known single-core systems based on the cyclic scheduling algorithm towards multicore systems based on the fixed priority algorithm. Furthermore, as safety regulations belonging to different domains share many aspects, the proposed solutions devised for the railway systems can also be used in other fields like automotive and avionics.
All the solutions provided in this dissertation come along with the results of experimental evaluations based on synthetic workloads, assessing their performance. Moreover, the proposed solutions have also contributed to the implementation of the prototype of the next generation CBI standard platform for the Italian railway, which is currently under development by Rete Ferroviaria Italiana S.p.a.
However, although the technological growth of the railway systems has been driven by the untiring computer architectures improvement, on the other side, it has been slowed down by the safety and predictability requirements that are typical for such a kind of system. Indeed, the systems above are commonly known as safety-critical systems because they play a crucial role in ensuring the passengers' safety: they aim to reduce the probability of rail disasters almost to zero. Hence, due to their criticality, a framework of safety regulations has been established to drive their design and development processes to the end of reducing their probability of failure. For instance, the principal three regulations for the railway domain adopted in Europe (and in many countries outside Europe) are EN 50126, EN 50128 and EN 50129, which were standardized by the CENELEC, i.e., the European Committee for Electrotechnical Standardization. Therefore, to be deployed, a safety-critical system must comply with such regulations, and the compliance must be established by an accredited third recognized entity, known as independent safety assessor, through a meticulous certification process, at the end of which the system itself receives the certification of compliance.
As the certification process represents a very meticulous and complex process, CBI systems have always been built upon well-established technologies, which are known to be "certifiable". For instance, according to the procedures followed by Rete Ferroviaria Italiana S.p.a. (RFI) — which is the Italian railway infrastructure manager — most CBI systems still use either single or multicore computing architectures with only one core powered up, scheduled by the cyclic scheduling algorithm. However, this approach has several limitations, for instance:
1) It restrains following the technological progress of the computing architectures, which is pushing towards improving the performance by increasing the parallelism — i.e., the number of cores — rather than by individually increasing the cores' performance; and
2) It wastes much computational power by both (i) using only one core of the underlying platform, and (ii) using a scheduling algorithm that does not allow to reach the full processor utilization.
On the other hand, the proper use of the high computational power offered by the current computing architectures would enable further cost reduction by both allowing (i) the integration of multiple functions on a single computing platform, and (ii) the improvement of the CBIs for covering a wider railway's portion. Motivated by this consideration, railways operators (e.g., RFI) are trying to overcome the previous limitations by looking for solutions that allow the certification of safety-critical systems for railway applications based on multicore computing platforms scheduled by a more performing scheduling algorithm like Fixed Priority.
This dissertation aims at achieving the goal presented above by focusing on two fundamental techniques typically used to implement safety-critical systems in compliance with safety regulations: voting in redundant architectures and online monitoring for fault detection.
First, this dissertation tackles the problem of voting in 2-out-of-2 redundant architectures from a scheduling perspective by presenting, analyzing and comparing two approaches for scheduling voting-related activities and managing inter-replica communication and synchronization under fixed priorities. The first approach is easy-to-use because it relies on passive waiting (i.e., tasks' self-suspension), which represents a well-known programming paradigm. The second approach is a novel idea based on the Logical Execution Time (LET) paradigm, which requires additional tasks for carrying out voting-related activities. Hence, although it is not easy to use, the latter approach performs better than the former. For the sake of clarity, both of the approaches mentioned above will be presented for single-core systems, but the one inspired by the LET paradigm can easily be generalized to also apply to multicore platforms. Then, it proposes a software architecture that enables the execution of online fault detection tests — specifically those for memories — upon multicore systems managed by the fixed priority scheduling algorithm. Moreover, it proposes an optimization algorithm driven by the schedulability analysis to find the diagnostic test's optimal configuration, making the system schedulable and maximizing the fault coverage.
Finally, this dissertation presents a hard real-time operating system architecture for multicore processors based on the partitioned fixed-priority algorithm by focusing on compliance with the regulation CENELEC EN 50128, which regards software systems for safety-critical railway applications.
This dissertation enriches the state-of-the-art of safety-critical and hard real-time systems for railway applications with solutions for fundamental problems arising out of the migration from the well-known single-core systems based on the cyclic scheduling algorithm towards multicore systems based on the fixed priority algorithm. Furthermore, as safety regulations belonging to different domains share many aspects, the proposed solutions devised for the railway systems can also be used in other fields like automotive and avionics.
All the solutions provided in this dissertation come along with the results of experimental evaluations based on synthetic workloads, assessing their performance. Moreover, the proposed solutions have also contributed to the implementation of the prototype of the next generation CBI standard platform for the Italian railway, which is currently under development by Rete Ferroviaria Italiana S.p.a.
File
Nome file | Dimensione |
---|---|
Ci sono 1 file riservati su richiesta dell'autore. |