Articles
| Open Access |
Architectural and System-Level Fault Tolerance Strategies for Safety-Critical Embedded Processors: Integrating Lockstep Execution, Soft Error Resilience, And Recovery Mechanisms
Maria Mateo , Department of Electrical and Computer Engineering, University of Ljubljana, SloveniaAbstract
The increasing integration density of semiconductor devices, coupled with the growing deployment of embedded processors in safety-critical domains such as automotive, aerospace, and industrial automation, has intensified concerns regarding system reliability and fault tolerance. This research investigates architectural and system-level strategies for enhancing fault resilience in embedded processors, focusing on lockstep execution, soft error mitigation, and recovery mechanisms. Drawing upon established literature, including advancements in dual-core and triple-core lockstep architectures, fault-tolerant soft-core processors, and hybrid hardware-software detection approaches, this study presents a comprehensive analysis of the effectiveness and limitations of these techniques. The research explores the implications of radiation-induced soft errors, particularly in advanced semiconductor technologies, and evaluates mitigation techniques ranging from hardware redundancy to checkpoint and rollback recovery systems. A detailed methodological framework is developed to analyze fault coverage, detection latency, and system overhead across multiple fault-tolerant configurations. Results indicate that while lockstep architectures provide robust error detection capabilities, they must be complemented by adaptive recovery mechanisms and embedded diagnostic features to address both transient and permanent faults effectively. The discussion highlights trade-offs between performance, cost, and reliability, emphasizing the need for hybrid approaches that integrate hardware redundancy with software-level resilience. This work contributes to the ongoing discourse on dependable computing by identifying key limitations in existing fault-tolerance strategies and proposing directions for future research, including adaptive resilience frameworks and machine-assisted fault prediction models.
Keywords
Fault tolerance, lockstep architecture, soft errors, embedded processors
References
Abdul Salam Abdul Karim. (2023). Fault-Tolerant Dual-Core Lockstep Architecture for Automotive Zonal Controllers Using NXP S32G Processors. International Journal of Intelligent Systems and Applications in Engineering, 11(11s), 877–885. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/7749
Azambuja, J.R., et al. Exploring the limitations of software-only techniques in SEE detection coverage. Journal of Electronic Testing, 2011.
Baumann, R.C. Radiation-induced soft errors in advanced semiconductor technologies. IEEE Transactions on Device and Materials Reliability, 2005.
Bernon-Enjalbert, V., et al. Safety Integrated Hardware Solutions to Support ASIL D Applications, 2013.
Bowen, N.S., et al. Processor and memory based checkpoint and rollback recovery. Computer, 1993.
Entrena, L., Lindoso, A., Portela-García, M., Parra, L., Du, B., Sonza Reorda, M., Sterpone, L. Fault-tolerance techniques for soft-core processors using the Trace Interface. Springer, 2015.
Hanafi, A., Karim, M., Hammami, A.E. Dual-lockstep microblaze-based embedded system for error detection and recovery with reconfiguration technique. Proceedings of the Third World Conference on Complex Systems, 2015.
Iturbe, X., Venu, B., Ozer, E., Das, S. A Triple Core Lock-Step ARM Cortex-R5 Processor for Safety-Critical and Ultra-Reliable Applications. IEEE/IFIP International Conference on Dependable Systems and Networks Workshop, 2016.
Peña-Fernandez, M., et al. PTM-based hybrid error-detection architecture for ARM microprocessors. Microelectronics Reliability, 2018.
Portela-García, M. On the use of embedded debug features for permanent and transient fault resilience in microprocessors. Microprocessors and Microsystems, 2012.
Article Statistics
Downloads
Copyright License
Copyright (c) 2025 Maria Mateo

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Copyright and Ethics:
- Authors are responsible for obtaining permission to use any copyrighted materials included in their manuscript.
- Authors are also responsible for ensuring that their research was conducted in an ethical manner and in compliance with institutional and national guidelines for the care and use of animals or human subjects.
- By submitting a manuscript to International Journal of Computer Science & Information System (IJCSIS), authors agree to transfer copyright to the journal if the manuscript is accepted for publication.