Modular online update and upgrades for safety-criticality systems
Safety-critical systems face an increase in critical software functions that require high-performance hardware platforms. Due to increasing software complexity and platform connectivity, Over-The-Air Software Updates (OTASU) become necessary for modern embedded systems as updates and feature enhancements, safety and security fixes, or adaptations to other components become inevitable during their lifetime. The application of OTASU in the safety-critical domain is not free of challenges. Our proposed approach simultaneously addresses the key reliability challenges that OTASU brings to critical domain with particular attention to safety and security, availability, and increasing platform complexity.
Offered Solution
We propose a new domain-independent software paradigm for the development, integration, and modular certification of software applications on mixed-critical cyber-physical systems along the product lifecycle that supports secure OTASU. This paradigm is implemented and demonstrated through a new proof-of-concept software architecture and development process that enables remote deployment of new applications (patches to existing features or feature enhancements) on heterogeneous computing platforms in the Dev-Ops cycle on the fly. In addition, we provide a strategy for future certification of the approach with respect to safety (e.g., IEC-61508, ISO 26262) and security (IEC-62443, ISO 21434) through specific concepts that build on composability, modularity, and observability as key properties to enable dynamic validation of safety and security properties (after deployment in the operational environment).
Update cockpit
The Update Cockpit allows the definition of the application as an updatable entity as well as contract-based metadata for system integration. These contain interface definitions, time and resource contracts. In addition, the Update Cockpit can be used to view and modify the entire system configuration (vehicle software) for a specific vehicle platform. Modifications (change of software components) trigger virtual integration and compatibility checks at design time to confirm the validity of the new configuration
Update server
The update server provides simulation-based and analytical integration tests for time contracts and compatibility tests for resource contracts. If the integration or compatibility tests fail, diagnostic data (traces) are transmitted as feedback to the developers or system integrators. If the integration and compatibility tests are successful, an update is prepared and transmitted Over-The-Air (OTA) via a secure connection (VPN connection).
Update-Middleware
The update middleware supports secure downloading of new update packages, data integration, validation, insitu compatibility and resource checking based on the field configuration and health status of the vehicle, and deployment of updatable software containers and associated monitor components on the (central) ECU. Updates and monitoring of end devices are optionally supported and fully controlled via the gateway. The update middleware uses libvirt as a common abstraction layer across various sta-of-the-art hypervisors (KVM, Xen) and containerization software (Docker) as well as HW platforms.
Overview of a continuous development and integration loop in the automotive sector
① The designer uses fleet data to develop an update for an automotive function and defines contracts to describe the timing and resource requirements of the system.
② The update is tested for compatibility, timing, and resource requirements. Monitors are generated from the contracts, which are later used to ensure ongoing functionality.
③ The update package is transferred to the vehicle via a wireless connection and tested again for compatibility and resource requirements. It is then installed without affecting other running functions through modularization. Monitors constantly monitor the functions according to the time and resource contracts set by the designer.
④ Rare events in terms of time and resource constraints and in terms of rare functional behavior are reported back to the manufacturer and collected for the next design iteration.