Global Sources
EE Times-Asia
Stay in touch with EE Times Asia
EE Times-Asia > Embedded

Partitioning helps build system reliability

Posted: 01 Oct 2004 ?? ?Print Version ?Bookmark and Share

Keywords:partition? ipsec? protocol? embedded? os?

Networking capabilities are becoming a requirement for embedded systems. While the ability of systems to communicate with the rest of the world creates a huge number of new capabilities and useful features, it also opens the door to many security threats. Securing a networked system requires a high-availability, maximum-reliability RTOS combined with a secure method of transmitting information.

For all practical purposes, it is impossible to find and fix every bug in any reasonably complex applicationthe amount of testing and reviewing necessary to find every bug in a large application is prohibitive. If we accept that an application of any real complexity will always have bugs in it, how can we make our devices as reliable as is economically possible?

To create reliable applications, we must use an OS designed from the ground up to help create reliable systems. Note that we are talking about the reliability of the entire system, not of just the OS. One of the easiest ways to make a system more reliable is to partition it into independent subsystems. A failure in one subsystem should not affect the execution of the rest of the system in any way. This has long been a requirement in safety-critical systems, such as avionics control and medical systems.

Memory protection

The first step in properly partitioning an application is using memory protection. Most developers are familiar with the idea of memory protectionthe OS uses the processor's memory management unit (MMU) to isolate applications in protected address spaces. With memory protection enabled, an application can only corrupt itself. Any attempt to read or write memory mapped into another address space will cause an MMU exception to be raised. The kernel will be notified of the attempted access and can handle it as it sees fit.

Though application A may not be able to directly read or write to application B's memory, there are still ways it can cause application B to fail. For example, assume two tasks are in a system. If the system designer wants the two tasks to share the processor time, both tasks would be given the same priority (round-robin scheduling). If task A needs 40 percent of the CPU time to execute properly, this design will work finethere are two tasks in the system, so each task will get 50 percent of the execution time. But if task B creates an additional task, now we have three tasks in the system (A, B1, B2) each getting 33 percent of the CPU time, and task A will no longer function properly. Now, although there is no bug in task A, a bug in task B has caused task A to stop behaving properly. A more extreme case would be if there is an actual bug in task B and it creates 98 additional tasks (A, B1, B2-B99). There are now 100 tasks in the system and each gets exactly 1 percent of the CPU time. At this point, the system will start to grind to a halt.

The cleanest solution would be to start each task with an execution "weight." If a task wants to create an additional task, it must give up some of its own weight. CPU time per priority level would be divided up based on the total weight at that priority level.

Another potential way for one task to interfere with the execution of another one is through the use of system memory. Virtually every OS today (embedded or not) has one central memory pool for the entire system. The major disadvantage: any application can starve other applications and the kernel itself for memory. If application A has a bug that causes it to request all the memory in the system, every other task will be prevented from allocating any additional memory. Perhaps more frightening, the kernel itself won't be able to create additional kernel objects (tasks, semaphores etc.). This is another way in which a single bug in one part of a system can cause other parts to stop behaving as designed. A classic attack that shows this problem is a Unix fork bomb.

A fork bomb is a process that just spawns other processes that are clones of itself. Each new process spawns new processes. The system quickly bogs down as thousands of processes are created. Each process requires new memory. Eventually the system crashes as all available memory is consumed.

One solution to this problem is to statically allocate a specific amount of memory to each part of the system. Each application can be guaranteed the minimum amount of memory it needs to function properlyany additional memory allocation can come from this central store of memory. This way, if application B uses all of its memory, its additional memory allocations will fail but other parts of the system (application A and the kernel) will be unaffected.

True partitioning of a system should include not only protecting the kernel and application code, but the communication stack and associated device-driver code. Communication protocols are complex pieces of code rarely written by end users. As this software increasingly becomes commoditized, application developers are choosing to license rather than write their communications protocols. Unfortunately this means the internals of the stack are often poorly understood, and its reliability and behavior under all conditions cannot be guaranteed.

Time-to-market pressures are also forcing developers to use standardized controllers (Ethernet, serial etc.) and the device drivers provided by the manufacturer. Unfortunately, this again means using somebody else's code without being able to verify the correctness of it.

Typically the stack and the device drivers are linked with the kernel. Unfortunately, this means that a bug in somebody else's code can completely crash the entire system. If the TCP/IP stack or device driver is in its own address space (instead of being statically linked with the kernel), it cannot affect other applications. Also, it can be restarted and upgraded on the fly, without requiring an application reload.

Protecting communications

So far this article discussed how to secure the actual application from bugs and malicious code and concluded that a combination of memory protection, CPU time guarantees and guaranteed memory allocation will ensure that an application will not only be safe from other applications running on the same system, but also have the system resources necessary to run properly. Unfortunately, the work to create a reliable networked device does not stop there.

Ensuring secure and reliable communications means protecting the system against three main problems. First, our data packets could be modified in transit. Second, the remote side's real identity is unknown. Finally, transactions can be snooped and replayed. Any of these can compromise security and reliability, possibly allowing an attacker to insert malicious code into your application, take control of the device, or both.

The easiest solution is to add Internet Protocol Secure (IPSec). It offers strong encryption of transferred data, guaranteeing that nobody can read transmitted information. It provides integrity and authentication, discarding modified packets and certifying a peer's identity. Finally, IPSec provides replay protection by ignoring duplicate transmissions.

IPSec allows this protection to be specified per packet, per socket, and per source and/or destination host. Its use is transparent to applicationsthey don't need to be modified to take advantage of the security IPSec provides. Old and new TCP/IP applications are automatically protected with no modification.

Although IPSec is available in IPv4, its use is required in IPv6. IPv6 also provides automatic configuration (essential for embedded systems), and a huge increase in the number of IP addresses (from 232 to 2,128). This will become more essential as the number of networked devices growshouses will have multiple networked devices, such as computers and gaming systems, and the average person will carry multiple networked devices, such as PDAs and cellphones.

The combination of IPSec and IPv6 provides networked devices with security, compatibility with future networking systems and easier configuration.

- Jeffery Hall

Regional Field Application Engineer

Green Hills Software Inc.

Article Comments - Partitioning helps build system reli...
*? You can enter [0] more charecters.
*Verify code:


Visit Asia Webinars to learn about the latest in technology and get practical design tips.

Back to Top