Optimizing Embedded Applications Using DMA
Enviado por mt1986 • 18 de Septiembre de 2014 • 1.946 Palabras (8 Páginas) • 241 Visitas
Optimizing Embedded Applications using DMA
By Sachin Gupta, Applications Engineer Sr, and Lakshmi Natarajan, Applications Engineer Sr, Cypress Semiconductor Corp.
Just like a human body is composed of many individual systems, an embedded system comprises multiple functions. Even a
basic mobile phone in today’s world includes calling facilities, messaging facilities, entertainment options like games, music
player, radio, camera, Bluetooth connectivity, and so on. Including so many different functions makes these systems quite
complex.
The CPU handles the arithmetic and decision-making operations. Most real-time systems might include one or more
processors to handle these functions with each processor interacting with the others. On the one hand, we add more CPUs to
share the load; on the other, the CPUs may waste time transferring data between them.
An efficient real-time system is one where the CPU is used for the maximum number of tasks to yield better real-time
responsiveness and lower power consumption while still being flexible enough to support future enhancements. In order to
reduce the CPU time that is being wasted in data transfers, many systems include a peripheral which can do data transfer
operations without including the CPU. This peripheral is called the Direct Memory Access (DMA).
The DMA helps to yield better results in terms of CPU usage and hence higher system throughput. Earlier, the concept of
DMA was limited to computer applications like PCs, servers, etc. Due to complexity of modern electronics and the need to
transfer a lot of data, it has become integral part of embedded applications as well.
In recent days, most high-end processors used for larger real-time systems like automotive, avionics, and medical applications
integrate a DMA peripheral. This DMA can be used for the following type of data transfers, with availability varying from
controller to controller:
Memory to memory transfer
Memory to peripheral transfer
Peripheral to memory transfer
Peripheral to peripheral transfer
Data transfers might occupy a large percentage of CPU time in a system. Let’s consider a car dashboard system as an
example. A dashboard has multiple data to displayed which comes from various sub systems. Figure 1 shows the block
diagram of a dashboard system with sub-systems.
Figure 1: Example of a real-time system and sub-systems
Each subsystem transmits its data to the central processor which controls the display. In this case, the central processor
controlling the display has to receive lots of data and then determine which subsystem has transmitted the data and update
the display accordingly. If CPU has to handle reception of data, decision-making, and display update, system response time
will decrease, resulting in a sluggish system.
Optimizing Embedded Applications using DMA Page 2 of 6
Published in EE Times Design (http://www.eetimes.com) November 2010
To improve system performance, developers can use a high-frequency CPU. This, however, comes at a tradeoff in terms of
power consumption and system cost. Alternatively, a DMA can free the CPU from these numerous data transfer operations.
The DMA handles all the data transfers, leaving the CPU to handle all the other tasks. The above example gives an initial idea
of why DMA is required in complex systems.
In a processor, all the peripherals including memory will be connected to the CPU using buses. As shown in Figure 2, these
buses are the routes across which data is transferred between the peripherals (including memory) and the CPU. A processor
including the DMA will have 2 masters for the bus – CPU and DMA. For instance, a simple C statement like “a = b + c”
involves CPU access to data in the memory. The CPU has to access the memory to read variables b and c, calculate the sum,
and update the value in the memory location for variable “a”. When the CPU wants to access the memory, it submits a request
to the bus which then processes the request, retrieves, and then updates the data in the memory location through the bus.
BUS
Figure 2: Example of a real-time system and subsystems
When DMA is involved, the DMA can access any peripheral based on the controller features. For instance, consider a system
where DMA is used to move received UART data to memory. In this case, the DMA has to access the UART registers and the
memory location through the bus. Since the DMA and CPU both can access the bus, there is usually an arbitration mechanism
to handle the bus access.
Let’s take a simple example of reading 10 bytes of data from an array in Flash to an array in SRAM to illustrate how much time
the DMA can save. When the function is done using a CPU alone, the following steps will be executed:
1. Copy the value in the Flash memory location and store the value to a GPR (General Purpose Register e.g.: R0)
2. Copy the GPR value to the Accumulator register
3. Copy the Accumulator value to the SRAM memory location
4. Check if all the bytes have been copied
5. If yes, end
6. If no, Increment the Flash pointer
7. Increment the SRAM pointer
8. Go to Step 1
A simple Flash-to-SRAM copy uses many instruction cycles which involve CPU.
If the same example is handled using DMA, the DMA hardware will handle the process thereby reducing the execution time.
Figure 3 shows how the DMA handles the transfer:
Optimizing Embedded Applications using DMA Page 3 of 6
Published in EE Times Design (http://www.eetimes.com) November 2010
Figure 3: DMA handling of transfer
The DMA takes a few cycles for initialization, then the DMA hardware automatically handles the memory pointer increments.
The DMA hardware also checks for the completion of data transfer and signals the completion.
Advantages of using DMA
The longer the CPU runs, the more power consumed. Using DMA to offload the CPU thereby reduces power
consumption.
The DMA works in parallel with the CPU, thereby simulating a multi-processing environment and effectively increasing the
CPU’s bandwidth.
Offloading the CPU results in more idle time for the CPU. This frees up available processing capacity for future product
enhancements.
Real-time examples for using DMA
Wave generation using a SoC can be a perfect example where DMA makes a tremendous difference. A wave generator needs
...