Embedded Systems Programming Course

Input/Output Devices

Introduction

A processor is not of much use unless you can pass data into and out of the CPU to external devices like serial ports, network interfaces, or displays. In this section I will attempt to talk about how I/O works on a microprocessor at a low-level.

The Address Bus

Recall from our discussion earlier about microprocessors, that every CPU has a number of pins, which work together, called an address bus. The address bus is normally used to read or write to memory, most often RAM chips. Most modern microprocessors use the address bus for more than just reading and writing to memory however.

By toggling a special pin, the CPU can switch from using the address bus for accessing RAM, to using the address bus to talk to other semi-intelligent chips that are also connected to the address bus. When used in this way, we are said to be using I/O port addressing, instead of normal memory addresses. Sometimes a port will be referred to as a register, but I find this a bit confusing, since a register normally means an internal CPU register. The semi-intelligent device chips are only activate when they detect that the special I/O pin is asserted and the address bus holds the memory value that points to that specific chip.

This is how most input and output occurs from devices like serial ports, parallel ports, floppy, hard drive and other controllers. Once the CPU has placed the proper address on the address bus and it asserts the special I/O pin, all RAM chips are temporarily disabled and the external I/O chips are read or written from instead. The bytes of data are actually transferred on a second set of pins called the data bus.

In the case of the modern PC-compatible personal computers, many I/O devices have fixed addresses. For example, the first serial port on a PC-compatible is always found at I/O port address 0x3f8 on the bus. If a second serial port is available, it will be located at port address 0x2f8 and so on? IBM defined these values when it first introduced the PC computer in the late 1970s. Other manufacturers will of course use different port numbers in many cases.

Generally, you can only read or write a single byte of information from an I/O port, although some newer I/O chip devices allow 16, 32 or even 64 bit wide ports. To support more than one kind of information flow between the external device and the processor, multiple ports are often used. This is needed for example when communicating through a serial port to a modem. The modem can either transmit data out to another modem (output), or read data received from the other modem and place it into the I/O port. When used this way, the processor must have some way of telling the hardware device whether it wants to read data or send data. This is usually accomplished by sending a control word to another port that is monitored by the hardware device. One of the bits can be set (1) or cleared (0) to indicate the direction of data flow for the next operation.

For example, the original IBM PC's parallel port really consists of two I/O ports. The first is used to transmit data out from the computer to an attached parallel port device (often a printer), while the second is used to read status information back from the device. In this case, one port is used for output, while the other port is used for input. The input side of the parallel port device has pins that correspond to the current state of the attached device. One bit indicates whether the device is powered up, another if it is ready to receive data, another may indicate the printer is out of paper, etc?

The Data Bus

The data bus is nothing more than a series of pins on the processor that are used to get data into, or out of, the processor chip itself. All memory and I/O devices are connected to the data bus, but depending on the current state of the address bus and other control pins on the processor, only one chip can actually be connected to the data bus at any given moment.

Depending on the exact processor used, the data bus may be 4, 8, 16, 32 or perhaps 64-bits wide. A wider data bus allows the processor to read and write more bits of data in a single operation. This technique is used with PCI-based cards on PC-compatibles to achieve faster I/O operations for certain devices. In other cases however, using more bits is a waste of time, because the device connected at the other end of the data bus only supports 4 or 8 bit transfers at a time. In this case it is very important to ignore the unused bits, generally by using a bit masking operation to force the unused bits to a zero value.

Careful hardware design is needed to ensure that no two chips attempt to write to the bus at the exact same moment. If this ever occurs, it is called bus contention. Normally, this will result in a locked, or frozen, computer. When two chips attempt to drive pins on the data bus high or low simultaneously, the values on the bus are in an indeterminate state and garbage is the result. This is one source for the term garbage-in, garbage-out.

Interrupt Requests

In addition to the processor using the data bus, address bus and special I/O pin to communicate with external devices; the external devices use another pin when they need the attention of the processor. This is referred to as an Interrupt Request Line or IRQ Line. For example, whenever you press a key on the keyboard, the keyboard controller device generally signals the main processor that a key is available by asserting the interrupt line. Some processors have a single IRQ line, which must be shared amongst all devices, while other have a series of IRQ lines.

When each device has its own unique IRQ line, determining which device needs attention is quite simple. The processor just determines which line was asserted and therefore knows which device needs attention. If there is just one IRQ line available, or several devices share one of the IRQ lines, then the processor must query each device to determine which one made the interrupt request. Once an interrupt request is received, the processor will temporarily stop executing the current program and run a special set of instructions designed to service the device. This is called an interrupt handler and the act of setting one of the IRQ lines is called an interrupt service request or ISR.

The interrupt handler must be small and efficiently designed, since in some cases it could be invoked hundreds or maybe even thousands of times a second. Generally an interrupt handler performs the minimum amount of work necessary to service the device, and then exits. At that point, the processor returns to running the process that was interrupted as if nothing happened.

Many microprocessors use the lowest available memory address, started at address 0, as an array of pointer to interrupt service routines. This is the way the 8086 family of processors work. When IRQ line 0 is signaled, the processor retrieves the pointer stored at address 0 and jumps to that location in memory and executes the code. In this way, many interrupt request lines can be handled efficiently. When the processor is initialized, after reset or upon power-up, one of the first tasks that must be performed is setting up this interrupt request table to point to the interrupt handler code.

While the processor is busy handling an interrupt service request, it generally cannot be interrupted again until the first request has been completed. For this reason, interrupts are generally disabled when first entering an interrupt request handler. Most processors require that one of the first actions the interrupt request handler do is to enable interrupts again.

There are normally two different types of interrupt lines on all processors. The first is the kind we have been discussing at this point, called maskable interrupts. Maskable in this case means that interrupts can be selectively enabled or disabled by the software. The other kind of interrupt is called a non-maskable interrupt. The software can never disable this kind of interrupt. It most often used to perform the DRAM refresh on memory chips, which MUST occur at regular intervals in order to keep memory contents alive.

Only in a few cases will you ever probably have to write a memory refresh handler, since hardware support chips are widely available that can perform memory refresh operations without the processor's involvement. If you do need to write your own memory refresh handler, keep in mind the following items. First, do not attempt to refresh all memory during every call to your handler. This will steal precious CPU time from other running programs. Second, you must calculate how often the refresh handler will be invoked and how much memory to refresh during each cycle, based on the amount of RAM involved and the decay rate of the RAM.

Memory Mapped I/O

I/O Port addressing is not the only way the processor can communicate with external devices however. Another commonly used technique is called memory mapped I/O. In this case, instead of asserting the I/O pin and addressing a data port, the processor just accesses a memory address directly. The external device can have a small amount of RAM or ROM that the processor just reads or writes as needed.

This technique has been used for years with many devices that require large amounts of data, such as video adapters. Instead of feeding data a byte at a time through a couple of data I/O ports, now the processor has the ability to handle large amounts of data quickly by just access special RAM. Many times dual-ported RAM is used when two processors must be both be able to access the memory simultaneously.

Direct Memory Access

One technique that has been used for years to speed transfer of data from main memory to an external device's memory is the direct memory access feature (DMA). The processor on the external device executes DMA transfers, without any assistance from the main processor. The processors must cooperate for this to work obviously. While the DMA transfer is in progress, the main processor is free to tend to other tasks, but should not attempt to modify the information in the buffer being transfer, until the transfer is complete.

The setup for this is fairly straightforward. The main processor generally initiates the DMA transfer by sending one or more instructions to a data port monitored by the external processor. The instructions must include a direction (to or from main memory), a memory address, and a length at a minimum. Keep in mind that many DMA enabled systems have a limit on the maximum size transfer allowed (often 64K).

Once the transfer is stared, the main processor is free to tend to other tasks. The external processor will take over the address and data lines periodically and execute the DMA transfer. Once the transfer is complete, the external device usually notifies the main processor of this by raising an interrupt request.

DMA's main advantage is that the main processor does not have to transfer data into one of its register, then save that to a memory address for each and every byte of data. Another advantage is the fact that while the DMA transfer is in progress, the CPU is free to work on other tasks. This leads to an apparently overall increase in speed.