The computer you
are using to read this page uses a microprocessor to do
its work. The microprocessor is the heart of any normal
computer, whether it is a desktop
machine, a server
or a laptop. The
microprocessor you are using might be a Pentium, a K6, a
PowerPC, a Sparc or any of the many other brands and types of
microprocessors, but they all do approximately the same thing
in approximately the same way.
If you have ever wondered what the microprocessor in your
computer is doing, or if you have ever wondered about the
differences between types of microprocessors, then read on. In
this edition of HowStuffWorks,
you will learn how fairly simple digital logic techniques
allow a computer to do its job, whether its playing a game or
spell checking a document!
Microprocessor History A microprocessor --
also known as a CPU or central processing unit -- is a
complete computation engine that is fabricated on a single
chip. The first microprocessor was the Intel 4004, introduced
in 1971. The 4004 was not very powerful -- all it could do was
add and subtract, and it could only do that 4 bits at a
time. But it was amazing that everything was on one chip.
Prior to the 4004, engineers built computers either from
collections of chips or from discrete components (transistors
wired one at a time). The 4004 powered one of the first
portable electronic calculators.
The first microprocessor to make it into a home computer
was the Intel 8080, a complete 8-bit computer on one chip,
introduced in 1974. The first microprocessor to make a real
splash in the market was the Intel 8088, introduced in 1979
and incorporated into the IBM PC (which first appeared around
1982). If you are familiar with the PC market and its history,
you know that the PC market moved from the 8088 to the 80286
to the 80386 to the 80486 to the Pentium to the Pentium II to
the Pentium III to the Pentium 4. All of these microprocessors
are made by Intel and all of them are improvements on the
basic design of the 8088. The Pentium 4 can execute any piece
of code that ran on the original 8088, but it does it about
5,000 times faster!
The following table helps you to understand the differences
between the different processors that Intel has introduced
over the years.
Name |
Date |
Transistors |
Microns |
Clock speed |
Data width |
MIPS |
8080 |
1974 |
6,000 |
6 |
2
MHz |
8
bits |
0.64 |
8088 |
1979 |
29,000 |
3 |
5
MHz |
16
bits 8-bit bus |
0.33 |
80286 |
1982 |
134,000 |
1.5 |
6
MHz |
16
bits |
1 |
80386 |
1985 |
275,000 |
1.5 |
16
MHz |
32
bits |
5 |
80486 |
1989 |
1,200,000 |
1 |
25
MHz |
32
bits |
20 |
Pentium |
1993 |
3,100,000 |
0.8 |
60
MHz |
32
bits 64-bit bus |
100 |
Pentium
II |
1997 |
7,500,000 |
0.35 |
233
MHz |
32
bits 64-bit bus |
~300 |
Pentium
III |
1999 |
9,500,000 |
0.25 |
450
MHz |
32
bits 64-bit bus |
~510 |
Pentium
4 |
2000 |
42,000,000 |
0.18 |
1.5
GHz |
32
bits 64-bit bus |
~1,700 | Compiled from The
Intel Microprocessor Quick Reference Guide and TSCP
Benchmark Scores
Information about this table:
What's a Chip?A chip is also called an integrated
circuit. Generally it is a small, thin piece of silicon
onto which the transistors
making up the microprocessor have been etched. A chip
might be as large as an inch on a side and can contain
tens of millions of transistors. Simpler processors
might consist of a few thousand transistors etched
onto a chip just a few millimeters
square.
|
- The date is the year that the processor was first
introduced. Many processors are re-introduced at higher
clock speeds for many years after the original release date.
- Transistors is the number of transistors on the
chip. You can see that the number of transistors on a single
chip has risen steadily over the years.
- Microns is the width, in microns, of the smallest
wire on the chip. For comparison, a human hair is 100
microns thick. As the feature size on the chip goes down,
the number of transistors rises.
- Clock speed is the maximum rate that the chip can
be clocked at. Clock speed will make more sense in the next
section.
- Data Width is the width of the ALU. An 8-bit ALU
can add/subtract/multiply/etc. two 8-bit numbers, while a
32-bit ALU can manipulate 32-bit numbers. An 8-bit ALU would
have to execute four instructions to add two 32-bit numbers,
while a 32-bit ALU can do it in one instruction. In many
cases, the external data bus is the same width as the ALU,
but not always. The 8088 had a 16-bit ALU and an 8-bit bus,
while the modern Pentiums fetch data 64 bits at a time for
their 32-bit ALUs.
- MIPS stands for "millions of instructions per
second" and is a rough measure of the performance of a CPU.
Modern CPUs can do so many different things that MIPS
ratings lose a lot of their meaning, but you can get a
general sense of the relative power of the CPUs from this
column.
From this table you can see that, in
general, there is a relationship between clock speed and MIPS.
The maximum clock speed is a function of the manufacturing
process and delays within the chip. There is also a
relationship between the number of transistors and MIPS. For
example, the 8088 clocked at 5 MHz but only executed at 0.33
MIPS (about one instruction per 15 clock cycles). Modern
processors can often execute at a rate of two instructions per
clock cycle. That improvement is directly related to the
number of transistors on the chip and will make more sense in
the next section.
Inside a Microprocessor To understand how a
microprocessor works, it is helpful to look inside and learn
about the logic used to create one. In the process you can
also learn about assembly language -- the native
language of a microprocessor -- and many of the things that
engineers can do to boost the speed of a processor.
A microprocessor executes a collection of machine
instructions that tell the processor what to do. Based on the
instructions, a microprocessor does three basic things:
- Using its ALU (Arithmetic/Logic Unit), a microprocessor
can perform mathematical operations like addition,
subtraction, multiplication and division. Modern
microprocessors contain complete floating point processors
that can perform extremely sophisticated operations on large
floating point numbers.
- A microprocessor can move data from one memory
location to another.
- A microprocessor can make decisions and jump to a new
set of instructions based on those decisions.
There
may be very sophisticated things that a microprocessor does,
but those are its three basic activities. The following
diagram shows an extremely simple microprocessor capable of
doing those three things:
This is about as simple as a microprocessor gets. This
microprocessor has:
- An address bus (that may be 8, 16 or 32 bits
wide) that sends an address to memory
- A data bus (that may be 8, 16 or 32 bits wide)
that can send data to memory or receive data from memory
- An RD (read) and WR (write) line to tell
the memory whether it wants to set or get the addressed
location
- A clock line that lets a clock pulse sequence the
processor
- A reset line that resets the program counter to
zero (or whatever) and restarts execution
Let's
assume that both the address and data buses are 8 bits wide in
this example.
Here are the components of this simple microprocessor:
- Registers A, B and C are simply latches made out of
flip-flops. (See the section on "edge-triggered latches" in
How
Boolean Logic Works for details.)
- The address latch is just like registers A, B and C.
- The program counter is a latch with the extra ability to
increment by 1 when told to do so, and also to reset to zero
when told to do so.
- The ALU could be as simple as an 8-bit adder (see the
section on adders in How Boolean
Logic Works for details), or it might be able to add,
subtract, multiply and divide 8-bit values. Let's assume the
latter here.
- The test register is a special latch that can hold
values from comparisons performed in the ALU. An ALU can
normally compare two numbers and determine if they are
equal, if one is greater than the other, etc. The test
register can also normally hold a carry bit from the last
stage of the adder. It stores these values in flip-flops and
then the instruction decoder can use the values to make
decisions.
- There are six boxes marked "3-State" in the diagram.
These are tri-state buffers. A tri-state buffer can
pass a 1, a 0 or it can essentially disconnect its output
(imagine a switch that totally disconnects the output line
from the wire that the output is heading toward). A
tri-state buffer allows multiple outputs to connect to a
wire, but only one of them to actually drive a 1 or a 0 onto
the line.
- The instruction register and instruction decoder are
responsible for controlling all of the other components.
Although
they are not shown in this diagram, there would be control
lines from the instruction decoder that would:
- Tell the A register to latch the value currently on the
data bus
- Tell the B register to latch the value currently on the
data bus
- Tell the C register to latch the value currently on the
data bus
- Tell the program counter register to latch the value
currently on the data bus
- Tell the address register to latch the value currently
on the data bus
- Tell the instruction register to latch the value
currently on the data bus
- Tell the program counter to increment
- Tell the program counter to reset to zero
- Activate any of the six tri-state buffers (six separate
lines)
- Tell the ALU what operation to perform
- Tell the test register to latch the ALU's test bits
- Activate the RD line
- Activate the WR line
Coming into the instruction
decoder are the bits from the test register and the clock
line, as well as the bits from the instruction register.
RAM and ROM The previous section talked
about the address and data buses, as well as the RD and WR
lines. These buses and lines connect either to RAM or ROM --
generally both. In our sample microprocessor, we have an
address bus 8 bits wide and a data bus 8 bits wide. That means
that the microprocessor can address (28) 256 bytes of memory, and it can read
or write 8 bits of the memory at a time. Let's assume that
this simple microprocessor has 128 bytes of ROM starting at
address 0 and 128 bytes of RAM starting at address 128.
ROM
stands for read-only memory. A ROM chip is programmed with a
permanent collection of pre-set bytes. The address bus tells
the ROM chip which byte to get and place on the data bus. When
the RD line changes state, the ROM chip presents the selected
byte onto the data bus.
RAM
stands for random-access memory. RAM contains bytes of
information, and the microprocessor can read or write to those
bytes depending on whether the RD or WR line is signaled. One
problem with today's RAM chips is that they forget everything
once the power
goes off. That is why the computer needs ROM.
By the way, nearly all computers contain some amount of ROM
(it is possible to create a simple computer that contains no
RAM -- many microcontrollers
do this by placing a handful of RAM bytes on the processor
chip itself -- but generally impossible to create one that
contains no ROM). On a PC, the ROM is
called the BIOS (Basic
Input/Output System). When the microprocessor starts, it
begins executing instructions it finds in the BIOS. The BIOS
instructions do things like test the hardware in the machine,
and then it goes to the hard disk to fetch the boot
sector (see How Hard
Disks Work for details). This boot sector is another small
program, and the BIOS stores it in RAM after reading it off
the disk. The microprocessor then begins executing the boot
sector's instructions from RAM. The boot sector program will
tell the microprocessor to fetch something else from the hard
disk into RAM, which the microprocessor then executes, and so
on. This is how the microprocessor loads and executes the
entire operating
system.
Microprocessor Instructions Even the
incredibly simple microprocessor shown in the previous example
will have a fairly large set of instructions that it can
perform. The collection of instructions is implemented as bit
patterns, each one of which has a different meaning when
loaded into the instruction register. Humans are not
particularly good at remembering bit patterns, so a set of
short words are defined to represent the different bit
patterns. This collection of words is called the assembly
language of the processor. An assembler can
translate the words into their bit patterns very easily, and
then the output of the assembler is placed in memory for the
microprocessor to execute.
Here's the set of assembly language instructions that the
designer might create for the simple microprocessor in our
example:
- LOADA mem - Load register A from memory address
- LOADB mem - Load register B from memory address
- CONB con - Load a constant value into register B
- SAVEB mem - Save register B to memory address
- SAVEC mem - Save register C to memory address
- ADD - Add A and B and store the result in C
- SUB - Subtract A and B and store the result in C
- MUL - Multiply A and B and store the result in C
- DIV - Divide A and B and store the result in C
- COM - Compare A and B and store the result in
test
- JUMP addr - Jump to an address
- JEQ addr - Jump, if equal, to address
- JNEQ addr - Jump, if not equal, to address
- JG addr - Jump, if greater than, to address
- JGE addr - Jump, if greater than or equal, to
address
- JL addr - Jump, if less than, to address
- JLE addr - Jump, if less than or equal, to
address
- STOP - Stop execution
If you have read How C Programming
Works, then you know that this simple piece of C code will
calculate the factorial of 5 (where the factorial of 5 = 5! =
5 * 4 * 3 * 2 * 1 = 120):
At the end of the program's execution, the variable
f contains the factorial of 5.
A C compiler translates this C code into assembly
language. Assuming that RAM starts at address 128 in this
processor, and ROM (which contains the assembly language
program) starts at address 0, then for our simple
microprocessor the assembly language might look like this:
So now the question is, "How do all of these instructions
look in ROM?" Each of these assembly language instructions
must be represented by a binary number. For the sake of
simplicity, let's assume each assembly language instruction is
given a unique number, like this:
- LOADA - 1
- LOADB - 2
- CONB - 3
- SAVEB - 4
- SAVEC mem - 5
- ADD - 6
- SUB - 7
- MUL - 8
- DIV - 9
- COM - 10
- JUMP addr - 11
- JEQ addr - 12
- JNEQ addr - 13
- JG addr - 14
- JGE addr - 15
- JL addr - 16
- JLE addr - 17
- STOP - 18
The numbers are known as
opcodes. In ROM, our little program would look like
this:
You can see that seven lines of C code became 17 lines of
assembly language, and that became 31 bytes in ROM.
The instruction decoder needs to turn each of the opcodes
into a set of signals that drive the different components
inside the microprocessor. Let's take the ADD instruction as
an example and look at what it needs to do:
- During the first clock cycle, we need to actually load
the instruction. Therefore the instruction decoder needs to:
- activate the tri-state buffer for the program counter
- activate the RD line
- activate the data-in tri-state buffer
- latch the instruction into the instruction register
- During the second clock cycle, the ADD instruction is
decoded. It needs to do very little:
- set the operation of the ALU to addition
- latch the output of the ALU into the C register
- During the third clock cycle, the program counter is
incremented (in theory this could be overlapped into the
second clock cycle).
Every instruction can be broken
down as a set of sequenced operations like these that
manipulate the components of the microprocessor in the proper
order. Some instructions, like this ADD instruction, might
take two or three clock cycles. Others might take five or six
clock cycles.
Microprocessor Performance The number of
transistors available has a huge effect on the
performance of a processor. As seen earlier, a typical
instruction in a processor like an 8088 took 15 clock cycles
to execute. Because of the design of the multiplier, it took
approximately 80 cycles just to do one 16-bit multiplication
on the 8088. With more transistors, much more powerful
multipliers capable of single-cycle speeds become possible.
More transistors also allow for a technology called
pipelining. In a pipelined architecture, instruction
execution overlaps. So even though it might take five clock
cycles to execute each instruction, there can be five
instructions in various stages of execution simultaneously.
That way it looks like one instruction completes every clock
cycle.
Many modern processors have multiple instruction decoders,
each with its own pipeline. This allows for multiple
instruction streams, which means that more than one
instruction can complete during each clock cycle. This
technique can be quite complex to implement, so it takes lots
of transistors.
The trend in processor design has been toward full 32-bit
ALUs with fast floating point processors built in and
pipelined execution with multiple instruction streams. There
has also been a tendency toward special instructions (like the
MMX instructions) that make certain operations particularly
efficient. There has also been the addition of hardware virtual
memory support and L1 caching on
the processor chip. All of these trends push up the transistor
count, leading to the multi-million transistor powerhouses
available today. These processors can execute about one
billion instructions per second!
For more information on microprocessors and related topics,
check out the links on the next page.
Lots More Information!
Related HowStuffWorks
Articles
More Great Links!
|