PowerPC: lots of layers
PowerPC is small, but has lots of layers. The current PowerPC 601 is made only by IBM in a fab in Burlington, Vermont. It is very difficult to get data on PowerPC, says Mashey, because it is a very aggressive, very proprietary effort. It is difficult to get a handle on what they cost to make. IBM is working with Motorola to get production of the part transferred into more high-volume, low-cost processes. We suspect this is expensive, but we don’t know, says Mashey, although sizes and metal give you pretty good guesses. PowerPC emphasises floating point performance over integer and is a reasonable architecture, according to Mashey.
The cost of going 64-bit
Meanwhile, MIPS and DEC have already paid the development price of going to 64-bit – a cost others will have to bear in future. MIPS has earned itself more die space for things like multiprocessor support. Other vendors have used space derived from shrinking transistor and process sizes for things like more floating point performance. The next round of microprocessor developments will see all the architectures get more transistors per chip and standard CPUs drop in price, Mashey expects. Some will have to integrate some of the things we’ve already done. We’ll put more things into floating point, he says. He uses the example of SuperSparc to point out some of the pitfalls of trying to move too far too fast. It’s big, three-layer, BiCMOS and aggressive in many directions at once. There’s a rule of thumb I like to use which says the probability of success for a project is grossly one over two to the number of risk factors. In other words if you want a good chance of success you’d better just take one risk and that’s about it.
All current projects are do-able in present technology
Mashey is confident all the MIPS R series projects now on the go, such as TFP, T5 Terminator, Terminator II and VRX, are do-able with semiconductor processes available today. There are a number of problems that stand in the way, however. One is the time needed to debug, find and fix problems. You can have produced thousands of good chips and still find a horrible bug, he says. Its not as if the chip can’t add and subtract on the first try – if it can’t do that then you shoot all the engineers. Mashey has as weird a collection of examples of circumstances that have led to bugs as can be imagined – and this can happen even after all kinds of diagnostics, fine tuning and performance benchmarks have been run – but basically the chip forgets what it is doing. The problem is that bugs are very hard to replicate and therefore eliminate. Another problem is that the increasing speed and parallelisation in chip technology makes it more and more difficult to be able to get the CPU to take an interrupt, stop and go off and do something else, without scrambling its brain. In Mashey’s words, it is like brakes on a car, the faster the car goes, the harder it is to stop in a straight line without killing people. Speculative execution processors, like MIPS’ next-generation T5 Terminator, need to have a clearer exception-handling mechanism than any chips have had in the past. The T5, for example, will be able to process stuff that is up to four branches ahead; Mashey says MIPS has a mechanism that will handle exceptions as well as undoing all the speculative processing. In MIPS’ case, the technology is more aggressive, but not something fundamentally different, he argues. This stuff gives you grey hair – I’m only 20 years old really – don’t get your children to do this kind of stuff, it’s bad for you, he warns.
The additional problems faced by those that desert CMOS for BiCMOS – and lagging interconnect technology
Although MIPS has done some pioneering work on mainstream 64-bit microprocessor technology, it remains wary of the latest BiCMOS techniques used in architectures like the Texas SuperSparc and Intel Pentium. BiCMOS has two kinds of transistors, bipolar and complementary metal oxide semiconductor. Conventional CMOS transistors switch data by recording a
state that is either off or on – 0 or 1. Bipolar transistors can record both states at the same time. We look at BiCMOS every time and reject it, says Mashey. The problem is that the reduction in size of BiCMOS transistors lags roughly six months to a year behind the improvements made to the size of CMOS transistors, he explains. The size of the transistor is the driving force on what the clock frequency in MHz is. There are uses for it [BiCMOS], says Mashey, we just find it gets to be expensive and difficult to make and difficult to shrink. SuperSparc and Pentium, he observes, have both encountered well-documented difficulties getting up to speed. As a technology BiCMOS is OK, it just doesn’t seem to work as well as people think. Gallium Arsenide? Well the joke is that it is the technology of the future – and always will be. GaAs is difficult to make and is very brittle, although faster, the problem is you can’t get enough transistors on one chip, you end up having to connect the chips with lots of wires – and the wires are slow. Indeed, the speed of light is a serious problem at this point, says Mashey, and there’s not much we can do about that. The problem is that a nanosecond in a vacuum is shorter than a nanosecond in a wire. And people care about single nanoseconds. On a silicon CPU, people worry about whether the clock ticks the same at different corners of the chip. Imagine if you are building a Cray-type machine with thousands of GaAs chips and lots of wires – the physical universe is against you.
The present rate of progress will not be sustainable
As far as the other performance limitations of microprocessor technology go, Mashey says the current rate of increase in CPU performance is unsustainable in the mid-term, and will drop over time. Once semiconductor technology reaches 500MHz and 0.1 micron parameters, around 1997-98, he expects there to be a slow-down because transistors will be getting smaller than the wavelength of visible light. The industry will have to digest this and a host of other infrastructure changes: it’ll be traumatic and will slow stuff down, says Mashey. The industry will have to move to using X-ray or electron beam etching technology, he believes. And at the same time as increases in individual CPU performance drop from the 50% per year today to 20% or 30%, something else will happen, says Mashey. At the moment, decreasing transistor size means that an 0.8 micron CPU re-done in 0.4 micron technology will fit into a space on quarter of the size it was. For the moment, things like extra cache, two floating point or integer units, speculative execution or other more complicated devices can usefully take up some of the space gained. However, can that whole thing be taken and shrunk again onto a chip a quarter of the size, in 0.1 micron technology, for example. Can we get another round of more complicated technology in? asks Mashey. I don’t think so. We’ll start seeing several CPUs per chip. The problem is that the wires used to link the transistors aren’t shrinking as fast as the transistors themselves. There are two kinds of limit on CPU speed. One is how fast the transistors can switch – faster as they get smaller – and the speed of receiver light, the resistance capacitance for light on the wires. On a master chip with long wires, that is where all the delays will come from, argues Mashey. He says firms will put CPUs together on a chip and keep the wire distances small by patching short gaps between them, or even between the different parts of the individual CPUs. By keeping the wire lengths as short as possible, the inherent slowness of the wires themselves won’t hurt too much. However what you will not be able to do in 500MHz, 0.1 micron technology is to use wires the size of those in a modern chip. The wires will stop you cold. And at the moment, we can’t figure how to make the long wires go faster.