I asked some developers about GPUs, CPUs, PPUs, and REV recently.
Quote:
Wow, I could go on and on for a long time about what these three chips do. Here's the capsule version, and feel free to ask questions.
So, the most specialized chip first, the PPU. It does physics. You send it some information about the world and it does the complex math for physical simulation. Usually this simulation is kinematics, collisions, gravity, etc. With a dedicated PPU, it may also be fluid flow like water and smoke or soft body stuff like cloth or gelatin. Physics does not usually include optics and light. That's the GPUs territory.
Now, the GPU. The GPU takes in the objects to render and does all the math to convert them into the camera's space and light them, etc. Common GPU tasks are lighting, texturing, skinning, transparency, etc.
Finally, the CPU. The CPU does everything else. That's just too simple of a way to say it though.

The GPU and PPU are specialized processors. They take data formatted a certain way and process it according to very special rules. The CPU is more flexible. You can throw virtually anything at it and it will handle it. What's the tradeoff here? If you tried to do what the GPU does on the CPU, it would be orders of magnitude slower. On the other hand, the GPU is wholly incapable of some of the things that the CPU can do.
Now, regarding your overall question, here's the data flow of a game. Load up some data and run the game logic on it. The game logic may need physics data so it asks the PPU for that information. The PPU grabs the data over the bus and returns some answers. Once all that's done, the CPU moves some objects around and then tells the GPU to draw them in the right place. The GPU grabs the vertices and textures over the bus and does some drawing. The addition of more flexible GPUs and PPUs means that the CPU does less calculation and more administrative work. It hands stuff off to other processors and waits for answers. Sometimes, this is actually occurring a frame or two out of synch. For example, you see a frame on TV and input controller commands that are actually rendered two frames later. You don't notice this in general because it's only 1/30th of a second.
Regarding the statement that such and such a console has way more power than some other console. Don't believe the hype. GameCube has the most powerful CPU of the current generation even though Xbox has a higher clock cycle, but that's not what the hype machine would have you believe. Wait for the consoles to come out. I'm pretty confident that PS3 will be the most powerful on paper but it will be theoretical power that is too damn complicated for anyone to actually harness in a game.
End Quote:
then I asked him about CELL; this was all before E3.
Quote:
So, about CELL. Here's what I can say from what has been publicly disclosed. First though, a quick discussion of microprocessor architecture. I promise it won't be too bad.
So, as I mentioned about the CPU and GPU, they do different things. GPUs expect specially formatted data and perform on it in a certain way. CPUs accept general data and do general operations. That's not entirely true. CPUs generally have what's called a SIMD unit now. SIMD stands for Single Instruction Multiple Data. SIMD works on specially formatted data and runs much faster. GPUs, PPUs, and the SPEs in CELL are all SIMD units.
Now, if I have two numbers like 5 and 6, I can multiply them to get 30. Let's say we're on computer where that takes 10 clock cycles. If I have a character with 1000 vertices each of which have an X, Y, and Z coordinate and I want to multiply those vertices by a number, I have to do 3000 multiplications at 10 clock cycles for a total of 30000 clock cycles. Let's suppose instead that I pack all my coordinates into a single vector that is (X, Y, Z, 1). There's a SIMD instruction that will multiple two vectors in the same number of clock cycles. So now, I can multiply 1000 vertices in 10000 clock cycles. It's three times faster thanks to SIMD.
Given that information, you can see that SIMD would be a big win to have on any chip if the data were formatted for SIMD. It just so happens that much of the data in games is uniquely formatted for SIMD usage. It's for this reason that every major game console since the Dreamcast has had some SIMD component. SSE, Paired Singles, Vector Units, etc.
So, how does this apply to CELL. CELL is a PowerPC chip with a bunch of SPEs or Synergistic Processing Units. Each SPE is a high speed SIMD processor with its own local memory. That means that CELL can process a whole bunch of mathematical data. It can move vertices around, calculate physics, and all sorts of cool stuff. The problem is that the data must be formatted to be processed that way, and sometimes there's just not a good way to format your data for SIMD.
If you took CELL and Xbox 360 and threw a big list of numbers at them and wrote the absolute fastest algorithm to multiply them together, I imagine that CELL would be faster. (It's all speculation at this point.) In a game context though, there are other considerations like RAM size, RAM speed, cache sizes, disc access speeds, etc.
CELL is going to allow people to do some pretty cool tricks with physics like fluid flow. It may allow some really cool procedural texturing or curved surface algorithms. Six months later, some really smart programmer will figure out how to do it on the other consoles. I don't think CELL will revolutionize the industry.
End Quote: