Features Overview: Lightspeed Memory Architecture
We have seen the summary of what the nFinite FX engine, comprising the Vertex Shader and Pixel Shader, is capable of and we caught a glimpse at how the Pixel Shader is able to save on memory bandwidth requirements. Next up, we will be exposed to another termed coined by NVIDIA, “Lightspeed Memory Architecture”, which is a name given to a series of important and very significant enhancements employed in the GeForce 3 to improve the efficiencies within and between the PC subsystems - like the microprocessor, the main memory, the AGP bus, the GPU and the graphics card's frame buffer memory.
NVIDIA has expressed that today’s advances in memory technology have been unable to keep pace with the computational throughput requirements in 3D graphics, so a number of key steps were taken in the GeForce 3’s design to maximize the use of the limited bandwidth availed.
- Higher Order Surfaces
- Crossbar Memory Controller
- Occlusion Detection
- Z-Buffer Compression
Before we dwell into the methods employed to alleviate the bandwidth issue. Let us examine the challenges creating an optimal card for game play.
Game Logic, Scene Management and Geometry Calculations
A 3D interactive video game would typically involve the following graphics components like game logic, scene management, geometry calculations and pixel rendering.
Game logic comprises elements like the game physics, artificial intelligence, sound and various other non-graphic functions that are necessary to create an engaging experience. Today, with dedicated 3D hardware that handles most of the geometry transform and lighting, the CPU is freed to provide the computational needs of game logic.
The next in line would be scene management and for a long time even until today, game developers have been tweaking their games for the right balance between immersive graphics and playability. Why so?
As the level of detail grows, the amount of scene data that needs to be rendered grows immensely huge and requires a lot of computational muscle to accomplish.
Quite evidently, the graphics memory onboard the card is one of the most heavily stressed part of the entire design, where each pixel is rendered by the reading from and the writing to the colour and z-buffers, as well as accessing to the texture data loaded in memory. For gameplay to be sustained at an optimal 60 frames per second rate at 1024x768 resolution 32-bits rendering and using a 2.5 average depth complexity figure, we arrive at a memory bandwidth requirement of 2.4GB/sec. Moving to a resolution of 1600x1200 will see the number jump to 5.8GB/sec that greatly exceeds even the fastest double data rate (DDR) memory.
The situation is further compounded by the inefficiency of overdraw, an occurence due to the overlaying of game objects in 3D space that results in the rendering of pixels that do not get displayed onscreen due to viewing perspective.
As we will see later, by using a unique method of compression and improvements in scene management and memory controller, the GeForce 3 is able to ease these bottlenecks significantly.
|