Creative 3DBlaster GeForce Annihilator - Page 5

Terminal Velocity
Okay, I guess we’ve come to the moment in truth. Does its speed increase over the TNT2 actually warrant its higher price tag? Well, to sum it all in one line – I’m impressed.  Let’s see what we get in the benchmarks below:

Q3Test1.08 
(benchmarked using “timedemo 1; demo q3demo1 or q3demo2”)

Core/Mem set at 140/190; “Vsync Off” in Q3console and control applet; All other features enabled (ie. Trilinear filtering, light fares on; lightmap; marks on walls ‘ON’, highest texture quality, etc) with the appropriate texture depth set at the corresponding screen depth for testing (ie. 16bit res + textures; 32bit res + textures)

3D Mark 99 Max (all done with the n-polygon test, trilinear filtering)

Resolution/ Depth 3DMarks CPU 3DMarks
1024x768 / 16-bit / 16-bit Z-Buffer 5190 8670
1024x768 / 32-bit / 32-bit Z-Buffer 5018 8634
1280x1024 / 16-bit / 16-bit Z-Buffer 5069 8733
1280x1024 / 32-bit / 24-bit Z-Buffer 4023 8613

<Click here to download the actual 3D Mark database scores>

These are pretty astounding scores in my opinion. Don’t think they are jaw-dropping, but nevertheless still decently faster than my V770 non-Ultra at 162/192. But having said that, with the hype surrounding this chip, I was made to believe that a constant 60fps in 1024x768/32bit was indeed achievable in Q3Test. Unfortunately, this wasn’t the case, but it’s still darn fast. At 1024x768/16bit, this manages 68.4fps!! And trust me, I’ve played Q3 online in this res for 15mins last night and I could actually feel the difference compared to my old TNT2 V770!! Typically, I believe whatever frame-rates were achievable on my V770 at 16 bit is achievable with the GeForce at 32 bit.

In addition, looking at the 3D Mark 99 Max scores does show that the D3D performance only shows a drop-off at 1280x1024/32 bit! The rest are typically saturated and remained constant!

We should also take note of the fact that these drivers may not be that mature. So perhaps, 60 fps in 1024x768x32bit Q3test in the near future is possible… ?? In any case, there are rumours of a new set of drivers to be released shortly that promises to double (fingers-crossed) the performance…

However, I had to address one of my main concerns on how limited the memory bandwidth bottleneck really was. I experimented to see how overclocking the core / mem scales the frame-rates in Q3 (as a test). This was what I found using a 1024x768 res as a base (being the most common res that most people buying the card will prob use):

Demo 1/ 1024x768

140/190*

140/180*

130/190*

32 bit

42.3

40.3

41.8

16 bit

68.4

67.1

67.9

* Indicates Core / Mem speed

Based on the table above, one can deduce that overclocking the mem produces more gains than the same increments in core clock settings. And if you do the maths, it seems to be a proportional increase in frame-rates with mem speeds. Hence, by extrapolating, hitting a 220 mem clock may theoretically offer a frame-rate of 49 in Q3Test? Well, I guess we’ll just have to wait and see the results from other manufacturers (that may offer better overclockable memory) to prove this hypothesis.  

* Updated Benchmarks (put up by Wilfred 6 Oct 99 0849hrs)
At your request, Wy Mun has gone on to perform more extensive benchmarks on the card. Some readers have claimed higher scores of roughly 5 fps more in Q3Test. Wy Mun was able to duplicate similar performance when he turned OFF trilinear filtering (which he had used in the above tests). Going to bilinear saw the benchmarks in Q3Test improve by margins described by our readers. 

If this is the case... check these numbers he sent me.

More Q3Test (same system config as before)
Timedemo Q3demo1; Vsync-OFF, All settings ON at 1024x768x32-bit (Max. texture quality, markings on walls, lightmap, etc). It is interesting to note the hit in performance from the switch between filtering modes.

nVidia Tree Demo

  Simple Complex
Screen Resolution 1024x768 1024x768
Frame Count 1000 1000
Leaves 1273 4568
Branches 1272 4567
Polygons Per Frame 35820 128080
Frames Per Second 49.0918 14.2025

Exercizer Benchmarks
(Didn’t really do any extensive benchmarking. Just ran it and noted the range of scores at the bottom right-hand corner)

First Screen (with turning logo) 58.1-58.6 fps
 
Texture Stress Test:
8Mb 177-214 fps
12Mb 152-165 fps
16Mb 138-143 fps
20Mb 121-125 fps
22.7Mb 113-117 fps
24Mb 5.9 fps
 
Light Stress
1  Light - min 57.1-58.4 fps
8 Lights - max 56.0-57.0 fps
 
Processor Stress
1000 particles 173 fps
2000 particles 92 fps
4000 particles 47 fps
6000 particles 30 fps
10000 particles 18 fps
 
Polygon Stress
4386 Polygons/Triangles 97-100 fps
8792 Polygons/Triangles 54-55 fps
13188 Polygons/Triangles 41-42 fps
17584 Polygons/Triangles 34-35 fps
21980 Polygons/Triangles 30 fps
26376 Polygons/Triangles 25-26 fps
52752 Polygons/Triangles 14 fps
 
Last Screen 77.0-77.4 fps

Tirtanium 1.50 benchmarks (taken from the bench.res file)
At the request of some, Wy Mun did some testing using Tirtanium running the card through its paces in 32-bit. The scores here are for both normal detail and high detail tests. Note that for Direct3D, VSync is ON.

[
PENTIUM III 558 128MB GeForce256 32MB SDRAM
DIRECT3D  H.DETAIL  N.TEX
Wymun's PC 124MHz Bus ABIT BH6 BX Chipset 3.35 Nvidia drivers
TIRTANIUM 1.5  COMPUTER WYMUNKON  WIN9X build:2222  DRIVER:display DEVICE:Direct3D HAL
freeVIDMEM 30108MB freeTEXTURE 50030MB   refreshrate: 75
VSync ON  RGB0888 ARGB8888
1024 x 768 x 32 fps: 22.6
]

[
PENTIUM III 558 128MB GeForce256 32MB SDRAM
OPENGL    H.DETAIL  N.TEX
Wymun's PC 124MHz Bus ABIT BH6 BX Chipset 3.35 Nvidia drivers
TIRTANIUM 1.5  COMPUTER WYMUNKON  WIN9X build:2222
GLvendor:NVIDIA Corporation  GLrenderer:GeForce 256/AGP/SSE  VSync OFF  DDMSwich OFF
1024 x 768 x 32 fps: 42.6
]

[
PENTIUM III 558 128MB GeForce256 32MB SDRAM
DIRECT3D  N.DETAIL  N.TEX
Wymun's PC 124MHz Bus ABIT BH6 BX Chipset 3.35 Nvidia drivers
TIRTANIUM 1.5  COMPUTER WYMUNKON  WIN9X build:2222  DRIVER:display DEVICE:Direct3D HAL
freeVIDMEM 30108MB freeTEXTURE 50031MB   refreshrate: 75 
VSync ON  RGB0888 ARGB8888
1024 x 768 x 32 fps: 28.8
]

[
PENTIUM III 558 128MB GeForce256 32MB SDRAM
OPENGL    N.DETAIL  N.TEX
Wymun's PC 124MHz Bus ABIT BH6 BX Chipset 3.35 Nvidia drivers
TIRTANIUM 1.5  COMPUTER WYMUNKON  WIN9X build:2222
GLvendor:NVIDIA Corporation  GLrenderer:GeForce 256/AGP/SSE  VSync OFF  DDMSwich OFF
1024 x 768 x 32 fps: 55.0
]

NB : I don’t know for some reason D3D Vsynch seems to be enabled even though I’ve turned it off via the control panel & Powerstrip.  However, OpenGL scores are very nice…

* More Benchmark Updates (Wy Mun 7 Oct 99 0849hrs)
I received various e-mails asking me to benchmark Q3 on a non-overclocked AGP bus as there was speculation whether that this may be affecting performance in some strange way. In addition, many wanted to know whether the GPU holds up well at differing processor speeds. Hence, I have included some results for those interested comparing a PIII450 vs o/c to PIII558 as a hopeful basis.

NB: As I do not have any other processor other than my PIII450, I wasn’t able to test a Celeron or PII and note its performance difference….

PIII-450Mhz Vs O/C PIII-558Mhz
Vsync OFF; All settings ON (Max texture quality, markings on walls, lightmap, Trilinear filtering, etc); Core/Mem set at 140/190.

Looks like the CPU dependency is still apparent at lower res (owing to the diff in frame-rates). However, once it hits about 1024x768 in 32bit and 1280x960 in 16bit, the differences start becoming insignificant. Will let you draw your own conclusions how the GPU is handling things.


Default 120/166Mhz Vs 140/190Mhz
In addition, many wanted to find out the Q3 speeds attainable at a default core/mem versus its overclocked counterpart. Have attached those findings here:

PIII-450 (un-overclocked); Vsync OFF; All settings ON (Max texture quality, markings on walls, lightmap, Trilinear filtering, etc).

The highest drop in performance is about 5.5 fps at the highest resolutions. The core/mem setting starts make a notable difference at 1024x768 in 16bit and 800x600 in 32-bit.


Vertex Lighting Vs Lightmap Lighting
Another reader also pointed out that the GeForce is optimized for Vertex lighting (and not Lightmap). So suggested I try to see if the difference is significant. I've attached the results for your own digestion. However, seems that switching to bilinear filtering from trilinear (see previous comparison) offer more gains…

PIII-558MHz; Vsync OFF; All settings ON (Max texture quality, markings on walls, Trilinear filtering, etc); Core/Mem set at 140/190.

It is interesting to note how at 640x480x32bit, that Lightmap lighting is in fact faster than Vertex??


r_picmip1 Vs r_picmip 0
This last one came from a reader of NV News and I've obliged to carry out his suggested test. Here's a blurb of his suggestion and reasoning for this benchmark:

Since Q3Test seems to be such a popular benchmark these days, you should know that when r_picmip is set to 0 (ie. the detail slider is all the way up) and 32bit textures are selected you will run out of local ram on video cards with 32Megs of memory.

This is easy to verify with the imagelist command from the Q3 console.  It will tell you how many texels are loaded for that map, and with r_picmip '0', you have approximately 7,000,000 pixels worth of textures.  Multiply that times 33% to include the texels required for mipmapping and add them together and you have around 9,310,000 pixels worth of textures. Multiply that time 3 (24bits = 3bytes, and even this is low as some of them include alpha info and are 32bit and we could multiply by 4, but even using 3 bytes per pixel it will illustrate the point). And viola 28Megs in textures. It takes 3 Megs for a 1024x768x32 frame buffer and we have at least two if we are double buffering (and we may even be triple buffered, not sure) and another 3Meg for Z-Buffer. Oh, my god 28Megs in textures + 6Megs in framebuffers + 3Megs in Z-Buffer = 37Megs. And our GeForce is limited to the SAME bandwith as our TNT2 becausing we are running across AGP. Gross.

Please see if he can run the 1024x768x32 Q3Test Bench's with r_picmip set to '1' like John Carmack does because of this same reason. We won't have to do this once GF256 drivers support DxTC and Q3 implements them, but for now, this is the only way to get accurate results. On my CL TNT2 Ultra @ 175Mhz it doesn't make a big difference because its already fillrate limited in this screen mode in Q3 because of overdraw, especially with the gibs, but perhaps the GF256 is still not.

PIII-558 with core/mem of GeForce at 140/190; using 1024x768x32bpp; All other options set to Max (except for texture detail which is altered by the r_picmip value) ie. Trilinear Filtering, Marks on Walls, Lightmap, etc... 

I hope these new rounds of testing will answer some of your doubts and queries you have about the Annihilator's performance at this point in time. This page will be updated from time to time if we have significant progress to report.

Indy3D Benchmarks
Some readers have requested for Indy3D tests to be carried out. This is a 3D graphics evaluation tool for MCAD, animation and simulation professionals. The auto-generated page can be checked out here. 

< Previous

Next >

 

Content