首页 > 其他分享 >Nvidia 显卡发展历程

Nvidia 显卡发展历程

时间:2024-08-27 18:24:32浏览次数:11  
标签:series GPU GeForce Nvidia GTX 显卡 历程 was

注: 机翻,未校。


Nvidia GPUs through the ages: The history of Nvidia’s graphics cards

By Adrian Willings

Updated Mar 25, 2023

Nvidia was originally founded in 1993 but it wasn’t until 1995 that the company released its first graphics product - the NV1.
Nvidia 最初成立于 1993 年,但直到 1995 年,该公司才发布了其第一款图形产品 - NV1。

Things have changed a lot since then and we’re now spoiled with ray tracing, DLSS and other wonderful technologies that make our gaming experiences even more enjoyable.
从那时起,情况发生了很大变化,我们现在被光线追踪、DLSS 和其他出色的技术宠坏了,这些技术让我们的游戏体验更加愉快。

We’re taking a look back through the history of Nvidia graphics cards and what these devices have looked like over the years.
我们正在回顾 Nvidia 显卡的历史以及这些设备多年来的样子。

Nvidia NV1 英伟达 NV1(1995)

Nvidia GPUs through the ages photo 1
Swaaye/Wikipedia

The Nvidia NV1 launched in 1995 and was able to handle both 2D and 3D video. It was sold as the “Diamond Edge 3D” and even sported a Sega Saturn compatible joypad port.
Nvidia NV1 于 1995 年推出,能够处理 2D 和 3D 视频。它以“Diamond Edge 3D”的形式出售,甚至配备了与世嘉土星兼容的操纵手柄端口。

Several Sega Saturn games were ported for PC including Panzer Dragoon and Virtua Fighter Remix. Those features alone weren’t enough to appeal to the market though as the Saturn was struggling to compete with the original PlayStation.
几款世嘉土星游戏被移植到 PC 上,包括 Panzer Dragoon 和 Virtua Fighter Remix。不过,仅靠这些功能不足以吸引市场,因为土星正在努力与原始 PlayStation 竞争。

NV1 was off to a rough start that was made worse by the release of Microsoft DirectX which was incompatible with the GPU and many games would not run.
NV1 的开局并不顺利,但 Microsoft DirectX 的发布使情况变得更糟,它与 GPU 不兼容,许多游戏无法运行。

Nvidia RIVA 128 英伟达 RIVA 128(1997)

Nvidia GPUs through the ages photo 2
Mathías Tabó/Wikipedia

In 1997 Nvidia released the NV3 aka Riva 128, Riva stood for “Real-time Interactive Video and Animation”. This graphics card used both 2D and 3D acceleration along with polygon texture mapping.
1997 年,Nvidia 发布了 NV3 又名 Riva 128,Riva 代表“实时交互式视频和动画”。该显卡同时使用 2D 和 3D 加速以及多边形纹理映射。

At the time Nvidia’s rival 3dfx was dominating the market but the NV3 had a 100MHz core/memory clock and essentially doubled the spec of 3dfx’s Voodoo 1.
当时,Nvidia 的竞争对手 3dfx 在市场上占据主导地位,但 NV3 具有 100MHz 的核心/内存时钟,基本上是 3dfx 的 Voodoo 1 规格的两倍。

There were two variants of the NV3 in the form of the Riva 128 and the Riva 128ZX. The latter was more powerful with 8MB of VRAM and a 250MHz clock speed.
NV3 有两种变体,分别是 Riva 128 和 Riva 128ZX。后者功能更强大,具有 8MB 的 VRAM 和 250MHz 的时钟速度。

The NV3 was much more successful than the company’s first GPU and helped Nvidia get widespread popularity.
NV3 比该公司的第一款 GPU 成功得多,并帮助 Nvidia 获得了广泛的普及。

Nvidia NV4 英伟达 NV4(1998)

Nvidia GPUs through the ages photo 3
Tors/Wikipedia

In 1998, Nvidia unleashed NV4, aka Riva TNT. This graphics card improved over the previous models by including support for 32-bit True Colour. The NV4 also had more RAM with 16MB of SDR SDRAM which meant it offered greater performance too.
1998 年,Nvidia 发布了 NV4,又名 Riva TNT。该显卡比以前的型号有所改进,包括对 32 位 True Color 的支持。NV4 还拥有更多的 RAM 和 16MB 的 SDR SDRAM,这意味着它也提供了更高的性能。

It was around this time that Nvidia started to make moves to regularly update graphics drivers to ensure good performance and compatibility for the end user, something the company is still doing to this day.
大约在这个时候,Nvidia 开始采取行动定期更新图形驱动程序,以确保为最终用户提供良好的性能和兼容性,该公司至今仍在这样做。

Nvidia’s Riva TNT was more affordable at the time than 3dfx’s Vodoo2 if a little slower in terms of performance. The driver support was key to NV4’s success.
Nvidia 的 Riva TNT 当时比 3dfx 的 Vodoo2 更实惠,只是在性能方面稍慢。驱动程序支持是 NV4 成功的关键。

Nvidia NV5 英伟达 NV5(1999)

Nvidia GPUs through the ages photo 4
Uwe Hermann/Wikipedia

In 1999, Nvidia followed up NV4 with the RIVA TNT2. NV5, as it was codenamed, bought a number of updates including 32-bit Z-buffer/stencil support, up to 32MB of VRAM and 2048 x 2048 texture support.
1999 年,Nvidia 在 NV4 之后推出了 Riva TNT2。NV5 的代号为它购买了许多更新,包括 32 位 Z 缓冲区/模板支持、高达 32MB 的 VRAM 和 2048 x 2048 纹理支持。

More importantly, this graphics card had improved clock speeds (up to 150+ MHz) which gave it as much as a 17 per cent performance boost over the previous model.
更重要的是,这款显卡的时钟速度有所提高(高达 150+ MHz),这使其性能比以前的型号提高了 17%。

This card was a direct competitor to the 3dfx Voodoo3 and both were incredibly popular.
这张牌是 3dfx Voodoo3 的直接竞争对手,而且两者都非常受欢迎。

Nvidia GeForce 256 英伟达 GeForce 256(1999)

Nvidia GPUs through the ages photo 5
Hyins/Wikipedia

Late in 1999, Nvidia released what it pitched as the “world’s first GPU” in the form of the Nvidia GeForce 256.
1999 年底,Nvidia 以 Nvidia GeForce 256 的形式发布了它所宣传的“世界上第一款 GPU”。

This was a clever marketing move from Nvidia and the beginning of a love affair with GeForce-branded graphics cards for years to come.
这是 Nvidia 的聪明营销举措,也是未来几年对 GeForce 品牌显卡的热爱的开始。

It improved upon previous RIVA cards by increasing the pixel pipelines but also offered a big leap in performance for PC gaming.
它通过增加像素管道改进了以前的 RIVA 卡,但也为 PC 游戏的性能提供了巨大的飞跃。

This card supported up to 64MB of DDR SDRAM and operated up to 166MHz. As such it was 50 per cent faster than the NV5.
该卡支持高达 64MB 的 DDR SDRAM,运行频率高达 166MHz。因此,它比 NV5 快 50%。

More importantly, the GeForce 256 also fully supported Direct3D 7 which meant it could power many of the best PC games available at the time.
更重要的是,GeForce 256 还完全支持 Direct3D 7,这意味着它可以为当时许多最好的 PC 游戏提供支持。

Shortly after this launch 3dfx went bankrupt and ATI became Nvidia’s main competitor.
此次发布后不久,3dfx 破产,ATI 成为 Nvidia 的主要竞争对手。

Nvidia GeForce2 英伟达 GeForce2(2000)

Nvidia GPUs through the ages photo 6
Hyins/Wikipedia

Nvidia followed up the world’s first GPU with the aptly named GeForce2.
Nvidia 紧随世界上第一个 GPU,推出了恰如其分的 GeForce2。

This GPU came in several different variants including the Ti, Pro, GTS and Ultra models. These were essentially NV11, 15 and 16 cards. Increasing pipelines and higher clock rates followed throughout the 2000 and 2001 releases.
该 GPU 有几种不同的变体,包括 Ti、Pro、GTS 和 Ultra 型号。这些基本上是 NV11、15 和 16 卡。在整个 2000 年和 2001 年版本中,不断增加的管道和更高的时钟速率随之而来。

The thing that made GeForce2 interesting was the start of support for multi-monitor setups.
让 GeForce2 感兴趣的是它开始支持多显示器设置。

It was also around this time that Nvidia acquired 3dfx.
也是在这个时候,Nvidia 收购了 3dfx。

Nvidia GeForce3 英伟达 GeForce3

Nvidia GPUs through the ages photo 7
Hyins/Wikipedia

Shortly after GeForce2 came GeForce3. Codenamed NV20, this series of graphics cards represented Nvidia’s first DirectX 8 compatible GPUs.
在 GeForce2 之后不久,GeForce3 问世。代号为 NV20 的该系列显卡代表了 Nvidia 的第一款兼容 DirectX 8 的 GPU。

There were three versions - the GeForce3, GeForce3 Ti 200, and GeForce3 Ti 500. This next GeForce GPU added programmable pixel and vertex shaders, multisample anti-aliasing into the mix.
有三个版本 - GeForce3、GeForce3 Ti 200 和 GeForce3 Ti 500。下一个 GeForce GPU 添加了可编程像素和顶点着色器,以及多重采样抗锯齿。

A version of GeForce3 referred to as NV2A was used in the original Xbox console.
原始 Xbox 控制台中使用了称为 NV2A 的 GeForce3 版本。

Nvidia GeForce FX series Nvidia GeForce FX 系列(2003)

Nvidia GPUs through the ages photo 8
Hyins/Wikipedia

Leap forward a couple of generations and in 2003 we have the release of Nvidia’s GeForce FX series.
向前飞进了几代,在 2003 年,我们发布了 Nvidia 的 GeForce FX 系列。

These were the fifth generation of the GeForce graphics cards and supported Direct3D 9 as well as a number of new memory technologies. Those included DDR2, GDDR2 and GDDR3 as well as Nvidia’s first attempt at a memory data bus wider than 128 bits.
这些是第五代 GeForce 显卡,支持 Direct3D 9 以及许多新的内存技术。其中包括 DDR2、GDDR2 和 GDDR3 以及 Nvidia 首次尝试超过 128 位的内存数据总线。

Meanwhile, the GeForce FX 5800 made waves for being the first GPU to be equipped with a large cooler. One so big that it was often called the “dustbuster” because of the amount of fan noise it gave off.
与此同时,GeForce FX 5800 成为第一款配备大型散热器的 GPU,掀起了波澜。一个如此之大,以至于它通常被称为“dustbuster”,因为它发出的风扇噪音量很大。

Nvidia GeForce 6 series Nvidia GeForce 6 系列(2004)

Nvidia GPUs through the ages photo 9
Hyins/Wikipedia

Shortly after the release of the GeForce FX series came the 6 series (aka NV40). GeForce 6 was the start of Nvidia pushing SLI technology allowing people to combine more than one graphics card for more power.
GeForce FX 系列发布后不久,6 系列(又名 NV40)问世。GeForce 6 是 Nvidia 推动 SLI 技术的开始,该技术允许人们组合多个显卡以获得更多功能。

The flagship of this range was the GeForce 6800 Ultra, a graphics card which boasted 222 million transistors, 16-pixel superscalar pipelines and six vertex shaders. It had Shader Model 3.0 support and was compliant with both Microsoft DirectX 9.0c and OpenGL 2.0.
该系列的旗舰产品是 GeForce 6800 Ultra,这款显卡拥有 2.22 亿个晶体管、16 像素超标量管道和 6 个顶点着色器。它支持 Shader Model 3.0,并且兼容 Microsoft DirectX 9.0c 和 OpenGL 2.0。

The series also featured Nvidia PureVideo technology and was able to decode H.264, VC-1, WMV and MPEG-2 videos with reduced CPU use.
该系列还采用了 Nvidia PureVideo 技术,能够解码 H.264、VC-1、WMV 和 MPEG-2 视频,同时降低 CPU 使用率。

As a result of all this, the GeForce 6 series was highly successful.
由于这一切,GeForce 6 系列取得了巨大的成功。

Nvidia GeForce7 series Nvidia GeForce7 系列(2005)

Nvidia GPUs through the ages photo 10
FxJ/Wikipedia

In 2005 came the GeForce7 series including the highlight thought of 7800 GTX.
2005 年推出了 GeForce7 系列,其中包括 7800 GTX 的亮点。

That card alone was a real powerhouse for the time and with some clever cooling, Nvidia was able to push the clock speed to 550MHz. At the same time, the company also managed to reduce latency and increase the bus to 512 bit.
仅那张卡在当时就是一个真正的强者,通过一些巧妙的冷却,Nvidia 能够将时钟速度推高到 550MHz。同时,该公司还设法降低了延迟并将总线增加到 512 位。

Interestingly a version of the 7 series was crafted as the RSX Reality Synthesizer which was the proprietary CPU co-created by Nvidia and Sony for the PlayStation 3.
有趣的是,7 系列的一个版本被制作为 RSX Reality Synthesizer,这是 Nvidia 和 Sony 为 PlayStation 3 共同创建的专有 CPU。

Nvidia GeForce8 series Nvidia GeForce8 系列(2006)

Nvidia GPUs through the ages photo 11
Shooke/Wikipedia

In 2006 Nvidia released the GeForce8 series and unveiled its Tesla microarchitecture. This was the company’s first unified shader design and would be one of its most used architectures in future cards too.
2006 年,Nvidia 发布了 GeForce8 系列,并推出了其 Tesla 微架构。这是该公司的第一个统一着色器设计,也将成为其未来显卡中使用最多的架构之一。

The much-loved Nvidia GeForce 8800 GTX was the flagship of the range and was incredibly popular. It boasted 681 million transistors, had 768MB of GDDR3 memory and came with 128 shaders clocked at 575MHz.
备受喜爱的 Nvidia GeForce 8800 GTX 是该系列的旗舰产品,非常受欢迎。它拥有 6.81 亿个晶体管,拥有 768MB 的 GDDR3 内存,并配备了 128 个主频为 575MHz 的着色器。

Most significantly, this graphics card could run Crysis, which was all that PC gamers wanted at the time.
最重要的是,这款显卡可以运行 Crysis,这就是当时 PC 游戏玩家想要的。

In 2006 things were also heating up as AMD purchased ATI for $5.4 billion and would become a thorn in Nvidia’s side for years to come.
2006 年,随着 AMD 以 54 亿美元的价格收购了 ATI,事情也开始升温,并将成为 Nvidia 未来几年的眼中钉。

Nvidia GeForce9 series Nvidia GeForce9 系列(2008)

Nvidia GPUs through the ages photo 12
Sk/Wikipedia

At the start of 2008 Nvidia released the GeForce9 series. These cards continued to use the Tesla architecture but also added PCIe 2.0 support, improved colour and z-compression too.
2008 年初,Nvidia 发布了 GeForce9 系列。这些卡继续使用 Tesla 架构,但也增加了 PCIe 2.0 支持、改进的色彩和 z 压缩。

By this time Nvidia was pushing performance up a notch with clock speeds hitting up to 675MHz and yet power consumption was being reduced as well.
此时,Nvidia 将性能提升了一个档次,时钟速度达到 675MHz,但功耗也得到了降低。

Nvidia GeForce 400 series Nvidia GeForce 400 系列(2010)

Nvidia GPUs through the ages photo 13
Amazon

The next significant update came in 2010 when Nvidia launched the GeForce 4000 series. This was when Nvidia revealed the Fermi microarchitecture which was the company’s next major architecture.
下一个重大更新发生在 2010 年,当时 Nvidia 推出了 GeForce 4000 系列。就在这时,Nvidia 展示了 Fermi 微架构,这是该公司的下一个主要架构。

This series also saw support for OpenGL 4.0 and Direct3D 11 and was a direct competitor to the Radeon HD 5000 Series.
该系列还支持 OpenGL 4.0 和 Direct3D 11,并且是 Radeon HD 5000 系列的直接竞争对手。

However, the series was let down by high running temperatures and power consumption which caused a lot of grumbles from users.
然而,该系列因高运行温度和功耗而失望,这引起了用户的很多抱怨。

Nvidia GeForce 600 series Nvidia GeForce 600 系列(2012)

Nvidia GPUs through the ages photo 14
HanyNAR

The GeForce 600 series introduced Nvidia’s Kepler architecture which was designed to increase performance per watt while also improving upon the performance of the previous Fermi microarchitecture.
GeForce 600 系列引入了 Nvidia 的 Kepler 架构,该架构旨在提高每瓦性能,同时也改进了之前 Fermi 微架构的性能。

With Kepler, Nvidia managed to increase the memory clock to 6GHz. It also added GPU Boost which guaranteed the GPU would be able to run at a minimum clock speed, but also boost performance when needed until it reached a pre-defined power target.
借助 Kepler,Nvidia 设法将内存时钟提高到 6GHz。它还添加了 GPU Boost,保证 GPU 能够以最低时钟速度运行,但也可以在需要时提高性能,直到达到预定义的功率目标。

This range also supported both Direct3D 11 and Direct3D 12. It also introduced a new anti-aliasing method known as TXAA.
此范围还支持 Direct3D 11 和 Direct3D 12。它还引入了一种新的抗锯齿方法,称为 TXAA。

Nvidia GeForce 700 series Nvidia GeForce 700 系列(2013)

Nvidia GPUs through the ages photo 15
Marcus Burns

In 2013 Nvidia kicked things up a notch with the 700 series which was topped off with the insane high-end enthusiast GTX Titan card.
2013 年,Nvidia 推出了 700 系列,该系列以疯狂的高端发烧友 GTX Titan 卡为顶峰。

This series was a refresh of the Kepler microarchitecture but some later cards also featured Femi and Maxwell architectures.
这个系列是 Kepler 微架构的更新,但后来的一些卡也采用了 Femi 和 Maxwell 架构。

The GTX Titan boasted 2688 CUDA cores, 224 TMUs and 6GB of RAM. By this time Nvidia was managing to squeeze as many as seven billion transistors into its GPUs.
GTX Titan 拥有 2688 个 CUDA 内核、224 个 TMU 和 6GB RAM。此时,Nvidia 已设法将多达 70 亿个晶体管挤入其 GPU。

The GeForce 700 series was designed to maximise energy efficiency but also included other features such as hardware H.264 encoding, PCI Express 3.0 interface, support for DisplayPort 1.2 and HDMI 1.4a 4K x 2K video output as well as GPU-Boost 2.0.
GeForce 700 系列旨在最大限度地提高能源效率,但也包括其他功能,例如硬件 H.264 编码、PCI Express 3.0 接口、支持 DisplayPort 1.2 和 HDMI 1.4a 4K x 2K 视频输出以及 GPU-Boost 2.0。

Nvidia GeForce 900 series Nvidia GeForce 900 系列(2014)

Nvidia GPUs through the ages photo 16
Marcus Burns

In 2014 Nvidia seemingly skipped a generation and went straight into the GeForce 900 series. This was the introduction to the Maxwell microarchitecture which offered improved graphics capabilities as well as better energy efficiency.
2014 年,Nvidia 似乎跳过了一代,直接进入了 GeForce 900 系列。这是对 Maxwell 微架构的介绍,它提供了改进的图形功能以及更好的能效。

This series had the GeForce GTX 980 as its initial flagship, later followed by the 980 Ti and GeForce GTX TITAN X at the extreme end.
该系列最初的旗舰产品是 GeForce GTX 980,后来是 980 Ti 和最高端的 GeForce GTX TITAN X。

NVENC was also improved for this series and support appeared for Direct3D 12_1, OpenGL 4.6, OpenCL 3.0 and Vulkan 1.3.
NVENC 也针对该系列进行了改进,并支持 Direct3D 12_1、OpenGL 4.6、OpenCL 3.0 和 Vulkan 1.3。

Nvidia GeForce10 series Nvidia GeForce10 系列(2014)

Nvidia GPUs through the ages photo 17
Marcus Burns

In 2014 Nvidia revealed the GeForce10 series based on the Pascal microarchitecture. The GTX 1080 Ti is perhaps the most well-known of the GPUs from this series and cemented itself in history as perhaps one of the most significant cards Nvidia released.
2014 年,Nvidia 发布了基于 Pascal 微架构的 GeForce10 系列。GTX 1080 Ti 可能是该系列中最著名的 GPU,并作为 Nvidia 发布的最重要的卡之一在历史上巩固了自己的地位。

This graphics card dominated the market and offered such superb performance and energy efficiency vs cost that it would often be referred to when compared with future GPUs.
这款显卡在市场上占据主导地位,并提供了如此卓越的性能和能效与成本,以至于与未来的 GPU 相比,它经常被提及。

The 10 series also included new features such as GPU Boost 3.0, a dynamic load balancing scheduling system, triple buffering and support for both DisplayPort 1.4 and HDMI 2.0b.
10 系列还包括新功能,例如 GPU Boost 3.0、动态负载平衡调度系统、三重缓冲以及对 DisplayPort 1.4 和 HDMI 2.0b 的支持。

Nvidia GeForce20 series Nvidia GeForce20 系列(2018)

Nvidia GPUs through the ages photo 18
Marcus Burns

The GeForce 20 series was released in 2018 and introduced the Turing microarchitecture to the world.
GeForce 20 系列于 2018 年发布,向世界介绍了 Turing 微架构。

Notably, these graphics cards were the first-gen of RTX cards and saw Nvidia pushing ray tracing as the main selling point.
值得注意的是,这些显卡是第一代 RTX 卡,并将 Nvidia 将光线追踪作为主要卖点。

The use of Tensor Cores and other enhancements helped these cards present a massive leap in graphical prowess. Resulting in realistic lighting and convincing reflection effects in-game.
Tensor Core 和其他增强功能的使用帮助这些卡在图形能力方面实现了巨大的飞跃。在游戏中产生逼真的照明和令人信服的反射效果。

Improvements came at a cost though, with the RTX 2080 Ti retailing for an eye-watering $1,199 (compared to the $699 of the 1080 Ti).
不过,改进是有代价的,RTX 2080 Ti 的零售价为 1,199 美元(相比之下 699 Ti 的 1080 美元)。

Nvidia GeForce30 series Nvidia GeForce30 系列(2020)

Nvidia GPUs through the ages photo 19
Pocket-lint

The GeForce30 series succeeded the 20 series in 2020 and unfortunately became most well-known for simply being impossible to get hold of due to the silicone shortage.
GeForce30 系列在 2020 年接替了 20 系列,不幸的是,由于硅胶短缺,它因根本无法获得而广为人知。

Still, the Nvidia RTX 3080 was meant to retail for $699 making it much more affordable than the previous generations flagship and the series bought significant improvements too.
尽管如此,Nvidia RTX 3080 的零售价为 699 美元,这比前几代旗舰产品便宜得多,而且该系列也购买了显着的改进。

The flagship GeForce RTX 3090 Ti announced in March 2022 packed 10,752 CUDA cores, 78 RT-TFLOPs, 40 Shader-TFLOPs and 320 Tensor-TFLOPs of power. This powerful beast was also noted to be power-hungry, requiring at least 850 watts of power meaning you’d likely need a 1,000-watt PSU to run it.
2022 年 3 月发布的旗舰产品 GeForce RTX 3090 Ti 具有 10752 个 CUDA 核心、78 个 RT-TFLOP、40 个 Shader-TFLOP 和 320 个 Tensor-TFLOP 的功率。这种强大的野兽还被注意到耗电,至少需要 850 瓦的功率,这意味着您可能需要 1,000 瓦的 PSU 来运行它。

Nvidia GeForce 40-series Nvidia GeForce 40 系列(2022)

Nvidia's smoldering RTX 4090 power connector labeled a 'disaster' photo 1
Nvidia

Nvidia has constantly been trying to push the boundaries with each iteration. With the RTX 40-series, one user proved it was possible to run games at 13K resolution.
Nvidia 一直在努力突破每次迭代的界限。使用 RTX 40 系列,一位用户证明了可以以 13K 分辨率运行游戏。

This was the first series of GPUs to use the 12VHPWR connector which led to some problems, but otherwise mostly had glowing reviews.
这是第一个使用 12VHPWR 连接器的 GPU 系列,这导致了一些问题,但其他方面大多好评如潮。

The flagship card of this range, the RTX 4090, had 24GB of GDDR6X, 16384 Cuda cores and some serious processing power. With these cards, Nvidia also introduced DLSS 3.
该系列的旗舰卡 RTX 4090 拥有 24GB 的 GDDR6X、16384 个 Cuda 内核和一些强大的处理能力。凭借这些卡,Nvidia 还推出了 DLSS 3。


via:


概览一下:英伟达显卡发展简史

英伟达(Nvidia)自成立以来,在图形处理单元(GPU)领域取得了显著的成就。

初创阶段

  • 1993 年:英伟达成立,旨在开发高性能图形芯片。
  • 1995 年:发布 NV1,这是英伟达首款产品,集成了 2D 和 3D 图形处理能力,但由于技术限制未能获得市场成功。

进入主流市场

  • 1996 年:推出 RIVA 128,成为首款支持硬件加速的 2D 和 3D 图形卡,逐渐获得了市场认可。
  • 1999 年:推出 GeForce 256,被称为 “世界上第一款真正的 GPU”,引入了硬件光栅化、变换与光照等功能。

GeForce 系列的崛起

  • 2000 年:发布 GeForce2 GTS,进一步提升了 3D 图形性能并支持更高的分辨率。
  • 2002 年:推出 GeForce4 系列,增强了游戏性能,并引入了新的图形特性,如 Shader 模型。
  • 2006 年:发布 GeForce 8800 系列,是首款支持 DirectX 10 的 GPU,标志着现代 GPU 架构的开始。

CUDA 和并行计算

  • 2006 年:引入 CUDA 架构,使得 GPU 不仅限于图形处理,而可以用于通用计算,开启了 GPU 计算的新篇章。
  • 2010 年:发布 Fermi 架构,进一步扩展了 GPU 的计算能力,成为科学计算和深度学习的核心组件。

深度学习与人工智能的兴起

  • 2012 年:发布 Kepler 架构,优化了功耗表现,广泛应用于深度学习训练。
  • 2016 年:推出 Pascal 架构,支持新一代深度学习框架,极大地提升了计算性能。
  • 2018 年:发布 Turing 架构,首次引入实时光线追踪技术,提升了图形渲染质量。
  • 2020 年:推出 Ampere 架构,进一步推动了高性能计算和 AI 的边界。
  • 2022 年:发布 Ada Lovelace 架构,……

英伟达的显卡发展历程伴随着不断的技术创新与市场需求的变化。从早期的基础图形处理,到如今的深度学习与实时光线追踪,英伟达始终处于图形计算领域的前沿。随着科技的不断进步,未来的显卡将会实现更多的可能性。


另一篇:Nvidia GPU 的历史:从 NV1 到 Turing

注:机翻,未校。

The History of Nvidia GPUs: NV1 to Turing

By Michael Justin Allen Sexton, Yannick Guerrini & Pierre Dandumont

published August 25, 2018

NV1: Nvidia Enters The Market NV1:Nvidia 进入市场

img

Nvidia formed in 1993 and immediately began work on its first product, the NV1. Taking two years to develop, the NV1 was officially launched in 1995. An innovative chipset for its time, the NV1 was capable of handling both 2D and 3D video, along with included audio processing hardware. Following Sega’s decision to use the NV1 inside of its Saturn game console, Nvidia also incorporated support for the Saturn controller, which enabled desktop graphics cards to also use the controller.
Nvidia 成立于 1993 年,并立即开始开发其第一款产品 NV1。NV1 历时两年开发,于 1995 年正式推出。NV1 是当时的创新芯片组,能够处理 2D 和 3D 视频,以及随附的音频处理硬件。在世嘉决定在其 Saturn 游戏机中使用 NV1 之后,Nvidia 还加入了对 Saturn 控制器的支持,这使得台式机显卡也可以使用该控制器。

A unique aspect of the NV1’s graphics accelerator is that it used quadratic surfaces as the most basic geometric primitive. This created difficulties for game designers to add support for the NV1 or to design games for it. This became increasingly problematic when Microsoft released its first revision of the DirectX gaming API, which was designed with polygons as the most basic geometric primitive.
NV1 图形加速器的一个独特之处在于它使用二次曲面作为最基本的几何基元。这给游戏设计师增加了对 NV1 的支持或为其设计游戏带来了困难。当 Microsoft 发布其 DirectX 游戏 API 的第一个修订版时,这变得越来越成问题,该修订版以多边形作为最基本的几何基元而设计。

Desktop cards used the common PCI interface with 133 MB/s of bandwidth. Cards could use EDO memory clocked at up to 75 MHz, and the graphics accelerator was capable of a max resolution of 1600x1200 with 16-bit color. Thanks to the combination of Sega Saturn and desktop market sales, Nvidia was able to stay in business, but the NV1 was not particularly successful. Its graphics and audio performance were lackluster, the various hardware components made it expensive compared to other graphics accelerators.
桌面卡使用具有 133 MB/s 带宽的通用 PCI 接口。卡可以使用主频高达 75 MHz 的 EDO 内存,图形加速器能够实现 1600x1200 的最大分辨率和 16 位颜色。多亏了世嘉土星和台式机市场销售的结合,Nvidia 得以继续经营,但 NV1 并不是特别成功。它的图形和音频性能乏善可陈,与其他图形加速器相比,各种硬件组件使其价格昂贵。

Nvidia started work on the NV2 as a successor to the NV1, but after a series of disagreements with Sega, Sega opted to use PowerVR technology inside of its Dreamcast console and the NV2 was cancelled.
Nvidia 开始开发 NV2 作为 NV1 的继任者,但在与世嘉发生一系列分歧后,世嘉选择在其 Dreamcast 控制台内使用 PowerVR 技术,NV2 被取消。

NV3: Riva 128 NV3: Riva 128


The Riva 128, also known as the NV3, launched in 1997 and was considerably more successful. It switched from using quadrilaterals as the most basic geometric primitive to the far more common polygon. This made it easier to add support for the Riva 128 in games. The GPU also used polygon texture mapping with mixed results. This allowed the GPU to render frames more quickly, but it had reduced image quality.
Riva 128,也被称为 NV3,于 1997 年推出,取得了更大的成功。它从使用四边形作为最基本的几何基元转变为更常见的多边形。这使得在游戏中添加对 Riva 128 的支持变得更加容易。GPU 还使用了多边形纹理映射,结果喜忧参半。这允许 GPU 更快地渲染帧,但会降低图像质量。

The GPU was available in two main variants: the Riva 128 and the Riva 128ZX. The Riva 128ZX graphics accelerators used higher-quality binned chips that enabled Nvidia to raise the RAMDAC frequency. Both models used SDRAM memory clocked at 100 MHz accessed over a 128-bit bus, giving the GPUs 1.6 GB/s of bandwidth. The Riva 128ZX chips, however, had 8MB of VRAM compared to 4MB on the Riva 128. The Riva 128ZX also operated at a higher 250 MHz clock speed compared to the Riva 128’s 206 MHz.
GPU 有两种主要版本:Riva 128 和 Riva 128ZX。Riva 128ZX 图形加速器使用更高质量的分档芯片,使 Nvidia 能够提高 RAMDAC 频率。两种型号都使用时钟频率为 100 MHz 的 SDRAM 内存,通过 128 位总线访问,为 GPU 提供 1.6 GB/s 的带宽。然而,Riva 128ZX 芯片的 VRAM 为 8MB,而 Riva 128 为 4MB。与 Riva 128 的 206 MHz 相比,Riva 128ZX 还以更高的 250 MHz 时钟速度运行。

These GPUs were fairly popular because they were capable of both 2D and 3D graphics acceleration, though they were clocked lower than alternatives from Nvidia’s leading competitor, 3dfx.
这些 GPU 相当受欢迎,因为它们能够同时进行 2D 和 3D 图形加速,尽管它们的时钟频率低于 Nvidia 的主要竞争对手 3dfx 的替代品。

NV4: The Plot To Drop The Bomb NV4:投下炸弹的阴谋

img

In 1998, Nvidia introduced its most explosive card to date, the Riva TNT (code named “NV4”). Similar to the NV3, the NV4 was capable of rendering both 2D and 3D graphics. Nvidia improved over the NV3 by enabling support for 32-bit “True Color,” expanding the RAM to 16MB of SDR SDRAM and increasing performance. Although the AGP slot was becoming increasingly popular, a large number of systems didn’t contain one, so Nvidia sold the NV4 primarily as a PCI graphics accelerator and produced a relatively small number of AGP-compatible cards. Starting with the Riva TNT, Nvidia made a strong effort to regularly update its drivers in order to improve compatibility and performance.
1998 年,Nvidia 推出了迄今为止最具爆炸性的卡,Riva TNT(代号为“NV4”)。与 NV3 类似,NV4 能够渲染 2D 和 3D 图形。Nvidia 通过启用对 32 位“True Color”的支持、将 RAM 扩展到 16MB 的 SDR SDRAM 并提高性能,对 NV3 进行了改进。尽管 AGP 插槽越来越受欢迎,但大量系统不包含插槽,因此 Nvidia 主要将 NV4 作为 PCI 图形加速器销售,并生产相对较少的 AGP 兼容卡。从 Riva TNT 开始,Nvidia 付出了巨大的努力来定期更新其驱动程序,以提高兼容性和性能。

At the time it was released, 3dfx’s Voodoo2 held the performance crown, but it was relatively expensive and was limited to 16-bit color. The Voodoo2 also required a separate 2D video card, which raised its cost of ownership even higher. Needing a separate 2D video card was common in the 1990s, but as the Riva TNT was capable of processing both 2D and 3D video, the card was considerably more budget friendly than the Voodoo2.
在发布时,3dfx 的 Voodoo2 占据了性能桂冠,但它相对昂贵,并且仅限于 16 位颜色。Voodoo2 还需要单独的 2D 视频卡,这进一步提高了其拥有成本。在 1990 年代,需要单独的 2D 视频卡很常见,但由于 Riva TNT 能够处理 2D 和 3D 视频,因此该卡比 Voodoo2 更经济实惠。

Nvidia planned to ship the Riva TNT clocked at 125 MHz in an attempt to take the performance crown from the Voodoo2, but the core simply ran too hot and wasn’t sufficiently stable. Instead, Nvidia was forced to ship at 90 MHz with RAM clocked at 110 MHz, resulting in the Riva TNT being slower than the Vodoo2. The Riva TNT still offered decent performance for its time, and after the release of Nvidia’s “Detonator” drivers, performance increased significantly making it even more competitive.
Nvidia 计划推出主频为 125 MHz 的 Riva TNT,以试图从 Voodoo2 手中夺取性能桂冠,但内核运行得太热并且不够稳定。相反,Nvidia 被迫以 90 MHz 的速度发货,RAM 时钟为 110 MHz,导致 Riva TNT 比 Vodoo2 慢。Riva TNT 在当时仍然提供了不错的性能,在 Nvidia 的“Detonator”驱动程序发布后,性能显着提高,使其更具竞争力。

Overall the Riva TNT was extremely successful due to its performance and features. The increased driver support from Nvidia also helped to attract customers, as anyone in the 1990s can tell you what a nightmare dealing with drivers used to be.
总的来说,Riva TNT 因其性能和功能而非常成功。Nvidia 增加的驾驶员支持也有助于吸引客户,因为 1990 年代的任何人都可以告诉您,过去与驾驶员打交道是多么的噩梦。

NV5: Another Explosion NV5:另一次爆炸

In 1999, Nvidia made another grab for the performance crown with the Riva TNT2 (codenamed “NV5”). The Riva TNT2 was architecturally similar to the original Riva TNT, but thanks to an improved rendering engine it was able to perform about 10 to 17 percent faster than its predecessor at the same clock speed. Nvidia also added support for AGP 4X slots, which provided more bandwidth to the card, and doubled the amount of VRAM to 32MB. Probably the most significant improvement was the transition to 250 nm, which allowed Nvidia to clock the Riva TNT2 up to 175 MHz.
1999 年,Nvidia 凭借 Riva TNT2(代号为“NV5”)再次夺得性能桂冠。Riva TNT2 在架构上与原来的 Riva TNT 相似,但由于改进了渲染引擎,它能够在相同的时钟速度下比其前身快约 10% 到 17%。Nvidia 还增加了对 AGP 4X 插槽的支持,这为卡提供了更多带宽,并将 VRAM 的数量增加了一倍,达到 32MB。可能最显着的改进是向 250 nm 的过渡,这使得 Nvidia 能够将 Riva TNT2 的时钟频率提高到 175 MHz。

The Riva TNT2’s main competitor was the 3dfx Vodoo3. These two products traded blows with each other for years without either card being a clear victor in terms of performance or features.
Riva TNT2 的主要竞争对手是 3dfx Vodoo3。这两款产品多年来相互交锋,但任何一张卡在性能或功能方面都没有明显的胜利者。

NV10: Use The GeForce Luke! NV10:使用 GeForce Luke!


In late 1999, Nvidia announced the GeForce 256 (code-named “NV10”). Prior to the GeForce 256, essentially all video cards were referred to as “graphics accelerators” or simply as “video cards,” but Nvidia opted to call the GeForce 256 a “GPU.” Nvidia packed in several new features with this card including hardware T&L (Transform and Lighting) processing, which allowed the GPU to perform calculations that were typically relegated to the CPU. Since the T&L Engine was fixed-function hardware designed specifically for this task, its throughput was roughly five times higher than a then high-end Pentium III processor clocked at 550 MHz.
1999 年底,Nvidia 发布了 GeForce 256(代号为“NV10”)。在 GeForce 256 之前,基本上所有的显卡都被称为“图形加速器”或简称为“显卡”,但 Nvidia 选择将 GeForce 256 称为“GPU”。Nvidia 在这款卡中加入了几个新功能,包括硬件 T&L(变换和照明)处理,这使得 GPU 能够执行通常由CPU进行的计算。由于T&L引擎是专为此任务设计的固定功能硬件,其吞吐量大约比当时的高端奔腾III处理器高出五倍,频率为550 MHz。

The design also differed from the Riva TNT2 in that it contained four pixel pipelines instead of just two. It was unable to match the clock speed of the Riva TNT2, but because of the additional pipelines it was still able to perform roughly 50% faster than its predecessor. The GPU was also Nvidia’s first card to use between 32 to 64MB of DDR SDRAM, which contributed to its performance increase. The GPU’s transistors shrunk to 220 nm and the core itself operated at 120 MHz, with the RAM ranging from 150 to 166 MHz.
该设计与 Riva TNT2 的不同之处在于它包含四个像素管道,而不仅仅是两个。它无法与 Riva TNT2 的时钟速度相媲美,但由于额外的管道,它仍然能够比其前身快大约 50%。GPU 也是 Nvidia 第一款使用 32 到 64MB DDR SDRAM 的卡,这有助于其性能提升。GPU 的晶体管缩小到 220 nm,内核本身以 120 MHz 的速度运行,RAM 范围为 150 至 166 MHz。

The GeForce 256 also represents the first time Nvidia included video acceleration hardware, but it was limited to motion acceleration for MPEG-2 content.
GeForce 256 也代表了 Nvidia 首次包含视频加速硬件,但它仅限于 MPEG-2 内容的运动加速。

NV11, NV15, NV16: GeForce2 NV11、NV15、NV16:GeForce2

img

Nvidia followed the NV10 GeForce 256 up with the GeForce2. The architecture of the GeForce2 was similar to the its predecessor, but Nvidia was able to double the TMUs attached to each pixel pipeline by further shrinking the die with 180 nm transistors. Nvidia used three different cores, codenamed NV11, NV15, and NV16 inside of GeForce2-branded cards. All of these cores used the same architecture, but NV11 contained just two pixel pipelines while the NV15 and NV16 cores had four, and NV16 operated at higher clock rates than NV15.
Nvidia 紧随 NV10 GeForce 256 之后推出了 GeForce2。GeForce2 的架构与其前身相似,但 Nvidia 能够通过使用 180 nm 晶体管进一步缩小芯片,将连接到每个像素管道的 TMU 加倍。Nvidia 在 GeForce11 品牌卡中使用了代号为 NV15、NV16 和 NV2 的三个不同内核。所有这些内核都使用相同的架构,但 NV11 仅包含两个像素管道,而 NV15 和 NV16 内核有四个,NV16 的时钟速率高于 NV15。

The GeForce2 was also the first Nvidia line-up to support multiple monitor configurations. GeForce2 GPUs were available with both SDR and DDR memory.
GeForce2 也是第一个支持多显示器配置的 Nvidia 系列。GeForce2 GPU 提供 SDR 和 DDR 内存。

NV20: The GeForce3 NV20:GeForce3

In 2001, the GeForce3 (codenamed “NV20”) arrived as Nvidia’s first DirectX 8-compatible card. The core contained 60 million transistors manufactured at 150 nm, which could be clocked up to 250 MHz. Nvidia introduced a new memory subsystem on the GeForce3 called “Lightspeed Memory Architecture” (LMA), which was designed to compress the Z-buffer and reduce the overall demand on the memory’s limited bandwidth. It was also designed to accelerate FSAA using a special algorithm called “Quincunx.” Overall performance was higher than the GeForce2, but due to the complexity of the GPU it was fairly expensive to produce, and thus carried a high price tag in comparison.

2001 年,GeForce3(代号为“NV20”)作为 Nvidia 的第一款兼容 DirectX 8 的显卡问世。该内核包含 6000 万个以 150 nm 制造的晶体管,时钟频率可达 250 MHz。Nvidia 在 GeForce3 上引入了一个名为“Lightspeed Memory Architecture”(LMA)的新内存子系统,旨在压缩 Z 缓冲区并降低对内存有限带宽的总体需求。它还旨在使用一种称为“Quincunx”的特殊算法来加速 FSAA。整体性能高于 GeForce2,但由于 GPU 的复杂性,它的生产成本相当高,因此相比之下价格较高。

NV2A: Nvidia And The Xbox NV2A:Nvidia 和 Xbox

Nvidia would once again find itself back in the home console market as a key component of Microsoft’s original Xbox in 2001. The Xbox used hardware nearly identical to what you would find inside of modern PCs at that time, and the GPU designed by Nvidia was essentially a tweaked GeForce3. Just like the NV20 GPU, the NV2A inside of the Xbox contained four pixel pipelines with two TMUs each. Nvidia also created the Xbox’s audio hardware known as MCPX, or “SoundStorm”.
2001 年,Nvidia 将再次发现自己作为 Microsoft 原始 Xbox 的关键组件重返家用游戏机市场。Xbox 使用的硬件与当时现代 PC 中的硬件几乎相同,而 Nvidia 设计的 GPU 本质上是经过调整的 GeForce3。就像 NV20 GPU 一样,Xbox 内部的 NV2A 包含四个像素管道,每个管道有两个 TMU。Nvidia 还创建了 Xbox 的音频硬件,称为 MCPX 或“SoundStorm”。

NV17: GeForce4 (Part 1) NV17:GeForce4(第 1 部分)

Nvidia started to shake things up in 2002 by introducing several GPUs based on different architectures. All of these were branded as GeForce4. At the low-end of the GeForce4 stack was the NV17, which was essentially an NV11 GeForce2 die that had been shrunk using 150 nm transistors and clocked between 250 and 300 MHz. It was a drastically simpler design compared to the NV20, which made it an affordable product that Nvidia could push to both mobile and desktop markets.
Nvidia 于 2002 年开始通过推出几款基于不同架构的 GPU 来改变现状。所有这些都被命名为 GeForce4。GeForce4 堆栈的低端是 NV17,它本质上是一个 NV11 GeForce2 芯片,使用 150 nm 晶体管缩小,时钟频率在 250 到 300 MHz 之间。与 NV20 相比,它的设计要简单得多,这使其成为 Nvidia 可以推向移动和桌面市场的经济实惠的产品。

Nvidia later released two revisions of the NV17 core called NV18 and NV19. NV18 featured an upgraded bus to AGP 8X, while NV19 was essentially an NV18 chip with a PCIe bridge to support x16 links. The DDR memory on these chips was clocked anywhere between 166 and 667 MHz.
Nvidia 后来发布了 NV17 核心的两个版本,称为 NV18 和 NV19。NV18 具有升级到 AGP 8X 的总线,而 NV19 本质上是一个 NV18 芯片,带有 PCIe 桥接以支持 x16 链接。这些芯片上的 DDR 内存的时钟频率在 166 到 667 MHz 之间。

NV25: GeForce4 (Part 2) NV25:GeForce4(第 2 部分)


With the NV17 covering the lower-half of the market, Nvidia launched NV25 to cover the high-end. The NV25 was developed as an improvement upon the GeForce3’s architecture, and essentially had the same resources with four pixel pipelines, eight TMUs, and four ROPs. The NV25 did have twice as many vertex shaders (an increase from one to two), however, and it featured the updated LMA-II system. Overall, the NV25 contained 63 million transistors, just 3 million more than the GeForce3. The GeForce4 NV25 also had a clock speed advantage over the GeForce3, ranging between 225 and 300 MHz. The 128MB DDR memory was clocked between 500 to 650 MHz.
由于 NV17 覆盖了低端市场,Nvidia 推出了 NV25 来覆盖高端市场。NV25 是作为 GeForce3 架构的改进而开发的,基本上具有相同的资源,包括 4 个像素管道、8 个 TMU 和 4 个 ROP。然而,NV25 确实有两倍的顶点着色器(从一个增加到两个),并且它具有更新的 LMA-II 系统。总体而言,NV25 包含 6300 万个晶体管,仅比 GeForce3 多 300 万个。GeForce4 NV25 的时钟速度也优于 GeForce3,范围在 225 到 300 MHz 之间。128MB DDR 内存的时钟频率在 500 到 650 MHz 之间。

Benchmarks of the NV25 in DirectX 7 titles showed modest performance gains over the GeForce3 of around 10%. However, DirectX 8 games that took advantage of the vertex shaders saw the performance advantage held by the NV25 grow to 38%.
DirectX 7 游戏中 NV25 的基准测试显示,与 GeForce3 相比,性能略有提升约 10%。然而,利用顶点着色器的 DirectX 8 游戏将 NV25 的性能优势增长到 38%。

Nvidia later released a revised NV25 chip, called the NV28. Similar to the NV18 mentioned in the last slide, the NV28 only differed from the NV25 in that it supported AGP 8X.
Nvidia 后来发布了一款经过修改的 NV25 芯片,称为 NV28。与上一张幻灯片中提到的 NV18 类似,NV28 与 NV25 的唯一区别在于它支持 AGP 8X。

NV30: The FX 5000 (Part 1) NV30:FX 5000(第 1 部分)

In 2002, the gaming world welcomed the arrival of Microsoft’s DirectX 9 API, which was one of the most heavily used and influential gaming APIs for several years. ATI and Nvidia both scrambled to develop DX9-compliant hardware, which meant the new GPUs had to support Pixel Shader 2.0. ATI beat Nvidia to the market in August 2002 with its first DX9-capable cards, but by the end of 2002 Nvidia launched its FX 5000 series.
2002 年,游戏界迎来了 Microsoft 的 DirectX 9 API,这是几年来使用最广泛和最具影响力的游戏 API 之一。ATI 和 Nvidia 都争先恐后地开发符合 DX9 标准的硬件,这意味着新的 GPU 必须支持 Pixel Shader 2.0。ATI 于 2002 年 8 月凭借其第一款支持 DX9 的卡击败了 Nvidia 进入市场,但到 2002 年底,Nvidia 推出了 FX 5000 系列。

Although Nvidia launched its DX 9 cards later than ATI, they came with a few additional features that Nvidia used to attract game developers. The key difference was the use of Pixel Shader 2.0A, Nvidia’s own in-house revision. Pixel Shader 2.0A featured a number of improvements over Microsoft’s Pixel Shader 2.0, such as unlimited dependent textures, a sharp increase in the number of instruction slots, instruction predication hardware, and support for more advanced gradient effects. Essentially, Pixel Shader 2.0A contained several improvements that would become part of Microsoft’s Pixel Shader 3.0.
尽管 Nvidia 推出其 DX 9 卡的时间晚于 ATI,但它们还配备了一些 Nvidia 用来吸引游戏开发者的附加功能。主要区别在于使用了 Pixel Shader 2.0A,这是 Nvidia 自己的内部版本。与 Microsoft 的 Pixel Shader 2.0 相比,Pixel Shader 2.0A 具有许多改进,例如无限的依赖纹理、指令槽数量的急剧增加、指令预测硬件以及对更高级渐变效果的支持。从本质上讲,Pixel Shader 2.0A 包含几项改进,这些改进将成为 Microsoft Pixel Shader 3.0 的一部分。

Crafted using 130 nm transistors, NV30 operated between 400 and 500 MHz, and it had access to 128 or 256MB of DDR2 RAM over a 128-bit bus operating at either 800 or 1000 MHz. The NV30 itself continued to use a four-pipeline design with two vertex shaders, eight TMUs, and four ROPs. Nvidia followed it up with lower-end variants that had four pixel pipelines and just one vertex shader, four TMUs, and four ROPs and could use less expensive DDR memory.
NV30 使用 130 nm 晶体管制成,工作频率在 400 到 500 MHz 之间,并且可以通过运行频率为 800 或 1000 MHz 的 128 位总线访问 128 或 256MB 的 DDR1000MB RAM。NV30 本身继续使用四管道设计,具有两个顶点着色器、八个 TMU 和四个 ROP。Nvidia 紧随其后,推出了具有四个像素管道和一个顶点着色器、四个 TMU 和四个 ROP 的低端变体,并且可以使用更便宜的 DDR 内存。

NV35: The FX 5000 Series (Part 2) NV35:FX 5000 系列(第 2 部分)

Although the NV30 was the original FX 5000-series flagship, just a few months later Nvidia released a faster model that had an extra vertex shader and could use DDR3 connected via a wider 256-bit bus.
尽管 NV30 是最初的 FX 5000 系列旗舰产品,但仅仅几个月后,Nvidia 就发布了一款更快的型号,该型号具有额外的顶点着色器,并且可以使用通过更宽的 256 位总线连接的 DDR3。

The FX 5000 series was heavily criticized despite its advanced feature set because its lackluster performance lagged behind ATI’s competing GPUs. It also had poor thermal characteristics, causing the GPU to run extraordinarily hot, requiring OEMs to sell the FX 5000 series with large air coolers.
FX 5000 系列尽管具有先进的功能集,但因其乏善可陈的性能落后于 ATI 的竞争对手 GPU 而受到严厉批评。它还具有较差的热特性,导致 GPU 运行异常热,要求 OEM 销售带有大型空气冷却器的 FX 5000 系列。

NV40: Nvidia GeForce 6800 NV40:英伟达 GeForce 6800

Just one year after the launch of the FX 5000 series, Nvidia released the 6000 series. The GeForce 6800 Ultra was Nvidia’s flagship powered by the NV40. With 222 million transistors, 16 pixel superscalar pipelines (with one pixel shader, TMU, and ROP on each), six vertex shaders, Pixel Shader 3.0 support, and 32-bit floating-point precision, the NV40 had vastly more resources at its disposal than the NV30. This is also not counting native support for up to 512MB of GDDR3 over a 256-bit bus, giving the GPU more memory and better memory performance than its predecessor. These GPUs were produced with the same 130nm technology as the FX 5000 series.
在 FX 5000 系列推出仅一年后,英伟达发布了 6000 系列。GeForce 6800 Ultra 是 Nvidia 的旗舰产品,由 NV40 提供支持。NV40 拥有 2.22 亿个晶体管、16 个像素超标量管道(每个管道都有一个像素着色器、TMU 和 ROP)、6 个顶点着色器、Pixel Shader 3.0 支持和 32 位浮点精度,比 NV30 拥有更多的资源。这还不包括通过 256 位总线原生支持高达 512MB 的 GDDR3,从而为 GPU 提供比其前身更多的内存和更好的内存性能。这些 GPU 采用与 FX 5000 系列相同的 130nm 技术生产。

The 6000 series was highly successful, as it could be twice as fast as the FX 5950 Ultra in some games, and was often roughly 50% faster in most tests. At the same time, it was also more energy efficient.
6000 系列非常成功,因为它在某些游戏中的速度可能是 FX 5950 Ultra 的两倍,并且在大多数测试中通常快 50% 左右。同时,它也更加节能。

NV43: The GeForce 6600 NV43:GeForce 6600


After Nvidia secured its position at the high-end of the GPU market, it turned its attention to producing a new mid-range graphics chip known as the NV43. This GPU was used inside of the Nvidia GeForce 6600, and it had essentially half of the execution resources of the NV40. It also relied on a narrower 128-bit bus. The NV43 had one key advantage, however, as it was shrunk using 110 nm transistors. The reduced number of resources made the NV43 relatively inexpensive to produce, while he new fabrication technology helped reduce power consumption and boost clock speeds by roughly 20% compared to the GeForce 6600.
在 Nvidia 巩固了其在高端 GPU 市场的地位后,它将注意力转向生产一种名为 NV43 的新型中端图形芯片。该 GPU 用于 Nvidia GeForce 6600 内部,它基本上拥有 NV40 一半的执行资源。它还依赖于更窄的 128 位总线。然而,NV43 有一个关键优势,因为它是使用 110 nm 晶体管缩小的。资源数量的减少使 NV43 的生产成本相对低廉,而与 GeForce 6600 相比,新的制造技术有助于降低功耗并将时钟速度提高约 20%。

G70: The GeForce 7800 GTX And GeForce 7800 GTX 512 G70:GeForce 7800 GTX 和 GeForce 7800 GTX 512

The GeForce 6800 was succeeded by the GeForce 7800 GTX, which used a new GPU code-named G70. Based on the same 110 nm technology as NV43, the G70 contained a total of 24 pixel pipelines with 24 TMUs, eight vertex shaders, and 16 ROPs. The GPU could access to up to 256MB of GDDR3 clocked at up to 600 MHz (1.2 GHz DDR) over a 256-bit bus. The core itself operated at 430 MHz.
GeForce 6800 被 GeForce 7800 GTX 取代,它使用了代号为 G70 的新 GPU。G70 基于与 NV43 相同的 110 nm 技术,总共包含 24 个像素管道,具有 24 个 TMU、8 个顶点着色器和 16 个 ROP。GPU 可以通过 256 位总线访问高达 256MB 的 GDDR3,主频高达 600 MHz (1.2 GHz DDR)。内核本身的工作频率为 430 MHz。

Although the GeForce 7800 GTX was quite powerful for its time, Nvidia managed to improve upon its design shortly after its release with the GeForce 7800 GTX 512. With this card, Nvidia reworked the layout of the core and transitioned over to a new cooler design, which enabled the company to push clock speed up to 550 MHz. It also improved its memory controller by reducing latency, increasing the bus width to 512-bit, and pushing the memory frequency up to 850 MHz (1.7 GHz DDR). Memory capacity increased to 512MB, too.
尽管 GeForce 7800 GTX 在当时非常强大,但 Nvidia 在发布后不久就设法改进了其设计,推出了 GeForce 7800 GTX 512。借助这张卡,Nvidia 重新设计了内核的布局并过渡到新的冷却器设计,这使公司能够将时钟速度提高到 550 MHz。它还通过减少延迟、将总线宽度增加到 512 位以及将内存频率提高到 850 MHz (1.7 GHz DDR) 来改进其内存控制器。内存容量也增加到了 512MB。

G80: GeForce 8000 Series And The Birth Of Tesla G80:GeForce 8000 系列和特斯拉的诞生

Nvidia introduced its Tesla microarchitecture with the GeForce 8000 series - the company’s first unified shader design. Tesla would become one of Nvidia’s longest-running architectures, as it was used inside the GeForce 8000, GeForce 9000, GeForce 100, GeForce 200, and GeForce 300 series of GPUs.
Nvidia 通过 GeForce 8000 系列推出了 Tesla 微架构,这是该公司的第一个统一着色器设计。特斯拉将成为 Nvidia 运行时间最长的架构之一,因为它被用于 GeForce 8000、GeForce 9000、GeForce 100、GeForce 200 和 GeForce 300 系列 GPU。

The GeForce 8000 series’s flagship was the 8800 GTX, powered by Nvidia’s G80 GPU manufactured at 80 nm with over 681 million transistors. Thanks to the unified shader architecture, the 8800 GTX and the rest of the 8000 series featured full support for Microsoft’s new DirectX 10 API and Pixel Shader 4.0. The 8800 GTX came with 128 shaders clocked at 575 MHz and connected to 768MB of GDDR3 over a 384-bit bus. Nvidia also increased the number of TMUs up to 64 and raised the ROP count to 24. All of these enhancements allowed the GeForce 8800 GTX perform more than twice as fast as its predecessor in high-resolution tests.
GeForce 8000 系列的旗舰产品是 8800 GTX,由 Nvidia 的 G80 GPU 提供支持,该 GPU 以 80 纳米制造,拥有超过 6.81 亿个晶体管。得益于统一的着色器架构,8800 GTX 和 8000 系列的其他产品完全支持 Microsoft 的新 DirectX 10 API 和像素着色器 4.0。8800 GTX 配备 128 个主频为 575 MHz 的着色器,并通过 384 位总线连接到 768MB 的 GDDR3。Nvidia 还将 TMU 的数量增加到 64 个,并将 ROP 数量增加到 24 个。所有这些增强功能使 GeForce 8800 GTX 在高分辨率测试中的执行速度是其前身的两倍多。

As yields improved, Nvidia later replaced the 8800 GTX with the 8800 Ultra as the flagship. Although both graphics cards used the same G80 core, the 8800 Ultra was clocked at 612 MHz, giving it a slight edge over the 8800 GTX.
随着产量的提高,Nvidia 后来用 8800 Ultra 取代了 8800 GTX 作为旗舰。虽然两个显卡都使用相同的 G80 内核,但 8800 Ultra 的时钟频率为 612 MHz,比 8800 GTX 略有优势。

G92: GeForce 9000 Series And Tesla Improved G92:GeForce 9000 系列和特斯拉改进

Nvidia continued to use the Tesla architecture in its GeForce 9000 series products, but with a few revisions. Nvidia’s G92 core inside the 9000-series flagship was essentially just a die shrink of G80. By fabricating G92 at 65 nm, Nvidia was able to hit clock speeds ranging from 600 to 675 MHz all while reducing overall power consumption.
Nvidia 在其 GeForce 9000 系列产品中继续使用 Tesla 架构,但进行了一些修改。Nvidia 在 9000 系列旗舰中的 G92 内核本质上只是 G80 的芯片缩小版。通过制造 65 nm 的 G92,Nvidia 能够达到 600 到 675 MHz 的时钟速度,同时降低整体功耗。

Thanks to the improved energy efficiency and reduced heat, Nvidia launched a dual-G92 GPU called the GeForce 9800 GX2 as the flagship in the 9000 series. This was something Nvidia was unable to do with the power-hungry G80. In tests, the 9800 GX2 outperformed the 8800 Ultra on average between 29 to 41% when AA was turned off. When AA was active, however, the 9800 GX2’s performance lead shrunk to 13% due to RAM limitations. Each G92 on the 9800 GX2 had access to 512MB of GDDR3, while the 8800 Ultra came with 768MB. The card was also considerably more expensive than the 8800 Ultra, which made it a tough sale.
由于提高了能源效率和减少了热量,Nvidia 推出了一款名为 GeForce 9800 GX2 的双 G92 GPU,作为 9000 系列中的旗舰。这是 Nvidia 无法用耗电的 G80 做到的。在测试中,当 AA 关闭时,9800 GX2 的性能平均比 8800 Ultra 高 29% 到 41%。然而,当 AA 处于活动状态时,由于 RAM 限制,9800 GX2 的性能领先优势缩小到 13%。9800 GX2 上的每个 G92 都可以访问 512MB 的 GDDR3,而 8800 Ultra 则有 768MB。该卡也比 8800 Ultra 贵得多,这使得它的销售变得艰难。

G92 And G92B: GeForce 9000 Series (Continued) G92 和 G92B:GeForce 9000 系列(续)

Nvidia later released the GeForce 9800 GTX with a single G92 core clocked at 675 MHz and 512MB of GDDR3. This 9800 GTX was slightly faster than the 8800 Ultra thanks to its higher clock speed, but it also ran into issues due to its limited RAM capacity. Eventually, Nvidia created the GeForce 9800 GTX+ with a new 55 nm chip code-named G92B. This allowed Nvidia to push clock speed up to 738 MHz, but the most significant improvement that the 9800 GTX+ possessed was its 1GB of memory.
Nvidia 后来发布了 GeForce 9800 GTX,该处理器具有主频为 92 MHz 的单个 G675 内核和 512MB 的 GDDR3。由于其更高的时钟速度,这款 9800 GTX 比 8800 Ultra 略快,但由于 RAM 容量有限,它也遇到了问题。最终,Nvidia 创造了 GeForce 9800 GTX+,配备了代号为 G92B 的新型 55 纳米芯片。这使 Nvidia 能够将时钟速度提高到 738 MHz,但 9800 GTX+ 拥有的最显着改进是其 1GB 内存。

G92B: The GeForce 100 Series G92B:GeForce 100 系列

Towards the end of the 9000-series, Nvidia introduced the GeForce 100 series targeted exclusively at OEMs. Individual consumers were not able to buy any of the 100-series cards directly from retailers. All 100 series GPUs were re-branded variants of the 9000 series with minor alternations to clock speed and card design.
在 9000 系列接近尾声时,Nvidia 推出了专门针对 OEM 的 GeForce 100 系列。个人消费者无法直接从零售商处购买任何 100 系列卡。所有 100 系列 GPU 都是 9000 系列的更名变体,时钟速度和卡设计略有不同。

GT200: GeForce 200 Series And Tesla 2.0 GT200:GeForce 200 系列和特斯拉 2.0

Nvidia introduced the GT200 core based on an improved Tesla architecture in 2008. Changes made to the architecture included an improved scheduler and instruction set, a wider memory interface, and an altered core ratio. Whereas the G92 had eight Texture Processor Clusters (TPC) with 16 EUs and eight TMUs, the GT200 used ten TPCs with 24 EUs and eight TMUs each. Nvidia also doubled the number of ROPs from 16 in the G92 to 32 in the GT200. The memory bus was extended from a 256-bit interface to a 512-bit wide connection to the GDDR3 memory pool.
Nvidia 于 2008 年推出了基于改进的 Tesla 架构的 GT2008 内核。对架构所做的更改包括改进的调度程序和指令集、更宽的内存接口和更改的内核比率。G92 有 8 个纹理处理器集群 (TPC),有 16 个 EU 和 8 个 TMU,而 GT200 使用 10 个 TPC,每个 TPC 有 24 个 EU 和 8 个 TMU。Nvidia 还将 ROP 的数量从 G16 的 92 个增加到 GT32 的 200 个。内存总线从 256 位接口扩展到 512 位宽的 GDDR3 内存池连接。

The GT200 launched in the Nvidia GeForce GTX 280, which was significantly faster than the GeForce 9800 GTX+ due to the increased resources. It could not cleanly outperform the GeForce 9800 GX2, but because the 9800 GX2 had considerably higher power consumption and less memory, the GTX 280 was still considered the superior graphics card. The introduction of the GeForce GTX 295 with two GT200 cores in 2009 further cemented the market position of the 200 series.
GT200 在 Nvidia GeForce GTX 280 中推出,由于资源增加,它比 GeForce 9800 GTX+ 快得多。它的性能无法完全胜过 GeForce 9800 GX2,但由于 9800 GX2 的功耗要高得多,内存也更少,GTX 280 仍然被认为是更优越的显卡。2009 年推出具有两个 GT200 内核的 GeForce GTX 295 进一步巩固了 200 系列的市场地位。

GT215: The GeForce 300 Series GT215:GeForce 300 系列

The GeForce 300 series was Nvidia’s second OEM-only line of cards. It was composed entirely of medium-range and low-end GPUs from the GeForce 200 series. All GeForce 300 series desktop GPUs use a 40 nm process and are based on the Tesla 2.0 architecture.
GeForce 300 系列是 Nvidia 的第二款 OEM 专用卡系列。它完全由 GeForce 200 系列的中端和低端 GPU 组成。所有 GeForce 300 系列台式机 GPU 均采用 40 纳米工艺,并基于 Tesla 2.0 架构。

GF100: Fermi Arrises In The GeForce 400 GF100:Fermi 在 GeForce 400 中崛起

Tesla and the GeForce 8000, 9000, 100, 200, and 300 series were followed by Nvidia’s Fermi architecture and the GeForce 400 series in 2010. The largest Fermi chip ever produced was the GF100, which contained four GPCs. Each GPC had four Streaming Multiprocessors, with 32 CUDA cores, four TMUs, three ROPs, and a PolyMorph Engine. A perfect GF100 core shipped with a total of 512 CUDA cores, 64 TMUs, 48 ROPs, and 16 PolyMorph Engines.
特斯拉和 GeForce 8000、9000、100、200 和 300 系列紧随其后,2010 年推出了 Nvidia 的 Fermi 架构和 GeForce 400 系列。有史以来最大的费米芯片是 GF100,它包含四个 GPC。每个 GPC 都有四个流式多处理器,具有 32 个 CUDA 内核、四个 TMU、三个 ROP 和一个 PolyMorph 引擎。完美的 GF100 核心总共附带 512 个 CUDA 核心、64 个 TMU、48 个 ROP 和 16 个 PolyMorph 引擎。

However, the GeForce GTX 480 ( the original Fermi flagship) shipped with just 480 CUDA cores, 60 TMUs, 48 ROPs, and 15 PolyMorph Engines enabled. Due to the hardware resources present in GF100, it was an enormous 529 square millimeters in area. This made it rather difficult to produce perfect samples, and forced Nvidia to use slightly defective cores instead. The GeForce GTX 480 also gained a reputation for running excessively hot. Nvidia and its board partners typically used beefy thermal solutions on the GTX 480 as well, which tended to be loud and earned the graphics card a reputation as one of the nosiest GPUs in recent years.
但是,GeForce GTX 480(最初的 Fermi 旗舰产品)仅启用了 480 个 CUDA 内核、60 个 TMU、48 个 ROP 和 15 个 PolyMorph 引擎。由于 GF100 中存在硬件资源,它的面积为 529 平方毫米。这使得生产完美的样品变得相当困难,并迫使 Nvidia 改用略有缺陷的内核。GeForce GTX 480 还因运行过热而声名鹊起。Nvidia 及其主板合作伙伴通常也在 GTX 480 上使用强大的散热解决方案,这些解决方案往往声音很大,并为该显卡赢得了近年来最爱管闲事的 GPU 之一的美誉。

GF104, 106, 108: Fermi’s Alterted Cores GF104, 106, 108: 费米改变的核

To reduce production costs and increase yields of its smaller Fermi GPUs, Nvidia rearranged the resource count of its SMs. Each of the eight SMs inside the GF104 and four inside the GF106 contain 48 CUDA cores, four TMUs, and four ROPs. This reduced the overall die size, as less SMs were needed on each die. It also reduced shader performance somewhat, but these cores were nonetheless competitive. With this core configuration, an Nvidia GeForce GTX 460 powered by a GF104 was able to perform nearly identical to an Nvidia GeForce GTX 465 that contained a GF100 core with just 11 SMs enabled.
为了降低生产成本并提高其较小的 Fermi GPU 的产量,Nvidia 重新安排了其 SM 的资源数量。GF104 内部的 8 个 SM 和 GF106 内部的 4 个 SM 都包含 48 个 CUDA 内核、4 个 TMU 和 4 个 ROP。这减小了整体芯片尺寸,因为每个芯片上需要的 SM 更少。它还在一定程度上降低了着色器性能,但这些内核仍然具有竞争力。在这种核心配置下,由 GF104 提供支持的 Nvidia GeForce GTX 460 的性能与包含 GF100 核心且仅启用 11 个 SM 的 Nvidia GeForce GTX 465 的性能几乎相同。

When Nvidia created the GF108, it again altered the number of resources in each SM. The GF108 has just two SMs, which contain 48 CUDA cores, four TMUs, and two ROPs each.
当 Nvidia 创建 GF108 时,它再次更改了每个 SM 中的资源数量。GF108 只有两个 SM,每个 SM 包含 48 个 CUDA 内核、4 个 TMU 和 2 个 ROP。

GF110: Fermi Re-Worked GF110:费米再加工

Nvidia continued to use the Fermi architecture for the GeForce 500 series, but managed to improve upon its design by re-working each GPU on a transistor level. The underlying concept of this re-working process was to use slower more efficient transistors in some parts of the GPU that are less critical to performance, and to use faster transistors in key areas that strongly affect performance. This had the impact of reducing power consumption and enabling an increase in clock speed.
Nvidia 继续在 GeForce 500 系列中使用 Fermi 架构,但通过在晶体管级别重新设计每个 GPU 来改进其设计。此返工过程的基本概念是在 GPU 的某些对性能不太关键的部分使用速度较慢、效率较高的晶体管,并在对性能有很大影响的关键区域使用速度较快的晶体管。这具有降低功耗和提高时钟速度的影响。

Under the hood of the GTX 580 (the GeForce 500-series flagship) was GF110. In addition to the aforementioned transistor re-working, Nvidia also improved FP16 and Z-cull efficiency. These changes made it possible to enable all 16 SMs on the GF110, and the GTX 580 was considerably faster than the GTX 480.
GTX 580(GeForce 500 系列旗舰)的引擎盖下是 GF110。除了上述晶体管返工外,Nvidia 还提高了 FP16 和 Z-cull 的效率。这些更改使得在 GF110 上启用所有 16 个 SM 成为可能,并且 GTX 580 比 GTX 480 快得多。

GK104: Kepler And The 600 Series GK104:Kepler 和 600 系列

The GeForce GTX 580 was succeeded by the GTX 680, which used a GK104 based on the Kepler architecture. This marked a transition to 28 nm manufacturing, which is partially responsible for the GK104 being far more efficient than GF110. Compared to the GF110, GK104 also has twice as many TMUs and three times as many CUDA cores. The increase in resources didn’t triple performance, but it did increase performance by between 10 and 30% depending on the game. Overall efficiency increased even more.
GeForce GTX 580 被 GTX 680 取代,GTX 680 使用基于 Kepler 架构的 GK104。这标志着向 28 nm 制造的过渡,这是 GK104 比 GF110 效率高得多的部分原因。与 GF110 相比,GK104 的 TMU 数量是 GF110 的两倍,CUDA 内核的数量是 GF110 的三倍。资源的增加并没有使性能增加三倍,但它确实将性能提高了 10% 到 30%,具体取决于游戏。整体效率进一步提高。

GK110: Big Kepler GK110: 大开普勒


Nvidia’s plan for the GeForce 700 series was essentially to introduce a larger Kepler die. The GK110, which was developed for compute work inside of supercomputers, was perfect for the job. This massive GPU contained 2880 CUDA cores and 240 TMUs. It was first introduced inside the GTX Titan, which had a single SMX disabled, dropping the core count to 2688 CUDA cores, 224 TMUs, and 6GB of RAM. However, the Titan had an unusually high price of $1000, which limited sales. It was later re-introduced as the GTX 780 with just 3GB of RAM and a somewhat more affordable price tag.
Nvidia 对 GeForce 700 系列的计划本质上是引入更大的 Kepler 芯片。GK110 专为超级计算机内部的计算工作而开发,非常适合这项工作。这个巨大的 GPU 包含 2880 个 CUDA 内核和 240 个 TMU。它首先在 GTX Titan 中引入,GTX Titan 禁用了一个 SMX,将核心数量降至 2688 个 CUDA 核心、224 个 TMU 和 6GB RAM。然而,Titan 的 1000 美元价格异常高,这限制了销售。它后来作为 GTX 780 重新推出,只有 3GB 的 RAM,价格更实惠。

Later, Nvidia would ship the GTX 780 Ti, which fully utilized the GK110 with all 2880 CUDA cores and 240 TMUs.
后来,Nvidia 将推出 GTX 780 Ti,它充分利用了 GK110 的所有 2880 个 CUDA 内核和 240 个 TMU。

GM204: Maxwell GM204: 麦克斯韦

Nvidia introduced its Maxwell architecture in 2014 with a focus on efficiency. The initial flagship, the GM204, launched inside the GeForce GTX 980. A key difference between Maxwell and Kepler is the memory sub-system. GM204 has a narrower 256-bit bus, but Nvidia achieved greater utilization of the available bandwidth by implementing a powerful memory compression algorithm. The GM204 also utilizes a large 2MB L2 cache that further reduced the impact of the narrower memory interface.
Nvidia 于 2014 年推出了 Maxwell 架构,专注于效率。最初的旗舰产品 GM204 在 GeForce GTX 980 中推出。Maxwell 和 Kepler 之间的一个关键区别是内存子系统。GM204 具有更窄的 256 位总线,但 Nvidia 通过实施强大的内存压缩算法实现了对可用带宽的更大利用率。GM204 还利用了 2MB 的大 L2 缓存,进一步减少了较窄内存接口的影响。

The GM204 contained a total of 2048 CUDA cores, 128 TMUs, 64 ROPs, and 16 PolyMorph engines. Due to the reduced resources compared to the GTX 780 Ti, the GM204 wasn’t exceptionally faster than the GeForce GTX 780 Ti. It achieved just a 6% performance advantage over the GTX 780 Ti, but it also consumed roughly 33% less power.
GM204 总共包含 2048 个 CUDA 内核、128 个 TMU、64 个 ROP 和 16 个 PolyMorph 引擎。由于与 GTX 780 Ti 相比资源减少,GM204 并不比 GeForce GTX 780 Ti 快。与 GTX 780 Ti 相比,它仅实现了 6% 的性能优势,但它的功耗也降低了大约 33%。

Nvidia later released the GM200 inside the GeForce GTX 980 Ti. The GM200 was essentially a more resource-rich GM204 with 2816 CUDA cores. It managed to increase performance over the GM204, but it was not quite as efficient.
Nvidia 后来在 GeForce GTX 200 Ti 中发布了 GM980。GM200 本质上是资源更丰富的 GM204,具有 2816 个 CUDA 内核。它设法提高了 GM204 的性能,但效率不高。

GP104: Pascal GP104: 帕斯卡

The Pascal architecture succeeded Maxwell, and marked Nvidia’s transition to a new 16 nm FinFET process. This helped to increase the architectural efficiency and drive up clock speed. The 314 mm square GP104 used inside the GeForce GTX 1080 contains a whopping 7.2 billion transistors. With 2560 CUDA cores, 160 TMUs, 64 ROPs, and 20 PolyMorph engines, the GeForce GTX 1080 was far more powerful than the GeForce GTX 980 Ti.
Pascal 架构继承了 Maxwell,标志着 Nvidia 向新的 16 nm FinFET 工艺过渡。这有助于提高架构效率并提高时钟速度。GeForce GTX 1080 内部使用的 314 毫米见方的 GP104 包含高达 72 亿个晶体管。GeForce GTX 1080 拥有 2560 个 CUDA 内核、160 个 TMU、64 个 ROP 和 20 个 PolyMorph 引擎,比 GeForce GTX 980 Ti 强大得多。

Nvidia also produced four other GPUs based on Pascal with lower core counts. Like the GTX 1080, the GTX 1070 is targeted at the high-end gaming segment, while the GTX 1060 handles the mid-range segment, and the GeForce GTX 1050 and 1050 Ti handle the low-end of the market.
Nvidia 还基于 Pascal 生产了其他四个内核数量较少的 GPU。与 GTX 1080 一样,GTX 1070 针对高端游戏市场,而 GTX 1060 针对中端市场,GeForce GTX 1050 和 1050 Ti 针对低端市场。

GP102: Titan X, 1080 Ti & Titan XP GP102: Titan X, 1080 Ti&Titan XP

Nvidia pushed the performance of the 1000 series further with the release of its GP102 GPU. This part features 3,840 CUDA cores with a 352-bit memory interface, and it is also produced on a 16nm process. It first appeared inside of the Titan X with a partially disabled die that left 3,584 cores clocked at 1,531MHz. It was equipped with 12GB of GDDR5X memory clocked at 10Gbps and had a max TDP of 250W.
Nvidia 通过发布其 GP1000 GPU 进一步推动了 102 系列的性能。这部分具有 3,840 个 CUDA 内核和 352 位内存接口,也是采用 16nm 工艺生产的。它首次出现在 Titan X 内部,带有一个部分禁用的芯片,留下 3,584 个时钟频率为 1,531MHz 的内核。它配备了 12GB 的 GDDR5X 显存,主频为 10Gbps,最大 TDP 为 250W。

The GP102 eventually made its way to the consumer market in the form of the Nvidia GeForce GTX 1080 Ti. This graphics card again had a partially disabled die, and it featured the same number of CUDA cores as the Titan X. It is able to outperform the Titan X, however, as it has a higher boost clock speed of 1,582MHz, and its 12GB of GDDR5X RAM is clocked higher at 11Gb/s.
GP102 最终以 Nvidia GeForce GTX 1080 Ti 的形式进入消费市场。该显卡再次有一个部分禁用的芯片,它具有与 Titan X 相同数量的 CUDA 内核。然而,它能够超越 Titan X,因为它具有更高的 1,582MHz 升压时钟速度,并且其 12GB 的 GDDR5X RAM 的时钟频率更高,为 11Gb/s。

As yields improved, Nvidia was able to release a new GPU called the Titan XP that uses a fully enabled GP102 core. This brings the CUDA core count up to 3,840. The Titan XP comes equipped with 12GB of GDDR5X clocked at 11.4Gb/s. The card is clocked identical to the GTX 1080 Ti, but should perform better thanks to the increased CUDA core count.
随着产量的提高,Nvidia 能够发布一款名为 Titan XP 的新 GPU,它使用完全启用的 GP102 内核。这使 CUDA 核心数量达到 3,840 个。Titan XP 配备 12GB GDDR5X,主频为 11.4Gb/s。该卡的时钟频率与 GTX 1080 Ti 相同,但由于 CUDA 内核数量增加,性能应该会更好。

Turing and GeForce RTX Turing 和 GeForce RTX

Once again, Nvidia veered in a different direction with its Turing architecture. The addition of dedicated hardware for ray tracing (RTX) and A.I. (Tesnor Cores) brings real-time ray tracing to the gaming world for the first time. It’s a quantum leap in terms of realistic lighting and reflection effects in games, and a rendering technique Nvidia’s CEO Jensen Huang calls the “Holy Grail” of the graphics industry.
Nvidia 的 Turing 架构再次转向了不同的方向。新增的光线追踪 (RTX) 和 AI (Tesnor Core) 专用硬件,首次为游戏世界带来了实时光线追踪。这是游戏中逼真照明和反射效果方面的巨大飞跃,NVIDIA 首席执行官黄仁勋将这种渲染技术称为图形行业的“圣杯”。

Nvidia first announced Turing as the foundation of new professional Quadro cards, but followed that up the next week with a trio of gaming-focused GeForce RTX cards, the 2070, 2080, and 2080 Ti. While the initial focus was all about ray tracing and AI-assisted super-sampling, Nvidia also promised the RTX 2080 would deliver performance improvements of between 35 and 125 percent compared to the previous-generation GTX 1080. But there was also a fair bit of initial backlash over significant generation-over-generation price increases, which pushed the RTX 1080 Ti Founders Edition card to an MSRP of 1,199, compared to the 699 launch price of the GTX 1080 Ti.
Nvidia 首先宣布 Turing 是新型专业 Quadro 显卡的基础,但在接下来的一周又推出了三款专注于游戏的 GeForce RTX 卡,即 2070、2080 和 2080 Ti。虽然最初的重点是光线追踪和 AI 辅助超级采样,但 Nvidia 还承诺,与上一代 GTX 35 相比,RTX 125 的性能将提高 1080% 到 1080%。但是,最初的几代人价格大幅上涨也引起了相当多的反对,这使得 RTX 1080 Ti Founders Edition 卡的建议零售价为 1,199 美元,而 GTX 1080 Ti 的发布价格为 699 美元。

Nvidia’s Turing-based RTX cards also introduced a couple other new technologies to the gaming space. The USB-C-based VirtualLink connector is aimed at next-gen VR headsets, while the NVLink connector replaces SLI as an interconnect for multi-card setups that avoids the bottleneck of PCI-E. But there’s also a price increase involved with NVLink. Nvidia says the required bridge connector will cost 79, compared to the 40 price tag of the high-bandwidth SLI bridge designed for 10-series cards.
Nvidia 基于 Turing 的 RTX 卡还为游戏领域引入了其他几项新技术。基于 USB-C 的 VirtualLink 连接器针对下一代 VR 耳机,而 NVLink 连接器取代 SLI 作为多卡设置的互连,避免了 PCI-E 的瓶颈。但 NVLink 也涉及价格上涨。Nvidia 表示,所需的桥接连接器将花费 79 美元,而专为 10 系列卡设计的高带宽 SLI 桥接器的价格为 40 美元。


via:

标签:series,GPU,GeForce,Nvidia,GTX,显卡,历程,was
From: https://blog.csdn.net/u013669912/article/details/141557615

相关文章

  • Python:RTX 40系列显卡安装 CUDA ,以 RTX 4070 为例,CUDA Toolkit 12.6 Downloads
    简简单单Onlinezuozuo:简简单单Onlinezuozuo简简单单Onlinezuozuo简简单单Onlinezuozuo简简单单Onlinezuozuo:本心、输入输出、结果简简单单Onlinezuozuo:联系我们:VX:tja6288/EMAIL:[email protected]文章目录Python:RTX40系列显卡安装CUDA,以RTX4......
  • 芯片、GPU、CPU、显卡、显存、x86、ARM、AMD等基础知识
    1.芯片芯片指的是半导体材料制成的集成电路,可以包含一个或多个电子元件、电路或系统。芯片可以是任何电子设备的组成部分,不仅限于CPU,还包括GPU、内存芯片、存储控制器、网络接口等。GPU和CPU是两种最常见的特定的芯片。它们分别针对图形处理和通用计算任务进行了优化。2.CPU......
  • 编程之旅:从挫折到突破的心路历程
    你是如何克服编程学习中的挫折感的?编程学习之路上,挫折感就像一道道难以逾越的高墙,让许多人望而却步。然而,真正的编程高手都曾在这条路上跌倒过、迷茫过,却最终找到了突破的方法。你是如何在Bug的迷宫中找到出口的?面对复杂的算法时,你用什么方法让自己保持冷静?让我们一起分享......
  • 智商测试的发展历程及其影响:从学术研究到社会实践
    发展20世纪初,法国心理学家比奈(AlfredBinet,1857年-1911年)和他的学生编制了世界上第一套智力量表,根据这套智力量表将一般人的平均智商定为100,而正常人的智商,根据这套测验,大多在85到115之间。智商主要靠遗传,但是人的智力肯定不是一成不变的,它随着年龄的成熟而发展,因教育和训练而......
  • 云监控的发展历程与未来展望
    本文分享自天翼云开发者社区《云监控的发展历程与未来展望》,作者:薛****志云监控的介绍随着云计算的普及,许多企业已经将关键业务系统和应用迁移到了云端。与传统的本地部署环境相比,云端环境更加动态和复杂,这使得监控云上资源和应用的运行状态变得尤为重要。云监控能够实时监......
  • Altium Designer (AD) 20对显卡的要求高不高?
    AltiumDesigner(AD)是一款专业的电子设计自动化软件,它对显卡的性能要求相对较高,特别是如果你打算充分利用其3DPCB设计和仿真功能。根据官方的系统配置要求,AD20推荐使用高性能显卡,如GeForceGTX1060或RadeonRX470级别或更高,以支持DirectX10或以上版本。这意味着显卡需......
  • VirGL与NVIDIA GPU一起运行 - 2024(QEMU)
    安装Nvidia驱动程序550和下一版本(如果需要检查,请将550更改为555等)。sudoadd-apt-repositoryppa:graphics-drivers/ppasudoaptupdatesudoaptinstallnvidia-driver-550禁用集成GPU第1步(只能通过英伟达™(NVIDIA®(英伟达™))GPU运行,不能使用其他GPU)(如果无法......
  • 这周末,除非外面下钞票,否则谁也拦不住我玩《黑神话悟空》(附:两款可以玩转悟空的显卡推荐
    主播说联播|从“十分之三”到“悟空”,国潮有何出圈密码?《黑神话:悟空》里的中国古建取景地,在这里!这周末,除非外面下钞票,否则谁也拦不住我玩《黑神话悟空》(附:两款可以玩转悟空的显卡推荐)原创 IPBrain平台君 集成电路大数据平台 2024年08月22日17:28 北京 这几天3A游......
  • nvidia系列教程-AGX-Orin 确定gpio编号
    目录前言一、软件版本说明二、debugfs得到gpio三、gpio操作总结前言        NVIDIAJetsonAGXOrin是一款强大的嵌入式AI计算平台,适用于各种复杂的边缘计算任务。对于开发者来说,准确地控制和操作GPIO(通用输入输出)引脚是非常重要的。本文将详细介绍如......
  • nvidia系列教程-AGX-Orin pcie网卡I350调试笔记
    目录前言一、Intel-I350网卡二、原理图连接三、SDK配置四、内核配置五、网卡调试总结前言        NVIDIAJetsonAGXOrin是一款高度集成的嵌入式AI计算平台,广泛应用于自动驾驶、机器人和边缘计算等领域。在某些项目中,你可能需要通过PCIe扩展接口来......