A general-purpose CPU, is great for regular day to day audio, visual workload and macros (like using the word processor, playing videos, running general code and programs, working with files) and calculations, but when there is a need for intense visual and geometrical calculations, and floating point arithmetic to display a complex scene filled with intricate 3D objects, the CPU requires the aid of a dedicated Graphics Processing and rendering device, which is where the GPU comes into picture.
The intricacy and amount of graphical detail in video games and other graphical media, today is staggering, and still evolving as well.
A CPU usually has a limited number of high-performance cores, a GPU’s architecture, on the other hand, focuses on parallelism. A GPU houses 1000’s of cores which perform graphical and geometric calculations, and perform object transformations such as translations, stress, shearing, to animate the object
The CPU talks to the GPU through the high-speed PCIe interface and the GPU plugs into a PCIe slot, Neither the graphics application nor the OS directly talks to the GPU, instead the CPU uses various API such as DirectX, OpenGL, Vulkan and so on.
A realistic scene rendered in Unreal Engine 4
These Graphics API’s tell the CPU to issue draw calls to the GPU, telling it what to draw onto the screen (outputted out through the HDMI/DVI cable to the monitor)
The GPU consist of a small graphics processing Chip, which sits on the PCB, and multiple memory controllers to access various units of the VRAM (video RAM) on the card. The memory bandwidth depends upon the bus width and memory clock speeds.
The GPU needs to store all the various graphical parameters such as, light source’s and their positions, the positions of objects in a scene, pixel shader information, polygons, various vertex and color information etc., in the VRAM, which it uses to generate the final rendered image or scene.
Here is an example of the amount of Graphical fidelity in games today:
Batman Arkham Knight
A GPU comprises of arrays of 1000’s of specialized cores laid out in a particular fashion.
Firstly, we have Shader cores or Shader units which are quite numerous in number and handle pixel shading. Today’s GPUs use unified shading where all the shader cores perform all the shader functions which are,
- Vertex Shading (works with vertices and points): Vertex Shaders transform the 3D position of each vertex in virtual space to the 2D coordinate position at which it actually appears on the screen. Vertex shaders can manipulate properties such as position, color and texture coordinate, but cannot create new vertices.
- Pixel Shading (color and depth at the pixel level): Pixel Shaders are 2D shaders which compute the color and depth value for each pixel at the pixel level. They can keep outputting the same color, apply a lighting value, shadows, bump mapping and so on.
- Geometry Shading (introduced in Direct3D 10 and OpenGL 3.2): Geometry Shaders are capable of generating newer graphics primitives such as lines, points, and triangles, from the more basic primitives, that were inputted in an earlier stage of rendering.
- Tessellation Shading (added in OpenGL 4.0 and Direct3D 11): Tessellation shading stage adds two new sub-stages known hull shading and domain shading, which when coupled together allow simple meshes to be subdivided into finer meshes based on a function, at run time.
Another important component of a GPU is TMUs (Texture Mapping Units), they perform the task of mapping 2-D bitmaps onto 3-D models using scaling and transforms.
TMUs affect a parameter known as texture fill rate, which is the rate at which pixels related to various textures over the models can be rendered.
The final components are the Rendered Output Units(ROPs), these units compute and determine the final pixel values of a shaded and texturized pixel through a process known as Rasterization.
A post-processing technique known as Antialiasing is applied at this point, Antialiasing reduces the jaggies or stair step effect in images to produce a clearer and crisper image.
This final output is written by the ROPs into respective positions in the frame buffer as Frame buffer values.
The rate at which all of these procedures can be done is measure by the Clock speed of the GPU (measured in Hz).
Also, GPU’s can be used to perform large scales of floating point operations, by offloading all the computations onto the 1000’s of core available on the GPU.
Was it Interesting and Fun to learn about GPUs and How they work? Comment Below, let us know!