PS4 Pro ENHANCED with Rapid Packed Math, Closing the Gap on Xbox One X

“Rapid Packed Math” May Change the Playing Field for Everyone

RX Vega will introduce, for the first time in Desktop GPUs, the Rapid Packed Math feature. This allows two half-float operations (FP16) to be executed at the same time it would take for one full-float operation (FP32). From AMD’s promo statement:

“Next-Gen Compute Units (NCUs) provide super-charged pathways for doubling processing throughput when using 16-bit data types. In cases where a full 32 bits of precision is not necessary to obtain the desired result, they can pack twice as much data into each register and use it to execute two parallel operations. This is ideal for a wide range of computationally intensive applications including image/video processing, ray tracing, artificial intelligence, and game rendering.”

In other words, it makes me high-end graphics faster to load so’s I’m not pissed off at de-res’d textures. Forget all the rest of that “work, I’m an engineer so I use the GPU to do computations A.I. blah-blah-blah” stuff.

Rapid Packed Math

Just ahead of the PS4 Pro launch, System Architect Mark Cerny revealed that the console would include a couple features from AMD’s future roadmap. One of these is what AMD is now calling Rapid Packed Math; Cerny said that it has the potential to “radically increase performance”.

One of the features appearing for the first time on the PS4 Pro is the handling of 16-bit variables – it’s possible to perform two 16-bit operations at a time instead of one 32-bit operation. In other words, at full floats, we have 4.2 teraflops. With half-floats, it’s now double; 8.4 teraflops in 16-bit computation.

Now here’s the rub: Microsoft didn’t add the Rapid Packed Math roadmap to their Xbox One X custom design. They bet on their super-roided pony with extra memory and GPU power. Conversely, developers would have to put time into learning how to implement the new Rapid Packed Math roadmap, and it may be something they skip altogether.

In walks another spicy rub: AMD revealed last week that Wolfenstein II: The New Colossus and Far Cry 5 will support Rapid Packed Math. Looks like AMD be talkin’ to the developers and maybe offering incentives?

Right now we’re in kind of a holding pattern to see where the industry chooses to go. Will it use the Rapid Packed Math roadmap and the PS4 Pro gets a leg up on Xbox One X, or will it stay the course with raw, UNCUT power to brute force these 32-bit operations? Hmmm, seems like they’ll use AMD’s technology. But how will nVidia feel about that, the titan of gaming hardware as of late. It’s a battle between BetaMax and VHS, HDVD and BluRay, and … Chili’s or Applebee’s? Virtually the same, except Chili’s has the new Chicken Crispers in Sweet Teriyaki, HOT BBQ, and Pamesean Garlic. Ha ha, Chili’s please give me money!

SOURCE

  • Kiyoshi Richards

    I think a better explanation is that FP32 allows bigger numbers witch in turn gives more accuracy when calculating graphic. What this means is that the higher the resolution you are aiming for the higher FP standard you usually want to use. For example if you want to draw a 4K imagee you usually want to use FP32 but when drawing a 720p or 1080p image FP16 is usually good enough.

    • justerthought

      Yes some functions require large data precision, so they remain unchanged. That accuracy is not lost. The bonus here is that not all data requires such high precision, and that is where the performance boost can happen.

      Don’t reject something just because it cannot be applied to everything. People constantly think in black and white with no greys allowed. Native or it’s rubbish is a perfect example. This is about optimisation and code efficiency.

      We would not be listening to mp3 music or viewing jpg images if the high precision of lossless native was the only thing allowed. A high quality jpg image can look so close to a raw tiff file that the user simply cannot see the different. The same applies for a high quality mp3. The internet would grind to a halt if we had to download lossless tiff images on websites.

      The tech belongs to AMD so Sony and MS have the option to use it on their chip designs if they pay for it. Sony definitely chose to use it on the PS4 so they can use it on the PS4 Pro as well. Backwards compatibility is maintained.

      I’m pretty sure MS chose not to use this tech on the XB1. If the tech was implemented on the XB1X and a game utilised it, the game would not work on the XB1. MS are all about backwards compatibility. If MS have used the tech on the XB1X, but not on the XB1, the devs would need to do two versions of the game with different memory architecture allocations. That’s a lot of work, even if they utilise some of the code from the PS4 Pro port.

      The big news here is that if MS do not have this tech, the PS4 Pro could negate the XB1X power advantage by using smart tech.

      • Kevin Caldwell

        I understand the tech, and it should be implemented at a system level like hyper-threading. Unfortunately it’s developer dependant, and given how developers aren’t keen on porting games to the PS4 Pro already… I can’t see it being heavily used. However if this tech doesn’t find its way into the next gen consoles, at a system level, I’ll be very surprised indeed.

  • Kiyoshi Richards
    • justerthought

      You cannot use visual examples like that to see the benefit this could bring. A game using this tech will still be using 32bit for most of its processing, but smaller data does not require such precision could benefit.

      The simplest example is the amount of ammo in your gun. It’s a very small number that would easily fit in a 16bit register. When you fire the gun, a calculation is made to subtract from the total number. Normally the number of bullets is stored in a huge 32bit register and a 32bit calculation is performed. Even if it’s stored in a 6bit register to save space, the process still places it in the same time slot used by a 32bit processor.

      This tech changes all that. Storing it in a 16bit register along with another 16bit value for something else inside one big 32bit register, allows both calculations to be performed in the same time slot. It doubles the thru put on certain data.

      Now if you think of all the other small values this could be applied to, you realize there is potential for a huge gain here. The actual work is invisible but the result gives the dev headroom to perform other tasks simultaneously to create a richer result.

      You may not see it as a pixel bonus in screenshot comparisons. It may be frame rate improvement or that an NPC has slightly more complex AI performances at his disposal, adding to the immersion of the scene. A temporal improvement rather than static improvement. That would be totally invisible in the static screenshot comparison you’re pointing to.

      Do you see my point.

  • justerthought

    Yes it’s true that this tech is only a benefit if the devs actually use it. But to claim that the dev may avoid it due to it not being on the XB1X does not hold water.

    The crucial news here is that the tech is now appearing on the PC. Devs make multiplat games for PC as well and they actually make them on PC workstations.

    When you have two platforms that use the tech and one that does not, you use the tech to make a gain. MS are left the odd man out through lack of long term vision.

    • Steve Wrangall

      The crucial news is that Chili’s has chicken crispers available in FOUR new flavors!

    • greatnessIsaLIE

      Dude devs barley do anything to support the ps4 pro. So far it looks like only a few devs will go all out on the xbx. I hope they do but time will tell

  • Omar

    This will not close any gap and a couple of games have already used it.

  • Kevin Caldwell

    It might just pull the PS4 Pro up a little, but that is something will we have to wait and see. In the meantime many have overlooked how the One X has DX12 instructions built directly into the CPU, which allow over 1,000 draw calls to be made in just 11. That will have a much bigger impact on frame rates than anything else.

    • Fred Kaiser

      unreal engine 4 and unity engine the 2 most used engines in videogames today have both reported they are or have already added full support for dx12 in there engines for xbox one X making it even easier for devs to port there games between xbox one or xbox one X

    • Ivan Johnson

      Dx12 barely does anything. I have my gaming PC… many games have issues when running DX12. And I see no difference lol frame frate jumps from 28fps to 29fps

  • Fred Kaiser

    it gives ps4 pro performance gains in CERTAIN AREAS for CERTIAN SCENARIOS for sure but to sit there and think that it will cause the ps4 pro to be on par with xbox one X is absolutely garbage fp16 isn’t gonna make up for 42% more power sorry just not gonna happen like omar said below there are games that have used this already and those games haven’t come close to what xbox one X can do and cerny talked about this when ps4 pro was revealed so why are you journalists bringing it back up again like its some huge savior for ps4 pro to catch up to xbox one X in power for anyone that’s not stupid you can see this is just more BS to try and make sony look good.The only way a ps4 pro game will be on par with xbox one X is if the devs blatantly make it that way and that within itself will be a shame.

  • Fred Kaiser

    ps4 pro cant even run senua’s sacrifice at 1080p 60fps it drops to 900p at 60fps basically making the game look worse on the ps4 pro than it does on a standard ps4 so F Outta Here with that fp16 BS…if ninja theory decides to bring hellblade to xbox one and the X the X version will run no less than 1800p native at 60fps guaranteed

  • Ivan Johnson

    All depends on the Developers. Some Devs are more advanced than others. Let Naughty Dog or the Dice get their hands on the extra power. Their games already look stunning on standard PS4.