The FSINCOS feature, recognized for its performance challenges, underwent minor optimizations, leading to a 75% increase in operations per second. It is advisable to utilize the x87 "reduced precision" mode to enhance performance.
The Ubisoft Connect launcher has been successfully repaired, enabling the play of various Ubisoft games. A new cross-compilation setup has been implemented, enabling developers and testers to construct FEX as a WoW64/ARM64EC emulation module. The package manager Nix has been incorporated to enhance user experience by facilitating the setup of the toolchain necessary for constructing FEX as a WoW64/ARM64EC emulation module.
The release additionally features nix-based tools designed to enhance cross-compilation, emulate files, manage external updates, and improve the FEXConfig and FEXConfig. The Nix package manager enables users to automatically configure the necessary toolchain for constructing FEX as a WoW64/ARM64EC emulation module.
FEX Release FEX-2507
This month we slowed down to take our time and we got some good fixes in store for you.
Horizon Zero Dawn slow-motion physics fix
This month it was brought to our attention that the game Horizon Zero Dawn was running in slow-motion. Even though the FPS was relatively stable, the physics were all running at about a third of the speed!
This turned out to be a pretty silly bug. WINE fills a registry key with the frequency of the cycle counter, but it first determines if the RDTSC is "reliable". FEX was failing this reliability check which causes WINE to fall back to the maximum clock speed of the CPU. HZD would then use this value for the speed of its animations! A modern CPU can run its CPU at more than 3Ghz, while cycle counters on both ARM and x86 don't go anywhere near as high! We fixed WINE's "reliability" check inside of FEX, which means the registry key is filled correctly and the game now runs its animations at the correct speed.This does mean that WINE technically still has a bug where if RDTSC is ever described as "unreliable" then you can end up with something up to 6Ghz in that registry key, which is incorrect but will reproduce this bug even without emulation playing a role.
As a side-note, ARM64ec WINE still isn't fixed with this so the game will still have weird issues there under emulation. It's getting fixed but will take some time!
Optimize x87 FSINCOS
x87 is this monster of a feature that keeps coming back to bite us in performance because the softfloat requirements. This month we found a game that hammers the FSINCOS instruction and get up to a staggering 70 million soft-float operations per second!
To help mitigate this a bit, we found that we could optimize the FSINCOS soft-float implementation slightly. This gave us a 75% uplift in operations-per-second, from 8,949 per second to 15,676 per second. While this isn't enough to make this game run full speed, it is a significant uplift and there may be further optimizations that can be found in the future.
It should be noted that with this game it is recommended use our x87 "reduced precision" mode to dramatically improve performance. It's a speed hack but it's definitely worth it in this case. At least until ARM gives us 128-bit floats with transcendental instructions to match x87.
Optimize long division, cpuid, and xgetbv
Since 2021 we had an optimization that removed long-division when it was unnecessary, turns out we had accidentally broken this in the last year alongside a similar optimization for cpuid and xgetbv. This optimize for long-division removal previously gave a significant uplift in SuperTuxKart, but we hadn't tested any of the latest AAA games with it in a while. Long-division gets used all over the place because it's the only way x86 supports 64-bit division, so we end up needing to fall back to a software implementation in worst case.
Go and test some games, this could result in some decent performance uplift again now that we fixed it again! Now with unittests so we don't break it again.
Implement enough of ptrace for Ubisoft Connect launcher
This launcher has been a thorn in our side for years. It has some anti-debugger tech built in to it for some reason which always prevented all Ubisoft games from launching. We have now fixed this in FEX and it now works as well as expected. Now various Ubisoft games can be played! Have fun!
A new cross-compilation setup
Developers and testers living on the edge will know this long-standing issue: Parts of FEX require a full x86 cross-toolchain to build, but surprisingly few Linux distributions ship a setup that works out of the box. Setting things up manually is tricky enough that we couldn't even provide the most generic instructions to do so, so we've been searching for a way to make life easier for people. We've finally found a good solution: The package manager Nix!
Once Nix is installed, it now only takes a call to one of FEX's bundled helper scripts to build our x86-native FEXLinuxTests or to enable the library forwarding feature for improved emulation performance. For usage details, head over to the pull request that added the helpers.
A sweet little bonus: This same system allows you to automatically set up the toolchain required for building FEX as a WoW64/ARM64EC emulation module!
Raw changes
FEX Release FEX-2507
Build
- Add nix-based helpers to facilitate cross-compilation ( 492b0fd)
EmulatedFiles
- Emulate
current_clocksource( c6aae9e)External
- Update Tracy submodule to version 0.12.1 ( bf83569)
FEXConfig
- Add icon ( 0072b28)
Fix
- wrong magic value in fpstate. ( 95b4618)
IR
- merge integer division & remainder ( 534b338)
InstructionCountCI
- add 32-bit division cases ( 9d2f557)
JIT
- Optimize x87 FSINCOS ( 640f024)
LibraryForwarding
Linux
- Implement enough of ptrace to allow Ubisoft launcher ( 1f15a4e)
RegisterAllocationPass
- assert an invariant in post-RA prop ( 95ca20c)
Misc
- Fix build with LLVM >= 21 ( 6f089a4)
- Fix Hades ( fa0a54d)
- Optimize some constant cpuid/xgetbv cases ( afbc7da)
- Optimize long division ( afabe7c)
- Remove code generation build step for install prefix ( e685ab8)
- Some static code analysis fixes ( 5f2a72b)
- Update Readme.md ( a5fad89)
- Improve log message formatting ( 3d0c20a)
- Merge moves that are immediately consumed ( 1c38b8b)
- Move data files to Data/ ( c346aca)
unittests
- Adds unittest for no-exec testing ( bb072c0)
