There is a new experimental async regex mode that Dave added, which helps with stuttering when shaders are converted and cached. It is on by default, but we don't have enough experience yet to say it's always better. If you want to experiment, you can set the .ini vars and try it out. Please let us know bad or good to help inform how we post this. Probably best to add these under d3dxdm.ini Device section.
As noted, the default is now shader_regex_patch_mode =4, so if you have problems with this update, add this under [Device] section and set shader_regex_patch_mode =0.
New asynchronous shader_regex_patch_mode's ([Device] variable) :The async modes avoid the insane stutter/lockups when using shader regex, instead we get delayed shader fixes the first time a shader is encountered. (This means you will get graphic/stereo glitches now instead of stutter.)
- shader_regex_patch_mode = 0; //synchronous pcre2 regex applied on first draw (this was the existing default value)
- shader_regex_patch_mode = 1; //synchronous pcre2 regex applied on shader creation (this was an existing alternative)
- shader_regex_patch_mode = 2; //pcre2 regex will be applied at the first draw on a background thread and the patched shader picked up once cached to disk at some future draw
- shader_regex_patch_mode = 3; //pcre2 regex will be applied at shader creation on a background thread and the patched shader picked up once cached to disk at some future draw
- shader_regex_patch_mode = 4; //chimera (falling back to pcre2 for certain patterns) regex will be applied at first draw on a background thread and the patched shader picked up once cached to disk at some future draw (this is the new default value)
- shader_regex_patch_mode = 5; //chimera (falling back to pcre2 for certain patterns) regex will be applied at shader creation on a background thread and the patched shader picked up once cached to disk at some future draw
I haven't properly profiled the difference, but just using a timer and measuring how long it takes the new chimera modes to resolve all shaders in a scene, its roughly 5x faster than pcre2.
For the async modes I have created a new [Device] ini variable shader_regex_thread_pool_num which controls the number threads we use in the background.shader_regex_thread_pool_num doesn't seem to make much different in the chimera modes, but I have found a higher number of threads will speed up shader resolution in the pcre2 modes at the cost of increased stutter.
- shader_regex_thread_pool_num = -1 (the thread_pool object is from the boost library, and setting it to -1 will let boost pick the number of threads.)
- shader_regex_thread_pool_num = 0 (geo-11 will decide the number of threads in the thread_pool based on cpu count.)
- shader_regex_thread_pool_num = n (Any other value will override. )
shader_regex_patch_mode = 5 will be the best option for most games (broken shaders are more quickly resolved), but there are currently issues with games that load the entire shader set on launch (e.g Batman Arkham Knight), where unacceptable launch delays can occur in mode 5., so I've therefore made mode 4 the default.
I've briefly looked into the Batman Arkham Knight launch issue with mode 5, and I think its basically saturating the external hard drive I'm loading the game from (as its caching so much shader regex data) so the fact that we are using background threads doesn't help. I've thought of a way around this, but, at this point, think I will come back to it at a later date as there are probably higher priorities now given the new async modes largely sort the issue.