Hi there,
TL;DR
Something broke with the new OSG pager threads, when running with --props:/sim/rendering/database-pager/threads greater 1.
The offending commits are probably the ones related to the OSG paging and Scenery instantiation in combination with threading:
flightgear: ab1828365 (Enable /sim/rendering/database-pager/threads, 2024-02-24)
simgear: 6a2a33bc (Implement instancing for scenery objects, 2024-02-24)
Background
in 2020 I added code to the C182 to have custom registration painted onto a model.
At the time of writing the addition 2020, that worked [1].
Also, when implementing that in november 2023 for the C172 and DA40NG I did not see this issue[2].
Problem description
Now, somehow since some time, adjusting the registration-canvas on remote planes is broken in a very curios way (on fgfs 2020.4):
Somehow for the first instance the registration is not shown.
For the other instances, all works fine.
Details
The interesting thing is, that the canvas properties always show a "match"; it reports, that the canvas is indeed matching the model selected.
And also the actual content of the canvas shows fine if i draw it additionally to a canvas window (screenshot in the github ticket linked below).
I did a torough analysis and bisect and documented the results and tests in my github ticket:
https://github.com/HHS81/c182s/issues/584#issuecomment-2063950321
Workaround
When disconnecting and reconnecting the first instance to MP, it starts to work again, and the first instance sees the correct registration on the remote model. It seem to not matter if this is a local LAN test or via MP-Servers.
Setting "--props:/sim/rendering/database-pager/threads=1" with a recent clean compiled next fixes the issue.
Hi Benedict,
Thanks for the detailed analysis you did. I think I've tracked this down, but I'm not 100% sure. Could you test the attached patch with multiple OSG threads and see if it resolves the problem for you.
The background is that the Nasal loader isn't threadsafe, and I think can end up over-writing itself.
Hi Stuart,
thanks for investigating.
Recompiled next with the patch applied as of:
Results:
Hi Benedikt,
I recently pushed a number of fixes for multithreading that you might want to try, but I'm not optimistic that they will have fixed the problem as none of them are related to Nasal or Canvas. :(
However, I also pushed fixes to cmake to enable ThreadSanitizer (aka TSAN) which may make it easier to diagnose.
If you have the time an inclination, could you create a build with ENABLE_TSAN=ON (you'll need to build both simgear and flightgear), repro the problem, and then send me the logs - there should be lots of information about thread race conditions report to STDERR.
You might find this page useful: https://github.com/google/sanitizers/wiki/ThreadSanitizerFlags
In particular, you can set the logpath to something other than STDERR.
Alternatively, could you send me a more detailed repro scenario - I've tried reproing this with MP loopback on the c182s but the registration looks OK to me.
Thanks,
-Stuart
Hey Stuart,
detailed reproduction steps and also a very detailed analysis is in my github ticket: https://github.com/HHS81/c182s/issues/584#issuecomment-2063950321
Basicly, the Problem occurs, when i launch two instances connecting trough real multiplayer (so real to dedicated instances, but it is OK to have them in one machine. It did also appear in real MP environment.
I try the TSAN thing
Turns out I cant. This is overloading my system and the process runs oom.
Maybe you have more luck?
But without TSAN, it looks somewhat better now as of:
With:
--props:/sim/rendering/database-pager/threads=<p>BUT, even with p=2, when disconnecting and reconnecting, the rejoined plane does not show the canvas in the remained-instance. With p=1 this is not the case, there reconnecting works.
So to summarize, the case when both instances see a remote plane the first time, it is way better.
But rejoining reliably triggers the issue; while with p=1 all is fine, always.