• Welcome to PiBoSo Official Forum. Please login or sign up.
 
September 21, 2019, 11:34:31 pm

Plugin timing drifting

Started by HornetMaX, November 28, 2014, 10:10:32 pm

Previous topic - Next topic

HornetMaX

@Piboso:

I've already reported this problem in the past (and, years back, random reported it too).
dibu just raised it again in KRP forum (http://forum.kartracing-pro.com/index.php?topic=4383.msg38081#msg38081).

The timing info passed to the plugin in the RunTelemtry call (_fTime) drifts away with time.

Proof:
Each time RunTelemetry is called I log the value of _fTime and the value of the windows accurate timer.
In the graph below you see the difference of two consecutive values of _fTime across 6 laps in GPB (on Victoria, offline):



As you can see, it seems the RunTelemetry is called exactly every 100ms (as it should be).
Below is the time difference between two subsequent calls as measured by the windows timer:



There are some fluctuations, but that's OK. First sight it seems to confirm that RunTelemetry is called every 100ms (at least on average).

But if I plot the difference between _fTime and the value of the windows timer (during the same call of RunTelemetry) i get this:



That difference should be constant, but it is clearly not: it drifted more than 2 seconds over 9 minutes.
It's not a lot, but it's weird as:

  • The windows timer did not drift

  • The GPB lap times do agree with the lap times I recompute using the windows timer (and detecting finish line crossings)



If I recompute lap times detecting finish line crossing I get:

  • accurate lap times if I use the windows timer

  • wrong lap times (after a few laps) if I use the GPB timer



You can even see the problem in the default GPB telemetry file: use the beacons to compute the lap times from the .csv file and you'll see that they drift away from the true lap times (which proves the issue is not in my plugin).

MaX.

PiBoSo


RunTelemetry is not called at a fixed rate.
But the data reported by it is calculated at a fixed rate.
"Obviously your ambition outweighs your talent".

HornetMaX

Quote from: PiBoSo on November 28, 2014, 10:51:16 pm

RunTelemetry is not called at a fixed rate.
But the data reported by it is calculated at a fixed rate.

But in that case, why the lap times calculated with the _fTime are wrong (they drift) and the lap times calculated with the windows timer are much better (no drift) ?
If what you say is correct, it should be the opposite.

MaX.

PiBoSo

Quote from: HornetMaX on November 29, 2014, 01:22:06 pm
Quote from: PiBoSo on November 28, 2014, 10:51:16 pm

RunTelemetry is not called at a fixed rate.
But the data reported by it is calculated at a fixed rate.

But in that case, why the lap times calculated with the _fTime are wrong (they drift) and the lap times calculated with the windows timer are much better (no drift) ?
If what you say is correct, it should be the opposite.

MaX.


Floating point rounding errors, most likely.
_fTime is the result of 500 floating point sums each second.
"Obviously your ambition outweighs your talent".

HornetMaX

Quote from: PiBoSo on November 29, 2014, 01:45:55 pm
Quote from: HornetMaX on November 29, 2014, 01:22:06 pm
Quote from: PiBoSo on November 28, 2014, 10:51:16 pm

RunTelemetry is not called at a fixed rate.
But the data reported by it is calculated at a fixed rate.

But in that case, why the lap times calculated with the _fTime are wrong (they drift) and the lap times calculated with the windows timer are much better (no drift) ?
If what you say is correct, it should be the opposite.

MaX.


Floating point rounding errors, most likely.
_fTime is the result of 500 floating point sums each second.

OK, that makes sense.

But then why not using windows' QueryPerformanceCounter in order to compute the value of _fTime to be passed to the plugins ?
It's way more accurate than the current _fTime ...

Or maybe just going to double (instead of float) would improve the situation enough ?

MaX.

HornetMaX

Quote from: PiBoSo on November 29, 2014, 01:45:55 pm
Quote from: HornetMaX on November 29, 2014, 01:22:06 pm
Quote from: PiBoSo on November 28, 2014, 10:51:16 pm

RunTelemetry is not called at a fixed rate.
But the data reported by it is calculated at a fixed rate.

But in that case, why the lap times calculated with the _fTime are wrong (they drift) and the lap times calculated with the windows timer are much better (no drift) ?
If what you say is correct, it should be the opposite.

MaX.


Floating point rounding errors, most likely.
_fTime is the result of 500 floating point sums each second.

Sorry to come back to this, but it really annoys me (in principle and in practice) :)

I stumbled on this: https://en.wikipedia.org/wiki/Kahan_summation_algorithm

Any chance it could help to minimize the problem we see in GPB ?

PiBoSo

Quote from: HornetMaX on May 19, 2016, 02:24:11 pm
Quote from: PiBoSo on November 29, 2014, 01:45:55 pm
Quote from: HornetMaX on November 29, 2014, 01:22:06 pm
Quote from: PiBoSo on November 28, 2014, 10:51:16 pm

RunTelemetry is not called at a fixed rate.
But the data reported by it is calculated at a fixed rate.

But in that case, why the lap times calculated with the _fTime are wrong (they drift) and the lap times calculated with the windows timer are much better (no drift) ?
If what you say is correct, it should be the opposite.

MaX.


Floating point rounding errors, most likely.
_fTime is the result of 500 floating point sums each second.

Sorry to come back to this, but it really annoys me (in principle and in practice) :)

I stumbled on this: https://en.wikipedia.org/wiki/Kahan_summation_algorithm

Any chance it could help to minimize the problem we see in GPB ?


Interesting algorithm.
However, for the vehicle timer it would probably be better to switch to integers. Mumble mumble...
"Obviously your ambition outweighs your talent".

HornetMaX

May 19, 2016, 10:49:23 pm #7 Last Edit: May 23, 2016, 10:45:22 am by HornetMaX
Quote from: PiBoSo on May 19, 2016, 09:37:20 pm
Interesting algorithm.
However, for the vehicle timer it would probably be better to switch to integers. Mumble mumble...

I was thinking the same (like using milliseconds), but if your ODE integration has a non-constant step (or it is constant but not exact multiple of 1ms) you'll have the same issues: if you convert your step to ms before adding up, you will lose fractions of 1ms each time you add. So it seems to me that, no matter if you want your cumulative time as a float or as an integer, you'll have to deal with the rounding properly (i.e. as in the algorithm above).

As an alternative you could use integers but with finer granularity (e.g. 100us or 10us or 1us) and do the math to cumpute how long it takes for that to cause a 1ms error in the cumulative time.

I'd just do a quick test keeping everything as is today but plugging the "smart sum" algorithm in your code: at least it could verify if the problem we see indeed comes from the sum or not.

HornetMaX

Stumbled on something very close to what has been discussed here: https://randomascii.wordpress.com/2012/02/13/dont-store-that-in-a-float/
Worth reading I guess. On top of the issue I see in the _fTime in the plugin interface, it could also explain some server side issues when servers are run for a long period of time (I know that some here took the good practice to restart a server every 24h to avoid troubles).

The entire series of posts (on float stuff) listed in this other post: https://randomascii.wordpress.com/2012/02/25/comparing-floating-point-numbers-2012-edition/