Plugin timing drifting

HornetMaX · November 28, 2014, 10:10:32 PM

@Piboso:

I've already reported this problem in the past (and, years back, random reported it too).
dibu just raised it again in KRP forum (http://forum.kartracing-pro.com/index.php?topic=4383.msg38081#msg38081).

The timing info passed to the plugin in the RunTelemtry call (_fTime) drifts away with time.

Proof:
Each time RunTelemetry is called I log the value of _fTime and the value of the windows accurate timer.
In the graph below you see the difference of two consecutive values of _fTime across 6 laps in GPB (on Victoria, offline):

As you can see, it seems the RunTelemetry is called exactly every 100ms (as it should be).
Below is the time difference between two subsequent calls as measured by the windows timer:

There are some fluctuations, but that's OK. First sight it seems to confirm that RunTelemetry is called every 100ms (at least on average).

But if I plot the difference between _fTime and the value of the windows timer (during the same call of RunTelemetry) i get this:

That difference should be constant, but it is clearly not: it drifted more than 2 seconds over 9 minutes.
It's not a lot, but it's weird as:

The windows timer did not drift
The GPB lap times do agree with the lap times I recompute using the windows timer (and detecting finish line crossings)

If I recompute lap times detecting finish line crossing I get:

accurate lap times if I use the windows timer
wrong lap times (after a few laps) if I use the GPB timer

You can even see the problem in the default GPB telemetry file: use the beacons to compute the lap times from the .csv file and you'll see that they drift away from the true lap times (which proves the issue is not in my plugin).

MaX.

PiBoSo · November 28, 2014, 10:51:16 PM

RunTelemetry is not called at a fixed rate.
But the data reported by it is calculated at a fixed rate.

HornetMaX · November 29, 2014, 01:22:06 PM

Quote from: PiBoSo on November 28, 2014, 10:51:16 PM

RunTelemetry is not called at a fixed rate.
But the data reported by it is calculated at a fixed rate.

But in that case, why the lap times calculated with the _fTime are wrong (they drift) and the lap times calculated with the windows timer are much better (no drift) ?
If what you say is correct, it should be the opposite.

MaX.

PiBoSo · November 29, 2014, 01:45:55 PM

Quote from: HornetMaX on November 29, 2014, 01:22:06 PM
Quote from: PiBoSo on November 28, 2014, 10:51:16 PM

RunTelemetry is not called at a fixed rate.
But the data reported by it is calculated at a fixed rate.
But in that case, why the lap times calculated with the _fTime are wrong (they drift) and the lap times calculated with the windows timer are much better (no drift) ?
If what you say is correct, it should be the opposite.

MaX.

Floating point rounding errors, most likely.
_fTime is the result of 500 floating point sums each second.

HornetMaX · November 29, 2014, 11:07:18 PM

Quote from: PiBoSo on November 29, 2014, 01:45:55 PM
Quote from: HornetMaX on November 29, 2014, 01:22:06 PM
Quote from: PiBoSo on November 28, 2014, 10:51:16 PM

RunTelemetry is not called at a fixed rate.
But the data reported by it is calculated at a fixed rate.
But in that case, why the lap times calculated with the _fTime are wrong (they drift) and the lap times calculated with the windows timer are much better (no drift) ?
If what you say is correct, it should be the opposite.

MaX.

Floating point rounding errors, most likely.
_fTime is the result of 500 floating point sums each second.

OK, that makes sense.

But then why not using windows' QueryPerformanceCounter in order to compute the value of _fTime to be passed to the plugins ?
It's way more accurate than the current _fTime ...

Or maybe just going to double (instead of float) would improve the situation enough ?

MaX.

HornetMaX · May 19, 2016, 02:24:11 PM

Quote from: PiBoSo on November 29, 2014, 01:45:55 PM
Quote from: HornetMaX on November 29, 2014, 01:22:06 PM
Quote from: PiBoSo on November 28, 2014, 10:51:16 PM

RunTelemetry is not called at a fixed rate.
But the data reported by it is calculated at a fixed rate.
But in that case, why the lap times calculated with the _fTime are wrong (they drift) and the lap times calculated with the windows timer are much better (no drift) ?
If what you say is correct, it should be the opposite.

MaX.

Floating point rounding errors, most likely.
_fTime is the result of 500 floating point sums each second.

Sorry to come back to this, but it really annoys me (in principle and in practice)

I stumbled on this: https://en.wikipedia.org/wiki/Kahan_summation_algorithm

Any chance it could help to minimize the problem we see in GPB ?

PiBoSo · May 19, 2016, 09:37:20 PM

Quote from: HornetMaX on May 19, 2016, 02:24:11 PM
Quote from: PiBoSo on November 29, 2014, 01:45:55 PM
Quote from: HornetMaX on November 29, 2014, 01:22:06 PM
Quote from: PiBoSo on November 28, 2014, 10:51:16 PM

RunTelemetry is not called at a fixed rate.
But the data reported by it is calculated at a fixed rate.
But in that case, why the lap times calculated with the _fTime are wrong (they drift) and the lap times calculated with the windows timer are much better (no drift) ?
If what you say is correct, it should be the opposite.

MaX.

Floating point rounding errors, most likely.
_fTime is the result of 500 floating point sums each second.
Sorry to come back to this, but it really annoys me (in principle and in practice)

I stumbled on this: https://en.wikipedia.org/wiki/Kahan_summation_algorithm

Any chance it could help to minimize the problem we see in GPB ?

Interesting algorithm.
However, for the vehicle timer it would probably be better to switch to integers. Mumble mumble...

HornetMaX · May 19, 2016, 10:49:23 PM

Quote from: PiBoSo on May 19, 2016, 09:37:20 PM
Interesting algorithm.
However, for the vehicle timer it would probably be better to switch to integers. Mumble mumble...

I was thinking the same (like using milliseconds), but if your ODE integration has a non-constant step (or it is constant but not exact multiple of 1ms) you'll have the same issues: if you convert your step to ms before adding up, you will lose fractions of 1ms each time you add. So it seems to me that, no matter if you want your cumulative time as a float or as an integer, you'll have to deal with the rounding properly (i.e. as in the algorithm above).

As an alternative you could use integers but with finer granularity (e.g. 100us or 10us or 1us) and do the math to cumpute how long it takes for that to cause a 1ms error in the cumulative time.

I'd just do a quick test keeping everything as is today but plugging the "smart sum" algorithm in your code: at least it could verify if the problem we see indeed comes from the sum or not.

HornetMaX · July 22, 2016, 07:13:55 AM

Stumbled on something very close to what has been discussed here: https://randomascii.wordpress.com/2012/02/13/dont-store-that-in-a-float/
Worth reading I guess. On top of the issue I see in the _fTime in the plugin interface, it could also explain some server side issues when servers are run for a long period of time (I know that some here took the good practice to restart a server every 24h to avoid troubles).

The entire series of posts (on float stuff) listed in this other post: https://randomascii.wordpress.com/2012/02/25/comparing-floating-point-numbers-2012-edition/

News:

Plugin timing drifting

HornetMaX

November 28, 2014, 10:10:32 PM

PiBoSo

November 28, 2014, 10:51:16 PM #1

HornetMaX

November 29, 2014, 01:22:06 PM #2

PiBoSo

November 29, 2014, 01:45:55 PM #3

HornetMaX

November 29, 2014, 11:07:18 PM #4

HornetMaX

May 19, 2016, 02:24:11 PM #5

PiBoSo

May 19, 2016, 09:37:20 PM #6

HornetMaX

May 19, 2016, 10:49:23 PM #7 Last Edit: May 23, 2016, 10:45:22 AM by HornetMaX

HornetMaX

July 22, 2016, 07:13:55 AM #8