2015-05-11 02:35:14

by Tina Ruchandani

[permalink] [raw]
Subject: [PATCH] aoe: Use 64-bit timestamp in frame

'struct frame' uses two variables to store the sent timestamp - 'struct
timeval' and jiffies. jiffies is used to avoid discrepancies caused by
updates to system time. 'struct timeval' uses 32-bit representation for
seconds which will overflow in year 2038.
This patch does the following:
- Replace the use of 'struct timeval' and jiffies with ktime_t, which
is a 64-bit timestamp and is year 2038 safe.
- ktime_t provides both long range (like jiffies) and high resolution
(like timeval). Using ktime_get (monotonic time) instead of wall-clock
time prevents any discprepancies caused by updates to system time.

Signed-off-by: Tina Ruchandani <[email protected]>
---
drivers/block/aoe/aoe.h | 3 +--
drivers/block/aoe/aoecmd.c | 36 +++++++-----------------------------
2 files changed, 8 insertions(+), 31 deletions(-)

diff --git a/drivers/block/aoe/aoe.h b/drivers/block/aoe/aoe.h
index 9220f8e..4582b3c 100644
--- a/drivers/block/aoe/aoe.h
+++ b/drivers/block/aoe/aoe.h
@@ -112,8 +112,7 @@ enum frame_flags {
struct frame {
struct list_head head;
u32 tag;
- struct timeval sent; /* high-res time packet was sent */
- u32 sent_jiffs; /* low-res jiffies-based sent time */
+ ktime_t sent;
ulong waited;
ulong waited_total;
struct aoetgt *t; /* parent target I belong to */
diff --git a/drivers/block/aoe/aoecmd.c b/drivers/block/aoe/aoecmd.c
index 422b7d8..7f78780 100644
--- a/drivers/block/aoe/aoecmd.c
+++ b/drivers/block/aoe/aoecmd.c
@@ -398,8 +398,7 @@ aoecmd_ata_rw(struct aoedev *d)

skb = skb_clone(f->skb, GFP_ATOMIC);
if (skb) {
- do_gettimeofday(&f->sent);
- f->sent_jiffs = (u32) jiffies;
+ f->sent = ktime_get();
__skb_queue_head_init(&queue);
__skb_queue_tail(&queue, skb);
aoenet_xmit(&queue);
@@ -489,8 +488,7 @@ resend(struct aoedev *d, struct frame *f)
skb = skb_clone(skb, GFP_ATOMIC);
if (skb == NULL)
return;
- do_gettimeofday(&f->sent);
- f->sent_jiffs = (u32) jiffies;
+ f->sent = ktime_get();
__skb_queue_head_init(&queue);
__skb_queue_tail(&queue, skb);
aoenet_xmit(&queue);
@@ -499,32 +497,15 @@ resend(struct aoedev *d, struct frame *f)
static int
tsince_hr(struct frame *f)
{
- struct timeval now;
+ ktime_t now;
int n;

- do_gettimeofday(&now);
- n = now.tv_usec - f->sent.tv_usec;
- n += (now.tv_sec - f->sent.tv_sec) * USEC_PER_SEC;
+ now = ktime_get();
+ n = ktime_to_us(ktime_sub(now, f->sent));

if (n < 0)
n = -n;

- /* For relatively long periods, use jiffies to avoid
- * discrepancies caused by updates to the system time.
- *
- * On system with HZ of 1000, 32-bits is over 49 days
- * worth of jiffies, or over 71 minutes worth of usecs.
- *
- * Jiffies overflow is handled by subtraction of unsigned ints:
- * (gdb) print (unsigned) 2 - (unsigned) 0xfffffffe
- * $3 = 4
- * (gdb)
- */
- if (n > USEC_PER_SEC / 4) {
- n = ((u32) jiffies) - f->sent_jiffs;
- n *= USEC_PER_SEC / HZ;
- }
-
return n;
}

@@ -589,7 +570,6 @@ reassign_frame(struct frame *f)
nf->waited = 0;
nf->waited_total = f->waited_total;
nf->sent = f->sent;
- nf->sent_jiffs = f->sent_jiffs;
f->skb = skb;

return nf;
@@ -633,8 +613,7 @@ probe(struct aoetgt *t)

skb = skb_clone(f->skb, GFP_ATOMIC);
if (skb) {
- do_gettimeofday(&f->sent);
- f->sent_jiffs = (u32) jiffies;
+ f->sent = ktime_get();
__skb_queue_head_init(&queue);
__skb_queue_tail(&queue, skb);
aoenet_xmit(&queue);
@@ -1474,8 +1453,7 @@ aoecmd_ata_id(struct aoedev *d)

skb = skb_clone(skb, GFP_ATOMIC);
if (skb) {
- do_gettimeofday(&f->sent);
- f->sent_jiffs = (u32) jiffies;
+ f->sent = ktime_get();
}

return skb;
--
2.2.0.rc0.207.ga3a616c


2015-05-11 15:38:15

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH] aoe: Use 64-bit timestamp in frame

On Monday 11 May 2015 08:05:05 Tina Ruchandani wrote:
> 'struct frame' uses two variables to store the sent timestamp - 'struct
> timeval' and jiffies. jiffies is used to avoid discrepancies caused by
> updates to system time. 'struct timeval' uses 32-bit representation for
> seconds which will overflow in year 2038.
> This patch does the following:
> - Replace the use of 'struct timeval' and jiffies with ktime_t, which
> is a 64-bit timestamp and is year 2038 safe.
> - ktime_t provides both long range (like jiffies) and high resolution
> (like timeval). Using ktime_get (monotonic time) instead of wall-clock
> time prevents any discprepancies caused by updates to system time.
>
> Signed-off-by: Tina Ruchandani <[email protected]>

Very nice!

> @@ -499,32 +497,15 @@ resend(struct aoedev *d, struct frame *f)
> static int
> tsince_hr(struct frame *f)
> {
> - struct timeval now;
> + ktime_t now;
> int n;
>
> - do_gettimeofday(&now);
> - n = now.tv_usec - f->sent.tv_usec;
> - n += (now.tv_sec - f->sent.tv_sec) * USEC_PER_SEC;
> + now = ktime_get();
> + n = ktime_to_us(ktime_sub(now, f->sent));
>

I would cut four extra lines by writing this as

return ktime_us_delta(ktime_get(), f->sent));

but the effect is exactly the same.

With that change, please add

Reviewed-by: Arnd Bergmann <[email protected]>

Arnd

2015-05-11 16:06:01

by Ed Cashin

[permalink] [raw]
Subject: Re: [PATCH] aoe: Use 64-bit timestamp in frame

I would like to see some performance measurements for this patch on a system with fast storage and multiple 10 GbE links.

If not, at least a good analysis of the expected performance impact the patch will have on major architectures.

Tonight I will think about whether the 2038 thing even matters or whether we just need a comment explaining why it's safe.

On May 11, 2015 11:38 AM, Arnd Bergmann <[email protected]> wrote:
>
> On Monday 11 May 2015 08:05:05 Tina Ruchandani wrote:
> > 'struct frame' uses two variables to store the sent timestamp - 'struct
> > timeval' and jiffies. jiffies is used to avoid discrepancies caused by
> > updates to system time. 'struct timeval' uses 32-bit representation for
> > seconds which will overflow in year 2038.
> > This patch does the following:
> > - Replace the use of 'struct timeval' and jiffies with ktime_t, which
> > is a 64-bit timestamp and is year 2038 safe.
> > - ktime_t provides both long range (like jiffies) and high resolution
> > (like timeval). Using ktime_get (monotonic time) instead of wall-clock
> > time prevents any discprepancies caused by updates to system time.
> >
> > Signed-off-by: Tina Ruchandani <[email protected]>
>
> Very nice!
>
> > @@ -499,32 +497,15 @@ resend(struct aoedev *d, struct frame *f)
> >  static int
> >  tsince_hr(struct frame *f)
> >  {
> > - struct timeval now;
> > + ktime_t now;
> >  int n;
> > 
> > - do_gettimeofday(&now);
> > - n = now.tv_usec - f->sent.tv_usec;
> > - n += (now.tv_sec - f->sent.tv_sec) * USEC_PER_SEC;
> > + now = ktime_get();
> > + n = ktime_to_us(ktime_sub(now, f->sent));
> > 
>
> I would cut four extra lines by writing this as
>
> return ktime_us_delta(ktime_get(), f->sent));
>
> but the effect is exactly the same.
>
> With that change, please add
>
> Reviewed-by: Arnd Bergmann <[email protected]>
>
> Arnd
????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?

2015-05-12 02:12:06

by Ed Cashin

[permalink] [raw]
Subject: Re: [PATCH] aoe: Use 64-bit timestamp in frame

First, thanks for the patch. I do appreciate the attempt to simplify
this part of the driver, but I don't think that this patch is good to merge.

I'll make some comments inline below.

On 05/10/2015 10:35 PM, Tina Ruchandani wrote:
> 'struct frame' uses two variables to store the sent timestamp - 'struct
> timeval' and jiffies. jiffies is used to avoid discrepancies caused by
> updates to system time. 'struct timeval' uses 32-bit representation for
> seconds which will overflow in year 2038.

The comment in the deleted lines below mentions the fact that the
overflow does not matter for calculating rough-grained deltas in time.
So there is no problem in 2038 or on systems with the clock set to 2038
accidentally.

> This patch does the following:
> - Replace the use of 'struct timeval' and jiffies with ktime_t, which
> is a 64-bit timestamp and is year 2038 safe.
> - ktime_t provides both long range (like jiffies) and high resolution
> (like timeval). Using ktime_get (monotonic time) instead of wall-clock
> time prevents any discprepancies caused by updates to system time.

But the patch only changes the struct frame data. The aoe driver
only has the struct frame for an incoming AoE response when that
response is "expected". If the response comes in a bit late, the frame
may have already been used for a new command.

You can see that in aoecmd_ata_rsp when getframe_deferred returns
NULL and tsince is called instead of tsince_hr.

In that case, there is still information about the timing embedded in
the AoE tag. The send time in jiffies is a rough-grained record of the
send time, and it's extracted from the tag. For these "unexpected"
responses, this timing information can improve performance significantly
without introducing extra overhead or risk.

I don't think the patch considers this aspect of the way the round trip
time is calculated, and I don't think the primary motivation is justified
(if that's 2038 safety, which we have already).

Simplifying it would be nice, but it would be difficult to thoroughly test
all of the performance implications. There are still people using 32-bit
systems, for example.

>
> Signed-off-by: Tina Ruchandani <[email protected]>
> ---
> drivers/block/aoe/aoe.h | 3 +--
> drivers/block/aoe/aoecmd.c | 36 +++++++-----------------------------
> 2 files changed, 8 insertions(+), 31 deletions(-)
>
> diff --git a/drivers/block/aoe/aoe.h b/drivers/block/aoe/aoe.h
> index 9220f8e..4582b3c 100644
> --- a/drivers/block/aoe/aoe.h
> +++ b/drivers/block/aoe/aoe.h
> @@ -112,8 +112,7 @@ enum frame_flags {
> struct frame {
> struct list_head head;
> u32 tag;
> - struct timeval sent; /* high-res time packet was sent */
> - u32 sent_jiffs; /* low-res jiffies-based sent time */
> + ktime_t sent;
> ulong waited;
> ulong waited_total;
> struct aoetgt *t; /* parent target I belong to */
> diff --git a/drivers/block/aoe/aoecmd.c b/drivers/block/aoe/aoecmd.c
> index 422b7d8..7f78780 100644
> --- a/drivers/block/aoe/aoecmd.c
> +++ b/drivers/block/aoe/aoecmd.c
> @@ -398,8 +398,7 @@ aoecmd_ata_rw(struct aoedev *d)
>
> skb = skb_clone(f->skb, GFP_ATOMIC);
> if (skb) {
> - do_gettimeofday(&f->sent);
> - f->sent_jiffs = (u32) jiffies;
> + f->sent = ktime_get();
> __skb_queue_head_init(&queue);
> __skb_queue_tail(&queue, skb);
> aoenet_xmit(&queue);
> @@ -489,8 +488,7 @@ resend(struct aoedev *d, struct frame *f)
> skb = skb_clone(skb, GFP_ATOMIC);
> if (skb == NULL)
> return;
> - do_gettimeofday(&f->sent);
> - f->sent_jiffs = (u32) jiffies;
> + f->sent = ktime_get();
> __skb_queue_head_init(&queue);
> __skb_queue_tail(&queue, skb);
> aoenet_xmit(&queue);
> @@ -499,32 +497,15 @@ resend(struct aoedev *d, struct frame *f)
> static int
> tsince_hr(struct frame *f)
> {
> - struct timeval now;
> + ktime_t now;
> int n;
>
> - do_gettimeofday(&now);
> - n = now.tv_usec - f->sent.tv_usec;
> - n += (now.tv_sec - f->sent.tv_sec) * USEC_PER_SEC;
> + now = ktime_get();
> + n = ktime_to_us(ktime_sub(now, f->sent));
>
> if (n < 0)
> n = -n;
>
> - /* For relatively long periods, use jiffies to avoid
> - * discrepancies caused by updates to the system time.
> - *
> - * On system with HZ of 1000, 32-bits is over 49 days
> - * worth of jiffies, or over 71 minutes worth of usecs.
> - *
> - * Jiffies overflow is handled by subtraction of unsigned ints:
> - * (gdb) print (unsigned) 2 - (unsigned) 0xfffffffe
> - * $3 = 4
> - * (gdb)
> - */
> - if (n > USEC_PER_SEC / 4) {
> - n = ((u32) jiffies) - f->sent_jiffs;
> - n *= USEC_PER_SEC / HZ;
> - }
> -
> return n;
> }
>
> @@ -589,7 +570,6 @@ reassign_frame(struct frame *f)
> nf->waited = 0;
> nf->waited_total = f->waited_total;
> nf->sent = f->sent;
> - nf->sent_jiffs = f->sent_jiffs;
> f->skb = skb;
>
> return nf;
> @@ -633,8 +613,7 @@ probe(struct aoetgt *t)
>
> skb = skb_clone(f->skb, GFP_ATOMIC);
> if (skb) {
> - do_gettimeofday(&f->sent);
> - f->sent_jiffs = (u32) jiffies;
> + f->sent = ktime_get();
> __skb_queue_head_init(&queue);
> __skb_queue_tail(&queue, skb);
> aoenet_xmit(&queue);
> @@ -1474,8 +1453,7 @@ aoecmd_ata_id(struct aoedev *d)
>
> skb = skb_clone(skb, GFP_ATOMIC);
> if (skb) {
> - do_gettimeofday(&f->sent);
> - f->sent_jiffs = (u32) jiffies;
> + f->sent = ktime_get();
> }
>
> return skb;

2015-05-12 09:44:33

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH] aoe: Use 64-bit timestamp in frame

On Monday 11 May 2015 21:00:25 Ed Cashin wrote:
> First, thanks for the patch. I do appreciate the attempt to simplify
> this part of the driver, but I don't think that this patch is good to merge.
>
> I'll make some comments inline below.
>
> On 05/10/2015 10:35 PM, Tina Ruchandani wrote:
> > 'struct frame' uses two variables to store the sent timestamp - 'struct
> > timeval' and jiffies. jiffies is used to avoid discrepancies caused by
> > updates to system time. 'struct timeval' uses 32-bit representation for
> > seconds which will overflow in year 2038.
>
> The comment in the deleted lines below mentions the fact that the
> overflow does not matter for calculating rough-grained deltas in time.
> So there is no problem in 2038 or on systems with the clock set to 2038
> accidentally.

To clarify: because 'struct timeval' is known to be broken in general,
we want to remove it from the kernel entirely and replace it with
something that is known to be correct in all cases, even for the drivers
that are currently correct. Out of the ~250 files in the kernel that
use 'timeval', we can't easily tell which ones are correct, so by replacing
them all, we can eliminate all the bugs.

We should be able to do that in a way that generally improves all the
drivers, because using 'timeval' tends to be suboptimal to start with.

If the currently available interfaces make things worse for the aoe
driver, we may have to add extra infrastructure, and get something that
also helps the conversion of other drivers.

> > This patch does the following:
> > - Replace the use of 'struct timeval' and jiffies with ktime_t, which
> > is a 64-bit timestamp and is year 2038 safe.
> > - ktime_t provides both long range (like jiffies) and high resolution
> > (like timeval). Using ktime_get (monotonic time) instead of wall-clock
> > time prevents any discprepancies caused by updates to system time.
>
> But the patch only changes the struct frame data. The aoe driver
> only has the struct frame for an incoming AoE response when that
> response is "expected". If the response comes in a bit late, the frame
> may have already been used for a new command.
>
> You can see that in aoecmd_ata_rsp when getframe_deferred returns
> NULL and tsince is called instead of tsince_hr.
>
> In that case, there is still information about the timing embedded in
> the AoE tag. The send time in jiffies is a rough-grained record of the
> send time, and it's extracted from the tag. For these "unexpected"
> responses, this timing information can improve performance significantly
> without introducing extra overhead or risk.

That path is not changed at all by this patch, right? It also looks
like the jiffies information from there is only used to print an
error message.

> I don't think the patch considers this aspect of the way the round trip
> time is calculated, and I don't think the primary motivation is justified
> (if that's 2038 safety, which we have already).
>
> Simplifying it would be nice, but it would be difficult to thoroughly test
> all of the performance implications. There are still people using 32-bit
> systems, for example.

Here is my analysis regarding the performance implications:

- Avoiding the access to 'jiffies' in a few places has basically zero
impact in small systems, but may help on large SMP machines because it
avoids cache line bouncing when you have use multiple concurrent accesses
to jiffies.

- Replacing do_gettimeofday() with ktime_get() will improve things slightly
on all machines, because it avoids a 32-bit division that takes a couple
of cycles, up to hundreds of cycles on some CPU architectures.

- This leaves a single change that is currently making things worse in
tsince_hr():

> > - do_gettimeofday(&now);
> > - n = now.tv_usec - f->sent.tv_usec;
> > - n += (now.tv_sec - f->sent.tv_sec) * USEC_PER_SEC;
> > + now = ktime_get();
> > + n = ktime_to_us(ktime_sub(now, f->sent));

ktime_to_us() requires a constant 64-bit integer division that is
significantly more expensive than the 32-bit it replaces. Thanks to your
analysis, I think it's fair to say that the function is indeed timing
critical and we should try hard avoid introducing this overhead.

There are of course multiple ways to do this. One way would be to
change the code to work on 32-bit nanoseconds instead of 32-bit
microseconds. This requires proving that the we cannot exceed
4.29 seconds of round-trip time in calc_rttavg().
Is that a valid assumption or not?

If not, we could replace do_gettimeofday() with ktime_get_ts64().
This will ensure we don't need a 64-bit division when converting
the ts64 to a 32-bit microsecond value, and combined with the
conversion is still no slower than do_gettimeofday(), and it
still avoids the double bookkeeping because it uses a monotonic
timebase that is robust against settimeofday.

Arnd

2015-05-12 11:14:20

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [Y2038] [PATCH] aoe: Use 64-bit timestamp in frame

On Tuesday 12 May 2015 11:44:21 Arnd Bergmann wrote:
>
> There are of course multiple ways to do this. One way would be to
> change the code to work on 32-bit nanoseconds instead of 32-bit
> microseconds. This requires proving that the we cannot exceed
> 4.29 seconds of round-trip time in calc_rttavg().
> Is that a valid assumption or not?
>
> If not, we could replace do_gettimeofday() with ktime_get_ts64().
> This will ensure we don't need a 64-bit division when converting
> the ts64 to a 32-bit microsecond value, and combined with the
> conversion is still no slower than do_gettimeofday(), and it
> still avoids the double bookkeeping because it uses a monotonic
> timebase that is robust against settimeofday.

Two other approaches that occurred to me later:

- introduce common ktime_get_ms(), ktime_get_us(), ktime_get_real_ms()
and ktime_get_real_is() interfaces, to match the other interfaces
we already provide. These could be done as efficiently or better
than what aoe does manually today.

- change the timebase that is used for the computations in aoe to use
scaled nanoseconds instead of microseconds. Using

u32 time = ktime_get_ns() >> 10;

would give you a similar range and precision as microseconds, but
completely avoid integer division. You could also use a different
shift value to either extend the range beyond 71 minutes, or the
extend the precision to something below a microsecond. This would
be the most efficient implementation, but also require significant
changes to the driver.

Arnd

2015-05-12 11:21:46

by Ed Cashin

[permalink] [raw]
Subject: Re: [PATCH] aoe: Use 64-bit timestamp in frame

Thanks for the expanded motivation. I'll return to your ideas tonight, but I wanted to mention that it is possible for round trips to take well over five seconds, partly because the disk on the target might be resetting.

If the rough-grained mechanism in the AoE tag still works, though, that's very helpful. ????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?

2015-05-13 01:29:56

by Ed Cashin

[permalink] [raw]
Subject: Re: [Y2038] [PATCH] aoe: Use 64-bit timestamp in frame

On 05/12/2015 07:14 AM, Arnd Bergmann wrote:
> On Tuesday 12 May 2015 11:44:21 Arnd Bergmann wrote:
>> There are of course multiple ways to do this. One way would be to
>> change the code to work on 32-bit nanoseconds instead of 32-bit
>> microseconds. This requires proving that the we cannot exceed
>> 4.29 seconds of round-trip time in calc_rttavg().
>> Is that a valid assumption or not?
>>
>> If not, we could replace do_gettimeofday() with ktime_get_ts64().
>> This will ensure we don't need a 64-bit division when converting
>> the ts64 to a 32-bit microsecond value, and combined with the
>> conversion is still no slower than do_gettimeofday(), and it
>> still avoids the double bookkeeping because it uses a monotonic
>> timebase that is robust against settimeofday.
> Two other approaches that occurred to me later:
>
> - introduce common ktime_get_ms(), ktime_get_us(), ktime_get_real_ms()
> and ktime_get_real_is() interfaces, to match the other interfaces
> we already provide. These could be done as efficiently or better
> than what aoe does manually today.
>
> - change the timebase that is used for the computations in aoe to use
> scaled nanoseconds instead of microseconds. Using
>
> u32 time = ktime_get_ns() >> 10;
>
> would give you a similar range and precision as microseconds, but
> completely avoid integer division. You could also use a different
> shift value to either extend the range beyond 71 minutes, or the
> extend the precision to something below a microsecond. This would
> be the most efficient implementation, but also require significant
> changes to the driver.
>

That is an interesting idea. People do care about aoe_deadsecs being
pretty accurate, so there would need to be a way to make that remain
accurate. The driver will fail outstanding I/O to the target and mark it
as "down" after unsuccessfully retransmitting commands to the target
for a number of seconds equal to aoe_deadsecs.

As to the efficient ktime_get_us idea, that sounds appealing since you
mention that they would be efficient.

Thanks for the analysis.

--
Ed

2015-05-13 01:33:27

by Ed Cashin

[permalink] [raw]
Subject: Re: [PATCH] aoe: Use 64-bit timestamp in frame

On 05/12/2015 05:44 AM, Arnd Bergmann wrote:
> On Monday 11 May 2015 21:00:25 Ed Cashin wrote:
...
> In that case, there is still information about the timing embedded in
> the AoE tag. The send time in jiffies is a rough-grained record of the
> send time, and it's extracted from the tag. For these "unexpected"
> responses, this timing information can improve performance significantly
> without introducing extra overhead or risk.
> That path is not changed at all by this patch, right? It also looks
> like the jiffies information from there is only used to print an
> error message.
>

That's right, thanks, the tag still has the old jiffies embedded
in it.

The information, though, is used to update the round-trip-time
average and the running estimate of the RTT variance. For the
unexpected responses that information can help the driver to
maintain high performance when there are inconsistencies in the
network performance. (calc_rttavg)

--
Ed

2015-05-13 08:04:42

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [Y2038] [PATCH] aoe: Use 64-bit timestamp in frame

On Tuesday 12 May 2015 21:23:04 Ed Cashin wrote:
> On 05/12/2015 07:14 AM, Arnd Bergmann wrote:
> > On Tuesday 12 May 2015 11:44:21 Arnd Bergmann wrote:
> >> There are of course multiple ways to do this. One way would be to
> >> change the code to work on 32-bit nanoseconds instead of 32-bit
> >> microseconds. This requires proving that the we cannot exceed
> >> 4.29 seconds of round-trip time in calc_rttavg().
> >> Is that a valid assumption or not?
> >>
> >> If not, we could replace do_gettimeofday() with ktime_get_ts64().
> >> This will ensure we don't need a 64-bit division when converting
> >> the ts64 to a 32-bit microsecond value, and combined with the
> >> conversion is still no slower than do_gettimeofday(), and it
> >> still avoids the double bookkeeping because it uses a monotonic
> >> timebase that is robust against settimeofday.
> > Two other approaches that occurred to me later:
> >
> > - introduce common ktime_get_ms(), ktime_get_us(), ktime_get_real_ms()
> > and ktime_get_real_is() interfaces, to match the other interfaces
> > we already provide. These could be done as efficiently or better
> > than what aoe does manually today.
> >
> > - change the timebase that is used for the computations in aoe to use
> > scaled nanoseconds instead of microseconds. Using
> >
> > u32 time = ktime_get_ns() >> 10;
> >
> > would give you a similar range and precision as microseconds, but
> > completely avoid integer division. You could also use a different
> > shift value to either extend the range beyond 71 minutes, or the
> > extend the precision to something below a microsecond. This would
> > be the most efficient implementation, but also require significant
> > changes to the driver.
> >
>
> That is an interesting idea. People do care about aoe_deadsecs being
> pretty accurate, so there would need to be a way to make that remain
> accurate. The driver will fail outstanding I/O to the target and mark it
> as "down" after unsuccessfully retransmitting commands to the target
> for a number of seconds equal to aoe_deadsecs.
>
> As to the efficient ktime_get_us idea, that sounds appealing since you
> mention that they would be efficient.
>
> Thanks for the analysis.

Shall we do the ktime_get_us() approach then? It still requires a
32-bit division like do_gettimeofday(), so it will not be as efficient
as the shifted nanoseconds.

As for the aoe_deadsecs computation, converting the aoe_deadsec
module parameter into scaled nanoseconds can be done at module
load time, and that way you also save the integer division you
currently do for each frame in rexmit_timer() to turn the
microseconds into seconds.

Arnd

2015-05-14 00:47:13

by Ed Cashin

[permalink] [raw]
Subject: Re: [Y2038] [PATCH] aoe: Use 64-bit timestamp in frame

On 05/13/2015 04:04 AM, Arnd Bergmann wrote:
...
> Shall we do the ktime_get_us() approach then? It still requires a
> 32-bit division like do_gettimeofday(), so it will not be as efficient
> as the shifted nanoseconds.

It's no worse, though, right? So I think it's a good transition. Further
optimization could be attempted in an experimental branch at some
point for easy testing.

> As for the aoe_deadsecs computation, converting the aoe_deadsec module
> parameter into scaled nanoseconds can be done at module load time, and
> that way you also save the integer division you currently do for each
> frame in rexmit_timer() to turn the microseconds into seconds. Arnd

That's true, but the "secs" in the identifier stands for "seconds". It would
be misleading to have something called seconds be scaled nanoseconds.
And we could just use another variable if it weren't for the fact that this
module parameter is exposed through sysfs and can be changed through
that mechanism at any time.

--
Ed