2013-06-28 10:26:18

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [linux-sunxi] [PATCH 0/8] clocksource: sunxi: Timer fixes and cleanup

On Fri, 28 Jun 2013, Siarhei Siamashka wrote:
> On Fri, 28 Jun 2013 09:43:37 +0800
> 张猛 <[email protected]> wrote:
>
> > > The A10 manual from http://free-electrons.com/~maxime/pub/datasheet/
> > > does not seem to contain any details about what bad things may happen
> > > if we try to read CNT64_LO_REG while latching is still in progress and
> > > CNT64_RL_EN bit in CNT64_CTRL_REG has not changed to zero yet.
> > > I can imagine the following possible scenarios:
> > > 1. We read either the old stale CNT64_LO_REG value or the new
> > > correct value.
> > > 2. We read either the old stale CNT64_LO_REG value or the new
> > > correct value, or some random garbage.
> > > 3. The processor may deadlock, eat your dog, or do some other
> > > nasty thing.
> > >
> > > In the case of 1, we probably can get away without using any spinlocks?
> >
> > About the 64bits counter, the latch bit is needed because of the
> > asynchronous circuit. The internal circuit of 64bits counter is
> > working under 24Mhz clock, and CNT_LO/HI is read with APB clock.
> > So the clock synchronize is needed. The function of the latch is
> > synchronous the 64bits counter from 24Mhz clock domain to APB clock
> > domain. So, if the latch is not completely, value of the CNT_LO/HI
> > maybe a random value, because some bits are latched, but others are
> > not. So, the CNT_LO/HI should be read after latch is completely.
> > The latch just takes 3 cycles of 24Mhz clock, the time is nearly
> > 0.125 micro-second.
>
> Thanks for the clarification! It is very much appreciated.
>
> So basically we get scenario 2, which still allows some optimizations
> compared to scenario 3. On single-core systems (Allwinner A10), it
> probably makes sense to avoid spinlocks overhead and just place

Spinlocks are NOPs on UP and just disable preemption, but they
provide you the same ordering guarantees as spinlocks on SMP. So no
special case optimization necessary.

Thanks,

tglx


2013-06-28 11:14:18

by Siarhei Siamashka

[permalink] [raw]
Subject: Re: [linux-sunxi] [PATCH 0/8] clocksource: sunxi: Timer fixes and cleanup

On Fri, 28 Jun 2013 12:26:10 +0200 (CEST)
Thomas Gleixner <[email protected]> wrote:

> On Fri, 28 Jun 2013, Siarhei Siamashka wrote:
> > On Fri, 28 Jun 2013 09:43:37 +0800
> > 张猛 <[email protected]> wrote:
> >
> > > > The A10 manual from http://free-electrons.com/~maxime/pub/datasheet/
> > > > does not seem to contain any details about what bad things may happen
> > > > if we try to read CNT64_LO_REG while latching is still in progress and
> > > > CNT64_RL_EN bit in CNT64_CTRL_REG has not changed to zero yet.
> > > > I can imagine the following possible scenarios:
> > > > 1. We read either the old stale CNT64_LO_REG value or the new
> > > > correct value.
> > > > 2. We read either the old stale CNT64_LO_REG value or the new
> > > > correct value, or some random garbage.
> > > > 3. The processor may deadlock, eat your dog, or do some other
> > > > nasty thing.
> > > >
> > > > In the case of 1, we probably can get away without using any spinlocks?
> > >
> > > About the 64bits counter, the latch bit is needed because of the
> > > asynchronous circuit. The internal circuit of 64bits counter is
> > > working under 24Mhz clock, and CNT_LO/HI is read with APB clock.
> > > So the clock synchronize is needed. The function of the latch is
> > > synchronous the 64bits counter from 24Mhz clock domain to APB clock
> > > domain. So, if the latch is not completely, value of the CNT_LO/HI
> > > maybe a random value, because some bits are latched, but others are
> > > not. So, the CNT_LO/HI should be read after latch is completely.
> > > The latch just takes 3 cycles of 24Mhz clock, the time is nearly
> > > 0.125 micro-second.
> >
> > Thanks for the clarification! It is very much appreciated.
> >
> > So basically we get scenario 2, which still allows some optimizations
> > compared to scenario 3. On single-core systems (Allwinner A10), it
> > probably makes sense to avoid spinlocks overhead and just place
>
> Spinlocks are NOPs on UP and just disable preemption, but they
> provide you the same ordering guarantees as spinlocks on SMP. So no
> special case optimization necessary.

Still I would not want some high priority and latency sensitive junk
(such as pulseaudio for example) not to get scheduled at the right
time because of some low priority gettimeofday spammer application.
The time spent latching and reading these counters is non negligible.

Though I agree Maxime that first we need an initial implementation,
which just works safely. And then we can optimize it by testing real
use cases and providing the relevant benchmark numbers. Knowing the
precise details about the hardware helps.

But enough bikeshedding, it's time for me to shut up :)

--
Best regards,
Siarhei Siamashka

2013-06-28 14:02:30

by Thomas Gleixner

[permalink] [raw]
Subject: Re: Re: [linux-sunxi] [PATCH 0/8] clocksource: sunxi: Timer fixes and cleanup

On Fri, 28 Jun 2013, ???? wrote:
> > The A10 manual from http://free-electrons.com/~maxime/pub/datasheet/
> > does not seem to contain any details about what bad things may happen
> > if we try to read CNT64_LO_REG while latching is still in progress and
> > CNT64_RL_EN bit in CNT64_CTRL_REG has not changed to zero yet.
> > I can imagine the following possible scenarios:
> > 1. We read either the old stale CNT64_LO_REG value or the new
> > correct value.
> > 2. We read either the old stale CNT64_LO_REG value or the new
> > correct value, or some random garbage.
> > 3. The processor may deadlock, eat your dog, or do some other
> > nasty thing.
> >
> > In the case of 1, we probably can get away without using any spinlocks?
>
> About the 64bits counter, the latch bit is needed because of the asynchronous circuit.
> The internal circuit of 64bits counter is working under 24Mhz clock, and CNT_LO/HI
> is read with APB clock. So the clock synchronize is needed. The function of the latch
> is synchronous the 64bits counter from 24Mhz clock domain to APB clock domain.
> So, if the latch is not completely, value of the CNT_LO/HI maybe a random value, because
> some bits are latched, but others are not. So, the CNT_LO/HI should be read after
> latch is completely.
> The latch just takes 3 cycles of 24Mhz clock, the time is nearly 0.125 micro-second.
>

I really wonder why we're trying to use that timer. AFAICT the A10 has
another six 32bit timers which do not have this restriction and the
clocksoure/sched_clock implementation works nicely with 32 bits. So
what's the point of using that 64 bit counter if it's horrible to
access?

Thanks,

tglx

2013-06-28 17:03:17

by Maxime Ripard

[permalink] [raw]
Subject: Re: Re: [linux-sunxi] [PATCH 0/8] clocksource: sunxi: Timer fixes and cleanup

Hi Thomas,

On Fri, Jun 28, 2013 at 04:02:23PM +0200, Thomas Gleixner wrote:
> On Fri, 28 Jun 2013, 张猛 wrote:
> > > The A10 manual from http://free-electrons.com/~maxime/pub/datasheet/
> > > does not seem to contain any details about what bad things may happen
> > > if we try to read CNT64_LO_REG while latching is still in progress and
> > > CNT64_RL_EN bit in CNT64_CTRL_REG has not changed to zero yet.
> > > I can imagine the following possible scenarios:
> > > 1. We read either the old stale CNT64_LO_REG value or the new
> > > correct value.
> > > 2. We read either the old stale CNT64_LO_REG value or the new
> > > correct value, or some random garbage.
> > > 3. The processor may deadlock, eat your dog, or do some other
> > > nasty thing.
> > >
> > > In the case of 1, we probably can get away without using any spinlocks?
> >
> > About the 64bits counter, the latch bit is needed because of the asynchronous circuit.
> > The internal circuit of 64bits counter is working under 24Mhz clock, and CNT_LO/HI
> > is read with APB clock. So the clock synchronize is needed. The function of the latch
> > is synchronous the 64bits counter from 24Mhz clock domain to APB clock domain.
> > So, if the latch is not completely, value of the CNT_LO/HI maybe a random value, because
> > some bits are latched, but others are not. So, the CNT_LO/HI should be read after
> > latch is completely.
> > The latch just takes 3 cycles of 24Mhz clock, the time is nearly 0.125 micro-second.
> >
>
> I really wonder why we're trying to use that timer. AFAICT the A10 has
> another six 32bit timers which do not have this restriction and the
> clocksoure/sched_clock implementation works nicely with 32 bits. So
> what's the point of using that 64 bit counter if it's horrible to
> access?

Yes, you're right. I actually wanted at first not to use a timer for
this since we had a counter to do just that.

But that's true that actually using a timer would make the code simpler
and presumably faster as well.

I'll change this in the v2.

Thanks,
Maxime

--
Maxime Ripard, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com


Attachments:
(No filename) (2.12 kB)
signature.asc (836.00 B)
Digital signature
Download all attachments