2012-02-01 23:36:04

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH] NVMe: Fix compilation on architecturs without readq/writeq

On Tue, Jan 31, 2012 at 4:23 AM, Ingo Molnar <[email protected]> wrote:
>
> non-atomic sounds good to me too.

You both apparently missed the related discussion that some devices
really do care about order, even if they don't care about atomicity.

So we'd actually have two versions of the header file, one
little-endian, and one big-endian. Then the driver that knows it
doesn't need the atomic 'readq()' that is always defined, but wants a
low-bytes-first version would just do

#include <linux/io64-little-endian.h>

(or "big-endian" if it wants to read/write high bits first). Most
drivers probably don't care, but apparently NVMe does.

Linus


2012-02-02 01:06:12

by James Bottomley

[permalink] [raw]
Subject: Re: [PATCH] NVMe: Fix compilation on architecturs without readq/writeq

On Wed, 2012-02-01 at 15:35 -0800, Linus Torvalds wrote:
> On Tue, Jan 31, 2012 at 4:23 AM, Ingo Molnar <[email protected]> wrote:
> >
> > non-atomic sounds good to me too.
>
> You both apparently missed the related discussion that some devices
> really do care about order, even if they don't care about atomicity.
>
> So we'd actually have two versions of the header file, one
> little-endian, and one big-endian. Then the driver that knows it
> doesn't need the atomic 'readq()' that is always defined, but wants a
> low-bytes-first version would just do
>
> #include <linux/io64-little-endian.h>
>
> (or "big-endian" if it wants to read/write high bits first). Most
> drivers probably don't care, but apparently NVMe does.

And this was about the point I concluded last time that it simply wasn't
worth it with the number of different possibilities for the primitives
and trying to come up with a sensible naming scheme ... it's just easier
to open code because then you get exactly what you meant.

Incidentally, the last time this came up was with mpt fusion: for a
write to a 64 bit register, it didn't care about order, but it did care
about interleaving as in if you write one half of a 64 bit register and
then write to another register, the 64 bit register effectively gets
written with zeros in the part you didn't write to, so we had to put a
spin lock in the open coded writeb/w/l/q() to make sure the card didn't
get interleaved writes.

James

????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?

2012-02-02 01:15:56

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH] NVMe: Fix compilation on architecturs without readq/writeq

On Wed, Feb 1, 2012 at 5:05 PM, James Bottomley
<[email protected]> wrote:
>
> And this was about the point I concluded last time that it simply wasn't
> worth it with the number of different possibilities for the primitives

Umm? Two?

Two really doesn't seem like a big number to me.

Linus

2012-02-02 15:05:41

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH] NVMe: Fix compilation on architecturs without readq/writeq

On 02/01/2012 05:05 PM, James Bottomley wrote:
>
> Incidentally, the last time this came up was with mpt fusion: for a
> write to a 64 bit register, it didn't care about order, but it did care
> about interleaving as in if you write one half of a 64 bit register and
> then write to another register, the 64 bit register effectively gets
> written with zeros in the part you didn't write to, so we had to put a
> spin lock in the open coded writeb/w/l/q() to make sure the card didn't
> get interleaved writes.
>

There are always going to be hardware which have specific needs, and for
those open-coding makes sense, but the littleendian/bigendian pair is
going to cover ~90% of users and make sense to can.

I worked myself on a driver (which sadly never shipped) which had an WC
window and a UC window... the final write in a series had a completion
bit in it and would go to the UC window after setting up a whole chunk
of operations in the WC window (writing UC memory flushes WC memory
ahead of it.)

Thus, the two-part breakdown of writeq() to the UC window had to write
the low half to the WC window instead. This is clearly not generic.

-hpa

2012-02-04 15:34:44

by Hitoshi Mitake

[permalink] [raw]
Subject: Re: [PATCH] NVMe: Fix compilation on architecturs without readq/writeq

On Thu, Feb 2, 2012 at 10:05, James Bottomley <[email protected]> wrote:
> On Wed, 2012-02-01 at 15:35 -0800, Linus Torvalds wrote:
>> On Tue, Jan 31, 2012 at 4:23 AM, Ingo Molnar <[email protected]> wrote:
>> >
>> > non-atomic sounds good to me too.
>>
>> You both apparently missed the related discussion that some devices
>> really do care about order, even if they don't care about atomicity.
>>
>> So we'd actually have two versions of the header file, one
>> little-endian, and one big-endian. Then the driver that knows it
>> doesn't need the atomic 'readq()' that is always defined, but wants a
>> low-bytes-first version would just do
>>
>> ? ?#include <linux/io64-little-endian.h>
>>
>> (or "big-endian" if it wants to read/write high bits first). Most
>> drivers probably don't care, but apparently NVMe does.
>
> And this was about the point I concluded last time that it simply wasn't
> worth it with the number of different possibilities for the primitives
> and trying to come up with a sensible naming scheme ... it's just easier
> to open code because then you get exactly what you meant.
>
> Incidentally, the last time this came up was with mpt fusion: for a
> write to a 64 bit register, it didn't care about order, but it did care
> about interleaving as in if you write one half of a 64 bit register and
> then write to another register, the 64 bit register effectively gets
> written with zeros in the part you didn't write to, so we had to put a
> spin lock in the open coded writeb/w/l/q() to make sure the card didn't
> get interleaved writes.
>
> James
>

As you say, readq/writeq without any description about the semantics
of atomicity will cause confusion in such a case.

But new plan for non-atomic readq/writeq is defining non-atomic readq/writeq
in the header file like asm-generic/io-nonatomic-hi-lo.h, and the file name
is a good documentation for the description.

The drivers which use readq/writeq without the line like
#include <asm-generic/io-nonatomic-hi-lo.h>
will cause compile error in the 32-bit environment.

--
Hitoshi Mitake
[email protected]

2012-02-04 15:39:04

by Hitoshi Mitake

[permalink] [raw]
Subject: Re: [PATCH] NVMe: Fix compilation on architecturs without readq/writeq

On Fri, Feb 3, 2012 at 00:05, H. Peter Anvin <[email protected]> wrote:
> On 02/01/2012 05:05 PM, James Bottomley wrote:
>>
>>
>> Incidentally, the last time this came up was with mpt fusion: for a
>> write to a 64 bit register, it didn't care about order, but it did care
>> about interleaving as in if you write one half of a 64 bit register and
>> then write to another register, the 64 bit register effectively gets
>> written with zeros in the part you didn't write to, so we had to put a
>> spin lock in the open coded writeb/w/l/q() to make sure the card didn't
>> get interleaved writes.
>>
>
> There are always going to be hardware which have specific needs, and for
> those open-coding makes sense, but the littleendian/bigendian pair is going
> to cover ~90% of users and make sense to can.
>
> I worked myself on a driver (which sadly never shipped) which had an WC
> window and a UC window... the final write in a series had a completion bit
> in it and would go to the UC window after setting up a whole chunk of
> operations in the WC window (writing UC memory flushes WC memory ahead of
> it.)
>
> Thus, the two-part breakdown of writeq() to the UC window had to write the
> low half to the WC window instead. ?This is clearly not generic.
>
> ? ? ? ?-hpa
>
>

Because of my ignorant, I don't know the words "UC window" and "WC window"
in this context. Could you teach me?

--
Hitoshi Mitake
[email protected]

2012-02-05 06:53:41

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [PATCH] NVMe: Fix compilation on architecturs without readq/writeq

On Sat, Feb 4, 2012 at 16:39, Hitoshi Mitake <[email protected]> wrote:
>> I worked myself on a driver (which sadly never shipped) which had an WC
>> window and a UC window... the final write in a series had a completion bit
>> in it and would go to the UC window after setting up a whole chunk of
>> operations in the WC window (writing UC memory flushes WC memory ahead of
>> it.)
>>
>> Thus, the two-part breakdown of writeq() to the UC window had to write the
>> low half to the WC window instead.  This is clearly not generic.
>
> Because of my ignorant, I don't know the words "UC window" and "WC window"
> in this context. Could you teach me?

UN = uncached, WC = writecombining.

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

2012-02-05 07:01:46

by Hitoshi Mitake

[permalink] [raw]
Subject: Re: [PATCH] NVMe: Fix compilation on architecturs without readq/writeq

On Sun, Feb 5, 2012 at 15:53, Geert Uytterhoeven <[email protected]> wrote:
> On Sat, Feb 4, 2012 at 16:39, Hitoshi Mitake <[email protected]> wrote:
>>> I worked myself on a driver (which sadly never shipped) which had an WC
>>> window and a UC window... the final write in a series had a completion bit
>>> in it and would go to the UC window after setting up a whole chunk of
>>> operations in the WC window (writing UC memory flushes WC memory ahead of
>>> it.)
>>>
>>> Thus, the two-part breakdown of writeq() to the UC window had to write the
>>> low half to the WC window instead. ?This is clearly not generic.
>>
>> Because of my ignorant, I don't know the words "UC window" and "WC window"
>> in this context. Could you teach me?
>
> UN = uncached, WC = writecombining.

Thanks a lot for your teaching!

>
> Gr{oetje,eeting}s,
>
> ? ? ? ? ? ? ? ? ? ? ? ? Geert
>
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]
>
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like that.
> ? ? ? ? ? ? ? ? ? ? ? ? ? ?? ?? -- Linus Torvalds



--
Hitoshi Mitake
[email protected]

2012-02-07 02:48:34

by Hitoshi Mitake

[permalink] [raw]
Subject: Re: [PATCH] NVMe: Fix compilation on architecturs without readq/writeq

On Sun, Feb 5, 2012 at 00:34, Hitoshi Mitake <[email protected]> wrote:
> On Thu, Feb 2, 2012 at 10:05, James Bottomley <[email protected]> wrote:
>> On Wed, 2012-02-01 at 15:35 -0800, Linus Torvalds wrote:
>>> On Tue, Jan 31, 2012 at 4:23 AM, Ingo Molnar <[email protected]> wrote:
>>> >
>>> > non-atomic sounds good to me too.
>>>
>>> You both apparently missed the related discussion that some devices
>>> really do care about order, even if they don't care about atomicity.
>>>
>>> So we'd actually have two versions of the header file, one
>>> little-endian, and one big-endian. Then the driver that knows it
>>> doesn't need the atomic 'readq()' that is always defined, but wants a
>>> low-bytes-first version would just do
>>>
>>> ? ?#include <linux/io64-little-endian.h>
>>>
>>> (or "big-endian" if it wants to read/write high bits first). Most
>>> drivers probably don't care, but apparently NVMe does.
>>
>> And this was about the point I concluded last time that it simply wasn't
>> worth it with the number of different possibilities for the primitives
>> and trying to come up with a sensible naming scheme ... it's just easier
>> to open code because then you get exactly what you meant.
>>
>> Incidentally, the last time this came up was with mpt fusion: for a
>> write to a 64 bit register, it didn't care about order, but it did care
>> about interleaving as in if you write one half of a 64 bit register and
>> then write to another register, the 64 bit register effectively gets
>> written with zeros in the part you didn't write to, so we had to put a
>> spin lock in the open coded writeb/w/l/q() to make sure the card didn't
>> get interleaved writes.
>>
>> James
>>
>
> As you say, readq/writeq without any description about the semantics
> of atomicity will cause confusion in such a case.
>
> But new plan for non-atomic readq/writeq is defining non-atomic readq/writeq
> in the header file like asm-generic/io-nonatomic-hi-lo.h, and the file name
> is a good documentation for the description.
>
> The drivers which use readq/writeq without the line like
> #include <asm-generic/io-nonatomic-hi-lo.h>
> will cause compile error in the 32-bit environment.
>
> --
> Hitoshi Mitake
> [email protected]

I sent the patch which implements readq/writeq in the way.
If you have comments, I'd like to hear.

Thanks,

--
Hitoshi Mitake
[email protected]