Depending on ABI "long long" type of a particular 32-bit CPU
might be aligned by either word (32-bits) or double word (64-bits).
Make sure "data" is really 64-bit aligned for any 32-bit CPU.
At least for 32-bit ARC cores ABI requires "long long" types
to be aligned by normal 32-bit word. This makes "data" field aligned to
12 bytes. Which is still OK as long as we use 32-bit data only.
But once we want to use native atomic64_t type (i.e. when we use special
instructions LLOCKD/SCONDD for accessing 64-bit data) we easily hit
misaligned access exception.
That's because even on CPUs capable of non-aligned data access LL/SC
instructions require strict alignment.
Signed-off-by: Alexey Brodkin <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
---
Changes v1 -> v2:
* Reworded commit message
* Inserted comment right in source [Thomas]
drivers/base/devres.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/base/devres.c b/drivers/base/devres.c
index f98a097e73f2..466fa59c866a 100644
--- a/drivers/base/devres.c
+++ b/drivers/base/devres.c
@@ -24,8 +24,12 @@ struct devres_node {
struct devres {
struct devres_node node;
- /* -- 3 pointers */
- unsigned long long data[]; /* guarantee ull alignment */
+ /*
+ * Depending on ABI "long long" type of a particular 32-bit CPU
+ * might be aligned by either word (32-bits) or double word (64-bits).
+ * Make sure "data" is really 64-bit aligned for any 32-bit CPU.
+ */
+ unsigned long long data[] __aligned(sizeof(unsigned long long));
};
struct devres_group {
--
2.17.1
On Mon, Jul 09, 2018 at 07:44:44AM +0300, Alexey Brodkin wrote:
> Depending on ABI "long long" type of a particular 32-bit CPU
> might be aligned by either word (32-bits) or double word (64-bits).
> Make sure "data" is really 64-bit aligned for any 32-bit CPU.
>
> At least for 32-bit ARC cores ABI requires "long long" types
> to be aligned by normal 32-bit word. This makes "data" field aligned to
> 12 bytes. Which is still OK as long as we use 32-bit data only.
>
> But once we want to use native atomic64_t type (i.e. when we use special
> instructions LLOCKD/SCONDD for accessing 64-bit data) we easily hit
> misaligned access exception.
So is this something you hit today? If not, why is this needed for
stable kernels?
> That's because even on CPUs capable of non-aligned data access LL/SC
> instructions require strict alignment.
Are you going to hit this code with all types of structures? What
happens when you do have an unaligned access?
>
> Signed-off-by: Alexey Brodkin <[email protected]>
> Cc: Greg Kroah-Hartman <[email protected]>
You didn't cc: this address :(
> Cc: Thomas Gleixner <[email protected]>
> Cc: [email protected]
> ---
>
> Changes v1 -> v2:
>
> * Reworded commit message
> * Inserted comment right in source [Thomas]
>
> drivers/base/devres.c | 8 ++++++--
> 1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/base/devres.c b/drivers/base/devres.c
> index f98a097e73f2..466fa59c866a 100644
> --- a/drivers/base/devres.c
> +++ b/drivers/base/devres.c
> @@ -24,8 +24,12 @@ struct devres_node {
>
> struct devres {
> struct devres_node node;
> - /* -- 3 pointers */
> - unsigned long long data[]; /* guarantee ull alignment */
> + /*
> + * Depending on ABI "long long" type of a particular 32-bit CPU
> + * might be aligned by either word (32-bits) or double word (64-bits).
> + * Make sure "data" is really 64-bit aligned for any 32-bit CPU.
> + */
> + unsigned long long data[] __aligned(sizeof(unsigned long long));
> };
Does this change the padding today for any other arches? If so, does
the increased memory usage cause any noticable issues?
thanks,
greg k-h
Hi Greg,
On Mon, 2018-07-09 at 07:48 +0200, Greg KH wrote:
> On Mon, Jul 09, 2018 at 07:44:44AM +0300, Alexey Brodkin wrote:
> > Depending on ABI "long long" type of a particular 32-bit CPU
> > might be aligned by either word (32-bits) or double word (64-bits).
> > Make sure "data" is really 64-bit aligned for any 32-bit CPU.
> >
> > At least for 32-bit ARC cores ABI requires "long long" types
> > to be aligned by normal 32-bit word. This makes "data" field aligned to
> > 12 bytes. Which is still OK as long as we use 32-bit data only.
> >
> > But once we want to use native atomic64_t type (i.e. when we use special
> > instructions LLOCKD/SCONDD for accessing 64-bit data) we easily hit
> > misaligned access exception.
>
> So is this something you hit today? If not, why is this needed for
> stable kernels?
Indeed we hit that problem recently when Etnaviv driver was switched to
DRM GPU scheduler, see
commit e93b6deeb45a ("drm/etnaviv: hook up DRM GPU scheduler").
The most important part of DRM GPU scheduler is "job_id_count" member of
"drm_gpu_scheduler" structure of type "atomic64_t". This structure is put
in a buffer allocated by devm_kzalloc() and if "job_id_count" is not 64-bit
aligned atomic instruction fails with an exception.
As for stable requirements - mentioned commit was a part of 4.17 kernel
which broke GPU driver for one of our HSDK board so I guess back-porting
to 4.17 is a no-brainer.
Now given a nature of the problem at hand I may expect other breakages
whenever another driver/fs etc that use "atomic64_t" gets enabled.
> > That's because even on CPUs capable of non-aligned data access LL/SC
> > instructions require strict alignment.
>
> Are you going to hit this code with all types of structures?
If there're other cases which lead to 4-byte aligned "atomic64_t" variables
there will be a problem as well but it's quite hard to predict those cases.
That said if we manage to reproduce more similar issues there will be more
patches with fixes :)
> What happens when you do have an unaligned access?
Atomic instructions are a bit special as compared to normal loads and stores.
Even if normal loads and stores may deal with unaligned data atomic instructions
still require data to be aligned because it's hard to manage atomic value that
spans through multiple cache lines or even MMU pages. And hardware just
raises an alignment fault exception.
And that's not something special for ARC, I guess all CPUs are the same in
that regard, see here's an extract from ARM(r) Architecture Reference
Manual ARMv7-A and ARMv7-R edition: https://lkml.org/lkml/2017/12/5/440
From "Table A3-1 Alignment requirements of load/store instructions"
it's seen that LDREXD, STREXD instructions will cause alignment fault
even if SCTLR.A=0 (strict alignment fault checking disabled) for non
double-word-aligned data.
> >
> > Signed-off-by: Alexey Brodkin <[email protected]>
> > Cc: Greg Kroah-Hartman <[email protected]>
>
> You didn't cc: this address :(
Sorry, which one?
That what I have in .mbox, see https://patchwork.ozlabs.org/patch/941092/mbox/:
------------------------------>8------------------------------
Cc: [email protected], Greg KH <[email protected]>,
Alexey Brodkin <[email protected]>, [email protected],
Greg Kroah-Hartman <[email protected]>,
Thomas Gleixner <[email protected]>, [email protected]
------------------------------>8------------------------------
-Alexey
On Mon, Jul 09, 2018 at 06:46:50AM +0000, Alexey Brodkin wrote:
> Hi Greg,
>
> On Mon, 2018-07-09 at 07:48 +0200, Greg KH wrote:
> > On Mon, Jul 09, 2018 at 07:44:44AM +0300, Alexey Brodkin wrote:
> > > Depending on ABI "long long" type of a particular 32-bit CPU
> > > might be aligned by either word (32-bits) or double word (64-bits).
> > > Make sure "data" is really 64-bit aligned for any 32-bit CPU.
> > >
> > > At least for 32-bit ARC cores ABI requires "long long" types
> > > to be aligned by normal 32-bit word. This makes "data" field aligned to
> > > 12 bytes. Which is still OK as long as we use 32-bit data only.
> > >
> > > But once we want to use native atomic64_t type (i.e. when we use special
> > > instructions LLOCKD/SCONDD for accessing 64-bit data) we easily hit
> > > misaligned access exception.
> >
> > So is this something you hit today? If not, why is this needed for
> > stable kernels?
>
> Indeed we hit that problem recently when Etnaviv driver was switched to
> DRM GPU scheduler, see
> commit e93b6deeb45a ("drm/etnaviv: hook up DRM GPU scheduler").
> The most important part of DRM GPU scheduler is "job_id_count" member of
> "drm_gpu_scheduler" structure of type "atomic64_t". This structure is put
> in a buffer allocated by devm_kzalloc() and if "job_id_count" is not 64-bit
> aligned atomic instruction fails with an exception.
>
> As for stable requirements - mentioned commit was a part of 4.17 kernel
> which broke GPU driver for one of our HSDK board so I guess back-porting
> to 4.17 is a no-brainer.
Ok, so 4.17 is as far back as you need? Please try to be specific when
asking for stable backports.
> > > That's because even on CPUs capable of non-aligned data access LL/SC
> > > instructions require strict alignment.
> >
> > Are you going to hit this code with all types of structures?
>
> If there're other cases which lead to 4-byte aligned "atomic64_t" variables
> there will be a problem as well but it's quite hard to predict those cases.
> That said if we manage to reproduce more similar issues there will be more
> patches with fixes :)
>
> > What happens when you do have an unaligned access?
>
> Atomic instructions are a bit special as compared to normal loads and stores.
> Even if normal loads and stores may deal with unaligned data atomic instructions
> still require data to be aligned because it's hard to manage atomic value that
> spans through multiple cache lines or even MMU pages. And hardware just
> raises an alignment fault exception.
>
> And that's not something special for ARC, I guess all CPUs are the same in
> that regard, see here's an extract from ARM(r) Architecture Reference
> Manual ARMv7-A and ARMv7-R edition: https://lkml.org/lkml/2017/12/5/440
> From "Table A3-1 Alignment requirements of load/store instructions"
> it's seen that LDREXD, STREXD instructions will cause alignment fault
> even if SCTLR.A=0 (strict alignment fault checking disabled) for non
> double-word-aligned data.
Thanks for the better explaination, that helps out a lot. Can you redo
the patch with all of that information so that others do not have the
same questions as I did?
thanks,
greg k-h
On Mon, Jul 9, 2018 at 7:49 AM Greg KH <[email protected]> wrote:
> On Mon, Jul 09, 2018 at 07:44:44AM +0300, Alexey Brodkin wrote:
> > Depending on ABI "long long" type of a particular 32-bit CPU
> > might be aligned by either word (32-bits) or double word (64-bits).
Or even 16-bit (on e.g. m68k).
> > --- a/drivers/base/devres.c
> > +++ b/drivers/base/devres.c
> > @@ -24,8 +24,12 @@ struct devres_node {
> >
> > struct devres {
> > struct devres_node node;
> > - /* -- 3 pointers */
> > - unsigned long long data[]; /* guarantee ull alignment */
> > + /*
> > + * Depending on ABI "long long" type of a particular 32-bit CPU
> > + * might be aligned by either word (32-bits) or double word (64-bits).
or even 16-bit (on e.g. m68k).
> > + * Make sure "data" is really 64-bit aligned for any 32-bit CPU.
> > + */
> > + unsigned long long data[] __aligned(sizeof(unsigned long long));
> > };
>
> Does this change the padding today for any other arches? If so, does
> the increased memory usage cause any noticable issues?
Yes it does. struct devres_node contains an odd number of pointers, so
there will be a hole between node and data on all 32-bit architectures.
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
Hi Greg,
On Mon, 2018-07-09 at 09:06 +0200, [email protected] wrote:
> On Mon, Jul 09, 2018 at 06:46:50AM +0000, Alexey Brodkin wrote:
> > Hi Greg,
> >
> > On Mon, 2018-07-09 at 07:48 +0200, Greg KH wrote:
> > > On Mon, Jul 09, 2018 at 07:44:44AM +0300, Alexey Brodkin wrote:
> > > > Depending on ABI "long long" type of a particular 32-bit CPU
> > > > might be aligned by either word (32-bits) or double word (64-bits).
> > > > Make sure "data" is really 64-bit aligned for any 32-bit CPU.
> > > >
> > > > At least for 32-bit ARC cores ABI requires "long long" types
> > > > to be aligned by normal 32-bit word. This makes "data" field aligned to
> > > > 12 bytes. Which is still OK as long as we use 32-bit data only.
> > > >
> > > > But once we want to use native atomic64_t type (i.e. when we use special
> > > > instructions LLOCKD/SCONDD for accessing 64-bit data) we easily hit
> > > > misaligned access exception.
> > >
> > > So is this something you hit today? If not, why is this needed for
> > > stable kernels?
> >
> > Indeed we hit that problem recently when Etnaviv driver was switched to
> > DRM GPU scheduler, see
> > commit e93b6deeb45a ("drm/etnaviv: hook up DRM GPU scheduler").
> > The most important part of DRM GPU scheduler is "job_id_count" member of
> > "drm_gpu_scheduler" structure of type "atomic64_t". This structure is put
> > in a buffer allocated by devm_kzalloc() and if "job_id_count" is not 64-bit
> > aligned atomic instruction fails with an exception.
> >
> > As for stable requirements - mentioned commit was a part of 4.17 kernel
> > which broke GPU driver for one of our HSDK board so I guess back-porting
> > to 4.17 is a no-brainer.
>
> Ok, so 4.17 is as far back as you need? Please try to be specific when
> asking for stable backports.
Well in that particular case I really wanted to get this fix back-ported
starting from v4.8 so I guess that's what I need to add in Cc tag, right?
----------------------------->8---------------------
Cc: <[email protected]> # 4.8+
----------------------------->8---------------------
> > > > That's because even on CPUs capable of non-aligned data access LL/SC
> > > > instructions require strict alignment.
> > >
> > > Are you going to hit this code with all types of structures?
> >
> > If there're other cases which lead to 4-byte aligned "atomic64_t" variables
> > there will be a problem as well but it's quite hard to predict those cases.
> > That said if we manage to reproduce more similar issues there will be more
> > patches with fixes :)
> >
> > > What happens when you do have an unaligned access?
> >
> > Atomic instructions are a bit special as compared to normal loads and stores.
> > Even if normal loads and stores may deal with unaligned data atomic instructions
> > still require data to be aligned because it's hard to manage atomic value that
> > spans through multiple cache lines or even MMU pages. And hardware just
> > raises an alignment fault exception.
> >
> > And that's not something special for ARC, I guess all CPUs are the same in
> > that regard, see here's an extract from ARM(r) Architecture Reference
> > Manual ARMv7-A and ARMv7-R edition: https://urldefense.proofpoint.com/v2/url?u=https-3A__lkml.org_lkml_2017_12_5_440&d=DwIBAg&c=DPL6_X_6JkXFx7AXWq
> > B0tg&r=lqdeeSSEes0GFDDl656eViXO7breS55ytWkhpk5R81I&m=eT1OUXQEt6zlA0bABwdI7sFr7Hys3aHzoCXTDAkM6xY&s=-3n_Xurm4D2TIC-_G_GIvGj9P_3unmq839oGATLE5W0&e=
> > From "Table A3-1 Alignment requirements of load/store instructions"
> > it's seen that LDREXD, STREXD instructions will cause alignment fault
> > even if SCTLR.A=0 (strict alignment fault checking disabled) for non
> > double-word-aligned data.
>
> Thanks for the better explaination, that helps out a lot. Can you redo
> the patch with all of that information so that others do not have the
> same questions as I did?
Sure will do it soonish.
-Alexey
Hi Geert,
On Mon, 2018-07-09 at 09:07 +0200, Geert Uytterhoeven wrote:
> On Mon, Jul 9, 2018 at 7:49 AM Greg KH <[email protected]> wrote:
> > On Mon, Jul 09, 2018 at 07:44:44AM +0300, Alexey Brodkin wrote:
> > > Depending on ABI "long long" type of a particular 32-bit CPU
> > > might be aligned by either word (32-bits) or double word (64-bits).
>
> Or even 16-bit (on e.g. m68k).
Indeed, thanks for the note!
Will add this in my v3.
For ARC I'd like this fix to be back-ported starting
from v4.8 where we added support of "native" atomic64_t, see commit
ce6365270ecd (" ARCv2: Implement atomic64 based on LLOCKD/SCONDD instructions").
What about m68k, do you have any preference of earliest kernel version
where this fix might be useful?
-Alexey
On Mon, Jul 09, 2018 at 07:17:11AM +0000, Alexey Brodkin wrote:
> Hi Greg,
>
> On Mon, 2018-07-09 at 09:06 +0200, [email protected] wrote:
> > On Mon, Jul 09, 2018 at 06:46:50AM +0000, Alexey Brodkin wrote:
> > > Hi Greg,
> > >
> > > On Mon, 2018-07-09 at 07:48 +0200, Greg KH wrote:
> > > > On Mon, Jul 09, 2018 at 07:44:44AM +0300, Alexey Brodkin wrote:
> > > > > Depending on ABI "long long" type of a particular 32-bit CPU
> > > > > might be aligned by either word (32-bits) or double word (64-bits).
> > > > > Make sure "data" is really 64-bit aligned for any 32-bit CPU.
> > > > >
> > > > > At least for 32-bit ARC cores ABI requires "long long" types
> > > > > to be aligned by normal 32-bit word. This makes "data" field aligned to
> > > > > 12 bytes. Which is still OK as long as we use 32-bit data only.
> > > > >
> > > > > But once we want to use native atomic64_t type (i.e. when we use special
> > > > > instructions LLOCKD/SCONDD for accessing 64-bit data) we easily hit
> > > > > misaligned access exception.
> > > >
> > > > So is this something you hit today? If not, why is this needed for
> > > > stable kernels?
> > >
> > > Indeed we hit that problem recently when Etnaviv driver was switched to
> > > DRM GPU scheduler, see
> > > commit e93b6deeb45a ("drm/etnaviv: hook up DRM GPU scheduler").
> > > The most important part of DRM GPU scheduler is "job_id_count" member of
> > > "drm_gpu_scheduler" structure of type "atomic64_t". This structure is put
> > > in a buffer allocated by devm_kzalloc() and if "job_id_count" is not 64-bit
> > > aligned atomic instruction fails with an exception.
> > >
> > > As for stable requirements - mentioned commit was a part of 4.17 kernel
> > > which broke GPU driver for one of our HSDK board so I guess back-porting
> > > to 4.17 is a no-brainer.
> >
> > Ok, so 4.17 is as far back as you need? Please try to be specific when
> > asking for stable backports.
>
> Well in that particular case I really wanted to get this fix back-ported
> starting from v4.8 so I guess that's what I need to add in Cc tag, right?
> ----------------------------->8---------------------
> Cc: <[email protected]> # 4.8+
> ----------------------------->8---------------------
Yes.
Hi Alexey,
On Mon, Jul 9, 2018 at 9:22 AM Alexey Brodkin
<[email protected]> wrote:
> On Mon, 2018-07-09 at 09:07 +0200, Geert Uytterhoeven wrote:
> > On Mon, Jul 9, 2018 at 7:49 AM Greg KH <[email protected]> wrote:
> > > On Mon, Jul 09, 2018 at 07:44:44AM +0300, Alexey Brodkin wrote:
> > > > Depending on ABI "long long" type of a particular 32-bit CPU
> > > > might be aligned by either word (32-bits) or double word (64-bits).
> >
> > Or even 16-bit (on e.g. m68k).
>
> Indeed, thanks for the note!
> Will add this in my v3.
Note that in this particular case, the field will probably always be
aligned to 4 bytes,
as the struct will be allocated using *alloc().
> For ARC I'd like this fix to be back-ported starting
> from v4.8 where we added support of "native" atomic64_t, see commit
> ce6365270ecd (" ARCv2: Implement atomic64 based on LLOCKD/SCONDD instructions").
>
> What about m68k, do you have any preference of earliest kernel version
> where this fix might be useful?
Given m68k is 32-bit, it will access atomic64_t variables while
holding a spinlock,
so it should still be safe without this change.
Not to mention no one will try etnaviv on m68k ;-)
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
Hi Geert,
On Mon, 2018-07-09 at 09:52 +0200, Geert Uytterhoeven wrote:
> Hi Alexey,
>
> On Mon, Jul 9, 2018 at 9:22 AM Alexey Brodkin
> <[email protected]> wrote:
> > On Mon, 2018-07-09 at 09:07 +0200, Geert Uytterhoeven wrote:
> > > On Mon, Jul 9, 2018 at 7:49 AM Greg KH <[email protected]> wrote:
> > > > On Mon, Jul 09, 2018 at 07:44:44AM +0300, Alexey Brodkin wrote:
> > > > > Depending on ABI "long long" type of a particular 32-bit CPU
> > > > > might be aligned by either word (32-bits) or double word (64-bits).
> > >
> > > Or even 16-bit (on e.g. m68k).
> >
> > Indeed, thanks for the note!
> > Will add this in my v3.
>
> Note that in this particular case, the field will probably always be
> aligned to 4 bytes,
> as the struct will be allocated using *alloc().
Well I think maybe it worth mentioning only 32-bit and 64-bit alignments in
this particular case because as it was mentioned initially in the comment there're
3 pointers before "data" field and for 32-bit machines I guess we may safely
assume that a pointer size is 32-bit.
This leaves us only one problematic situation 32-bit instead of 64-bit alignment.
> > For ARC I'd like this fix to be back-ported starting
> > from v4.8 where we added support of "native" atomic64_t, see commit
> > ce6365270ecd (" ARCv2: Implement atomic64 based on LLOCKD/SCONDD instructions").
> >
> > What about m68k, do you have any preference of earliest kernel version
> > where this fix might be useful?
>
> Given m68k is 32-bit, it will access atomic64_t variables while
> holding a spinlock,
> so it should still be safe without this change.
Well ARC and ARMv7 are 32-bit machines as well still both have
a special instructions to read/write 64-bit data.
> Not to mention no one will try etnaviv on m68k ;-)
See it was just a trigger case but other GPUs that use or will use
DRM GPU scheduler are good candidates to it that problem and not to
mention some other users of atomic64_t.
But from you above comment I may deduce that m68k doesn't have
instructions for 64-bit data; in that case there's no reason to worry
at least about struct devres_node :)
-Alexey
Hi Alexey,
On Mon, Jul 9, 2018 at 10:37 AM Alexey Brodkin
<[email protected]> wrote:
> On Mon, 2018-07-09 at 09:52 +0200, Geert Uytterhoeven wrote:
> > On Mon, Jul 9, 2018 at 9:22 AM Alexey Brodkin
> > <[email protected]> wrote:
> > > On Mon, 2018-07-09 at 09:07 +0200, Geert Uytterhoeven wrote:
> > > > On Mon, Jul 9, 2018 at 7:49 AM Greg KH <[email protected]> wrote:
> > > > > On Mon, Jul 09, 2018 at 07:44:44AM +0300, Alexey Brodkin wrote:
> > > > > > Depending on ABI "long long" type of a particular 32-bit CPU
> > > > > > might be aligned by either word (32-bits) or double word (64-bits).
> > > >
> > > > Or even 16-bit (on e.g. m68k).
> > >
> > > Indeed, thanks for the note!
> > > Will add this in my v3.
> >
> > Note that in this particular case, the field will probably always be
> > aligned to 4 bytes,
> > as the struct will be allocated using *alloc().
>
> Well I think maybe it worth mentioning only 32-bit and 64-bit alignments in
> this particular case because as it was mentioned initially in the comment there're
> 3 pointers before "data" field and for 32-bit machines I guess we may safely
> assume that a pointer size is 32-bit.
>
> This leaves us only one problematic situation 32-bit instead of 64-bit alignment.
>
> > > For ARC I'd like this fix to be back-ported starting
> > > from v4.8 where we added support of "native" atomic64_t, see commit
> > > ce6365270ecd (" ARCv2: Implement atomic64 based on LLOCKD/SCONDD instructions").
> > >
> > > What about m68k, do you have any preference of earliest kernel version
> > > where this fix might be useful?
> >
> > Given m68k is 32-bit, it will access atomic64_t variables while
> > holding a spinlock,
> > so it should still be safe without this change.
>
> Well ARC and ARMv7 are 32-bit machines as well still both have
> a special instructions to read/write 64-bit data.
Sure.
> > Not to mention no one will try etnaviv on m68k ;-)
>
> See it was just a trigger case but other GPUs that use or will use
> DRM GPU scheduler are good candidates to it that problem and not to
> mention some other users of atomic64_t.
>
> But from you above comment I may deduce that m68k doesn't have
> instructions for 64-bit data; in that case there's no reason to worry
> at least about struct devres_node :)
That's correct: m68k has no instructions for 64-bit data.
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
From: Alexey Brodkin
> Sent: 09 July 2018 05:45
> Depending on ABI "long long" type of a particular 32-bit CPU
> might be aligned by either word (32-bits) or double word (64-bits).
> Make sure "data" is really 64-bit aligned for any 32-bit CPU.
>
> At least for 32-bit ARC cores ABI requires "long long" types
> to be aligned by normal 32-bit word. This makes "data" field aligned to
> 12 bytes. Which is still OK as long as we use 32-bit data only.
>
> But once we want to use native atomic64_t type (i.e. when we use special
> instructions LLOCKD/SCONDD for accessing 64-bit data) we easily hit
> misaligned access exception.
Shouldn't there be a typedef for the actual type.
Perhaps it is even atomic64_t ?
And have the __aligned(8) applied to that typedef ??
> That's because even on CPUs capable of non-aligned data access LL/SC
> instructions require strict alignment.
...
> --- a/drivers/base/devres.c
> +++ b/drivers/base/devres.c
> @@ -24,8 +24,12 @@ struct devres_node {
>
> struct devres {
> struct devres_node node;
> - /* -- 3 pointers */
> - unsigned long long data[]; /* guarantee ull alignment */
> + /*
> + * Depending on ABI "long long" type of a particular 32-bit CPU
> + * might be aligned by either word (32-bits) or double word (64-bits).
> + * Make sure "data" is really 64-bit aligned for any 32-bit CPU.
Just:
/* data[] must be 64 bit aligned even on 32 bit architectures
* because it might be accessed by instructions that require
* aligned memory arguments.
> + */
> + unsigned long long data[] __aligned(sizeof(unsigned long long));
One day assuming that 'unsigned long long' is exactly 64 bits will bite.
This probably ought to be u64 (or similar).
(If not atomic64_t)
David
On Mon, Jul 9, 2018 at 11:15 AM David Laight <[email protected]> wrote:
> From: Alexey Brodkin
> > Sent: 09 July 2018 05:45
> > Depending on ABI "long long" type of a particular 32-bit CPU
> > might be aligned by either word (32-bits) or double word (64-bits).
> > Make sure "data" is really 64-bit aligned for any 32-bit CPU.
> >
> > At least for 32-bit ARC cores ABI requires "long long" types
> > to be aligned by normal 32-bit word. This makes "data" field aligned to
> > 12 bytes. Which is still OK as long as we use 32-bit data only.
> >
> > But once we want to use native atomic64_t type (i.e. when we use special
> > instructions LLOCKD/SCONDD for accessing 64-bit data) we easily hit
> > misaligned access exception.
>
> Shouldn't there be a typedef for the actual type.
> Perhaps it is even atomic64_t ?
> And have the __aligned(8) applied to that typedef ??
That indeed sounds like the best thing to do, as it will fix this issue in other
places, too.
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
From: Geert Uytterhoeven
> Sent: 09 July 2018 10:23
> On Mon, Jul 9, 2018 at 11:15 AM David Laight <[email protected]> wrote:
> > From: Alexey Brodkin
> > > Sent: 09 July 2018 05:45
> > > Depending on ABI "long long" type of a particular 32-bit CPU
> > > might be aligned by either word (32-bits) or double word (64-bits).
> > > Make sure "data" is really 64-bit aligned for any 32-bit CPU.
> > >
> > > At least for 32-bit ARC cores ABI requires "long long" types
> > > to be aligned by normal 32-bit word. This makes "data" field aligned to
> > > 12 bytes. Which is still OK as long as we use 32-bit data only.
> > >
> > > But once we want to use native atomic64_t type (i.e. when we use special
> > > instructions LLOCKD/SCONDD for accessing 64-bit data) we easily hit
> > > misaligned access exception.
> >
> > Shouldn't there be a typedef for the actual type.
> > Perhaps it is even atomic64_t ?
> > And have the __aligned(8) applied to that typedef ??
>
> That indeed sounds like the best thing to do, as it will fix this issue in other
> places, too.
Something like:
typedef struct {
u64 val __aligned(8);
} atomic64_t;
would pick up most errors.
Including all the places that fail to use atomic_read().
David
Hi David,
On Mon, 2018-07-09 at 09:16 +0000, David Laight wrote:
> From: Alexey Brodkin
> > Sent: 09 July 2018 05:45
> > Depending on ABI "long long" type of a particular 32-bit CPU
> > might be aligned by either word (32-bits) or double word (64-bits).
> > Make sure "data" is really 64-bit aligned for any 32-bit CPU.
> >
> > At least for 32-bit ARC cores ABI requires "long long" types
> > to be aligned by normal 32-bit word. This makes "data" field aligned to
> > 12 bytes. Which is still OK as long as we use 32-bit data only.
> >
> > But once we want to use native atomic64_t type (i.e. when we use special
> > instructions LLOCKD/SCONDD for accessing 64-bit data) we easily hit
> > misaligned access exception.
>
> Shouldn't there be a typedef for the actual type.
> Perhaps it is even atomic64_t ?
> And have the __aligned(8) applied to that typedef ??
That's a good idea indeed but it doesn't solve the problem with
struct devres_node. Consider the following snippet:
-------------------------------->8-------------------------------
struct mystruct {
atomic64_t myvar;
}
struct mystruct *p;
p = devm_kzalloc(dev, sizeof(*p), GFP_KERNEL);
-------------------------------->8-------------------------------
Here myvar address will match address of "data" member of struct devres_node.
So if "data" is has offset of 12 bytes from the beginning of a page then
myvar won't be 64-bit aligned regardless of myvar's attribute, right?
> > That's because even on CPUs capable of non-aligned data access LL/SC
> > instructions require strict alignment.
>
> ...
> > --- a/drivers/base/devres.c
> > +++ b/drivers/base/devres.c
> > @@ -24,8 +24,12 @@ struct devres_node {
> >
> > struct devres {
> > struct devres_node node;
> > - /* -- 3 pointers */
> > - unsigned long long data[]; /* guarantee ull alignment */
> > + /*
> > + * Depending on ABI "long long" type of a particular 32-bit CPU
> > + * might be aligned by either word (32-bits) or double word (64-bits).
> > + * Make sure "data" is really 64-bit aligned for any 32-bit CPU.
>
> Just:
> /* data[] must be 64 bit aligned even on 32 bit architectures
> * because it might be accessed by instructions that require
> * aligned memory arguments.
>
> > + */
> > + unsigned long long data[] __aligned(sizeof(unsigned long long));
>
> One day assuming that 'unsigned long long' is exactly 64 bits will bite.
> This probably ought to be u64 (or similar).
I agree. Initially I wanted to keep as few changes as possible but
IMHO switching to more predictable data type makes sense.
-Alexey
From: Alexey Brodkin
> Sent: 09 July 2018 11:00
...
> That's a good idea indeed but it doesn't solve the problem with
> struct devres_node. Consider the following snippet:
> -------------------------------->8-------------------------------
> struct mystruct {
> atomic64_t myvar;
> }
>
> struct mystruct *p;
> p = devm_kzalloc(dev, sizeof(*p), GFP_KERNEL);
> -------------------------------->8-------------------------------
>
> Here myvar address will match address of "data" member of struct devres_node.
> So if "data" is has offset of 12 bytes from the beginning of a page then
> myvar won't be 64-bit aligned regardless of myvar's attribute, right?
...
> > > - unsigned long long data[]; /* guarantee ull alignment */
Ahh, that line should be:
u8 data[] __aligned(8); /* Guarantee 64bit alignment */
David
Hi David,
On Mon, 2018-07-09 at 10:18 +0000, David Laight wrote:
> From: Alexey Brodkin
> > Sent: 09 July 2018 11:00
>
> ...
> > That's a good idea indeed but it doesn't solve the problem with
> > struct devres_node. Consider the following snippet:
> > -------------------------------->8-------------------------------
> > struct mystruct {
> > atomic64_t myvar;
> > }
> >
> > struct mystruct *p;
> > p = devm_kzalloc(dev, sizeof(*p), GFP_KERNEL);
> > -------------------------------->8-------------------------------
> >
> > Here myvar address will match address of "data" member of struct devres_node.
> > So if "data" is has offset of 12 bytes from the beginning of a page then
> > myvar won't be 64-bit aligned regardless of myvar's attribute, right?
>
> ...
> > > > - unsigned long long data[]; /* guarantee ull alignment */
>
> Ahh, that line should be:
> u8 data[] __aligned(8); /* Guarantee 64bit alignment */
And that pretty much what I suggested in my initial patch :)
For the record x86 has exactly the same atomic64_t as you suggested,
see https://elixir.bootlin.com/linux/latest/source/arch/x86/include/asm/atomic64_32.h#L13:
---------------------->8------------------
typedef struct {
u64 __aligned(8) counter;
} atomic64_t;
---------------------->8------------------
-Alexey
On 07/09/2018 03:23 AM, Alexey Brodkin wrote:
> Hi David,
>
> On Mon, 2018-07-09 at 10:18 +0000, David Laight wrote:
>> From: Alexey Brodkin
>>> Sent: 09 July 2018 11:00
>>
>> ...
>>> That's a good idea indeed but it doesn't solve the problem with
>>> struct devres_node. Consider the following snippet:
>>> -------------------------------->8-------------------------------
>>> struct mystruct {
>>> atomic64_t myvar;
>>> }
>>>
>>> struct mystruct *p;
>>> p = devm_kzalloc(dev, sizeof(*p), GFP_KERNEL);
>>> -------------------------------->8-------------------------------
>>>
>>> Here myvar address will match address of "data" member of struct devres_node.
>>> So if "data" is has offset of 12 bytes from the beginning of a page then
>>> myvar won't be 64-bit aligned regardless of myvar's attribute, right?
>>
>> ...
>>>>> - unsigned long long data[]; /* guarantee ull alignment */
>>
>> Ahh, that line should be:
>> u8 data[] __aligned(8); /* Guarantee 64bit alignment */
>
> And that pretty much what I suggested in my initial patch :)
>
> For the record x86 has exactly the same atomic64_t as you suggested,
> see https://elixir.bootlin.com/linux/latest/source/arch/x86/include/asm/atomic64_32.h#L13:
> ---------------------->8------------------
> typedef struct {
> u64 __aligned(8) counter;
> } atomic64_t;
> ---------------------->8------------------
And so does the ARC version since when the atomic64_t support was added by commit
ce6365270ecd
Also, you should consider using the pre-canned type aligned_u64.
typedef struct {
aligned_u64 counter;
^^^^^^^^^^^
} atomic64_t;
Hi Vineet,
On Mon, 2018-07-09 at 11:27 -0700, Vineet Gupta wrote:
> On 07/09/2018 03:23 AM, Alexey Brodkin wrote:
> > Hi David,
> >
> > On Mon, 2018-07-09 at 10:18 +0000, David Laight wrote:
> > > From: Alexey Brodkin
> > > > Sent: 09 July 2018 11:00
> > >
> > > ...
> > > > That's a good idea indeed but it doesn't solve the problem with
> > > > struct devres_node. Consider the following snippet:
> > > > -------------------------------->8-------------------------------
> > > > struct mystruct {
> > > > atomic64_t myvar;
> > > > }
> > > >
> > > > struct mystruct *p;
> > > > p = devm_kzalloc(dev, sizeof(*p), GFP_KERNEL);
> > > > -------------------------------->8-------------------------------
> > > >
> > > > Here myvar address will match address of "data" member of struct devres_node.
> > > > So if "data" is has offset of 12 bytes from the beginning of a page then
> > > > myvar won't be 64-bit aligned regardless of myvar's attribute, right?
> > >
> > > ...
> > > > > > - unsigned long long data[]; /* guarantee ull alignment */
> > >
> > > Ahh, that line should be:
> > > u8 data[] __aligned(8); /* Guarantee 64bit alignment */
> >
> > And that pretty much what I suggested in my initial patch :)
> >
> > For the record x86 has exactly the same atomic64_t as you suggested,
> > see https://elixir.bootlin.com/linux/latest/source/arch/x86/include/asm/atomic64_32.h#L13:
> > ---------------------->8------------------
> > typedef struct {
> > u64 __aligned(8) counter;
> > } atomic64_t;
> > ---------------------->8------------------
>
> And so does the ARC version since when the atomic64_t support was added by commit
> ce6365270ecd
>
> Also, you should consider using the pre-canned type aligned_u64.
>
> typedef struct {
> aligned_u64 counter;
> ^^^^^^^^^^^
> } atomic64_t;
As Peter mentioned earlier atomic is signed, see:
http://lists.infradead.org/pipermail/linux-snps-arc/2018-July/004030.html
So looks like we need this fix for ARC instead:
--------------------------------->8----------------------------
diff --git a/arch/arc/include/asm/atomic.h b/arch/arc/include/asm/atomic.h
index 11859287c52a..ef94b7371de6 100644
--- a/arch/arc/include/asm/atomic.h
+++ b/arch/arc/include/asm/atomic.h
@@ -367,7 +367,7 @@ ATOMIC_OPS(xor, ^=, CTOP_INST_AXOR_DI_R2_R2_R3)
*/
typedef struct {
- aligned_u64 counter;
+ long long __aligned(8) counter;
} atomic64_t;
#define ATOMIC64_INIT(a) { (a) }
--------------------------------->8----------------------------
-Alexey