The commit a1ed5b0cffe4b16a93a6a3390e8cee0fbef94f86
(klist: don't iterate over deleted entries) introduces use of the
low bit in a pointer to indicate if the knode is dead or not,
assuming that this bit is always free.
This is not true for all architectures, CRIS for example may align data
on byte borders.
The result is a bunch of warnings on bootup, devices not being
added correctly etc, reported by Hinko Kocevar <[email protected]>:
------------[ cut here ]------------
WARNING: at lib/klist.c:62 ()
Modules linked in:
Stack from c1fe1cf0:
c01cc7f4 c1fe1d11 c000eb4e c000e4de 00000000 00000000 c1f4f78f c1f50c2d
c01d008c c1fdd1a0 c1fdd1a0 c1fe1d38 c0192954 c1fe0000 00000000 c1fe1dc0
00000002 7fffffff c1fe1da8 c0192d50 c1fe1dc0 00000002 7fffffff c1ff9fcc
Call Trace: [<c000eb4e>] [<c000e4de>] [<c0192954>] [<c0192d50>] [<c001d49e>] [<c000b688>] [<c0192a3c>]
[<c000b63e>] [<c000b63e>] [<c001a542>] [<c00b55b0>] [<c00411c0>] [<c00b559c>] [<c01918e6>] [<c0191988>]
[<c01919d0>] [<c00cd9c8>] [<c00cdd6a>] [<c0034178>] [<c000409a>] [<c0015576>] [<c0029130>] [<c0029078>]
[<c0029170>] [<c0012336>] [<c00b4076>] [<c00b4770>] [<c006d6e4>] [<c006d974>] [<c006dca0>] [<c0028d6c>]
[<c0028e12>] [<c0006424>] <4>---[ end trace 4eaa2a86a8e2da22 ]---
------------[ cut here ]------------
Repeat ad nauseam.
The following naive patch fixes the problem by moving the flag to a
separate field in struct klist_node, but perhaps there is a better solution?
Signed-off-by: Jesper Nilsson <[email protected]>
---
include/linux/klist.h | 1 +
lib/klist.c | 8 +++-----
2 files changed, 4 insertions(+), 5 deletions(-)
diff --git a/include/linux/klist.h b/include/linux/klist.h
index d5a27af..e542629 100644
--- a/include/linux/klist.h
+++ b/include/linux/klist.h
@@ -40,6 +40,7 @@ struct klist_node {
void *n_klist; /* never access directly */
struct list_head n_node;
struct kref n_ref;
+ unsigned int flags;
};
extern void klist_add_tail(struct klist_node *n, struct klist *k);
diff --git a/lib/klist.c b/lib/klist.c
index 573d606..72ec552 100644
--- a/lib/klist.c
+++ b/lib/klist.c
@@ -43,17 +43,15 @@
* dead ones from iteration.
*/
#define KNODE_DEAD 1LU
-#define KNODE_KLIST_MASK ~KNODE_DEAD
static struct klist *knode_klist(struct klist_node *knode)
{
- return (struct klist *)
- ((unsigned long)knode->n_klist & KNODE_KLIST_MASK);
+ return (struct klist *)knode->n_klist;
}
static bool knode_dead(struct klist_node *knode)
{
- return (unsigned long)knode->n_klist & KNODE_DEAD;
+ return knode->flags & KNODE_DEAD;
}
static void knode_set_klist(struct klist_node *knode, struct klist *klist)
@@ -67,7 +65,7 @@ static void knode_kill(struct klist_node *knode)
{
/* and no knode should die twice ever either, see we're very humane */
WARN_ON(knode_dead(knode));
- *(unsigned long *)&knode->n_klist |= KNODE_DEAD;
+ knode->flags |= KNODE_DEAD;
}
/**
/^JN - Jesper Nilsson
--
Jesper Nilsson -- [email protected]
From: Jesper Nilsson <[email protected]>
Date: Tue, 13 Jan 2009 16:14:06 +0100
> The commit a1ed5b0cffe4b16a93a6a3390e8cee0fbef94f86
> (klist: don't iterate over deleted entries) introduces use of the
> low bit in a pointer to indicate if the knode is dead or not,
> assuming that this bit is always free.
>
> This is not true for all architectures, CRIS for example may align data
> on byte borders.
There are many other spots in the kernel where the low bits of a
pointer are hijacked as status bits. Lots of other things cannot
possible be working on CRIS because of this.
On Tue, Jan 13, 2009 at 10:10:30PM +0100, David Miller wrote:
> From: Jesper Nilsson <[email protected]>
> Date: Tue, 13 Jan 2009 16:14:06 +0100
>
> > The commit a1ed5b0cffe4b16a93a6a3390e8cee0fbef94f86
> > (klist: don't iterate over deleted entries) introduces use of the
> > low bit in a pointer to indicate if the knode is dead or not,
> > assuming that this bit is always free.
> >
> > This is not true for all architectures, CRIS for example may align data
> > on byte borders.
>
> There are many other spots in the kernel where the low bits of a
> pointer are hijacked as status bits. Lots of other things cannot
> possible be working on CRIS because of this.
It may be that we've worked around the other spots, although I haven't
seen anything like that, we might just have been lucky until now.
Can you recall another place where this trick is used?
It seems that rt_mutex uses the same trick (with the lowest 2bits)
but AFAICT that's something we don't use on CRIS.
/^JN - Jesper Nilsson
--
Jesper Nilsson -- [email protected]
> It may be that we've worked around the other spots, although I haven't
> seen anything like that, we might just have been lucky until now.
>
> Can you recall another place where this trick is used?
rmap.
Don't CRIS use mmu?
>
> It seems that rt_mutex uses the same trick (with the lowest 2bits)
> but AFAICT that's something we don't use on CRIS.
From: Jesper Nilsson <[email protected]>
Date: Tue, 13 Jan 2009 23:12:35 +0100
> Can you recall another place where this trick is used?
>
> It seems that rt_mutex uses the same trick (with the lowest 2bits)
> but AFAICT that's something we don't use on CRIS.
Sparsemem uses it.
The TRIE based routing tables uses it.
All of the IPV4/IPV6 socket hash tables use it as of 2.6.29-rc1.
You really can't avoid supporting this. Why doesn't kmalloc
return properly aligned objects on CRIS?
From: KOSAKI Motohiro <[email protected]>
Date: Wed, 14 Jan 2009 07:40:19 +0900
> > It may be that we've worked around the other spots, although I haven't
> > seen anything like that, we might just have been lucky until now.
> >
> > Can you recall another place where this trick is used?
>
> rmap.
> Don't CRIS use mmu?
I'm beginning to suspect the issue is only with objects
in the kernel image itself. Dynamically allocated memory
is properly aligned and therefore the "low bit status bits
in pointer" trick works.
On Tue, Jan 13, 2009 at 11:45 PM, David Miller <[email protected]> wrote:
> From: KOSAKI Motohiro <[email protected]>
> Date: Wed, 14 Jan 2009 07:40:19 +0900
>
>> > It may be that we've worked around the other spots, although I haven't
>> > seen anything like that, we might just have been lucky until now.
>> >
>> > Can you recall another place where this trick is used?
>>
>> rmap.
>> Don't CRIS use mmu?
>
> I'm beginning to suspect the issue is only with objects
> in the kernel image itself. Dynamically allocated memory
> is properly aligned and therefore the "low bit status bits
> in pointer" trick works.
Perhaps using a pointerhackalign trick on this structure where
#define pointerhackalign(x) __attribute__ ((aligned (x)))
and declare
struct klist_node {
...
} pointerhackalign(2);
Because __attribute__ ((aligned (x))) could only increase alignment
it will safe to do that and serve as documentation purpose :)
Regards
Bastien
> From: KOSAKI Motohiro <[email protected]>
> Date: Wed, 14 Jan 2009 07:40:19 +0900
>
> > > It may be that we've worked around the other spots, although I haven't
> > > seen anything like that, we might just have been lucky until now.
> > >
> > > Can you recall another place where this trick is used?
> >
> > rmap.
> > Don't CRIS use mmu?
>
> I'm beginning to suspect the issue is only with objects
> in the kernel image itself. Dynamically allocated memory
> is properly aligned and therefore the "low bit status bits
> in pointer" trick works.
Ah, I see.
very thank you for helpful explain.
On Wed, Jan 14, 2009 at 08:54:20AM +0900, KOSAKI Motohiro wrote:
> > From: KOSAKI Motohiro <[email protected]>
> > Date: Wed, 14 Jan 2009 07:40:19 +0900
> >
> > > > It may be that we've worked around the other spots, although I haven't
> > > > seen anything like that, we might just have been lucky until now.
> > > >
> > > > Can you recall another place where this trick is used?
> > >
> > > rmap.
> > > Don't CRIS use mmu?
> >
> > I'm beginning to suspect the issue is only with objects
> > in the kernel image itself. Dynamically allocated memory
> > is properly aligned and therefore the "low bit status bits
> > in pointer" trick works.
>
> Ah, I see.
> very thank you for helpful explain.
So, is this change still needed for klists? I'm guessing so as they
could be in statically allocated objects, right?
Here's yet another reason to never statically allocate a kobject...
thanks,
greg k-h
On Tue, 13 Jan 2009, David Miller wrote:
> From: KOSAKI Motohiro <[email protected]>
> Date: Wed, 14 Jan 2009 07:40:19 +0900
>
> > > It may be that we've worked around the other spots, although I haven't
> > > seen anything like that, we might just have been lucky until now.
> > >
> > > Can you recall another place where this trick is used?
> >
> > rmap.
> > Don't CRIS use mmu?
>
> I'm beginning to suspect the issue is only with objects
> in the kernel image itself. Dynamically allocated memory
> is properly aligned and therefore the "low bit status bits
> in pointer" trick works.
Yes: I don't think we fully realized that when adding the
__attribute__((aligned(sizeof(long)))) to struct address_space
(so the PAGE_MAPPING_ANON bit didn't mess up on CRIS), but the
only instance which actually gave a problem was the peculiar
struct swapper_space declared in mm/swap_state.c.
Hugh
On Wed, Jan 14, 2009 at 12:11:32AM +0100, Bastien ROUCARIES wrote:
> On Tue, Jan 13, 2009 at 11:45 PM, David Miller <[email protected]> wrote:
> > From: KOSAKI Motohiro <[email protected]>
> > Date: Wed, 14 Jan 2009 07:40:19 +0900
> >
> >> > It may be that we've worked around the other spots, although I haven't
> >> > seen anything like that, we might just have been lucky until now.
> >> >
> >> > Can you recall another place where this trick is used?
> >>
> >> rmap.
> >> Don't CRIS use mmu?
> >
> > I'm beginning to suspect the issue is only with objects
> > in the kernel image itself. Dynamically allocated memory
> > is properly aligned and therefore the "low bit status bits
> > in pointer" trick works.
> Perhaps using a pointerhackalign trick on this structure where
> #define pointerhackalign(x) __attribute__ ((aligned (x)))
> and declare
> struct klist_node {
> ...
> } pointerhackalign(2);
>
> Because __attribute__ ((aligned (x))) could only increase alignment
> it will safe to do that and serve as documentation purpose :)
That works, but we need to do it not for the struct klist_node,
but for the struct we insert into the void * in klist_node,
which is struct klist.
The following patch works for CRIS, and is less intrusive than
my earlier patch. If this the way to go I can resubmit a proper patch.
diff --git a/include/linux/klist.h b/include/linux/klist.h
index 8ea98db..a21cd7a 100644
--- a/include/linux/klist.h
+++ b/include/linux/klist.h
@@ -23,7 +23,7 @@ struct klist {
struct list_head k_list;
void (*get)(struct klist_node *);
void (*put)(struct klist_node *);
-};
+} __attribute__ ((aligned (4)));
#define KLIST_INIT(_name, _get, _put) \
{ .k_lock = __SPIN_LOCK_UNLOCKED(_name.k_lock), \
> Regards
>
> Bastien
/^JN - Jesper Nilsson
--
Jesper Nilsson -- [email protected]
On Tue, Jan 13, 2009 at 11:45:21PM +0100, David Miller wrote:
> From: KOSAKI Motohiro <[email protected]>
> Date: Wed, 14 Jan 2009 07:40:19 +0900
>
> > > It may be that we've worked around the other spots, although I haven't
> > > seen anything like that, we might just have been lucky until now.
> > >
> > > Can you recall another place where this trick is used?
> >
> > rmap.
> > Don't CRIS use mmu?
>
> I'm beginning to suspect the issue is only with objects
> in the kernel image itself. Dynamically allocated memory
> is properly aligned and therefore the "low bit status bits
> in pointer" trick works.
Yes, that agrees with what I see here, and why we haven't seen this
problem more often.
/^JN - Jesper Nilsson
--
Jesper Nilsson -- [email protected]
From: Jesper Nilsson <[email protected]>
Date: Wed, 14 Jan 2009 11:19:08 +0100
> The following patch works for CRIS, and is less intrusive than
> my earlier patch. If this the way to go I can resubmit a proper patch.
Out of curiosity, is there a way to get gcc to output code such
that data objects are aligned more naturally? Some option or
similar?
On Wed, Jan 14, 2009 at 12:59:14AM +0100, Greg KH wrote:
> So, is this change still needed for klists? I'm guessing so as they
> could be in statically allocated objects, right?
Like I said in another email, we can get away with just making
the struct klist aligned to 4 bytes (or 2 would probably suffice).
It would be a less intrusive patch.
> Here's yet another reason to never statically allocate a kobject...
>
> thanks,
>
> greg k-h
/^JN - Jesper Nilsson
--
Jesper Nilsson -- [email protected]
On Wed, Jan 14, 2009 at 11:21:40AM +0100, David Miller wrote:
> From: Jesper Nilsson <[email protected]>
> Date: Wed, 14 Jan 2009 11:19:08 +0100
> > The following patch works for CRIS, and is less intrusive than
> > my earlier patch. If this the way to go I can resubmit a proper patch.
>
> Out of curiosity, is there a way to get gcc to output code such
> that data objects are aligned more naturally? Some option or
> similar?
There's flags for alignment of objects, but no flags for changing
structure layout or size, which is probably what we run into here.
/^JN - Jesper Nilsson
--
Jesper Nilsson -- [email protected]
From: Jesper Nilsson <[email protected]>
Date: Wed, 14 Jan 2009 11:36:24 +0100
> On Wed, Jan 14, 2009 at 11:21:40AM +0100, David Miller wrote:
> > From: Jesper Nilsson <[email protected]>
> > Date: Wed, 14 Jan 2009 11:19:08 +0100
> > > The following patch works for CRIS, and is less intrusive than
> > > my earlier patch. If this the way to go I can resubmit a proper patch.
> >
> > Out of curiosity, is there a way to get gcc to output code such
> > that data objects are aligned more naturally? Some option or
> > similar?
>
> There's flags for alignment of objects, but no flags for changing
> structure layout or size, which is probably what we run into here.
Really?
I thought the problem is that the base object can have any odd
alignment in the kernel image. And this is why the problem isn't run
into for objects allocated using dynamic allocation.
On Wed, Jan 14, 2009 at 11:45:28AM +0100, David Miller wrote:
> From: Jesper Nilsson <[email protected]>
> Date: Wed, 14 Jan 2009 11:36:24 +0100
>
> > On Wed, Jan 14, 2009 at 11:21:40AM +0100, David Miller wrote:
> > > From: Jesper Nilsson <[email protected]>
> > > Date: Wed, 14 Jan 2009 11:19:08 +0100
> > > > The following patch works for CRIS, and is less intrusive than
> > > > my earlier patch. If this the way to go I can resubmit a proper patch.
> > >
> > > Out of curiosity, is there a way to get gcc to output code such
> > > that data objects are aligned more naturally? Some option or
> > > similar?
> >
> > There's flags for alignment of objects, but no flags for changing
> > structure layout or size, which is probably what we run into here.
>
> Really?
A quick test using 32bit alignment didn't work, but it could be
that I've missed a place to modify cflags.
> I thought the problem is that the base object can have any odd
> alignment in the kernel image. And this is why the problem isn't run
> into for objects allocated using dynamic allocation.
I reasoned that the struct klist is actually allocated as a part of
another struct, which due to no padding got the odd alignment.
I'll research this more fully.
/^JN - Jesper Nilsson
--
Jesper Nilsson -- [email protected]
On Wed, Jan 14, 2009 at 11:45:28AM +0100, David Miller wrote:
> From: Jesper Nilsson <[email protected]>
> Date: Wed, 14 Jan 2009 11:36:24 +0100
>
> > On Wed, Jan 14, 2009 at 11:21:40AM +0100, David Miller wrote:
> > > From: Jesper Nilsson <[email protected]>
> > > Date: Wed, 14 Jan 2009 11:19:08 +0100
> > > > The following patch works for CRIS, and is less intrusive than
> > > > my earlier patch. If this the way to go I can resubmit a proper patch.
> > >
> > > Out of curiosity, is there a way to get gcc to output code such
> > > that data objects are aligned more naturally? Some option or
> > > similar?
> >
> > There's flags for alignment of objects, but no flags for changing
> > structure layout or size, which is probably what we run into here.
>
> Really?
Yes, after some more research, I found that the bug is indeed triggered by
the struct klist being aligned to odd bytes inside other structs.
Since CRIS uses packed structs, having a char * inside the kobjects
shifts the alignment for all data after any struct that contains
a kobject (and indeed inside the kobject itself)
In this case it was the class pointer of the struct device
that had a klist in the private pointer (struct class_private),
after a struct kset (which contains a kobject).
> I thought the problem is that the base object can have any odd
> alignment in the kernel image. And this is why the problem isn't run
> into for objects allocated using dynamic allocation.
Actually, it seems that the default is for objects to be aligned
at 16bits for CRIS, which may also explain why this is not seen more often.
Objects allocated dynamically may still trigger the bug, due to
the struct not being padded.
/^JN - Jesper Nilsson
--
Jesper Nilsson -- [email protected]
On Wed, Jan 14, 2009 at 04:12:21PM +0100, Jesper Nilsson wrote:
> On Wed, Jan 14, 2009 at 11:45:28AM +0100, David Miller wrote:
> > From: Jesper Nilsson <[email protected]>
> > Date: Wed, 14 Jan 2009 11:36:24 +0100
> >
> > > On Wed, Jan 14, 2009 at 11:21:40AM +0100, David Miller wrote:
> > > > From: Jesper Nilsson <[email protected]>
> > > > Date: Wed, 14 Jan 2009 11:19:08 +0100
> > > > > The following patch works for CRIS, and is less intrusive than
> > > > > my earlier patch. If this the way to go I can resubmit a proper patch.
> > > >
> > > > Out of curiosity, is there a way to get gcc to output code such
> > > > that data objects are aligned more naturally? Some option or
> > > > similar?
> > >
> > > There's flags for alignment of objects, but no flags for changing
> > > structure layout or size, which is probably what we run into here.
> >
> > Really?
>
> Yes, after some more research, I found that the bug is indeed triggered by
> the struct klist being aligned to odd bytes inside other structs.
>
> Since CRIS uses packed structs, having a char * inside the kobjects
> shifts the alignment for all data after any struct that contains
> a kobject (and indeed inside the kobject itself)
>
> In this case it was the class pointer of the struct device
> that had a klist in the private pointer (struct class_private),
> after a struct kset (which contains a kobject).
So, in order to make things smaller, your "use an external flag" patch
would probably be better than forcing the alignment on a 4 byte boundry,
right?
thanks,
greg k-h
On Wed, Jan 14, 2009 at 07:17:05PM +0100, Greg KH wrote:
> On Wed, Jan 14, 2009 at 04:12:21PM +0100, Jesper Nilsson wrote:
> > On Wed, Jan 14, 2009 at 11:45:28AM +0100, David Miller wrote:
> > > From: Jesper Nilsson <[email protected]>
> > > Date: Wed, 14 Jan 2009 11:36:24 +0100
> > > > On Wed, Jan 14, 2009 at 11:21:40AM +0100, David Miller wrote:
> > > > > From: Jesper Nilsson <[email protected]>
> > > > > Date: Wed, 14 Jan 2009 11:19:08 +0100
> > > > > > The following patch works for CRIS, and is less intrusive than
> > > > > > my earlier patch. If this the way to go I can resubmit a proper patch.
> > > > >
> > > > > Out of curiosity, is there a way to get gcc to output code such
> > > > > that data objects are aligned more naturally? Some option or
> > > > > similar?
> > > >
> > > > There's flags for alignment of objects, but no flags for changing
> > > > structure layout or size, which is probably what we run into here.
> > >
> > > Really?
> >
> > Yes, after some more research, I found that the bug is indeed triggered by
> > the struct klist being aligned to odd bytes inside other structs.
> >
> > Since CRIS uses packed structs, having a char * inside the kobjects
> > shifts the alignment for all data after any struct that contains
> > a kobject (and indeed inside the kobject itself)
> >
> > In this case it was the class pointer of the struct device
> > that had a klist in the private pointer (struct class_private),
> > after a struct kset (which contains a kobject).
>
> So, in order to make things smaller, your "use an external flag" patch
> would probably be better than forcing the alignment on a 4 byte boundry,
> right?
Unfortunately not, the GCC flags for CRIS only act on objects (address
of the first member in the struct), the structs themselves are
still subject to alignment on odd bytes. So the patch for aligning
all klists on 16 or 32 bits is still needed for CRIS.
It should however be a no-op for all architectures that do struct padding.
> thanks,
> greg k-h
/^JN - Jesper Nilsson
--
Jesper Nilsson -- [email protected]