LinuxLists.cc - KMSAN: uninit-value in do

2018-06-12 19:22:52

Subject: KMSAN: uninit-value in do_msgrcv

Hello,

syzbot found the following crash on:

HEAD commit: 8fc8ecd1c58a kmsan: unpoison regs in arch_uprobe_exception..
git tree: https://github.com/google/kmsan.git/master
console output: https://syzkaller.appspot.com/x/log.txt?x=1481799f800000
kernel config: https://syzkaller.appspot.com/x/.config?x=9fa436d3ae606638
dashboard link: https://syzkaller.appspot.com/bug?extid=2827ef6b3385deb07eaf
compiler: clang version 7.0.0 (trunk 332596)
syzkaller repro:https://syzkaller.appspot.com/x/repro.syz?x=1265edb7800000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=16eeee9f800000

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: [email protected]

==================================================================
BUG: KMSAN: uninit-value in do_msgrcv+0x509/0x1e30 ipc/msg.c:1048
CPU: 0 PID: 4528 Comm: syz-executor852 Not tainted 4.17.0-rc5+ #103
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x185/0x1d0 lib/dump_stack.c:113
kmsan_report+0x149/0x260 mm/kmsan/kmsan.c:1084
__msan_warning_32+0x6e/0xc0 mm/kmsan/kmsan_instr.c:686
do_msgrcv+0x509/0x1e30 ipc/msg.c:1048
ksys_msgrcv ipc/msg.c:1184 [inline]
__do_sys_msgrcv ipc/msg.c:1190 [inline]
__se_sys_msgrcv ipc/msg.c:1187 [inline]
__x64_sys_msgrcv+0x160/0x1b0 ipc/msg.c:1187
do_syscall_64+0x152/0x230 arch/x86/entry/common.c:287
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x4459b9
RSP: 002b:00007f0d57662db8 EFLAGS: 00000297 ORIG_RAX: 0000000000000046
RAX: ffffffffffffffda RBX: 00000000006dac54 RCX: 00000000004459b9
RDX: 00000000000000d0 RSI: 0000000020000000 RDI: 0000000000260007
RBP: 00000000006dac50 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000297 R12: 0000000000000000
R13: 00007ffd5ab7e25f R14: 00007f0d576639c0 R15: 0000000000000006

Uninit was created at:
kmsan_save_stack_with_flags mm/kmsan/kmsan.c:279 [inline]
kmsan_internal_poison_shadow+0xb8/0x1b0 mm/kmsan/kmsan.c:189
kmsan_kmalloc+0x94/0x100 mm/kmsan/kmsan.c:315
__kmalloc_node+0xe25/0x11f0 mm/slub.c:3865
kmalloc_node include/linux/slab.h:554 [inline]
kvmalloc_node+0x197/0x2f0 mm/util.c:421
kvmalloc include/linux/mm.h:550 [inline]
newque+0xb4/0x7d0 ipc/msg.c:139
ipcget_new ipc/util.c:315 [inline]
ipcget+0x27b/0xd90 ipc/util.c:653
ksys_msgget ipc/msg.c:289 [inline]
__do_sys_msgget ipc/msg.c:294 [inline]
__se_sys_msgget ipc/msg.c:292 [inline]
__x64_sys_msgget+0x14c/0x1d0 ipc/msg.c:292
do_syscall_64+0x152/0x230 arch/x86/entry/common.c:287
entry_SYSCALL_64_after_hwframe+0x44/0xa9
==================================================================

---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at [email protected].

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with
syzbot.
syzbot can test patches for this bug, for details see:
https://goo.gl/tpsmEJ#testing-patches

2018-06-24 02:57:53

by Davidlohr Bueso

[permalink] [raw]

Subject: ipc/msg: zalloc struct msg_queue when creating a new msq

The following splat was reported around the msg_queue structure
which can have uninitialized fields left over after newque().
Future syscalls which make use of the msq id (now valid) can thus
make KMSAN complain because not all fields are explicitly initialized
and we have the padding as well. This is internal to the kernel,
hence no bogus leaks.

==================================================================
BUG: KMSAN: uninit-value in do_msgrcv+0x509/0x1e30 ipc/msg.c:1048
CPU: 0 PID: 4528 Comm: syz-executor852 Not tainted 4.17.0-rc5+ #103
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x185/0x1d0 lib/dump_stack.c:113
kmsan_report+0x149/0x260 mm/kmsan/kmsan.c:1084
__msan_warning_32+0x6e/0xc0 mm/kmsan/kmsan_instr.c:686
do_msgrcv+0x509/0x1e30 ipc/msg.c:1048
ksys_msgrcv ipc/msg.c:1184 [inline]
__do_sys_msgrcv ipc/msg.c:1190 [inline]
__se_sys_msgrcv ipc/msg.c:1187 [inline]
__x64_sys_msgrcv+0x160/0x1b0 ipc/msg.c:1187
do_syscall_64+0x152/0x230 arch/x86/entry/common.c:287
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x4459b9
RSP: 002b:00007f0d57662db8 EFLAGS: 00000297 ORIG_RAX: 0000000000000046
RAX: ffffffffffffffda RBX: 00000000006dac54 RCX: 00000000004459b9
RDX: 00000000000000d0 RSI: 0000000020000000 RDI: 0000000000260007
RBP: 00000000006dac50 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000297 R12: 0000000000000000
R13: 00007ffd5ab7e25f R14: 00007f0d576639c0 R15: 0000000000000006

Uninit was created at:
kmsan_save_stack_with_flags mm/kmsan/kmsan.c:279 [inline]
kmsan_internal_poison_shadow+0xb8/0x1b0 mm/kmsan/kmsan.c:189
kmsan_kmalloc+0x94/0x100 mm/kmsan/kmsan.c:315
__kmalloc_node+0xe25/0x11f0 mm/slub.c:3865
kmalloc_node include/linux/slab.h:554 [inline]
kvmalloc_node+0x197/0x2f0 mm/util.c:421
kvmalloc include/linux/mm.h:550 [inline]
newque+0xb4/0x7d0 ipc/msg.c:139
ipcget_new ipc/util.c:315 [inline]
ipcget+0x27b/0xd90 ipc/util.c:653
ksys_msgget ipc/msg.c:289 [inline]
__do_sys_msgget ipc/msg.c:294 [inline]
__se_sys_msgget ipc/msg.c:292 [inline]
__x64_sys_msgget+0x14c/0x1d0 ipc/msg.c:292
do_syscall_64+0x152/0x230 arch/x86/entry/common.c:287
entry_SYSCALL_64_after_hwframe+0x44/0xa9
==================================================================

Fix this by simply zero init the whole structure, something that
sysvsems also do; this is safe as it's a nop, having no secondary
effect afaict.

Reported-by: syzbot <[email protected]>
Signed-off-by: Davidlohr Bueso <[email protected]>
---
ipc/msg.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/ipc/msg.c b/ipc/msg.c
index 62545ce19173..da81b374f9fd 100644
--- a/ipc/msg.c
+++ b/ipc/msg.c
@@ -136,7 +136,7 @@ static int newque(struct ipc_namespace *ns, struct ipc_params *params)
key_t key = params->key;
int msgflg = params->flg;

- msq = kvmalloc(sizeof(*msq), GFP_KERNEL);
+ msq = kvzalloc(sizeof(*msq), GFP_KERNEL);
if (unlikely(!msq))
return -ENOMEM;

--
2.16.4

2018-06-25 09:23:35

by Dmitry Vyukov

[permalink] [raw]

Subject: Re: ipc/msg: zalloc struct msg_queue when creating a new msq

On Sun, Jun 24, 2018 at 4:56 AM, Davidlohr Bueso <[email protected]> wrote:
> The following splat was reported around the msg_queue structure
> which can have uninitialized fields left over after newque().
> Future syscalls which make use of the msq id (now valid) can thus
> make KMSAN complain because not all fields are explicitly initialized
> and we have the padding as well. This is internal to the kernel,
> hence no bogus leaks.

Hi Davidlohr,

As far as I understand the root problem is that (1) we publish a
not-fully initialized objects and (2) finish it's initialization in a
racy manner when other threads already have access to it. As the
result other threads can act on a wrong object. I am not sure that
zeroing the object really solves these problems. It will sure get rid
of the report at hand (but probably not of KTSAN, data race detector,
report), other threads still can see wrong 0 id and the id is still
initialized in racy way. I would expect that a proper fix would be to
publish a fully initialized object with proper, final id. Am I missing
something?

> ==================================================================
> BUG: KMSAN: uninit-value in do_msgrcv+0x509/0x1e30 ipc/msg.c:1048
> CPU: 0 PID: 4528 Comm: syz-executor852 Not tainted 4.17.0-rc5+ #103
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> Call Trace:
> __dump_stack lib/dump_stack.c:77 [inline]
> dump_stack+0x185/0x1d0 lib/dump_stack.c:113
> kmsan_report+0x149/0x260 mm/kmsan/kmsan.c:1084
> __msan_warning_32+0x6e/0xc0 mm/kmsan/kmsan_instr.c:686
> do_msgrcv+0x509/0x1e30 ipc/msg.c:1048
> ksys_msgrcv ipc/msg.c:1184 [inline]
> __do_sys_msgrcv ipc/msg.c:1190 [inline]
> __se_sys_msgrcv ipc/msg.c:1187 [inline]
> __x64_sys_msgrcv+0x160/0x1b0 ipc/msg.c:1187
> do_syscall_64+0x152/0x230 arch/x86/entry/common.c:287
> entry_SYSCALL_64_after_hwframe+0x44/0xa9
> RIP: 0033:0x4459b9
> RSP: 002b:00007f0d57662db8 EFLAGS: 00000297 ORIG_RAX: 0000000000000046
> RAX: ffffffffffffffda RBX: 00000000006dac54 RCX: 00000000004459b9
> RDX: 00000000000000d0 RSI: 0000000020000000 RDI: 0000000000260007
> RBP: 00000000006dac50 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000297 R12: 0000000000000000
> R13: 00007ffd5ab7e25f R14: 00007f0d576639c0 R15: 0000000000000006
>
> Uninit was created at:
> kmsan_save_stack_with_flags mm/kmsan/kmsan.c:279 [inline]
> kmsan_internal_poison_shadow+0xb8/0x1b0 mm/kmsan/kmsan.c:189
> kmsan_kmalloc+0x94/0x100 mm/kmsan/kmsan.c:315
> __kmalloc_node+0xe25/0x11f0 mm/slub.c:3865
> kmalloc_node include/linux/slab.h:554 [inline]
> kvmalloc_node+0x197/0x2f0 mm/util.c:421
> kvmalloc include/linux/mm.h:550 [inline]
> newque+0xb4/0x7d0 ipc/msg.c:139
> ipcget_new ipc/util.c:315 [inline]
> ipcget+0x27b/0xd90 ipc/util.c:653
> ksys_msgget ipc/msg.c:289 [inline]
> __do_sys_msgget ipc/msg.c:294 [inline]
> __se_sys_msgget ipc/msg.c:292 [inline]
> __x64_sys_msgget+0x14c/0x1d0 ipc/msg.c:292
> do_syscall_64+0x152/0x230 arch/x86/entry/common.c:287
> entry_SYSCALL_64_after_hwframe+0x44/0xa9
> ==================================================================
>
> Fix this by simply zero init the whole structure, something that
> sysvsems also do; this is safe as it's a nop, having no secondary
> effect afaict.
>
> Reported-by: syzbot <[email protected]>
> Signed-off-by: Davidlohr Bueso <[email protected]>
> ---
> ipc/msg.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/ipc/msg.c b/ipc/msg.c
> index 62545ce19173..da81b374f9fd 100644
> --- a/ipc/msg.c
> +++ b/ipc/msg.c
> @@ -136,7 +136,7 @@ static int newque(struct ipc_namespace *ns, struct
> ipc_params *params)
> key_t key = params->key;
> int msgflg = params->flg;
>
> - msq = kvmalloc(sizeof(*msq), GFP_KERNEL);
> + msq = kvzalloc(sizeof(*msq), GFP_KERNEL);
> if (unlikely(!msq))
> return -ENOMEM;
>
> --
> 2.16.4
>
> --
> You received this message because you are subscribed to the Google Groups
> "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/syzkaller-bugs/20180624025651.bvjlcfulbmycz5bf%40linux-r8p5.
> For more options, visit https://groups.google.com/d/optout.

2018-07-04 09:20:24

by Manfred Spraul

[permalink] [raw]

Subject: Re: ipc/msg: zalloc struct msg_queue when creating a new msq

Hello together,

On 06/25/2018 11:21 AM, Dmitry Vyukov wrote:
> On Sun, Jun 24, 2018 at 4:56 AM, Davidlohr Bueso <[email protected]> wrote:
>> The following splat was reported around the msg_queue structure
>> which can have uninitialized fields left over after newque().
>> Future syscalls which make use of the msq id (now valid) can thus
>> make KMSAN complain because not all fields are explicitly initialized
>> and we have the padding as well. This is internal to the kernel,
>> hence no bogus leaks.
> Hi Davidlohr,
>
> As far as I understand the root problem is that (1) we publish a
> not-fully initialized objects and (2) finish it's initialization in a
> racy manner when other threads already have access to it. As the
> result other threads can act on a wrong object. I am not sure that
> zeroing the object really solves these problems. It will sure get rid
> of the report at hand (but probably not of KTSAN, data race detector,
> report), other threads still can see wrong 0 id and the id is still
> initialized in racy way. I would expect that a proper fix would be to
> publish a fully initialized object with proper, final id. Am I missing
> something?
There are 2 relevant values: kern_ipc_perm.id and kern_ipc_perm.seq.

For kern_ipc_perm.id, it is possible to move the access to the codepath
that hold the lock.

For kern_ipc_perm.seq, there are two options:
1) set it before publication.
2) initialize to an invalid value, and correct that at the end.

I'm in favor of option 2, it avoids that we must think about reducing
the next sequence number or not:

The purpose of the sequence counter is to minimize the risk that e.g. a
semop() will write into a newly created array.
I intentially write "minimize the risk", as it is by design impossible
to guarantee that this cannot happen, e.g. if semop() sleeps at the
instruction before the syscall.

Therefore, we can set seq to ULONG_MAX, then ipc_checkid() will always
fail and the corruption is avoided.

What do you think?

And, obviously:
Just set seq to 0 is dangerous, as the first allocated sequence number
is 0, and if that object is destroyed, then the newly created object has
temporarily sequence number 0 as well.

--
Manfred

Attachments:

0001-ipc-initialize-kern_ipc_perm.id-earlier.patch (6.53 kB)

2018-07-04 10:05:33

by Dmitry Vyukov

[permalink] [raw]

Subject: Re: ipc/msg: zalloc struct msg_queue when creating a new msq

On Wed, Jul 4, 2018 at 11:18 AM, Manfred Spraul
<[email protected]> wrote:
> Hello together,
>
> On 06/25/2018 11:21 AM, Dmitry Vyukov wrote:
>>
>> On Sun, Jun 24, 2018 at 4:56 AM, Davidlohr Bueso <[email protected]>
>> wrote:
>>>
>>> The following splat was reported around the msg_queue structure
>>> which can have uninitialized fields left over after newque().
>>> Future syscalls which make use of the msq id (now valid) can thus
>>> make KMSAN complain because not all fields are explicitly initialized
>>> and we have the padding as well. This is internal to the kernel,
>>> hence no bogus leaks.
>>
>> Hi Davidlohr,
>>
>> As far as I understand the root problem is that (1) we publish a
>> not-fully initialized objects and (2) finish it's initialization in a
>> racy manner when other threads already have access to it. As the
>> result other threads can act on a wrong object. I am not sure that
>> zeroing the object really solves these problems. It will sure get rid
>> of the report at hand (but probably not of KTSAN, data race detector,
>> report), other threads still can see wrong 0 id and the id is still
>> initialized in racy way. I would expect that a proper fix would be to
>> publish a fully initialized object with proper, final id. Am I missing
>> something?
>
> There are 2 relevant values: kern_ipc_perm.id and kern_ipc_perm.seq.
>
> For kern_ipc_perm.id, it is possible to move the access to the codepath that
> hold the lock.
>
> For kern_ipc_perm.seq, there are two options:
> 1) set it before publication.
> 2) initialize to an invalid value, and correct that at the end.
>
> I'm in favor of option 2, it avoids that we must think about reducing the
> next sequence number or not:
>
> The purpose of the sequence counter is to minimize the risk that e.g. a
> semop() will write into a newly created array.
> I intentially write "minimize the risk", as it is by design impossible to
> guarantee that this cannot happen, e.g. if semop() sleeps at the instruction
> before the syscall.
>
> Therefore, we can set seq to ULONG_MAX, then ipc_checkid() will always fail
> and the corruption is avoided.
>
> What do you think?
>
> And, obviously:
> Just set seq to 0 is dangerous, as the first allocated sequence number is 0,
> and if that object is destroyed, then the newly created object has
> temporarily sequence number 0 as well.

Hi Manfred,

It still looks fishy to me. This code published uninitialized uid's
for years (which lead not only to accidentally accessing wrong
objects, but also to privilege escalation). Now it publishes uninit
id/seq. The first proposed fix still did not make it correct. I can't
say that I see a but in your patch, but initializing id/seq in a racy
manner rings bells for me. Say, if we write/read seq ahead of id, can
reader still get access to a wrong object?
It all suggests some design flaw to me. Could ipc_idr_alloc() do full
initialization, i.e. also do what ipc_buildid() does? This would
ensure that we publish a fully constructed object in the first place.
We already have cleanup for ipc_idr_alloc(), which is idr_remove(), so
if we care about seq space conservation even in error conditions
(ENOMEM?), idr_remove() could accept an additional flag saying "this
object should not have been used by sane users yet, so retake its
seq". Did I get your concern about seq properly?

Thanks

p.s. I wonder how do you folks decipher unified patches without
context? If somebody will find it useful, side-by-side patch with
context is available here:
https://github.com/dvyukov/linux/commit/c3726a54a3c05b54192b2f5a4eaa4f7fa66ce68a

2018-07-04 14:10:21

by Manfred Spraul

[permalink] [raw]

Subject: Re: ipc/msg: zalloc struct msg_queue when creating a new msq

Hello Dmitry,
On 07/04/2018 12:03 PM, Dmitry Vyukov wrote:
> On Wed, Jul 4, 2018 at 11:18 AM, Manfred Spraul
> <[email protected]> wrote:
>>
>> There are 2 relevant values: kern_ipc_perm.id and kern_ipc_perm.seq.
>>
>> For kern_ipc_perm.id, it is possible to move the access to the codepath that
>> hold the lock.
>>
>> For kern_ipc_perm.seq, there are two options:
>> 1) set it before publication.
>> 2) initialize to an invalid value, and correct that at the end.
>>
>> I'm in favor of option 2, it avoids that we must think about reducing the
>> next sequence number or not:
>>
>> The purpose of the sequence counter is to minimize the risk that e.g. a
>> semop() will write into a newly created array.
>> I intentially write "minimize the risk", as it is by design impossible to
>> guarantee that this cannot happen, e.g. if semop() sleeps at the instruction
>> before the syscall.
>>
>> Therefore, we can set seq to ULONG_MAX, then ipc_checkid() will always fail
>> and the corruption is avoided.
>>
>> What do you think?
>>
>> And, obviously:
>> Just set seq to 0 is dangerous, as the first allocated sequence number is 0,
>> and if that object is destroyed, then the newly created object has
>> temporarily sequence number 0 as well.
> Hi Manfred,
>
> It still looks fishy to me. This code published uninitialized uid's
> for years (which lead not only to accidentally accessing wrong
> objects, but also to privilege escalation). Now it publishes uninit
> id/seq. The first proposed fix still did not make it correct. I can't
> say that I see a but in your patch, but initializing id/seq in a racy
> manner rings bells for me. Say, if we write/read seq ahead of id, can
> reader still get access to a wrong object?
> It all suggests some design flaw to me. Could ipc_idr_alloc() do full
> initialization, i.e. also do what ipc_buildid() does? This would
> ensure that we publish a fully constructed object in the first place.
> We already have cleanup for ipc_idr_alloc(), which is idr_remove(), so
> if we care about seq space conservation even in error conditions
> (ENOMEM?), idr_remove() could accept an additional flag saying "this
> object should not have been used by sane users yet, so retake its
> seq". Did I get your concern about seq properly?
You have convinced me, I'll rewrite the patch:

1) kern_ipc_perm.seq should be accessible under rcu_read_lock(), this
means replacing ipc_build_id() with two functions:
One that initializes kern_ipc_perm.seq, and one that would set
kern_ipc_perm.id.
2) the accesses to kern_ipc_perm.id must be moved to the position where
the lock is held. This is trivial.
3) we need a clear table that describes which variables can be accessed
under rcu_read_lock() and which need ipc_lock_object().
e.g.: kern_ipc_perm.id would end up under ipc_lock_object,
kern_ipc_perm.seq or the xuid fields can be read under rcu_read_lock().
Everything that can be accessed without ipc_lock_object must be
initialized before publication of a new object.

Or, as all access to kern_ipc_perm.id are in rare codepaths:
I'll remove kern_ipc_perm.id entirely, and build the id on demand.

Ok?

--
Manfred

2018-07-04 14:24:14

by Dmitry Vyukov

[permalink] [raw]

Subject: Re: ipc/msg: zalloc struct msg_queue when creating a new msq

On Wed, Jul 4, 2018 at 4:08 PM, Manfred Spraul <[email protected]> wrote:
> Hello Dmitry,
> On 07/04/2018 12:03 PM, Dmitry Vyukov wrote:
>>
>> On Wed, Jul 4, 2018 at 11:18 AM, Manfred Spraul
>> <[email protected]> wrote:
>>>
>>>
>>> There are 2 relevant values: kern_ipc_perm.id and kern_ipc_perm.seq.
>>>
>>> For kern_ipc_perm.id, it is possible to move the access to the codepath
>>> that
>>> hold the lock.
>>>
>>> For kern_ipc_perm.seq, there are two options:
>>> 1) set it before publication.
>>> 2) initialize to an invalid value, and correct that at the end.
>>>
>>> I'm in favor of option 2, it avoids that we must think about reducing the
>>> next sequence number or not:
>>>
>>> The purpose of the sequence counter is to minimize the risk that e.g. a
>>> semop() will write into a newly created array.
>>> I intentially write "minimize the risk", as it is by design impossible to
>>> guarantee that this cannot happen, e.g. if semop() sleeps at the
>>> instruction
>>> before the syscall.
>>>
>>> Therefore, we can set seq to ULONG_MAX, then ipc_checkid() will always
>>> fail
>>> and the corruption is avoided.
>>>
>>> What do you think?
>>>
>>> And, obviously:
>>> Just set seq to 0 is dangerous, as the first allocated sequence number is
>>> 0,
>>> and if that object is destroyed, then the newly created object has
>>> temporarily sequence number 0 as well.
>>
>> Hi Manfred,
>>
>> It still looks fishy to me. This code published uninitialized uid's
>> for years (which lead not only to accidentally accessing wrong
>> objects, but also to privilege escalation). Now it publishes uninit
>> id/seq. The first proposed fix still did not make it correct. I can't
>> say that I see a but in your patch, but initializing id/seq in a racy
>> manner rings bells for me. Say, if we write/read seq ahead of id, can
>> reader still get access to a wrong object?
>> It all suggests some design flaw to me. Could ipc_idr_alloc() do full
>> initialization, i.e. also do what ipc_buildid() does? This would
>> ensure that we publish a fully constructed object in the first place.
>> We already have cleanup for ipc_idr_alloc(), which is idr_remove(), so
>> if we care about seq space conservation even in error conditions
>> (ENOMEM?), idr_remove() could accept an additional flag saying "this
>> object should not have been used by sane users yet, so retake its
>> seq". Did I get your concern about seq properly?
>
> You have convinced me, I'll rewrite the patch:
>
> 1) kern_ipc_perm.seq should be accessible under rcu_read_lock(), this means
> replacing ipc_build_id() with two functions:
> One that initializes kern_ipc_perm.seq, and one that would set
> kern_ipc_perm.id.
> 2) the accesses to kern_ipc_perm.id must be moved to the position where the
> lock is held. This is trivial.
> 3) we need a clear table that describes which variables can be accessed
> under rcu_read_lock() and which need ipc_lock_object().
> e.g.: kern_ipc_perm.id would end up under ipc_lock_object,
> kern_ipc_perm.seq or the xuid fields can be read under rcu_read_lock().
> Everything that can be accessed without ipc_lock_object must be
> initialized before publication of a new object.
>
> Or, as all access to kern_ipc_perm.id are in rare codepaths:
> I'll remove kern_ipc_perm.id entirely, and build the id on demand.
>
> Ok?

Sounds like a more solid plan. If we remove kern_ipc_perm.id, then we
also don't need to split ipc_buildid() into two functions, right? And
since seq becomes constant throughout object lifetime, it probably
does not need any special access rules.

Thanks