2013-06-25 23:29:48

by Davidlohr Bueso

[permalink] [raw]
Subject: Re: linux-next: Tree for Jun 21 [ BROKEN ipc/ipc-msg ]

On Tue, 2013-06-25 at 23:41 +0200, Sedat Dilek wrote:
> On Tue, Jun 25, 2013 at 10:33 PM, Davidlohr Bueso
> <[email protected]> wrote:
> > On Tue, 2013-06-25 at 18:10 +0200, Sedat Dilek wrote:
> > [...]
> >
> >> I did some more testing with Linux-Testing-Project (release:
> >> ltp-full-20130503) and next-20130624 (Monday) which has still the
> >> issue, here.
> >>
> >> If I revert the mentioned two commits from my local
> >> revert-ipc-next20130624-5089fd1c6a6a-ab9efc2d0db5 GIT repo, everything
> >> is fine.
> >>
> >> I have tested the LTP ***IPC*** and ***SYSCALLS*** testcases.
> >>
> >> root# ./runltp -f ipc
> >>
> >> root# ./runltp -f syscalls
> >
> > These are nice test cases!
> >
> > So I was able to reproduce the issue with LTP and manually running
> > msgctl08. We seemed to be racing at find_msg(), so take to q_perm lock
> > before calling it. The following changes fixes the issue and passes all
> > 'runltp -f syscall' tests, could you give it a try?
> >
>
> Cool, that fixes the issues here.
>
> Building with fakeroot & make deb-pkg is now OK, again.
>
> The syscalls/msgctl08 test-case ran successfully!

Andrew, could you pick this one up? I've made the patch on top of
3.10.0-rc7-next-20130625

Thanks.
Davidlohr

8<---------------------------------

From: Davidlohr Bueso <[email protected]>
Subject: [PATCH] ipc,msq: fix race in msgrcv(2)

Sedat reported the following issue when building the latest linux-next:

Building via 'make deb-pkg' with fakeroot fails here like this:

make: *** [deb-pkg] Terminated
/usr/bin/fakeroot: line 181: 2386 Terminated
FAKEROOTKEY=$FAKEROOTKEY LD_LIBRARY_PATH="$PATHS" LD_PRELOAD="$LIB"
"$@"
semop(1): encountered an error: Identifier removed
semop(2): encountered an error: Invalid argument
semop(1): encountered an error: Identifier removed
semop(1): encountered an error: Identifier removed
semop(1): encountered an error: Invalid argument
semop(1): encountered an error: Invalid argument
semop(1): encountered an error: Invalid argument

The issue was caused by a race in find_msg(), so acquire the q_perm.lock
before calling the function. This also broke some LTP test cases:

<<<test_start>>>
tag=msgctl08 stime=1372174954
cmdline="msgctl08"
contacts=""
analysis=exit
<<<test_output>>>
msgctl08 0 TWARN : Verify error in child 0, *buf = 28, val = 27, size = 8
msgctl08 1 TFAIL : in child 0 read # = 73,key = 127
msgctl08 0 TWARN : Verify error in child 3, *buf = ffffff8a, val
= ffffff89, size = 52
msgctl08 1 TFAIL : in child 3 read # = 157,key = 189
msgctl08 0 TWARN : Verify error in child 2, *buf = ffffff87, val
= ffffff86, size = 71
msgctl08 1 TFAIL : in child 2 read # = 15954,key = 3e86
msgctl08 0 TWARN : Verify error in child 12, *buf = ffffffa9,
val = ffffffa8, size = 22
msgctl08 1 TFAIL : in child 12 read # = 12904,key = 32a8
msgctl08 0 TWARN : Verify error in child 13, *buf = 36, val =
35, size = 27
...

Also update a comment referring to ipc_lock_by_ptr(), which has already been deleted
and no longer applies to this context.

Reported-and-tested-by: Sedat Dilek <[email protected]>
Signed-off-by: Davidlohr Bueso <[email protected]>
---
ipc/msg.c | 11 +++++------
1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/ipc/msg.c b/ipc/msg.c
index a1cf70e..bd60d7e 100644
--- a/ipc/msg.c
+++ b/ipc/msg.c
@@ -895,6 +895,7 @@ long do_msgrcv(int msqid, void __user *buf, size_t bufsz, long msgtyp, int msgfl
if (ipcperms(ns, &msq->q_perm, S_IRUGO))
goto out_unlock1;

+ ipc_lock_object(&msq->q_perm);
msg = find_msg(msq, &msgtyp, mode);
if (!IS_ERR(msg)) {
/*
@@ -903,7 +904,7 @@ long do_msgrcv(int msqid, void __user *buf, size_t bufsz, long msgtyp, int msgfl
*/
if ((bufsz < msg->m_ts) && !(msgflg & MSG_NOERROR)) {
msg = ERR_PTR(-E2BIG);
- goto out_unlock1;
+ goto out_unlock0;
}
/*
* If we are copying, then do not unlink message and do
@@ -911,10 +912,9 @@ long do_msgrcv(int msqid, void __user *buf, size_t bufsz, long msgtyp, int msgfl
*/
if (msgflg & MSG_COPY) {
msg = copy_msg(msg, copy);
- goto out_unlock1;
+ goto out_unlock0;
}

- ipc_lock_object(&msq->q_perm);
list_del(&msg->m_list);
msq->q_qnum--;
msq->q_rtime = get_seconds();
@@ -930,10 +930,9 @@ long do_msgrcv(int msqid, void __user *buf, size_t bufsz, long msgtyp, int msgfl
/* No message waiting. Wait for a message */
if (msgflg & IPC_NOWAIT) {
msg = ERR_PTR(-ENOMSG);
- goto out_unlock1;
+ goto out_unlock0;
}

- ipc_lock_object(&msq->q_perm);
list_add_tail(&msr_d.r_list, &msq->q_receivers);
msr_d.r_tsk = current;
msr_d.r_msgtype = msgtyp;
@@ -957,7 +956,7 @@ long do_msgrcv(int msqid, void __user *buf, size_t bufsz, long msgtyp, int msgfl
* Prior to destruction, expunge_all(-EIRDM) changes r_msg.
* Thus if r_msg is -EAGAIN, then the queue not yet destroyed.
* rcu_read_lock() prevents preemption between reading r_msg
- * and the spin_lock() inside ipc_lock_by_ptr().
+ * and acquiring the q_perm.lock in ipc_lock_object().
*/
rcu_read_lock();

--
1.7.11.7



2013-08-28 12:00:32

by Vineet Gupta

[permalink] [raw]
Subject: ipc-msg broken again on 3.11-rc7? (was Re: linux-next: Tree for Jun 21 [ BROKEN ipc/ipc-msg ])

Hi David,

On 06/26/2013 04:59 AM, Davidlohr Bueso wrote:
> On Tue, 2013-06-25 at 23:41 +0200, Sedat Dilek wrote:
>> On Tue, Jun 25, 2013 at 10:33 PM, Davidlohr Bueso
>> <[email protected]> wrote:
>>> On Tue, 2013-06-25 at 18:10 +0200, Sedat Dilek wrote:
>>> [...]
>>>
>>>> I did some more testing with Linux-Testing-Project (release:
>>>> ltp-full-20130503) and next-20130624 (Monday) which has still the
>>>> issue, here.
>>>>
>>>> If I revert the mentioned two commits from my local
>>>> revert-ipc-next20130624-5089fd1c6a6a-ab9efc2d0db5 GIT repo, everything
>>>> is fine.
>>>>
>>>> I have tested the LTP ***IPC*** and ***SYSCALLS*** testcases.
>>>>
>>>> root# ./runltp -f ipc
>>>>
>>>> root# ./runltp -f syscalls
>>>
>>> These are nice test cases!
>>>
>>> So I was able to reproduce the issue with LTP and manually running
>>> msgctl08. We seemed to be racing at find_msg(), so take to q_perm lock
>>> before calling it. The following changes fixes the issue and passes all
>>> 'runltp -f syscall' tests, could you give it a try?
>>>
>>
>> Cool, that fixes the issues here.
>>
>> Building with fakeroot & make deb-pkg is now OK, again.
>>
>> The syscalls/msgctl08 test-case ran successfully!
>
> Andrew, could you pick this one up? I've made the patch on top of
> 3.10.0-rc7-next-20130625

LTP msgctl08 hangs on 3.11-rc7 (ARC port) with some of my local changes. I
bisected it, sigh... didn't look at this thread earlier :-( and landed into this.

------------->8------------------------------------
3dd1f784ed6603d7ab1043e51e6371235edf2313 is the first bad commit
commit 3dd1f784ed6603d7ab1043e51e6371235edf2313
Author: Davidlohr Bueso <[email protected]>
Date: Mon Jul 8 16:01:17 2013 -0700

ipc,msg: shorten critical region in msgsnd

do_msgsnd() is another function that does too many things with the ipc
object lock acquired. Take it only when needed when actually updating
msq.
------------->8------------------------------------

If I revert 3dd1f784ed66 and 9ad66ae "ipc: remove unused functions" - the test
passes. I can confirm that linux-next also has the issue (didn't try the revert
there though).

1. arc 3.11-rc7 config attached (UP + PREEMPT)
2. dmesg prints "msgmni has been set to 479"
3. LTP output (this is slightly dated source, so prints might vary)

------------->8------------------------------------
<<<test_start>>>
tag=msgctl08 stime=1377689180
cmdline="msgctl08"
contacts=""
analysis=exit
initiation_status="ok"
<<<test_output>>>
------------->8-------- hung here ------------------


Let me know if you need more data/test help.

-Vineet


Attachments:
.config (24.55 kB)

2013-08-29 03:04:44

by Sedat Dilek

[permalink] [raw]
Subject: Re: ipc-msg broken again on 3.11-rc7? (was Re: linux-next: Tree for Jun 21 [ BROKEN ipc/ipc-msg ])

On Wed, Aug 28, 2013 at 1:58 PM, Vineet Gupta
<[email protected]> wrote:
> Hi David,
>
> On 06/26/2013 04:59 AM, Davidlohr Bueso wrote:
>> On Tue, 2013-06-25 at 23:41 +0200, Sedat Dilek wrote:
>>> On Tue, Jun 25, 2013 at 10:33 PM, Davidlohr Bueso
>>> <[email protected]> wrote:
>>>> On Tue, 2013-06-25 at 18:10 +0200, Sedat Dilek wrote:
>>>> [...]
>>>>
>>>>> I did some more testing with Linux-Testing-Project (release:
>>>>> ltp-full-20130503) and next-20130624 (Monday) which has still the
>>>>> issue, here.
>>>>>
>>>>> If I revert the mentioned two commits from my local
>>>>> revert-ipc-next20130624-5089fd1c6a6a-ab9efc2d0db5 GIT repo, everything
>>>>> is fine.
>>>>>
>>>>> I have tested the LTP ***IPC*** and ***SYSCALLS*** testcases.
>>>>>
>>>>> root# ./runltp -f ipc
>>>>>
>>>>> root# ./runltp -f syscalls
>>>>
>>>> These are nice test cases!
>>>>
>>>> So I was able to reproduce the issue with LTP and manually running
>>>> msgctl08. We seemed to be racing at find_msg(), so take to q_perm lock
>>>> before calling it. The following changes fixes the issue and passes all
>>>> 'runltp -f syscall' tests, could you give it a try?
>>>>
>>>
>>> Cool, that fixes the issues here.
>>>
>>> Building with fakeroot & make deb-pkg is now OK, again.
>>>
>>> The syscalls/msgctl08 test-case ran successfully!
>>
>> Andrew, could you pick this one up? I've made the patch on top of
>> 3.10.0-rc7-next-20130625
>
> LTP msgctl08 hangs on 3.11-rc7 (ARC port) with some of my local changes. I
> bisected it, sigh... didn't look at this thread earlier :-( and landed into this.
>
> ------------->8------------------------------------
> 3dd1f784ed6603d7ab1043e51e6371235edf2313 is the first bad commit
> commit 3dd1f784ed6603d7ab1043e51e6371235edf2313
> Author: Davidlohr Bueso <[email protected]>
> Date: Mon Jul 8 16:01:17 2013 -0700
>
> ipc,msg: shorten critical region in msgsnd
>
> do_msgsnd() is another function that does too many things with the ipc
> object lock acquired. Take it only when needed when actually updating
> msq.
> ------------->8------------------------------------
>
> If I revert 3dd1f784ed66 and 9ad66ae "ipc: remove unused functions" - the test
> passes. I can confirm that linux-next also has the issue (didn't try the revert
> there though).
>
> 1. arc 3.11-rc7 config attached (UP + PREEMPT)
> 2. dmesg prints "msgmni has been set to 479"
> 3. LTP output (this is slightly dated source, so prints might vary)
>
> ------------->8------------------------------------
> <<<test_start>>>
> tag=msgctl08 stime=1377689180
> cmdline="msgctl08"
> contacts=""
> analysis=exit
> initiation_status="ok"
> <<<test_output>>>
> ------------->8-------- hung here ------------------
>
>
> Let me know if you need more data/test help.
>

Cannot say much to your constellation as I had the issue on x86-64 and
Linux-next.
But I have just seen a post-v3.11-rc7 IPC-fix in [1].

I have here a v3.11-rc7 kernel with drm-intel-nightly on top... did not run LTP.

Which LTP release do you use?
Might be good to attach your kernel-config for followers?

- Sedat -

[1] http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=368ae537e056acd3f751fa276f48423f06803922

2013-08-29 07:21:58

by Vineet Gupta

[permalink] [raw]
Subject: Re: ipc-msg broken again on 3.11-rc7? (was Re: linux-next: Tree for Jun 21 [ BROKEN ipc/ipc-msg ])

On 08/29/2013 08:34 AM, Sedat Dilek wrote:
> On Wed, Aug 28, 2013 at 1:58 PM, Vineet Gupta
> <[email protected]> wrote:
>> Hi David,
>>
>> On 06/26/2013 04:59 AM, Davidlohr Bueso wrote:
>>> On Tue, 2013-06-25 at 23:41 +0200, Sedat Dilek wrote:
>>>> On Tue, Jun 25, 2013 at 10:33 PM, Davidlohr Bueso
>>>> <[email protected]> wrote:
>>>>> On Tue, 2013-06-25 at 18:10 +0200, Sedat Dilek wrote:
>>>>> [...]
>>>>>
>>>>>> I did some more testing with Linux-Testing-Project (release:
>>>>>> ltp-full-20130503) and next-20130624 (Monday) which has still the
>>>>>> issue, here.
>>>>>>
>>>>>> If I revert the mentioned two commits from my local
>>>>>> revert-ipc-next20130624-5089fd1c6a6a-ab9efc2d0db5 GIT repo, everything
>>>>>> is fine.
>>>>>>
>>>>>> I have tested the LTP ***IPC*** and ***SYSCALLS*** testcases.
>>>>>>
>>>>>> root# ./runltp -f ipc
>>>>>>
>>>>>> root# ./runltp -f syscalls
>>>>> These are nice test cases!
>>>>>
>>>>> So I was able to reproduce the issue with LTP and manually running
>>>>> msgctl08. We seemed to be racing at find_msg(), so take to q_perm lock
>>>>> before calling it. The following changes fixes the issue and passes all
>>>>> 'runltp -f syscall' tests, could you give it a try?
>>>>>
>>>> Cool, that fixes the issues here.
>>>>
>>>> Building with fakeroot & make deb-pkg is now OK, again.
>>>>
>>>> The syscalls/msgctl08 test-case ran successfully!
>>> Andrew, could you pick this one up? I've made the patch on top of
>>> 3.10.0-rc7-next-20130625
>> LTP msgctl08 hangs on 3.11-rc7 (ARC port) with some of my local changes. I
>> bisected it, sigh... didn't look at this thread earlier :-( and landed into this.
>>
>> ------------->8------------------------------------
>> 3dd1f784ed6603d7ab1043e51e6371235edf2313 is the first bad commit
>> commit 3dd1f784ed6603d7ab1043e51e6371235edf2313
>> Author: Davidlohr Bueso <[email protected]>
>> Date: Mon Jul 8 16:01:17 2013 -0700
>>
>> ipc,msg: shorten critical region in msgsnd
>>
>> do_msgsnd() is another function that does too many things with the ipc
>> object lock acquired. Take it only when needed when actually updating
>> msq.
>> ------------->8------------------------------------
>>
>> If I revert 3dd1f784ed66 and 9ad66ae "ipc: remove unused functions" - the test
>> passes. I can confirm that linux-next also has the issue (didn't try the revert
>> there though).
>>
>> 1. arc 3.11-rc7 config attached (UP + PREEMPT)
>> 2. dmesg prints "msgmni has been set to 479"
>> 3. LTP output (this is slightly dated source, so prints might vary)
>>
>> ------------->8------------------------------------
>> <<<test_start>>>
>> tag=msgctl08 stime=1377689180
>> cmdline="msgctl08"
>> contacts=""
>> analysis=exit
>> initiation_status="ok"
>> <<<test_output>>>
>> ------------->8-------- hung here ------------------
>>
>>
>> Let me know if you need more data/test help.
>>
> Cannot say much to your constellation as I had the issue on x86-64 and
> Linux-next.
> But I have just seen a post-v3.11-rc7 IPC-fix in [1].
>
> I have here a v3.11-rc7 kernel with drm-intel-nightly on top... did not run LTP.

Not sure what you mean - I'd posted that Im seeing the issue on ARC Linux (an FPGA
board) 3.11-rc7 as well as linux-next of yesterday.

> Which LTP release do you use?

The LTP build I generally use is from a 2007 based sources (lazy me). However I
knew this would come up so before posting, I'd built the latest from buildroot and
ran the msgctl08 from there standalone and it did the same thing.

> Might be good to attach your kernel-config for followers?

It was already there in my orig msg - you probably missed it.

> [1] http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=368ae537e056acd3f751fa276f48423f06803922

I tried linux-next of today, same deal - msgctl08 still hangs.

-Vineet

2013-08-29 07:52:16

by Sedat Dilek

[permalink] [raw]
Subject: Re: ipc-msg broken again on 3.11-rc7? (was Re: linux-next: Tree for Jun 21 [ BROKEN ipc/ipc-msg ])

On Thu, Aug 29, 2013 at 9:21 AM, Vineet Gupta
<[email protected]> wrote:
> On 08/29/2013 08:34 AM, Sedat Dilek wrote:
>> On Wed, Aug 28, 2013 at 1:58 PM, Vineet Gupta
>> <[email protected]> wrote:
>>> Hi David,
>>>
>>> On 06/26/2013 04:59 AM, Davidlohr Bueso wrote:
>>>> On Tue, 2013-06-25 at 23:41 +0200, Sedat Dilek wrote:
>>>>> On Tue, Jun 25, 2013 at 10:33 PM, Davidlohr Bueso
>>>>> <[email protected]> wrote:
>>>>>> On Tue, 2013-06-25 at 18:10 +0200, Sedat Dilek wrote:
>>>>>> [...]
>>>>>>
>>>>>>> I did some more testing with Linux-Testing-Project (release:
>>>>>>> ltp-full-20130503) and next-20130624 (Monday) which has still the
>>>>>>> issue, here.
>>>>>>>
>>>>>>> If I revert the mentioned two commits from my local
>>>>>>> revert-ipc-next20130624-5089fd1c6a6a-ab9efc2d0db5 GIT repo, everything
>>>>>>> is fine.
>>>>>>>
>>>>>>> I have tested the LTP ***IPC*** and ***SYSCALLS*** testcases.
>>>>>>>
>>>>>>> root# ./runltp -f ipc
>>>>>>>
>>>>>>> root# ./runltp -f syscalls
>>>>>> These are nice test cases!
>>>>>>
>>>>>> So I was able to reproduce the issue with LTP and manually running
>>>>>> msgctl08. We seemed to be racing at find_msg(), so take to q_perm lock
>>>>>> before calling it. The following changes fixes the issue and passes all
>>>>>> 'runltp -f syscall' tests, could you give it a try?
>>>>>>
>>>>> Cool, that fixes the issues here.
>>>>>
>>>>> Building with fakeroot & make deb-pkg is now OK, again.
>>>>>
>>>>> The syscalls/msgctl08 test-case ran successfully!
>>>> Andrew, could you pick this one up? I've made the patch on top of
>>>> 3.10.0-rc7-next-20130625
>>> LTP msgctl08 hangs on 3.11-rc7 (ARC port) with some of my local changes. I
>>> bisected it, sigh... didn't look at this thread earlier :-( and landed into this.
>>>
>>> ------------->8------------------------------------
>>> 3dd1f784ed6603d7ab1043e51e6371235edf2313 is the first bad commit
>>> commit 3dd1f784ed6603d7ab1043e51e6371235edf2313
>>> Author: Davidlohr Bueso <[email protected]>
>>> Date: Mon Jul 8 16:01:17 2013 -0700
>>>
>>> ipc,msg: shorten critical region in msgsnd
>>>
>>> do_msgsnd() is another function that does too many things with the ipc
>>> object lock acquired. Take it only when needed when actually updating
>>> msq.
>>> ------------->8------------------------------------
>>>
>>> If I revert 3dd1f784ed66 and 9ad66ae "ipc: remove unused functions" - the test
>>> passes. I can confirm that linux-next also has the issue (didn't try the revert
>>> there though).
>>>
>>> 1. arc 3.11-rc7 config attached (UP + PREEMPT)
>>> 2. dmesg prints "msgmni has been set to 479"
>>> 3. LTP output (this is slightly dated source, so prints might vary)
>>>
>>> ------------->8------------------------------------
>>> <<<test_start>>>
>>> tag=msgctl08 stime=1377689180
>>> cmdline="msgctl08"
>>> contacts=""
>>> analysis=exit
>>> initiation_status="ok"
>>> <<<test_output>>>
>>> ------------->8-------- hung here ------------------
>>>
>>>
>>> Let me know if you need more data/test help.
>>>
>> Cannot say much to your constellation as I had the issue on x86-64 and
>> Linux-next.
>> But I have just seen a post-v3.11-rc7 IPC-fix in [1].
>>
>> I have here a v3.11-rc7 kernel with drm-intel-nightly on top... did not run LTP.
>
> Not sure what you mean - I'd posted that Im seeing the issue on ARC Linux (an FPGA
> board) 3.11-rc7 as well as linux-next of yesterday.
>

I am not saying there is no issue, but I have no possibility to test
for ARC arch.

>> Which LTP release do you use?
>
> The LTP build I generally use is from a 2007 based sources (lazy me). However I
> knew this would come up so before posting, I'd built the latest from buildroot and
> ran the msgctl08 from there standalone and it did the same thing.
>

Try always latest LTP-stable (03-May-2013 is what I tried). AFAICS a
new release is planned soon.

>> Might be good to attach your kernel-config for followers?
>
> It was already there in my orig msg - you probably missed it.
>

I have got that response from you only :-).

>> [1] http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=368ae537e056acd3f751fa276f48423f06803922
>
> I tried linux-next of today, same deal - msgctl08 still hangs.
>

That above fix [1] in Linus-tree is also in next-20130828.

Hope Davidlohr and fellows can help you.

- Sedat -

2013-08-30 08:19:49

by Vineet Gupta

[permalink] [raw]
Subject: Re: ipc-msg broken again on 3.11-rc7? (was Re: linux-next: Tree for Jun 21 [ BROKEN ipc/ipc-msg ])

Ping ?

It seems 3.11 is pretty close to releasing but we stil have LTP msgctl08 causing a
hang (atleast on ARC) for both linux-next 20130829 as well as Linus tree.

So far, I haven't seemed to have drawn attention of people involved.

-Vineet

On 08/29/2013 01:22 PM, Sedat Dilek wrote:
> On Thu, Aug 29, 2013 at 9:21 AM, Vineet Gupta
> <[email protected]> wrote:
>> On 08/29/2013 08:34 AM, Sedat Dilek wrote:
>>> On Wed, Aug 28, 2013 at 1:58 PM, Vineet Gupta
>>> <[email protected]> wrote:
>>>> Hi David,
>>>>

[....]

>>>> LTP msgctl08 hangs on 3.11-rc7 (ARC port) with some of my local changes. I
>>>> bisected it, sigh... didn't look at this thread earlier :-( and landed into this.
>>>>
>>>> ------------->8------------------------------------
>>>> 3dd1f784ed6603d7ab1043e51e6371235edf2313 is the first bad commit
>>>> commit 3dd1f784ed6603d7ab1043e51e6371235edf2313
>>>> Author: Davidlohr Bueso <[email protected]>
>>>> Date: Mon Jul 8 16:01:17 2013 -0700
>>>>
>>>> ipc,msg: shorten critical region in msgsnd
>>>>
>>>> do_msgsnd() is another function that does too many things with the ipc
>>>> object lock acquired. Take it only when needed when actually updating
>>>> msq.
>>>> ------------->8------------------------------------
>>>>
>>>> If I revert 3dd1f784ed66 and 9ad66ae "ipc: remove unused functions" - the test
>>>> passes. I can confirm that linux-next also has the issue (didn't try the revert
>>>> there though).
>>>>
>>>> 1. arc 3.11-rc7 config attached (UP + PREEMPT)
>>>> 2. dmesg prints "msgmni has been set to 479"
>>>> 3. LTP output (this is slightly dated source, so prints might vary)
>>>>
>>>> ------------->8------------------------------------
>>>> <<<test_start>>>
>>>> tag=msgctl08 stime=1377689180
>>>> cmdline="msgctl08"
>>>> contacts=""
>>>> analysis=exit
>>>> initiation_status="ok"
>>>> <<<test_output>>>
>>>> ------------->8-------- hung here ------------------
>>>>
>>>>
>>>> Let me know if you need more data/test help.
>>>>
>>> Cannot say much to your constellation as I had the issue on x86-64 and
>>> Linux-next.
>>> But I have just seen a post-v3.11-rc7 IPC-fix in [1].
>>>
>>> I have here a v3.11-rc7 kernel with drm-intel-nightly on top... did not run LTP.
>>
>> Not sure what you mean - I'd posted that Im seeing the issue on ARC Linux (an FPGA
>> board) 3.11-rc7 as well as linux-next of yesterday.
>>
>
> I am not saying there is no issue, but I have no possibility to test
> for ARC arch.
>
>>> Which LTP release do you use?
>>
>> The LTP build I generally use is from a 2007 based sources (lazy me). However I
>> knew this would come up so before posting, I'd built the latest from buildroot and
>> ran the msgctl08 from there standalone and it did the same thing.
>>
>
> Try always latest LTP-stable (03-May-2013 is what I tried). AFAICS a
> new release is planned soon.
>
>>> Might be good to attach your kernel-config for followers?
>>
>> It was already there in my orig msg - you probably missed it.
>>
>
> I have got that response from you only :-).
>
>>> [1] http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=368ae537e056acd3f751fa276f48423f06803922
>>
>> I tried linux-next of today, same deal - msgctl08 still hangs.
>>
>
> That above fix [1] in Linus-tree is also in next-20130828.
>
> Hope Davidlohr and fellows can help you.
>
> - Sedat -
>

2013-08-30 08:27:48

by Sedat Dilek

[permalink] [raw]
Subject: Re: ipc-msg broken again on 3.11-rc7? (was Re: linux-next: Tree for Jun 21 [ BROKEN ipc/ipc-msg ])

On Fri, Aug 30, 2013 at 10:19 AM, Vineet Gupta <[email protected]> wrote:
> Ping ?
>
> It seems 3.11 is pretty close to releasing but we stil have LTP msgctl08 causing a
> hang (atleast on ARC) for both linux-next 20130829 as well as Linus tree.
>
> So far, I haven't seemed to have drawn attention of people involved.
>

Hi Vineet,

I remember fakeroot was an another good test-case for me to test this
IPC breakage.
Attached is my build-script for Linux-next (tested with Debian/Ubuntu).
( Cannot say if you can play with it in your environment. )

Regards,
- Sedat -

> -Vineet
>
> On 08/29/2013 01:22 PM, Sedat Dilek wrote:
>> On Thu, Aug 29, 2013 at 9:21 AM, Vineet Gupta
>> <[email protected]> wrote:
>>> On 08/29/2013 08:34 AM, Sedat Dilek wrote:
>>>> On Wed, Aug 28, 2013 at 1:58 PM, Vineet Gupta
>>>> <[email protected]> wrote:
>>>>> Hi David,
>>>>>
>
> [....]
>
>>>>> LTP msgctl08 hangs on 3.11-rc7 (ARC port) with some of my local changes. I
>>>>> bisected it, sigh... didn't look at this thread earlier :-( and landed into this.
>>>>>
>>>>> ------------->8------------------------------------
>>>>> 3dd1f784ed6603d7ab1043e51e6371235edf2313 is the first bad commit
>>>>> commit 3dd1f784ed6603d7ab1043e51e6371235edf2313
>>>>> Author: Davidlohr Bueso <[email protected]>
>>>>> Date: Mon Jul 8 16:01:17 2013 -0700
>>>>>
>>>>> ipc,msg: shorten critical region in msgsnd
>>>>>
>>>>> do_msgsnd() is another function that does too many things with the ipc
>>>>> object lock acquired. Take it only when needed when actually updating
>>>>> msq.
>>>>> ------------->8------------------------------------
>>>>>
>>>>> If I revert 3dd1f784ed66 and 9ad66ae "ipc: remove unused functions" - the test
>>>>> passes. I can confirm that linux-next also has the issue (didn't try the revert
>>>>> there though).
>>>>>
>>>>> 1. arc 3.11-rc7 config attached (UP + PREEMPT)
>>>>> 2. dmesg prints "msgmni has been set to 479"
>>>>> 3. LTP output (this is slightly dated source, so prints might vary)
>>>>>
>>>>> ------------->8------------------------------------
>>>>> <<<test_start>>>
>>>>> tag=msgctl08 stime=1377689180
>>>>> cmdline="msgctl08"
>>>>> contacts=""
>>>>> analysis=exit
>>>>> initiation_status="ok"
>>>>> <<<test_output>>>
>>>>> ------------->8-------- hung here ------------------
>>>>>
>>>>>
>>>>> Let me know if you need more data/test help.
>>>>>
>>>> Cannot say much to your constellation as I had the issue on x86-64 and
>>>> Linux-next.
>>>> But I have just seen a post-v3.11-rc7 IPC-fix in [1].
>>>>
>>>> I have here a v3.11-rc7 kernel with drm-intel-nightly on top... did not run LTP.
>>>
>>> Not sure what you mean - I'd posted that Im seeing the issue on ARC Linux (an FPGA
>>> board) 3.11-rc7 as well as linux-next of yesterday.
>>>
>>
>> I am not saying there is no issue, but I have no possibility to test
>> for ARC arch.
>>
>>>> Which LTP release do you use?
>>>
>>> The LTP build I generally use is from a 2007 based sources (lazy me). However I
>>> knew this would come up so before posting, I'd built the latest from buildroot and
>>> ran the msgctl08 from there standalone and it did the same thing.
>>>
>>
>> Try always latest LTP-stable (03-May-2013 is what I tried). AFAICS a
>> new release is planned soon.
>>
>>>> Might be good to attach your kernel-config for followers?
>>>
>>> It was already there in my orig msg - you probably missed it.
>>>
>>
>> I have got that response from you only :-).
>>
>>>> [1] http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=368ae537e056acd3f751fa276f48423f06803922
>>>
>>> I tried linux-next of today, same deal - msgctl08 still hangs.
>>>
>>
>> That above fix [1] in Linus-tree is also in next-20130828.
>>
>> Hope Davidlohr and fellows can help you.
>>
>> - Sedat -
>>
>


Attachments:
build_linux-next.sh (4.51 kB)