2013-06-21 19:34:22

by Sedat Dilek

[permalink] [raw]
Subject: Re: linux-next: Tree for Jun 21 [ BROKEN ipc/ipc-msg ]

On Fri, Jun 21, 2013 at 10:17 AM, Stephen Rothwell <[email protected]> wrote:
> Hi all,
>
> Happy solstice!
>
> Changes since 20130620:
>
> Dropped tree: mailbox (really bad merge conflicts with the arm-soc tree)
>
> The net-next tree gained a conflict against the net tree.
>
> The leds tree still had its build failure, so I used the version from
> next-20130607.
>
> The arm-soc tree gained conflicts against the tip, net-next, mfd and
> mailbox trees.
>
> The staging tree still had its build failure for which I disabled some
> code.
>
> The akpm tree lost a few patches that turned up elsewhere and gained
> conflicts against the ftrace and arm-soc trees.
>
> ----------------------------------------------------------------------------
>

[ CC IPC folks ]

Building via 'make deb-pkg' with fakeroot fails here like this:

make: *** [deb-pkg] Terminated
/usr/bin/fakeroot: line 181: 2386 Terminated
FAKEROOTKEY=$FAKEROOTKEY LD_LIBRARY_PATH="$PATHS" LD_PRELOAD="$LIB"
"$@"
semop(1): encountered an error: Identifier removed
semop(2): encountered an error: Invalid argument
semop(1): encountered an error: Identifier removed
semop(1): encountered an error: Identifier removed
semop(1): encountered an error: Invalid argument
semop(1): encountered an error: Invalid argument
semop(1): encountered an error: Invalid argument

The issue is present since next-20130606!

LAST KNOWN GOOD: next-20130605
FIRST KNOWN BAD: next-20130606

KNOWN GOOD: next-20130604
KNOWN BAD: next-20130607 || next-20130619 || next-20130620 || next-20130621

git-bisect says CULPRIT commit is...

"ipc,msg: shorten critical region in msgrcv"

NOTE: msg_lock_(check_) routines have to be restored (one more revert needed)!

Reverting both (below) commits makes fakeroot build via 'make dep-pkg" again.

I have tested the revert-patches with next-20130606 and next-20130621
(see file-attachments).

My build-script is attached!

Can someone of the IPC folks look at that?
Thanks!

- Sedat -


P.S.: Commit-IDs listed below.

[ next-20130606 ]

http://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/log/?id=next-20130606

"ipc: remove unused functions"
http://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=8793fdfb0d0a6ed5916767e29a15d3eb56e04e79

"ipc,msg: shorten critical region in msgrcv"
http://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=c0ff93322847a54f74a5450032c4df64c17fdaed

[ next-20130621 ]

http://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/log/?id=next-20130621

"ipc: remove unused functions"
http://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=941ce57c81dcceadf55265616ee1e8bef18b0ad3

"ipc,msg: shorten critical region in msgrcv"
http://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=62190df4081ee8504e3611d45edb40450cb408ac


Attachments:
build_linux-next.sh (3.76 kB)
3.10.0-rc4-next20130606-3-iniza-small.patch (6.19 kB)
3.10.0-rc6-next20130621-2-iniza-small.patch (7.46 kB)
Download all attachments

2013-06-21 22:07:23

by Davidlohr Bueso

[permalink] [raw]
Subject: Re: linux-next: Tree for Jun 21 [ BROKEN ipc/ipc-msg ]

On Fri, 2013-06-21 at 21:34 +0200, Sedat Dilek wrote:
> On Fri, Jun 21, 2013 at 10:17 AM, Stephen Rothwell <[email protected]> wrote:
> > Hi all,
> >
> > Happy solstice!
> >
> > Changes since 20130620:
> >
> > Dropped tree: mailbox (really bad merge conflicts with the arm-soc tree)
> >
> > The net-next tree gained a conflict against the net tree.
> >
> > The leds tree still had its build failure, so I used the version from
> > next-20130607.
> >
> > The arm-soc tree gained conflicts against the tip, net-next, mfd and
> > mailbox trees.
> >
> > The staging tree still had its build failure for which I disabled some
> > code.
> >
> > The akpm tree lost a few patches that turned up elsewhere and gained
> > conflicts against the ftrace and arm-soc trees.
> >
> > ----------------------------------------------------------------------------
> >
>
> [ CC IPC folks ]
>
> Building via 'make deb-pkg' with fakeroot fails here like this:
>
> make: *** [deb-pkg] Terminated
> /usr/bin/fakeroot: line 181: 2386 Terminated
> FAKEROOTKEY=$FAKEROOTKEY LD_LIBRARY_PATH="$PATHS" LD_PRELOAD="$LIB"
> "$@"
> semop(1): encountered an error: Identifier removed
> semop(2): encountered an error: Invalid argument
> semop(1): encountered an error: Identifier removed
> semop(1): encountered an error: Identifier removed
> semop(1): encountered an error: Invalid argument
> semop(1): encountered an error: Invalid argument
> semop(1): encountered an error: Invalid argument
>

Hmmm those really shouldn't be related to the message queue changes. Are
you sure you got the right bisect?

Manfred has a few ipc/sem.c patches in linux-next, starting at commit
c50df1b4 (ipc/sem.c: cacheline align the semaphore structures), does
reverting any of those instead of "ipc,msg: shorten critical region in
msgrcv" help at all? Also, anything reported in dmesg?

> The issue is present since next-20130606!
>
> LAST KNOWN GOOD: next-20130605
> FIRST KNOWN BAD: next-20130606
>
> KNOWN GOOD: next-20130604
> KNOWN BAD: next-20130607 || next-20130619 || next-20130620 || next-20130621
>
> git-bisect says CULPRIT commit is...
>
> "ipc,msg: shorten critical region in msgrcv"

This I get. I went through the code again and it looks correct and
functionally equivalent to the old msgrcv.

>
> NOTE: msg_lock_(check_) routines have to be restored (one more revert needed)!

This I don't get. Restoring msg_lock_[check] is already equivalent to
reverting "ipc,msg: shorten critical region in msgrcv" and several other
of the msq patches. What other patch needs reverted?

Anyway, I'll see if I can reproduce the issue, maybe I'm missing
something.

Thanks,
Davidlohr

>
> Reverting both (below) commits makes fakeroot build via 'make dep-pkg" again.
>
> I have tested the revert-patches with next-20130606 and next-20130621
> (see file-attachments).
>
> My build-script is attached!
>
> Can someone of the IPC folks look at that?
> Thanks!
>
> - Sedat -
>
>
> P.S.: Commit-IDs listed below.
>
> [ next-20130606 ]
>
> http://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/log/?id=next-20130606
>
> "ipc: remove unused functions"
> http://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=8793fdfb0d0a6ed5916767e29a15d3eb56e04e79
>
> "ipc,msg: shorten critical region in msgrcv"
> http://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=c0ff93322847a54f74a5450032c4df64c17fdaed
>
> [ next-20130621 ]
>
> http://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/log/?id=next-20130621
>
> "ipc: remove unused functions"
> http://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=941ce57c81dcceadf55265616ee1e8bef18b0ad3
>
> "ipc,msg: shorten critical region in msgrcv"
> http://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=62190df4081ee8504e3611d45edb40450cb408ac

2013-06-21 22:54:14

by Sedat Dilek

[permalink] [raw]
Subject: Re: linux-next: Tree for Jun 21 [ BROKEN ipc/ipc-msg ]

On Sat, Jun 22, 2013 at 12:07 AM, Davidlohr Bueso
<[email protected]> wrote:
> On Fri, 2013-06-21 at 21:34 +0200, Sedat Dilek wrote:
>> On Fri, Jun 21, 2013 at 10:17 AM, Stephen Rothwell <[email protected]> wrote:
>> > Hi all,
>> >
>> > Happy solstice!
>> >
>> > Changes since 20130620:
>> >
>> > Dropped tree: mailbox (really bad merge conflicts with the arm-soc tree)
>> >
>> > The net-next tree gained a conflict against the net tree.
>> >
>> > The leds tree still had its build failure, so I used the version from
>> > next-20130607.
>> >
>> > The arm-soc tree gained conflicts against the tip, net-next, mfd and
>> > mailbox trees.
>> >
>> > The staging tree still had its build failure for which I disabled some
>> > code.
>> >
>> > The akpm tree lost a few patches that turned up elsewhere and gained
>> > conflicts against the ftrace and arm-soc trees.
>> >
>> > ----------------------------------------------------------------------------
>> >
>>
>> [ CC IPC folks ]
>>
>> Building via 'make deb-pkg' with fakeroot fails here like this:
>>
>> make: *** [deb-pkg] Terminated
>> /usr/bin/fakeroot: line 181: 2386 Terminated
>> FAKEROOTKEY=$FAKEROOTKEY LD_LIBRARY_PATH="$PATHS" LD_PRELOAD="$LIB"
>> "$@"
>> semop(1): encountered an error: Identifier removed
>> semop(2): encountered an error: Invalid argument
>> semop(1): encountered an error: Identifier removed
>> semop(1): encountered an error: Identifier removed
>> semop(1): encountered an error: Invalid argument
>> semop(1): encountered an error: Invalid argument
>> semop(1): encountered an error: Invalid argument
>>
>
> Hmmm those really shouldn't be related to the message queue changes. Are
> you sure you got the right bisect?
>
> Manfred has a few ipc/sem.c patches in linux-next, starting at commit
> c50df1b4 (ipc/sem.c: cacheline align the semaphore structures), does
> reverting any of those instead of "ipc,msg: shorten critical region in
> msgrcv" help at all? Also, anything reported in dmesg?
>

First, I reverted all IPC patches from akpm-tree within -next.
Then, I isolated the culprit by git-bisecting.
As I checked my logs I did not see anything helpful.

>> The issue is present since next-20130606!
>>
>> LAST KNOWN GOOD: next-20130605
>> FIRST KNOWN BAD: next-20130606
>>
>> KNOWN GOOD: next-20130604
>> KNOWN BAD: next-20130607 || next-20130619 || next-20130620 || next-20130621
>>
>> git-bisect says CULPRIT commit is...
>>
>> "ipc,msg: shorten critical region in msgrcv"
>
> This I get. I went through the code again and it looks correct and
> functionally equivalent to the old msgrcv.
>

Hmm, I guess a rcu_read_unlock() is missing?

[ next-20130605 ]
...
/* Lockless receive, part 3:
* Acquire the queue spinlock.
*/
ipc_lock_by_ptr(&msq->q_perm);
rcu_read_unlock();
...
[ next-20130621 ]
...
/* Lockless receive, part 3:
* Acquire the queue spinlock.
*/
ipc_lock_object(&msq->q_perm);
...

Whereas ipc_lock_by_ptr() is equivalent to:
rcu_read_lock();
ipc_lock_object();

>>
>> NOTE: msg_lock_(check_) routines have to be restored (one more revert needed)!
>
> This I don't get. Restoring msg_lock_[check] is already equivalent to
> reverting "ipc,msg: shorten critical region in msgrcv" and several other
> of the msq patches. What other patch needs reverted?
>

No, you have to revert both patches as the other removed
msg_lock_[check] afterwards.

> Anyway, I'll see if I can reproduce the issue, maybe I'm missing
> something.
>

Yupp, I try with adding rcu_read_unlock()... and report.

- Sedat -

> Thanks,
> Davidlohr
>
>>
>> Reverting both (below) commits makes fakeroot build via 'make dep-pkg" again.
>>
>> I have tested the revert-patches with next-20130606 and next-20130621
>> (see file-attachments).
>>
>> My build-script is attached!
>>
>> Can someone of the IPC folks look at that?
>> Thanks!
>>
>> - Sedat -
>>
>>
>> P.S.: Commit-IDs listed below.
>>
>> [ next-20130606 ]
>>
>> http://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/log/?id=next-20130606
>>
>> "ipc: remove unused functions"
>> http://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=8793fdfb0d0a6ed5916767e29a15d3eb56e04e79
>>
>> "ipc,msg: shorten critical region in msgrcv"
>> http://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=c0ff93322847a54f74a5450032c4df64c17fdaed
>>
>> [ next-20130621 ]
>>
>> http://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/log/?id=next-20130621
>>
>> "ipc: remove unused functions"
>> http://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=941ce57c81dcceadf55265616ee1e8bef18b0ad3
>>
>> "ipc,msg: shorten critical region in msgrcv"
>> http://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=62190df4081ee8504e3611d45edb40450cb408ac
>
>

2013-06-21 23:11:49

by Davidlohr Bueso

[permalink] [raw]
Subject: Re: linux-next: Tree for Jun 21 [ BROKEN ipc/ipc-msg ]

On Sat, 2013-06-22 at 00:54 +0200, Sedat Dilek wrote:
> On Sat, Jun 22, 2013 at 12:07 AM, Davidlohr Bueso
> <[email protected]> wrote:
> > On Fri, 2013-06-21 at 21:34 +0200, Sedat Dilek wrote:
> >> On Fri, Jun 21, 2013 at 10:17 AM, Stephen Rothwell <[email protected]> wrote:
> >> > Hi all,
> >> >
> >> > Happy solstice!
> >> >
> >> > Changes since 20130620:
> >> >
> >> > Dropped tree: mailbox (really bad merge conflicts with the arm-soc tree)
> >> >
> >> > The net-next tree gained a conflict against the net tree.
> >> >
> >> > The leds tree still had its build failure, so I used the version from
> >> > next-20130607.
> >> >
> >> > The arm-soc tree gained conflicts against the tip, net-next, mfd and
> >> > mailbox trees.
> >> >
> >> > The staging tree still had its build failure for which I disabled some
> >> > code.
> >> >
> >> > The akpm tree lost a few patches that turned up elsewhere and gained
> >> > conflicts against the ftrace and arm-soc trees.
> >> >
> >> > ----------------------------------------------------------------------------
> >> >
> >>
> >> [ CC IPC folks ]
> >>
> >> Building via 'make deb-pkg' with fakeroot fails here like this:
> >>
> >> make: *** [deb-pkg] Terminated
> >> /usr/bin/fakeroot: line 181: 2386 Terminated
> >> FAKEROOTKEY=$FAKEROOTKEY LD_LIBRARY_PATH="$PATHS" LD_PRELOAD="$LIB"
> >> "$@"
> >> semop(1): encountered an error: Identifier removed
> >> semop(2): encountered an error: Invalid argument
> >> semop(1): encountered an error: Identifier removed
> >> semop(1): encountered an error: Identifier removed
> >> semop(1): encountered an error: Invalid argument
> >> semop(1): encountered an error: Invalid argument
> >> semop(1): encountered an error: Invalid argument
> >>
> >
> > Hmmm those really shouldn't be related to the message queue changes. Are
> > you sure you got the right bisect?
> >
> > Manfred has a few ipc/sem.c patches in linux-next, starting at commit
> > c50df1b4 (ipc/sem.c: cacheline align the semaphore structures), does
> > reverting any of those instead of "ipc,msg: shorten critical region in
> > msgrcv" help at all? Also, anything reported in dmesg?
> >
>
> First, I reverted all IPC patches from akpm-tree within -next.
> Then, I isolated the culprit by git-bisecting.
> As I checked my logs I did not see anything helpful.
>
> >> The issue is present since next-20130606!
> >>
> >> LAST KNOWN GOOD: next-20130605
> >> FIRST KNOWN BAD: next-20130606
> >>
> >> KNOWN GOOD: next-20130604
> >> KNOWN BAD: next-20130607 || next-20130619 || next-20130620 || next-20130621
> >>
> >> git-bisect says CULPRIT commit is...
> >>
> >> "ipc,msg: shorten critical region in msgrcv"
> >
> > This I get. I went through the code again and it looks correct and
> > functionally equivalent to the old msgrcv.
> >
>
> Hmm, I guess a rcu_read_unlock() is missing?
>
> [ next-20130605 ]
> ...
> /* Lockless receive, part 3:
> * Acquire the queue spinlock.
> */
> ipc_lock_by_ptr(&msq->q_perm);
> rcu_read_unlock();
> ...
> [ next-20130621 ]
> ...
> /* Lockless receive, part 3:
> * Acquire the queue spinlock.
> */
> ipc_lock_object(&msq->q_perm);
> ...
>
> Whereas ipc_lock_by_ptr() is equivalent to:
> rcu_read_lock();
> ipc_lock_object();

Yeah, I noticed that, but it's not an error. In the older code we have

rcu_read_lock (Lockless receive, part 1)
[...]
/* Lockless receive, part 3:
* Acquire the queue spinlock.
*/
ipc_lock_by_ptr(&msq->q_perm);
rcu_read_unlock();


Which translates to:
rcu_read_lock (Lockless receive, part 1)
[...]
/* Lockless receive, part 3:
* Acquire the queue spinlock.
*/
rcu_read_lock();
ipc_lock_object();
rcu_read_unlock();

And thus, after that last rcu_read_unlock we are left with
rcu_read_lock()
ipc_lock_object();

If you notice, that's exactly what is done in the new code, only much
more readable: We do rcu_read_lock in the part 1, then in part 3, we
acquire the spinlock via ipc_lock_object(&msq->q_perm)


> >>
> >> NOTE: msg_lock_(check_) routines have to be restored (one more revert needed)!
> >
> > This I don't get. Restoring msg_lock_[check] is already equivalent to
> > reverting "ipc,msg: shorten critical region in msgrcv" and several other
> > of the msq patches. What other patch needs reverted?
> >
>
> No, you have to revert both patches as the other removed
> msg_lock_[check] afterwards.
>
> > Anyway, I'll see if I can reproduce the issue, maybe I'm missing
> > something.
> >
>
> Yupp, I try with adding rcu_read_unlock()... and report.
>
> - Sedat -
>
> > Thanks,
> > Davidlohr
> >
> >>
> >> Reverting both (below) commits makes fakeroot build via 'make dep-pkg" again.
> >>
> >> I have tested the revert-patches with next-20130606 and next-20130621
> >> (see file-attachments).
> >>
> >> My build-script is attached!
> >>
> >> Can someone of the IPC folks look at that?
> >> Thanks!
> >>
> >> - Sedat -
> >>
> >>
> >> P.S.: Commit-IDs listed below.
> >>
> >> [ next-20130606 ]
> >>
> >> http://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/log/?id=next-20130606
> >>
> >> "ipc: remove unused functions"
> >> http://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=8793fdfb0d0a6ed5916767e29a15d3eb56e04e79
> >>
> >> "ipc,msg: shorten critical region in msgrcv"
> >> http://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=c0ff93322847a54f74a5450032c4df64c17fdaed
> >>
> >> [ next-20130621 ]
> >>
> >> http://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/log/?id=next-20130621
> >>
> >> "ipc: remove unused functions"
> >> http://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=941ce57c81dcceadf55265616ee1e8bef18b0ad3
> >>
> >> "ipc,msg: shorten critical region in msgrcv"
> >> http://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=62190df4081ee8504e3611d45edb40450cb408ac
> >
> >

2013-06-21 23:14:27

by Sedat Dilek

[permalink] [raw]
Subject: Re: linux-next: Tree for Jun 21 [ BROKEN ipc/ipc-msg ]

On Sat, Jun 22, 2013 at 1:11 AM, Davidlohr Bueso <[email protected]> wrote:
> On Sat, 2013-06-22 at 00:54 +0200, Sedat Dilek wrote:
>> On Sat, Jun 22, 2013 at 12:07 AM, Davidlohr Bueso
>> <[email protected]> wrote:
>> > On Fri, 2013-06-21 at 21:34 +0200, Sedat Dilek wrote:
>> >> On Fri, Jun 21, 2013 at 10:17 AM, Stephen Rothwell <[email protected]> wrote:
>> >> > Hi all,
>> >> >
>> >> > Happy solstice!
>> >> >
>> >> > Changes since 20130620:
>> >> >
>> >> > Dropped tree: mailbox (really bad merge conflicts with the arm-soc tree)
>> >> >
>> >> > The net-next tree gained a conflict against the net tree.
>> >> >
>> >> > The leds tree still had its build failure, so I used the version from
>> >> > next-20130607.
>> >> >
>> >> > The arm-soc tree gained conflicts against the tip, net-next, mfd and
>> >> > mailbox trees.
>> >> >
>> >> > The staging tree still had its build failure for which I disabled some
>> >> > code.
>> >> >
>> >> > The akpm tree lost a few patches that turned up elsewhere and gained
>> >> > conflicts against the ftrace and arm-soc trees.
>> >> >
>> >> > ----------------------------------------------------------------------------
>> >> >
>> >>
>> >> [ CC IPC folks ]
>> >>
>> >> Building via 'make deb-pkg' with fakeroot fails here like this:
>> >>
>> >> make: *** [deb-pkg] Terminated
>> >> /usr/bin/fakeroot: line 181: 2386 Terminated
>> >> FAKEROOTKEY=$FAKEROOTKEY LD_LIBRARY_PATH="$PATHS" LD_PRELOAD="$LIB"
>> >> "$@"
>> >> semop(1): encountered an error: Identifier removed
>> >> semop(2): encountered an error: Invalid argument
>> >> semop(1): encountered an error: Identifier removed
>> >> semop(1): encountered an error: Identifier removed
>> >> semop(1): encountered an error: Invalid argument
>> >> semop(1): encountered an error: Invalid argument
>> >> semop(1): encountered an error: Invalid argument
>> >>
>> >
>> > Hmmm those really shouldn't be related to the message queue changes. Are
>> > you sure you got the right bisect?
>> >
>> > Manfred has a few ipc/sem.c patches in linux-next, starting at commit
>> > c50df1b4 (ipc/sem.c: cacheline align the semaphore structures), does
>> > reverting any of those instead of "ipc,msg: shorten critical region in
>> > msgrcv" help at all? Also, anything reported in dmesg?
>> >
>>
>> First, I reverted all IPC patches from akpm-tree within -next.
>> Then, I isolated the culprit by git-bisecting.
>> As I checked my logs I did not see anything helpful.
>>
>> >> The issue is present since next-20130606!
>> >>
>> >> LAST KNOWN GOOD: next-20130605
>> >> FIRST KNOWN BAD: next-20130606
>> >>
>> >> KNOWN GOOD: next-20130604
>> >> KNOWN BAD: next-20130607 || next-20130619 || next-20130620 || next-20130621
>> >>
>> >> git-bisect says CULPRIT commit is...
>> >>
>> >> "ipc,msg: shorten critical region in msgrcv"
>> >
>> > This I get. I went through the code again and it looks correct and
>> > functionally equivalent to the old msgrcv.
>> >
>>
>> Hmm, I guess a rcu_read_unlock() is missing?
>>
>> [ next-20130605 ]
>> ...
>> /* Lockless receive, part 3:
>> * Acquire the queue spinlock.
>> */
>> ipc_lock_by_ptr(&msq->q_perm);
>> rcu_read_unlock();
>> ...
>> [ next-20130621 ]
>> ...
>> /* Lockless receive, part 3:
>> * Acquire the queue spinlock.
>> */
>> ipc_lock_object(&msq->q_perm);
>> ...
>>
>> Whereas ipc_lock_by_ptr() is equivalent to:
>> rcu_read_lock();
>> ipc_lock_object();
>
> Yeah, I noticed that, but it's not an error. In the older code we have
>
> rcu_read_lock (Lockless receive, part 1)
> [...]
> /* Lockless receive, part 3:
> * Acquire the queue spinlock.
> */
> ipc_lock_by_ptr(&msq->q_perm);
> rcu_read_unlock();
>
>
> Which translates to:
> rcu_read_lock (Lockless receive, part 1)
> [...]
> /* Lockless receive, part 3:
> * Acquire the queue spinlock.
> */
> rcu_read_lock();
> ipc_lock_object();
> rcu_read_unlock();
>
> And thus, after that last rcu_read_unlock we are left with
> rcu_read_lock()
> ipc_lock_object();
>
> If you notice, that's exactly what is done in the new code, only much
> more readable: We do rcu_read_lock in the part 1, then in part 3, we
> acquire the spinlock via ipc_lock_object(&msq->q_perm)
>

OK.

AFAICS some comments has to be refreshed.

/* Lockless receive, part 1:
* Disable preemption. We don't hold a reference to the queue
* and getting a reference would defeat the idea of a lockless
* operation, thus the code relies on rcu to guarantee the
* existence of msq:
* Prior to destruction, expunge_all(-EIRDM) changes r_msg.
* Thus if r_msg is -EAGAIN, then the queue not yet destroyed.
* rcu_read_lock() prevents preemption between reading r_msg
* and the spin_lock() inside ipc_lock_by_ptr().

...as there is no usage of ipc_lock_by_ptr().

NO success with that:

--- a/ipc/msg.c
+++ b/ipc/msg.c
@@ -983,6 +983,7 @@ long do_msgrcv(int msqid, void __user *buf, size_t
bufsz, long msgtyp, int msgfl
* Acquire the queue spinlock.
*/
ipc_lock_object(&msq->q_perm);
+ rcu_read_unlock();

/* Lockless receive, part 4:
* Repeat test after acquiring the spinlock.

- Sedat -

>
>> >>
>> >> NOTE: msg_lock_(check_) routines have to be restored (one more revert needed)!
>> >
>> > This I don't get. Restoring msg_lock_[check] is already equivalent to
>> > reverting "ipc,msg: shorten critical region in msgrcv" and several other
>> > of the msq patches. What other patch needs reverted?
>> >
>>
>> No, you have to revert both patches as the other removed
>> msg_lock_[check] afterwards.
>>
>> > Anyway, I'll see if I can reproduce the issue, maybe I'm missing
>> > something.
>> >
>>
>> Yupp, I try with adding rcu_read_unlock()... and report.
>>
>> - Sedat -
>>
>> > Thanks,
>> > Davidlohr
>> >
>> >>
>> >> Reverting both (below) commits makes fakeroot build via 'make dep-pkg" again.
>> >>
>> >> I have tested the revert-patches with next-20130606 and next-20130621
>> >> (see file-attachments).
>> >>
>> >> My build-script is attached!
>> >>
>> >> Can someone of the IPC folks look at that?
>> >> Thanks!
>> >>
>> >> - Sedat -
>> >>
>> >>
>> >> P.S.: Commit-IDs listed below.
>> >>
>> >> [ next-20130606 ]
>> >>
>> >> http://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/log/?id=next-20130606
>> >>
>> >> "ipc: remove unused functions"
>> >> http://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=8793fdfb0d0a6ed5916767e29a15d3eb56e04e79
>> >>
>> >> "ipc,msg: shorten critical region in msgrcv"
>> >> http://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=c0ff93322847a54f74a5450032c4df64c17fdaed
>> >>
>> >> [ next-20130621 ]
>> >>
>> >> http://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/log/?id=next-20130621
>> >>
>> >> "ipc: remove unused functions"
>> >> http://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=941ce57c81dcceadf55265616ee1e8bef18b0ad3
>> >>
>> >> "ipc,msg: shorten critical region in msgrcv"
>> >> http://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=62190df4081ee8504e3611d45edb40450cb408ac
>> >
>> >
>
>

2013-06-21 23:15:10

by Sedat Dilek

[permalink] [raw]
Subject: Re: linux-next: Tree for Jun 21 [ BROKEN ipc/ipc-msg ]

On Sat, Jun 22, 2013 at 12:54 AM, Sedat Dilek <[email protected]> wrote:
> On Sat, Jun 22, 2013 at 12:07 AM, Davidlohr Bueso
> <[email protected]> wrote:
>> On Fri, 2013-06-21 at 21:34 +0200, Sedat Dilek wrote:
>>> On Fri, Jun 21, 2013 at 10:17 AM, Stephen Rothwell <[email protected]> wrote:
>>> > Hi all,
>>> >
>>> > Happy solstice!
>>> >
>>> > Changes since 20130620:
>>> >
>>> > Dropped tree: mailbox (really bad merge conflicts with the arm-soc tree)
>>> >
>>> > The net-next tree gained a conflict against the net tree.
>>> >
>>> > The leds tree still had its build failure, so I used the version from
>>> > next-20130607.
>>> >
>>> > The arm-soc tree gained conflicts against the tip, net-next, mfd and
>>> > mailbox trees.
>>> >
>>> > The staging tree still had its build failure for which I disabled some
>>> > code.
>>> >
>>> > The akpm tree lost a few patches that turned up elsewhere and gained
>>> > conflicts against the ftrace and arm-soc trees.
>>> >
>>> > ----------------------------------------------------------------------------
>>> >
>>>
>>> [ CC IPC folks ]
>>>
>>> Building via 'make deb-pkg' with fakeroot fails here like this:
>>>
>>> make: *** [deb-pkg] Terminated
>>> /usr/bin/fakeroot: line 181: 2386 Terminated
>>> FAKEROOTKEY=$FAKEROOTKEY LD_LIBRARY_PATH="$PATHS" LD_PRELOAD="$LIB"
>>> "$@"
>>> semop(1): encountered an error: Identifier removed
>>> semop(2): encountered an error: Invalid argument
>>> semop(1): encountered an error: Identifier removed
>>> semop(1): encountered an error: Identifier removed
>>> semop(1): encountered an error: Invalid argument
>>> semop(1): encountered an error: Invalid argument
>>> semop(1): encountered an error: Invalid argument
>>>
>>
>> Hmmm those really shouldn't be related to the message queue changes. Are
>> you sure you got the right bisect?
>>
>> Manfred has a few ipc/sem.c patches in linux-next, starting at commit
>> c50df1b4 (ipc/sem.c: cacheline align the semaphore structures), does
>> reverting any of those instead of "ipc,msg: shorten critical region in
>> msgrcv" help at all? Also, anything reported in dmesg?
>>
>
> First, I reverted all IPC patches from akpm-tree within -next.
> Then, I isolated the culprit by git-bisecting.
> As I checked my logs I did not see anything helpful.
>
>>> The issue is present since next-20130606!
>>>
>>> LAST KNOWN GOOD: next-20130605
>>> FIRST KNOWN BAD: next-20130606
>>>
>>> KNOWN GOOD: next-20130604
>>> KNOWN BAD: next-20130607 || next-20130619 || next-20130620 || next-20130621
>>>
>>> git-bisect says CULPRIT commit is...
>>>
>>> "ipc,msg: shorten critical region in msgrcv"
>>
>> This I get. I went through the code again and it looks correct and
>> functionally equivalent to the old msgrcv.
>>
>
> Hmm, I guess a rcu_read_unlock() is missing?
>
> [ next-20130605 ]
> ...
> /* Lockless receive, part 3:
> * Acquire the queue spinlock.
> */
> ipc_lock_by_ptr(&msq->q_perm);
> rcu_read_unlock();
> ...
> [ next-20130621 ]
> ...
> /* Lockless receive, part 3:
> * Acquire the queue spinlock.
> */
> ipc_lock_object(&msq->q_perm);
> ...
>
> Whereas ipc_lock_by_ptr() is equivalent to:
> rcu_read_lock();
> ipc_lock_object();
>
>>>
>>> NOTE: msg_lock_(check_) routines have to be restored (one more revert needed)!
>>
>> This I don't get. Restoring msg_lock_[check] is already equivalent to
>> reverting "ipc,msg: shorten critical region in msgrcv" and several other
>> of the msq patches. What other patch needs reverted?
>>
>
> No, you have to revert both patches as the other removed
> msg_lock_[check] afterwards.
>
>> Anyway, I'll see if I can reproduce the issue, maybe I'm missing
>> something.
>>
>
> Yupp, I try with adding rcu_read_unlock()... and report.
>
> - Sedat -
>
>> Thanks,
>> Davidlohr
>>
>>>
>>> Reverting both (below) commits makes fakeroot build via 'make dep-pkg" again.
>>>
>>> I have tested the revert-patches with next-20130606 and next-20130621
>>> (see file-attachments).
>>>
>>> My build-script is attached!
>>>
>>> Can someone of the IPC folks look at that?
>>> Thanks!
>>>
>>> - Sedat -
>>>
>>>
>>> P.S.: Commit-IDs listed below.
>>>
>>> [ next-20130606 ]
>>>
>>> http://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/log/?id=next-20130606
>>>
>>> "ipc: remove unused functions"
>>> http://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=8793fdfb0d0a6ed5916767e29a15d3eb56e04e79
>>>
>>> "ipc,msg: shorten critical region in msgrcv"
>>> http://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=c0ff93322847a54f74a5450032c4df64c17fdaed
>>>
>>> [ next-20130621 ]
>>>
>>> http://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/log/?id=next-20130621
>>>
>>> "ipc: remove unused functions"
>>> http://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=941ce57c81dcceadf55265616ee1e8bef18b0ad3
>>>
>>> "ipc,msg: shorten critical region in msgrcv"
>>> http://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=62190df4081ee8504e3611d45edb40450cb408ac
>>
>>