Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932462Ab3ICHQx (ORCPT ); Tue, 3 Sep 2013 03:16:53 -0400 Received: from mail-wi0-f182.google.com ([209.85.212.182]:65114 "EHLO mail-wi0-f182.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1759669Ab3ICHQu (ORCPT ); Tue, 3 Sep 2013 03:16:50 -0400 MIME-Version: 1.0 Reply-To: sedat.dilek@gmail.com In-Reply-To: <5224BCF6.2080401@colorfullife.com> References: <1372192414.1888.8.camel@buesod1.americas.hpqcorp.net> <1372202983.1888.22.camel@buesod1.americas.hpqcorp.net> <521DE5D7.4040305@synopsys.com> <52205597.3090609@synopsys.com> <5224BCF6.2080401@colorfullife.com> Date: Tue, 3 Sep 2013 09:16:48 +0200 Message-ID: Subject: Re: ipc-msg broken again on 3.11-rc7? From: Sedat Dilek To: Manfred Spraul Cc: Vineet Gupta , Linus Torvalds , Davidlohr Bueso , Davidlohr Bueso , linux-next , LKML , Stephen Rothwell , Andrew Morton , linux-mm , Andi Kleen , Rik van Riel , Jonathan Gonzalez Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3441 Lines: 96 On Mon, Sep 2, 2013 at 6:29 PM, Manfred Spraul wrote: > Hi, > > [forgot to cc everyone, thus I'll summarize some mails...] > > On 09/02/2013 06:58 AM, Vineet Gupta wrote: >> >> On 08/31/2013 11:20 PM, Linus Torvalds wrote: >>> >>> Vineet, actual patch for what Davidlohr suggests attached. Can you try >>> it? >>> >>> Linus >> >> Apologies for late in getting back to this - I was away from my computer >> for a bit. >> >> Unfortunately, with a quick test, this patch doesn't help. >> FWIW, this is latest mainline (.config attached). >> >> Let me know what diagnostics I can add to help with this. > > > msgctl08 is a bulk message send/receive test. I had to look at it once > before, then it was a broken hardware: > https://lkml.org/lkml/2008/6/12/365 > This can be ruled out, because it works with 3.10. > > msgctl08 uses pairs of threads: one thread does msgsnd(), the other one > msgrcv(). > There is no synchronization, i.e. the msgsnd() can race ahead until the > kernel buffer is full and then a block with msgrcv() follows or it could be > pairs of alternating msgsnd()/msgrcv() operations. > No special features are used: each pair of threads has it's own message > queues, all messages have type=1. > > Vineet ran strace - and just before the signal from killing msgctl08, there > are only msgsnd()/msgrcv() calls. > Vineet: > a) could you run strace tomorrow again, with '-ttt' as an additional option? > I don't see where exactly it hangs. > b) Could you check that it is not just a performance regression? > Does ./msgctl08 1000 16 hang, too? > > In ipc/msg.c, I haven't seen any obvious reason why it should hang. > The only race I spotted so far is this one: >> >> for (;;) { >> struct msg_sender s; >> >> err = -EACCES; >> if (ipcperms(ns, &msq->q_perm, S_IWUGO)) >> goto out_unlock1; >> >> >> err = security_msg_queue_msgsnd(msq, msg, msgflg); >> if (err) >> goto out_unlock1; >> >> if (msgsz + msq->q_cbytes <= msq->q_qbytes && >> 1 + msq->q_qnum <= msq->q_qbytes) { >> break; >> } >> > [snip] >> >> if (!pipelined_send(msq, msg)) { >> /* no one is waiting for this message, enqueue it */ >> list_add_tail(&msg->m_list, &msq->q_messages); >> msq->q_cbytes += msgsz; >> msq->q_qnum++; >> atomic_add(msgsz, &ns->msg_bytes); > > > The access to msq->q_cbytes is not protected. Thus two parallel msgsnd() > calls could succeed, even if both together brings the queue length above the > limit. > But it can't explain why 3.11-rc7 hangs: As explained above, msgctl08 uses > one queue for each thread pair. > Just FYI: Linux Testing Project (LTP) will do a new release in the 1st September week. Some IPC test-suites were reworked. Manfred can you look at them ("...msgctl08 uses one queue for each thread pair."). ( Might be worth to throw some words at the LTP mailing-list (that test-case is not ideal, etc.)? ) - Sedat - -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/