Received: by 10.192.165.148 with SMTP id m20csp3896084imm; Mon, 23 Apr 2018 14:33:51 -0700 (PDT) X-Google-Smtp-Source: AIpwx49VoMX77KIe+GIGPnRNSAFG+5Sn/6b1Re6+XrCXSWGkdw7HsO0eahqt6dkiVVx0p5xihSWV X-Received: by 10.101.91.7 with SMTP id y7mr18011078pgq.396.1524519231718; Mon, 23 Apr 2018 14:33:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1524519231; cv=none; d=google.com; s=arc-20160816; b=oDBpk5SbR1llkKGEMNeAi+pbWoG+Ope6cUYJJvbXYg/zJGtBbNTb5e3MqXub2mEOzL aywgINj1Bn9yyMTqUBbRHhLjDLhldlb7f71/FbERUX1IBFdJE/O2mL1Hy5pyrtzp35Df qbqjr6Dk5ij5jZ78BMgmqLfN8yGIxl4VbroTivwjhAHx9LSri8MikDf98O8nHXC/6MHv ndowwgX9MkSPv0CQnjxTNAa0dYUmNCmCjtCPqijK7yrzBuxZHBEIOPAHWhrFuwxN3cDD Y9/AFqBpsORnRP3p2fxAU7OMrvaeTt1sy8L4PEGQh00Na0wmC6OeQ7dmT1hruAJ5DxCL volQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-language :content-transfer-encoding:in-reply-to:mime-version:user-agent:date :message-id:organization:from:references:cc:to:subject :arc-authentication-results; bh=DudK83TbZEcVY4BdlchUEHZlrrBTb1MjUmgkSiqplvw=; b=EmMFr4/ggJhmg73vVAS1YX93fT60X0MOhOZeEt+OT9LCJNph7hQD8FYkpvRfB7Iorj z3mCT4VH0xZfxOU+cxxQvmpOqtClL62z/J3jSg+jcahsjrIMPGUFAQocFobHJTFBOA8C kjBL8Uc9LcUtQTnlK3h5/ckHqAdjEua6bJ4qjigse0cm30I+hkEWhhMXoYuXozLzSJRM RHMe8VhckEPpgg/idBYKcJgLM9K0VvkH4Qofrrngplmmrjzr8Lm57JqqyRChqly2F7W8 /+i5opc3uuuc47wAoFanRVstiyo1iib6qWjpshfskc95s/FSnBxMA3IYKdDKVl+FMCc9 03hg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id f91-v6si12542860plf.23.2018.04.23.14.33.37; Mon, 23 Apr 2018 14:33:51 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932463AbeDWVa7 convert rfc822-to-8bit (ORCPT + 99 others); Mon, 23 Apr 2018 17:30:59 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:40526 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S932368AbeDWVau (ORCPT ); Mon, 23 Apr 2018 17:30:50 -0400 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 4FB66401DEAC; Mon, 23 Apr 2018 21:30:50 +0000 (UTC) Received: from llong.remote.csb (ovpn-120-209.rdu2.redhat.com [10.10.120.209]) by smtp.corp.redhat.com (Postfix) with ESMTP id 02F912026DFD; Mon, 23 Apr 2018 21:30:48 +0000 (UTC) Subject: Re: [PATCH] locking/rwsem: Synchronize task state & waiter->task of readers To: Andrea Parri Cc: Peter Zijlstra , Ingo Molnar , linux-kernel@vger.kernel.org, Dave Chinner , Eric Sandeen , "Paul E. McKenney" References: <1523380938-19462-1-git-send-email-longman@redhat.com> <6c112ecb-d662-b1fc-152a-32060ec46dae@redhat.com> <20180423205514.GA5876@andrea> From: Waiman Long Organization: Red Hat Message-ID: <4661980a-ad4a-1578-0b8f-5b6788912582@redhat.com> Date: Mon, 23 Apr 2018 17:30:49 -0400 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.0 MIME-Version: 1.0 In-Reply-To: <20180423205514.GA5876@andrea> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8BIT Content-Language: en-US X-Scanned-By: MIMEDefang 2.78 on 10.11.54.4 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.6]); Mon, 23 Apr 2018 21:30:50 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.6]); Mon, 23 Apr 2018 21:30:50 +0000 (UTC) for IP:'10.11.54.4' DOMAIN:'int-mx04.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'longman@redhat.com' RCPT:'' Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 04/23/2018 04:55 PM, Andrea Parri wrote: > Hi Waiman, > > On Mon, Apr 23, 2018 at 12:46:12PM -0400, Waiman Long wrote: >> On 04/10/2018 01:22 PM, Waiman Long wrote: >>> It was observed occasionally in PowerPC systems that there was reader >>> who had not been woken up but that its waiter->task had been cleared. > Can you provide more details about these observations? (links to LKML > posts, traces, applications used/micro-benchmarks, ...) They were customers reported problems on the RHEL7 kernel. Actually, we are not able to reproduce it in-house. From the symptom, it does look like a memory synchronization issue which I believe is also there in the upstream code. >>> One probable cause of this missed wakeup may be the fact that the >>> waiter->task and the task state have not been properly synchronized as >>> the lock release-acquire pair of different locks in the wakeup code path >>> does not provide a full memory barrier guarantee. > I guess that by the "pair of different locks" you mean (sem->wait_lock, > p->pi_lock), right? BTW, __rwsem_down_write_failed_common() is calling > wake_up_q() _before_ releasing the wait_lock: did you intend to exclude > this callsite? (why?) Yes, I am talking about sem->wait_lock and p->pi_lock. Right, there is an alternative reader wakeup path in __rwsem_down_write_failed_common() iin addition to unlock wakeup. I didn't purposely exclude that, but I think it has similar issue. I will clarify that in the next version. > >> So smp_store_mb() >>> is now used to set waiter->task to NULL to provide a proper memory >>> barrier for synchronization. > Mmh; the patch is not introducing an smp_store_mb()... My guess is that > you are thinking at the sequence: > > smp_store_release(&waiter->task, NULL); > [...] > smp_mb(); /* added with your patch */ > > or what am I missing? I actually thought about doing that at the beginning. The reasons why I changed my mind were: 1) If there are multiple readers ready to be woken up, you will have to issue multiple smp_mb instead of just 1 here. 2) smp_store_mb() is implemented with a smp_mb after the store. So it may not be able to ensure that setting the reader waiter to nil is the last operation seen by other CPUs as noted in the comment above smp_store_release(). For these 2 reasons, I decide to add an extra smp_mb at the end. Cheers, Longman