Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp2470366imu; Thu, 29 Nov 2018 05:38:07 -0800 (PST) X-Google-Smtp-Source: AFSGD/XLKHrYV0ln+aMhhViP/RO53So/SgXaka5Y00NEZb2DhPOlr30MrJ00WyKasbXVsp2AvpS/ X-Received: by 2002:a17:902:28c1:: with SMTP id f59mr1506280plb.37.1543498687278; Thu, 29 Nov 2018 05:38:07 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1543498687; cv=none; d=google.com; s=arc-20160816; b=NToEbo0CqgHPWAaL+X+S98Pb7AeoJX0+FMctcVyij4ZQza3qBtqplaW1NHiaPU11O4 J5SywCibFZBFogrme1AbOfbLTgXppJoYXhqkW/FDZYojyq3nmtUzq9PyRQv+lKsvhQ2z o9wFc4TvRwxqWotA+CsyY2lSP70Ztd56WCi3mXJS+VNFEg/wtI19zRRY7Lh4F1kn4Hop PWwEHxZIyPiRXBsXPZn7j7LEIpLchac2ZYr0nmA3OTfw4msGTEn5sZYqdl4JpneS/f5k h0vEvHMAvO4K8351Kla+R1XOHVvzo0MFeU3qeTuDfkKJdiI4K70IPNb5MLBF1mLTUPEF GQPQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:date:subject:cc:to:from :dkim-signature; bh=ND9ZMiFSs7gvb+oNn4AdICkUx/3oAt1Y/vutGLYwcH4=; b=x1Jr8W6kPf8uQmQFdfYVZp83OnGfnYGlMobqaGlF9zxXpsPV40jnNGpaFopmpnq40F IrdiA+oni8gnCeo1RfD25g180z4Ex1KcZ/DOXw2qFmkJbDRIfIJNrGJ52c7WqscSSQIy RyqLDUCMXUXAXCP1Yjrbckg418YKa/YGp6pCyY8RIRGSsl5obv8xkg0trr7Lz9TZ3iCi LspzZf88ML1u+kqWRXTwDjyvraQN7jWhAgxsMsxU0nT3ZPS9praouLiTkqxoGjbU5kur KRhJOuVuqojybTie2MfTouYTjhbXXYYVvK81ks6xFLvRHgfQDBimt5jJYI8uZtTcYK+c Nk8A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=GiHZmoCe; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id a3si2321791pld.252.2018.11.29.05.37.51; Thu, 29 Nov 2018 05:38:07 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=GiHZmoCe; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728269AbeK2Xzu (ORCPT + 99 others); Thu, 29 Nov 2018 18:55:50 -0500 Received: from mail-pf1-f196.google.com ([209.85.210.196]:35564 "EHLO mail-pf1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727402AbeK2Xzt (ORCPT ); Thu, 29 Nov 2018 18:55:49 -0500 Received: by mail-pf1-f196.google.com with SMTP id z9so977645pfi.2 for ; Thu, 29 Nov 2018 04:50:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id; bh=ND9ZMiFSs7gvb+oNn4AdICkUx/3oAt1Y/vutGLYwcH4=; b=GiHZmoCeUOvh+Ct/FMVj+KG2u4GTk6Mn6spQbmiuO6aQpiTZkM5n+M+/MKzgLSHpyw NauwfrpiS9V3tFLnZIFOPY+GkBratWag8FChqpsdPEE/hBl6ZKTDOTp/n2W0YsKo9p5t vAfC6rIbn/Gw1Yywll8P32WPGqdC+wSD/CyjJbSoIEKY7Inj4CgtPp5QVqYxxkh3slJR PwKwa8cKMGFD1zJxuQXGu2/+2D4KbVULJvIkaGZcx5ffoTMhZYEAjW3aHEDTR0vJZrAJ MA267UuQgWy5Waq4Bzl7wgDQhnHt3+JG1d/RmoG8tx19CwHCZF1YjUSvV+iecQOOyk46 V7zg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=ND9ZMiFSs7gvb+oNn4AdICkUx/3oAt1Y/vutGLYwcH4=; b=a0Rs5/PhqZVdNqWUQHWV1cfcWgjTpf72u0uTR1nQPX/g1J5p9yUnJCQLQwnQiTy1Z3 /p62k6j8nqPrxLEkWZ2SHC9UNzKV79Lny4kLw/DEeaB2TuXNVNAx8CTFL19SqfCaPdU/ vhLk7cl+qX+yX9H8JbrNZsZd1Mc8T7wt3SVIZdI+uVs07tRSTQaqtZA5GD1KyayDdcO7 XG18W5RXGN6dIJO6wcDBrCtgqe9xzwnjlHfEwTS81eno9jlre/HfW4fKen3AQt/UUdj4 rhpr9P1un5YkaJAf08O23ArDorTcLrkOz0lKItrklN2lowCJ0Ji+stZCU/vQUc+wSzsz CsxQ== X-Gm-Message-State: AA+aEWaGEhwOJaYeN0OlTFkQQB9KpHszxlC2fXj2yi54AuJ5waAEUeoq a4jIjYgr05YsZurnYSxx6SQ= X-Received: by 2002:a63:d818:: with SMTP id b24mr1116272pgh.174.1543495833760; Thu, 29 Nov 2018 04:50:33 -0800 (PST) Received: from localhost ([101.93.171.187]) by smtp.gmail.com with ESMTPSA id n186sm5287399pfn.137.2018.11.29.04.50.32 (version=TLS1_2 cipher=AES128-SHA bits=128/128); Thu, 29 Nov 2018 04:50:33 -0800 (PST) From: Yongji Xie X-Google-Original-From: Yongji Xie To: peterz@infradead.org, mingo@redhat.com, will.deacon@arm.com Cc: linux-kernel@vger.kernel.org, xieyongji@baidu.com, zhangyu31@baidu.com, liuqi16@baidu.com, yuanlinsi01@baidu.com, nixun@baidu.com, lilin24@baidu.com Subject: [RFC] locking/rwsem: Avoid issuing wakeup before setting the reader waiter to nil Date: Thu, 29 Nov 2018 20:50:30 +0800 Message-Id: <1543495830-2644-1-git-send-email-xieyongji@baidu.com> X-Mailer: git-send-email 1.7.9.5 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Xie Yongji Our system encountered a problem recently, the khungtaskd detected some process hang on mmap_sem. But the odd thing was that one task which is not on mmap_sem.wait_list still sleeps in rwsem_down_read_failed(). Through code inspection, we found a potential bug can lead to this. Imaging this: Thread 1 Thread 2 down_write(); rwsem_down_read_failed() raw_spin_lock_irq(&sem->wait_lock); list_add_tail(&waiter.list, &wait_list); raw_spin_unlock_irq(&sem->wait_lock); __up_write(); rwsem_wake(); __rwsem_mark_wake(); wake_q_add(); list_del(&waiter->list); waiter->task = NULL; while (true) { set_current_state(TASK_UNINTERRUPTIBLE); if (!waiter.task) // true break; } __set_current_state(TASK_RUNNING); Now Thread 1 is queued in Thread 2's wake_q without sleeping. Then Thread 1 call rwsem_down_read_failed() again because Thread 3 hold the lock, if Thread 3 tries to queue Thread 1 before Thread 2 do wakeup, it will fail and miss wakeup: Thread 1 Thread 2 Thread 3 down_write(); rwsem_down_read_failed() raw_spin_lock_irq(&sem->wait_lock); list_add_tail(&waiter.list, &wait_list); raw_spin_unlock_irq(&sem->wait_lock); __rwsem_mark_wake(); wake_q_add(); wake_up_q(); waiter->task = NULL; while (true) { set_current_state(TASK_UNINTERRUPTIBLE); if (!waiter.task) // false break; schedule(); } wake_up_q(&wake_q); In another word, that means we might issue the wakeup before setting the reader waiter to nil. If so, the wakeup may do nothing when it was called before reader set task state to TASK_UNINTERRUPTIBLE. Then we would have no chance to wake up the reader any more, and cause other writers such as "ps" command stuck on it. This patch is not verified because we still have no way to reproduce the problem. But I'd like to ask for some comments from community firstly. Signed-off-by: Xie Yongji Signed-off-by: Zhang Yu --- kernel/locking/rwsem-xadd.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c index 09b1800..50d9af6 100644 --- a/kernel/locking/rwsem-xadd.c +++ b/kernel/locking/rwsem-xadd.c @@ -198,15 +198,22 @@ static void __rwsem_mark_wake(struct rw_semaphore *sem, woken++; tsk = waiter->task; - wake_q_add(wake_q, tsk); + get_task_struct(tsk); list_del(&waiter->list); /* - * Ensure that the last operation is setting the reader + * Ensure calling get_task_struct() before setting the reader * waiter to nil such that rwsem_down_read_failed() cannot * race with do_exit() by always holding a reference count * to the task to wakeup. */ smp_store_release(&waiter->task, NULL); + /* + * Ensure issuing the wakeup (either by us or someone else) + * after setting the reader waiter to nil. + */ + wake_q_add(wake_q, tsk); + /* wake_q_add() already take the task ref */ + put_task_struct(tsk); } adjustment = woken * RWSEM_ACTIVE_READ_BIAS - adjustment; -- 2.2.3