Received: by 2002:a25:b323:0:0:0:0:0 with SMTP id l35csp1785660ybj; Sun, 22 Sep 2019 11:53:45 -0700 (PDT) X-Google-Smtp-Source: APXvYqxF43TQ89TWyJLZixQcTvjEi644ASHPz2me9AsG4VgnUuhfKAv911PcvdkDagvF59e3qv/Q X-Received: by 2002:a05:6402:1426:: with SMTP id c6mr33976468edx.53.1569178425564; Sun, 22 Sep 2019 11:53:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1569178425; cv=none; d=google.com; s=arc-20160816; b=DhpKkhWrFOOv3w/fKPQ3K3cY1t1326YlAjMTnLTK2/PGz3mOBwtya4+MS7hZpEboP1 JQkM4DXPXMsu99OYZC77/V5RWFPPOEstNFomvAHmKfwJN4sNzL+tK+QSrILMHuTCJ+Fn +PsEq/xrEi0VbvqWxp5x2iOLFaABSNy08d4h8+9J04HVjRoylE70vGHl7GoVRlGBIRAe AwJ0QFyFgYWpIEz0zmIe/an2sBbq8svJOIimWhC6BXaEKqaL6uyCR+wfXoVwZWDZ+p2z OiFqa8LFsiPCVyv6ab7Akklrn1cO27Wq8vQYjVhGkP1FrUN/3eLPxtEUEgdlxufksaof du8A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:date:subject:cc:to:from; bh=RBF07pTHDzJv8voLfGMmP4umHo0l7s/ElFpxVNGXGG4=; b=xZvwaFZI6Rw1PUzfvCSmDA1OYiaW/1bXylAVUT88sd73VC4dWc46ubQoU6UllAChXu /FU4Ciu71oParMk/2HDtwjTFJE9FYlDAXk0sX7tve812sgxdzjhAiqypzDwOkB4dnUdo Tt+TcH+586BkgIb3O5gp/zbP2wA52PH0lq1wkZKXgRANWFOszZYOtiOKngxfMeQxt38M 2VbTF2FJkpttw13VQeyJ4uj+STGt/3a0DHHpf7YZhlk8nuijMotFltGNDQRdkg1ZKiTd 7+sEsaCyLLZduI8LSSAeei+uMRGnUUXZq/r4LpFEGP0MPwCG44EMG4v9/VFNzUBs1eDI 3DIA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d23si5231298edb.55.2019.09.22.11.53.21; Sun, 22 Sep 2019 11:53:45 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2404711AbfITPyY (ORCPT + 99 others); Fri, 20 Sep 2019 11:54:24 -0400 Received: from mx1.redhat.com ([209.132.183.28]:44080 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2404245AbfITPyY (ORCPT ); Fri, 20 Sep 2019 11:54:24 -0400 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id F302B8A1C94; Fri, 20 Sep 2019 15:54:23 +0000 (UTC) Received: from llong.com (ovpn-122-210.rdu2.redhat.com [10.10.122.210]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4860160C18; Fri, 20 Sep 2019 15:54:20 +0000 (UTC) From: Waiman Long To: Andrew Morton Cc: linux-kernel@vger.kernel.org, Arnd Bergmann , "Gustavo A. R. Silva" , Catalin Marinas , Mathieu Malaterre , Peter Zijlstra , Davidlohr Bueso , Waiman Long Subject: [PATCH] ipc/sem: Fix race between to-be-woken task and waker Date: Fri, 20 Sep 2019 11:54:02 -0400 Message-Id: <20190920155402.28996-1-longman@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.6.2 (mx1.redhat.com [10.5.110.69]); Fri, 20 Sep 2019 15:54:24 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org While looking at a customr bug report about potential missed wakeup in the system V semaphore code, I spot a potential problem. The fact that semaphore waiter stays in TASK_RUNNING state while checking queue status may lead to missed wakeup if a spurious wakeup happens in the right moment as try_to_wake_up() will do nothing if the task state isn't right. To eliminate this possibility, the task state is now reset to TASK_INTERRUPTIBLE immediately after wakeup before checking the queue status. This should eliminate the race condition on the interaction between the queue status and the task state and fix the potential missed wakeup problem. Signed-off-by: Waiman Long --- ipc/sem.c | 21 ++++++++++++++++++++- 1 file changed, 20 insertions(+), 1 deletion(-) diff --git a/ipc/sem.c b/ipc/sem.c index 7da4504bcc7c..1bcd424be047 100644 --- a/ipc/sem.c +++ b/ipc/sem.c @@ -2146,11 +2146,11 @@ static long do_semtimedop(int semid, struct sembuf __user *tsops, sma->complex_count++; } + __set_current_state(TASK_INTERRUPTIBLE); do { WRITE_ONCE(queue.status, -EINTR); queue.sleeper = current; - __set_current_state(TASK_INTERRUPTIBLE); sem_unlock(sma, locknum); rcu_read_unlock(); @@ -2159,6 +2159,24 @@ static long do_semtimedop(int semid, struct sembuf __user *tsops, else schedule(); + /* + * A spurious wakeup at the right moment can cause race + * between the to-be-woken task and the waker leading to + * missed wakeup. Setting state back to TASK_INTERRUPTIBLE + * before checking queue.status will ensure that the race + * won't happen. + * + * CPU0 CPU1 + * + * wake_up_sem_queue_prepare(): + * state = TASK_INTERRUPTIBLE status = error + * try_to_wake_up(): + * smp_mb() smp_mb() + * if (status == -EINTR) if (!(p->state & state)) + * schedule() goto out + */ + set_current_state(TASK_INTERRUPTIBLE); + /* * fastpath: the semop has completed, either successfully or * not, from the syscall pov, is quite irrelevant to us at this @@ -2210,6 +2228,7 @@ static long do_semtimedop(int semid, struct sembuf __user *tsops, sem_unlock(sma, locknum); rcu_read_unlock(); out_free: + __set_current_state(TASK_RUNNING); if (sops != fast_sops) kvfree(sops); return error; -- 2.18.1