Received: by 2002:a25:e74b:0:0:0:0:0 with SMTP id e72csp1541502ybh; Thu, 23 Jul 2020 11:24:05 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxe5Sh31PUdapp9LfITDepyDE2GES/b7gMo7ljS8BX1iNxFfRvulIj4iBsOv+kqedjMFi3q X-Received: by 2002:a17:906:240d:: with SMTP id z13mr5549843eja.346.1595528645057; Thu, 23 Jul 2020 11:24:05 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1595528645; cv=none; d=google.com; s=arc-20160816; b=WAkIss14DWyWLjYiUtlF+UGkaN0AVhEm6SqRy+mYirGJM09V01P+3QEyqrgkMLb9vS uCnvgwXF7QJs7Zqy+urLCkeJF57uTReHfJ7/dsWliqnKrGDL9JiQrigUk+nesvpY1bl+ nGSrr5lPZMFk7iC/3m5+64kRzCi3i/Wd4eOK/7KSz5R+J1tXJMTG3xTUf7q9R6GMA55F ryJsdLNwe8+c9rcr9h6pHbYesQvBafJXAmOr77uvHHbwaU6LEGwjjXZi2DAH6YK2WsWg YRIDUcQtakup7FUDe1Po3OylkJC9ncNXNhMwLVQpR+02cj1lCB+hubWkbInr4ApnNGiJ YfFQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=8jbqAu4MTtKZYfsDRzg9oxuVgmZXrDgAOWgETbdmt3I=; b=AKmbOxhy/bIcS76z5diswm9jXxr/Eh0U0ugBOfi+SMUoZx4RA71KLDLastM4Pixg5l pBq0RqohC/59DrMZ7WfIE+g5ipUb0jmSL7RmITEd6c39HsMgFe9te9+d56nq0j25oBpz 0ydFSNMYbijLWoST1J/E8C9vKFghBeRbGLKbCFsk8ZAV8q6O2ESyaplJ6lBEAg5qxCX3 zvxq711CCwjziQgHOZEDnrvtvKjRrypAAyCP6ms5Uj2UDHb/UfcrzjEpsTOfG61Pppfq O42PBcMeaccHMyxYvMmixJGsHDEMpo/yfIvP1l+CVfvrinSPmHQBvjweL9W102R+GmTR Mn5A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=google header.b=TpC1L9ye; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id c12si2929073edx.210.2020.07.23.11.23.42; Thu, 23 Jul 2020 11:24:05 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=google header.b=TpC1L9ye; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727774AbgGWSXG (ORCPT + 99 others); Thu, 23 Jul 2020 14:23:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57368 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726304AbgGWSXF (ORCPT ); Thu, 23 Jul 2020 14:23:05 -0400 Received: from mail-lj1-x242.google.com (mail-lj1-x242.google.com [IPv6:2a00:1450:4864:20::242]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 949B6C0619DC for ; Thu, 23 Jul 2020 11:23:05 -0700 (PDT) Received: by mail-lj1-x242.google.com with SMTP id j11so7376381ljo.7 for ; Thu, 23 Jul 2020 11:23:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=8jbqAu4MTtKZYfsDRzg9oxuVgmZXrDgAOWgETbdmt3I=; b=TpC1L9yeqU8VAOIGIaabNRD2URVdeVJkJUvan/PdnVoywn5ewbqyDjsAOR2s/j/716 6qzVRkeThyuPOPLm0TzlXk/CAB2wM0XXQkHKrpkXglrPHKkEtftSwCjuVRP3t7TqqESK mwqoKZRybu74iu0fz+q3OrSQs3NydidhrfzPo= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=8jbqAu4MTtKZYfsDRzg9oxuVgmZXrDgAOWgETbdmt3I=; b=OWh/NLUzlGXzyUZrB3xCzpcWTM9XCuVlM7aXa3sdHDm0LtVmCrF0QK9f9MC+nt3j0c AEn6fE5EN6lohs1B3U+eea7FUmqfDMq+yS7SjbUapMGZqMfeX8JbtqTiLc5VcgrVABu9 j8fgBFdpmQvTMC9c817O9fxJW+pTTBbi7BrOeAHXFSITI6Gt6jO30zW3QA5put/TXDtB P9boGbZDH0vsjDiUEBpCEFrNGhwZrY0oXiGgZcJt6vxfiLvOBQJwDnNrerwhdwG74KyV St83sYwa+L1MhSrAhCpVOkrtTOraDzWWUX5IC4Kg9H8YCtBwZFN5VbeYdXJzf0ewpsno XTxA== X-Gm-Message-State: AOAM530dKc8ipdm51KsN7rUbF7BpytMmRO5cypDbG3zPfFzEQA9jWqau RtOwwbPO8qJ2gJBM5UqdBPVB3eXaSWo= X-Received: by 2002:a2e:964d:: with SMTP id z13mr2521557ljh.98.1595528583240; Thu, 23 Jul 2020 11:23:03 -0700 (PDT) Received: from mail-lf1-f50.google.com (mail-lf1-f50.google.com. [209.85.167.50]) by smtp.gmail.com with ESMTPSA id 186sm3250414lfn.66.2020.07.23.11.23.01 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 23 Jul 2020 11:23:02 -0700 (PDT) Received: by mail-lf1-f50.google.com with SMTP id b11so3790216lfe.10 for ; Thu, 23 Jul 2020 11:23:01 -0700 (PDT) X-Received: by 2002:ac2:58d5:: with SMTP id u21mr2835310lfo.31.1595528581489; Thu, 23 Jul 2020 11:23:01 -0700 (PDT) MIME-Version: 1.0 References: <20200721063258.17140-1-mhocko@kernel.org> <20200723124749.GA7428@redhat.com> <20200723180100.GA21755@redhat.com> In-Reply-To: <20200723180100.GA21755@redhat.com> From: Linus Torvalds Date: Thu, 23 Jul 2020 11:22:44 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [RFC PATCH] mm: silence soft lockups from unlock_page To: Oleg Nesterov Cc: Hugh Dickins , Michal Hocko , Linux-MM , LKML , Andrew Morton , Tim Chen , Michal Hocko Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jul 23, 2020 at 11:01 AM Oleg Nesterov wrote: > > > + * > > + * We _really_ should have a "list_del_init_careful()" to > > + * properly pair with the unlocked "list_empty_careful()" > > + * in finish_wait(). > > + */ > > + smp_mb(); > > + list_del_init(&wait->entry); > > I think smp_wmb() would be enough, but this is minor. Well, what we _really_ want (and what that comment is about) would be got that list_del_init_careful() to use a "smp_store_release()" for the last store, and then "list_empty_careful()" would use a "smp_load_acquire()" for the corresponding first load. On x86, that's free. On most other architectures, it's the minimal ordering requirement. And no, I don't think "smp_wmb()" is technically enough. With crazy memory ordering, one of the earlier *reads* (eg loading "wait->private" when waking things up) could have been delayed until after the stores that initialize the list - and thus read stack contents from another process after it's been released and re-used. Does that happen in reality? No. There are various conditionals in there which means that the stores end up being gated on the loads and cannot actually be re-ordered, but it's the kind of subtley So we actually do want to constrain all earlier reads and writes wrt the final write. Which is exactly what "smp_store_release()" does. But with our current list_empty_careful(), the smp_mb() is I think technically sufficient. > We need a barrier between "wait->flags |= WQ_FLAG_WOKEN" and list_del_init(), See above: we need more than just that write barrier, although in _practice_ you're right, and the other barriers actually all already exist and are part of wake_up_state(). So the smp_mb() is unnecessary, and in fact your smp_wmb() would be too. But I left it there basically as "documentation". > But afaics we need another barrier, rmb(), in wait_on_page_bit_common() fo > the case when wait->private was not blocked; we need to ensure that if > finish_wait() sees list_empty_careful() == T then we can't miss WQ_FLAG_WOKEN. Again, this is what a proper list_empty_careful() with a smp_load_acquire() would have automatically gotten for us. But yes, I think that without that, and with the explicit barriers, we need an smp_rmb() after the list_empty_careful(). I really think it should be _in_ list_empty_careful(), though. Or maybe finish_wait(). Hmm. Because looking at all the other finish_wait() uses, the fact that the waitqueue _list_ is properly ordered isn't really a guarantee of the rest of the stack space is. In practice, it will be, but I think this lack of serialization is a potential real bug elsewhere too. (Obviously none of this would show on x86, where we already *get* that smp_store_release/smp_load_acquire behavior for the existing list_del_init()/list_empty_careful(), since all stores are releases, and all loads are acquires) So I think that is a separate issue, generic to our finish_wait() uses. Linus