Received: by 2002:a05:6a10:144:0:0:0:0 with SMTP id 4csp618422pxw; Fri, 8 Apr 2022 16:59:18 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwW1jLEQlQmXnvupp1csviCmPp60amsgzFTsBfJ2ZBHFo4BRifncOjv9MCoXH0xf2U6uxCF X-Received: by 2002:a17:906:9754:b0:6da:7d72:1353 with SMTP id o20-20020a170906975400b006da7d721353mr20715644ejy.273.1649462358353; Fri, 08 Apr 2022 16:59:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1649462358; cv=none; d=google.com; s=arc-20160816; b=nVZnbHjv4NRYNFLjAO/PbKQIMSLXbWIBMOUJJR75qS1mZkuEzf6FyLAz5fdeN+4jOP xnUuNTie8K68RjU5aK9Z2zO4+opLTRl+WHGaCF3QggKAQPd47ncoftpwh4vy63JMXRcE 1lYP1XvlxaZfj0qgTFCgBXdSxO3Ybdgf1aL6v2xUKmnYJlUa9yRc13Q0sRQSp8gqGsEn 1w2O2Esr5TAI56vDZyfK4h7us845K7jHhhBGQ0CgIXpb4ChpIJ19oZymlZWqy3WfX5Mw kNb8WBZLoK/IMYscT4KkSJiAjjVX5pdSOIkK5DANE2l7Uh2Ngs9Ih1ztNZ+q+6BLB4zZ rgdg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:message-id:date:references :in-reply-to:subject:cc:to:dkim-signature:dkim-signature:from; bh=G+kiGYFsPtc+olLWpLKsQjt4OwVjq7Vizjez6bSpAk8=; b=mg/gcMV9Xjp3zEAPf0qX4cAs+eeu2n/gshF1ZvYfpN1OF29mAGDHvI8tE35DrzXBOx HDA2cF5q5N6U1VOD2ZV3zH4GW/4Gkjjzao+jWL9hMKulLRe53FgCpVxJT+EuLG6gvIbI 0/D/2ij6wQLsW5RNbJRgnIoFm3y4ipyNN6zwPf7GidLkgieMrV7DHtuB2WevifCRgpJK cxXI3nVq/dNJNpl4TcAblMk4YhnXcaa1UlhD0taqfseRUQkkLaba+zxu2pGN1Ycaoyhw zrlta+KjVUcW9pB+YWkXBTtcwwfr3Cod88WPWdxTrNzi75lOdQZFaPT9MSYEp3dTxLr4 LxNg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=RBLOT5OG; dkim=neutral (no key) header.i=@linutronix.de; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id y17-20020a056402441100b0041cbe834f81si2542842eda.581.2022.04.08.16.58.45; Fri, 08 Apr 2022 16:59:18 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=RBLOT5OG; dkim=neutral (no key) header.i=@linutronix.de; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236594AbiDHN4f (ORCPT + 99 others); Fri, 8 Apr 2022 09:56:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55810 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236306AbiDHN4c (ORCPT ); Fri, 8 Apr 2022 09:56:32 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9496110241B for ; Fri, 8 Apr 2022 06:54:28 -0700 (PDT) From: Thomas Gleixner DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1649426066; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=G+kiGYFsPtc+olLWpLKsQjt4OwVjq7Vizjez6bSpAk8=; b=RBLOT5OG4p7NYsM9dNROuyxfEHNEE3mn3WNgdPdVgyv6vjE/xrGOZw4UWwwWTSQYZDiZXM 5M0wcFjsv1X+ipAcFQxxnkA1GdUnVoQq5rKccY96Y1ByTBVX7JOJgxONJWPYJFfO4Ym8DZ 0le6gjGxjCJ7OgWe5gxk965V/A45DVkraEmrCno4vrz6Sq5UeD0TvYqYpMCPpwqPd3aP6C 07LJCaNXcFvd18eJGwPmfCmOO6QV1CjIqaAh3wMhfRJ6/55YXB57wPQ6V4POIQhWWQ2ddw KF3xKQGQt9KHq+oq6vTPwGY3qwYuZ9PBM1BYoXeMiMEFylmzXBLsGqbBEI1vHg== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1649426066; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=G+kiGYFsPtc+olLWpLKsQjt4OwVjq7Vizjez6bSpAk8=; b=nA1/WsFN4QQzqLseatdRzX59vbePCAD7ulKVEJauimUNMR7WEhQfkSpDQw4AHcipeY+QIj Rdc2mUeAwF0iJkBQ== To: Nico Pache , Peter Zijlstra Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Rafael Aquini , Waiman Long , Baoquan He , Christoph von Recklinghausen , Don Dutile , "Herton R . Krzesinski" , David Rientjes , Michal Hocko , Andrea Arcangeli , Andrew Morton , Davidlohr Bueso , Ingo Molnar , Joel Savitz , Darren Hart , stable@kernel.org Subject: Re: [PATCH v8] oom_kill.c: futex: Don't OOM reap the VMA containing the robust_list_head In-Reply-To: References: <20220408032809.3696798-1-npache@redhat.com> <20220408081549.GM2731@worktop.programming.kicks-ass.net> Date: Fri, 08 Apr 2022 15:54:26 +0200 Message-ID: <87k0bzk7e5.ffs@tglx> MIME-Version: 1.0 Content-Type: text/plain X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Apr 08 2022 at 04:41, Nico Pache wrote: > On 4/8/22 04:15, Peter Zijlstra wrote: >>> >>> The following case can still fail: >>> robust head (skipped) -> private lock (reaped) -> shared lock (skipped) >> >> This is still all sorts of confused.. it's a list head, the entries can >> be in any random other VMA. You must not remove *any* user memory before >> doing the robust thing. Not removing the VMA that contains the head is >> pointless in the extreme. > Not sure how its pointless if it fixes all the different reproducers we've > written for it. As for the private lock case we stated here, we havent been able > to reproduce it, but I could see how it can be a potential issue (which is why > its noted). The below reproduces the problem nicely, i.e. the lock() in the parent times out. So why would the OOM killer fail to cause the same problem when it reaps the private anon mapping where the private futex sits? If you revert the lock order in the child the robust muck works. Thanks, tglx --- #include #include #include #include #include #include #include #include #include static char n[4096]; int main(void) { pthread_mutexattr_t mat_s, mat_p; pthread_mutex_t *mut_s, *mut_p; pthread_barrierattr_t ba; pthread_barrier_t *b; struct timespec to; void *pri, *shr; int r; shr = mmap(NULL, sizeof(n), PROT_READ | PROT_WRITE, MAP_SHARED | MAP_ANONYMOUS, -1, 0); pthread_mutexattr_init(&mat_s); pthread_mutexattr_setrobust(&mat_s, PTHREAD_MUTEX_ROBUST); mut_s = shr; pthread_mutex_init(mut_s, &mat_s); pthread_barrierattr_init(&ba); pthread_barrierattr_setpshared(&ba, PTHREAD_PROCESS_SHARED); b = shr + 1024; pthread_barrier_init(b, &ba, 2); if (!fork()) { pri = mmap(NULL, 1<<20, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); pthread_mutexattr_init(&mat_p); pthread_mutexattr_setpshared(&mat_p, PTHREAD_PROCESS_PRIVATE); pthread_mutexattr_setrobust(&mat_p, PTHREAD_MUTEX_ROBUST); mut_p = pri; pthread_mutex_init(mut_p, &mat_p); // With lock order s, p parent gets timeout // With lock order p, s parent gets owner died pthread_mutex_lock(mut_s); pthread_mutex_lock(mut_p); // Remove unmap and lock order does not matter munmap(pri, sizeof(n)); pthread_barrier_wait(b); printf("child gone\n"); } else { pthread_barrier_wait(b); printf("parent lock\n"); clock_gettime(CLOCK_REALTIME, &to); to.tv_sec += 1; r = pthread_mutex_timedlock(mut_s, &to); printf("parent lock returned: %s\n", strerror(r)); } return 0; }