Received: by 10.223.185.116 with SMTP id b49csp3643400wrg; Tue, 13 Feb 2018 05:36:22 -0800 (PST) X-Google-Smtp-Source: AH8x226bUWdvvs5mImyY1o1RvaHHwPgQ2hXSbx8OvES0GGH+uUEmPrHJ1vHPF3XcL2/duEOsGeCp X-Received: by 10.101.78.12 with SMTP id r12mr1025164pgt.33.1518528982548; Tue, 13 Feb 2018 05:36:22 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518528982; cv=none; d=google.com; s=arc-20160816; b=vb+joGKjwnE5I3Q4NVwmIf5bnWiCL0uVn+NBYkDx96rB077blMZd/yvr16e2ZGRQOe zBIBS1gmOFancUdi/i5aZEZSCkGbiG5b/Xlv9WmL4N12A3gpRnqNL4HFIA7f4DhTYpI9 8t/l8wkEruXYv0SnMMpK1ljWlUgPWTDBir0TQ1FB5On4VYPawRlPvWkBY4Hs9mKBTW5E IQuuY8SL5vP/yy+2DrPs421m3vkhQ8uE+Qap0iaEkaPS2R4Sj9FwZXit3Gdh9leRm6ek iNg0VpC0ufmjg/jMPwQ5OYmp8EaXKoHv7+P1ebL4TkoIK6oMitV7/dux/rAOZzS9Qued E4RA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:date:cc:to:from:subject:message-id :dkim-signature:arc-authentication-results; bh=af5Nwayn9PTLGQvRPPyCJFrHxcabvYxA3vhpI/pdNyM=; b=e/O6IRFK8Tlcdb13j6HgzcgrolgRJZRB9/mdCqFFHrNyJllOVDtP0X7mrjILhRfWcD 9rd3QGQIHU4x9/14xgiKelv5BjdV1qWRTMoSGjyxEfWcGnBpZMxjpXSPev/Xe/PQuBwX li5ks2aLhfecwEDkniIT+Cq0jjMf6+EFyWJPiuibEnma0yjEaKFzgI4qjzRo/wcmk15c hkYRiakv2rd2l5DpS/AUA+2nFKDgJgqfcPvTY9C6CrIUiSc6d6oHcHM32M4ZR0wlb1eb +POd1eKitfAMJKaNHHIaDkxGEpVDSnBRkHLYiROHeiGhaxGgnxkYJBzXcAVs4oG7JtMg eP0Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=rOlI0ZeY; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id q61-v6si5819566plb.183.2018.02.13.05.36.07; Tue, 13 Feb 2018 05:36:22 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=rOlI0ZeY; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S964929AbeBMNfb (ORCPT + 99 others); Tue, 13 Feb 2018 08:35:31 -0500 Received: from mail-it0-f65.google.com ([209.85.214.65]:51630 "EHLO mail-it0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S964776AbeBMNfa (ORCPT ); Tue, 13 Feb 2018 08:35:30 -0500 Received: by mail-it0-f65.google.com with SMTP id 193so8207755iti.1; Tue, 13 Feb 2018 05:35:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:subject:from:to:cc:date:in-reply-to:references :mime-version:content-transfer-encoding; bh=af5Nwayn9PTLGQvRPPyCJFrHxcabvYxA3vhpI/pdNyM=; b=rOlI0ZeYImexBY90GdEIxX3jwpHd4z6aGWwvXh9ESMZxfr6Oed33XZRZ72d61XWsLi JGBGgDHEuMoRGFFEaATICm27w4jHENqcO+CqSpdhmFh/rofWZDs8IuP+2RmVJxpO/6bG X2cp4wcyChfSEYLx7NFAk4sW8LX3QkvyWtBIsBeO1X1A34msB3htUedhLhvzTBl+2F04 DxgKCLaWrxRzwm9AcS60rbebjpCZQBDW+RrO66RdviH1b9I8tM2I+wtB5l+vRL0jeaMs L0D5lurHRfZi3+OE2OQOBaQB+7M2eOD4tazZTO4KRl7K9zFawRYe1wVyyru51teRGtoX 03tw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:subject:from:to:cc:date:in-reply-to :references:mime-version:content-transfer-encoding; bh=af5Nwayn9PTLGQvRPPyCJFrHxcabvYxA3vhpI/pdNyM=; b=I/+93eFsfHcaVfFlgToDHzg4ufuzN9WPoFkZotbk9rquWAyETQNvAs3Zu6WcBgPRpc iqiguHtZu6ImNwJ0uIy54R13+A5+pv9zxMkrTYhPWGWY014AHQmnjEw64y2kT98TjfAH RUlKxk59LTcMCgaoEqGsrqh6o5jn6C3AvkFCRbFGWYmaA/9lOnpCU5kIneWUvUzPiX4o IjW+YvbI7nky09naCVeMDUmzDY2YhLls085Y906o6gHxYoKn7JY25fobMOEEBi1Qyhht HxTe1CQ+0kD91uzp/4GDBs9ykl/7GHp2BQcwxAwAAcwK8CXo65hauHvTzT9t+Er4UIpt A6Ww== X-Gm-Message-State: APf1xPB4zowUr0y8D8Y6ZDxSnQpYAfX3N4hD4HmQTG/c6LXG6D7gHC8R Cdf9uIuqOrZS0THX5H1WC8Y= X-Received: by 10.36.125.9 with SMTP id b9mr1607682itc.72.1518528929526; Tue, 13 Feb 2018 05:35:29 -0800 (PST) Received: from edumazet-glaptop3.lan (c-67-180-167-114.hsd1.ca.comcast.net. [67.180.167.114]) by smtp.googlemail.com with ESMTPSA id l82sm14209585ioe.20.2018.02.13.05.35.27 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Tue, 13 Feb 2018 05:35:28 -0800 (PST) Message-ID: <1518528926.3715.173.camel@gmail.com> Subject: Re: lost connection to test machine (4) From: Eric Dumazet To: Tejun Heo , Daniel Borkmann Cc: dennisszhou@gmail.com, Dmitry Vyukov , syzbot , Alexei Starovoitov , netdev , LKML , syzkaller-bugs@googlegroups.com Date: Tue, 13 Feb 2018 05:35:26 -0800 In-Reply-To: <20180212200548.GG695913@devbig577.frc2.facebook.com> References: <001a113f8734783e94056505f8fd@google.com> <00c45ca8-305d-1818-e974-a9903c8494b8@iogearbox.net> <20180212170325.GW695913@devbig577.frc2.facebook.com> <20180212200548.GG695913@devbig577.frc2.facebook.com> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.22.6-1+deb9u1 Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 2018-02-12 at 12:05 -0800, Tejun Heo wrote: > On Mon, Feb 12, 2018 at 09:03:25AM -0800, Tejun Heo wrote: > > Hello, Daniel. > > > > On Mon, Feb 12, 2018 at 06:00:13PM +0100, Daniel Borkmann wrote: > > > [ +Dennis, +Tejun ] > > > > > > Looks like we're stuck in percpu allocator with key/value size of 4 bytes > > > each and large number of entries (max_entries) in the reproducer in above > > > link. > > > > > > Could we have some __GFP_NORETRY semantics and let allocations fail instead > > > of triggering OOM killer? > > > > For some part, maybe, but not generally. The virt area allocation > > goes down to page table allocation which is hard coded to use > > GFP_KERNEL in arch mm code. > > So, the following should convert majority of allocations to use > __GFP_NORETRY. It doesn't catch everything but should significantly > lower the probability of hitting this and put this on the same footing > as vmalloc. Can you see whether this is enough? > > Note that this patch isn't upstreamable. We definitely want to > restrict this to the rebalance path, but it should be good enough for > testing. > > Thanks. > > diff --git a/mm/percpu-vm.c b/mm/percpu-vm.c > index 9158e5a..0b4739f 100644 > --- a/mm/percpu-vm.c > +++ b/mm/percpu-vm.c > @@ -81,7 +81,7 @@ static void pcpu_free_pages(struct pcpu_chunk *chunk, > static int pcpu_alloc_pages(struct pcpu_chunk *chunk, > struct page **pages, int page_start, int page_end) > { > - const gfp_t gfp = GFP_KERNEL | __GFP_HIGHMEM; > + const gfp_t gfp = GFP_KERNEL | __GFP_HIGHMEM | __GFP_NORETRY; > unsigned int cpu, tcpu; > int i; > Also I would consider using this fix as I had warnings of cpus being stuck there for more than 50 ms : diff --git a/mm/percpu-vm.c b/mm/percpu-vm.c index 9158e5a81391ced4e268e3d5dd9879c2bc7280ce..6309b01ceb357be01e857e5f899429403836f41f 100644 --- a/mm/percpu-vm.c +++ b/mm/percpu-vm.c @@ -92,6 +92,7 @@ static int pcpu_alloc_pages(struct pcpu_chunk *chunk, *pagep = alloc_pages_node(cpu_to_node(cpu), gfp, 0); if (!*pagep) goto err; + cond_resched(); } } return 0;