Received: by 10.223.185.116 with SMTP id b49csp3925378wrg; Tue, 13 Feb 2018 09:51:10 -0800 (PST) X-Google-Smtp-Source: AH8x224aLB0j03VqtG+0K6FM7EHC5SFfGyFWkOQkad4HuV9Jw5J/qo7LxTJX8hsz/9UuZbzEQAHI X-Received: by 10.101.93.17 with SMTP id e17mr1636908pgr.352.1518544270330; Tue, 13 Feb 2018 09:51:10 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518544270; cv=none; d=google.com; s=arc-20160816; b=QcEFuNQPsnoqbXOvqJ2hqhUSxeyX7cOAY+HGn5d38oXKSgk08CvaG1xUwKGp/KrjEA jRuFxz0PQ1YXIzxw5BwFqtF2N26upBreTM85he+YGx4FXvaRFefgOBKSwYJLLwkKzECE GSBsaQeSqo2Qj0ejerHj1Bt0/C4FBP87/wZQCv5V764jf+fKPxfNm6o6zGPgQLzebmjH rUThClonZXFv7fwBmdTg4w/phXH8G6szWuLBe86IazZNXZXxJvzfj3Gful2KjAmlWdSN D0XUxJuio4HEJs+0sfSRp3DqZm1HoOdZUkzMCzBu9WerVLZsjehrzz13zPKiMgFunnLG /gBQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:date:cc:to:from:subject:message-id :dkim-signature:arc-authentication-results; bh=bUlcsrQgaqMCKxpQ9G3y3xGvIXQn3QhYW0Npxer2/cg=; b=JLMlvRVsz/sPRxn/5NTdXwx8agsYt0xbOBkMvrNQ/Y6V2JiLYCzchNrhPSu5RJpqkN eJ2p5SpD+yjrmr3EUbHUvKuOe4eZVzVDiz70JdbRCVK2FiippdxIzk4DVUTEjNbZS4jj 9Ztu1bGrKYk0en2A3FRmdNwl42AjNQfAnYo2xDBBrY5GA4+NlhS6TOLs+SjMzcQc3mGz P258UIyYLPSkz+xNw8J+8bKeE/JFNnZNTA0ecLHOhtzyDUiXIi245B7gRFDHr2umbFIF I/18vnxpoUllItbobe7ikFR+yB5XK7Hgp/gq4vsyOFGpKPsCalriTmHI3eE7m9ofIR8F aBoQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=VzJh8h9h; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id v4-v6si1794196plp.746.2018.02.13.09.50.56; Tue, 13 Feb 2018 09:51:10 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=VzJh8h9h; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965479AbeBMRtd (ORCPT + 99 others); Tue, 13 Feb 2018 12:49:33 -0500 Received: from mail-it0-f66.google.com ([209.85.214.66]:55324 "EHLO mail-it0-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S965278AbeBMRtb (ORCPT ); Tue, 13 Feb 2018 12:49:31 -0500 Received: by mail-it0-f66.google.com with SMTP id b66so12061644itd.5; Tue, 13 Feb 2018 09:49:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:subject:from:to:cc:date:in-reply-to:references :mime-version:content-transfer-encoding; bh=bUlcsrQgaqMCKxpQ9G3y3xGvIXQn3QhYW0Npxer2/cg=; b=VzJh8h9h2i3+u5/RHABRnIYaMQ7nfC8warbw9ehlIBd9Qjqodro5RX3VTaqfFlVArI BG0PJANV4uCCjLKw7RqXH+WXudzMwiKqVWmDGR7eoAfRngLOiHNTq+EPU0cCGn/fr4AH 1brA6EZCXaewJOtR2tILfsH3YpmxEeZFWgEcZq46DDfZ3OtTsMjI3AK2HHg1jkPTPKVh 1KRGTsnWLQUpCupjQByZN3pv5unjYW9jrCdqThqnHPc5hwGui8YL9/FzOH/Sk/B8trn/ bC//tHJdZhf7j8XmJETsK8fuzKdKCLKb8fThDocBn5h8/O4MVJQJLK1t0jZZtRlh2VeC lRpw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:subject:from:to:cc:date:in-reply-to :references:mime-version:content-transfer-encoding; bh=bUlcsrQgaqMCKxpQ9G3y3xGvIXQn3QhYW0Npxer2/cg=; b=cMyn3utGEb22dwskksq1oVaLJKxXst1OLlEkqEQUIfa+kdFHi1Sv3wunyxxBQGWa0/ kdlHr/lXa9lzNS9it1FuDHrYZZqstPElWJFY8NjsrWQIAYy5B8oOruWeIOkb+ZK7MnoG gYlZ2VbaUQtk23dJQ1lv+8QD7/sPM1BhXDf33N2mI4hDgvb7Wm7FHnamA5TShdBp7MkM DPJvkawLnCgb+4ah8rZn+hpw0t1j27ZjM6NJ+Z/01VWO71jC/R5AWYycqZQ94LqQxzI7 FWZnOr6TCardzgtbGtcQfZhGM/tBn8fkeOXfHLORg9SBNjt9OPYqv1EO3rENRc1a4cGr 1VBA== X-Gm-Message-State: APf1xPDmr3rqi2Rp7YWqJROv9lwZMYt/IU+uRp+CYDylFCsCU5rwdKK2 sAy88vh2QfPOLZWCpim2OJ0= X-Received: by 10.36.137.65 with SMTP id s62mr2536435itd.96.1518544170468; Tue, 13 Feb 2018 09:49:30 -0800 (PST) Received: from ?IPv6:2620:15c:2c1:200:e081:603e:7fdc:75e? ([2620:15c:2c1:200:e081:603e:7fdc:75e]) by smtp.googlemail.com with ESMTPSA id o8sm14825724ioe.56.2018.02.13.09.49.28 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Tue, 13 Feb 2018 09:49:29 -0800 (PST) Message-ID: <1518544167.3715.180.camel@gmail.com> Subject: Re: lost connection to test machine (4) From: Eric Dumazet To: Dennis Zhou Cc: Tejun Heo , Daniel Borkmann , Dmitry Vyukov , syzbot , Alexei Starovoitov , netdev , LKML , syzkaller-bugs@googlegroups.com Date: Tue, 13 Feb 2018 09:49:27 -0800 In-Reply-To: <20180213173438.GA60641@localhost.uwnet.wisc.edu> References: <001a113f8734783e94056505f8fd@google.com> <00c45ca8-305d-1818-e974-a9903c8494b8@iogearbox.net> <20180212170325.GW695913@devbig577.frc2.facebook.com> <20180212200548.GG695913@devbig577.frc2.facebook.com> <1518528926.3715.173.camel@gmail.com> <20180213173438.GA60641@localhost.uwnet.wisc.edu> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.22.6-1+deb9u1 Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 2018-02-13 at 11:34 -0600, Dennis Zhou wrote: > Hi Eric, > > On Tue, Feb 13, 2018 at 05:35:26AM -0800, Eric Dumazet wrote: > > > > Also I would consider using this fix as I had warnings of cpus being > > stuck there for more than 50 ms : > > > > > > diff --git a/mm/percpu-vm.c b/mm/percpu-vm.c > > index 9158e5a81391ced4e268e3d5dd9879c2bc7280ce..6309b01ceb357be01e857e5f899429403836f41f 100644 > > --- a/mm/percpu-vm.c > > +++ b/mm/percpu-vm.c > > @@ -92,6 +92,7 @@ static int pcpu_alloc_pages(struct pcpu_chunk *chunk, > > *pagep = alloc_pages_node(cpu_to_node(cpu), gfp, 0); > > if (!*pagep) > > goto err; > > + cond_resched(); > > } > > } > > return 0; > > > > > > This function gets called from pcpu_populate_chunk while holding the > pcpu_alloc_mutex and is called from two scenarios. First, when an > allocation occurs to a place without backing pages, and second when the > workqueue item is scheduled to replenish the number of empty pages. So, > I don't think this is a good idea. > That _is_ a good idea, we do this already in vmalloc(), and vmalloc() can absolutely be called while some mutex(es) are held. > My understanding is if we're seeing warnings here, that means we're > struggling to find backing pages. I believe adding __GFP_NORETRY on the > workqueue path as Tejun mentioned above would help with warnings as > well, but not if they are caused by the allocation path. > That is a separate concern. My patch simply avoids latency spikes when huge percpu allocations are happening, on systems with say 1024 cpus.