Received: by 2002:a25:ad19:0:0:0:0:0 with SMTP id y25csp5261333ybi; Tue, 30 Jul 2019 17:18:50 -0700 (PDT) X-Google-Smtp-Source: APXvYqyRki+ILbBtJBx7/5b+xuB/BoeY1mvlvWy5g4SDtB180x1wX0ZB5DJ7lHkSNeBQOr/8l1Ml X-Received: by 2002:a17:90b:f12:: with SMTP id br18mr24735pjb.127.1564532330591; Tue, 30 Jul 2019 17:18:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1564532330; cv=none; d=google.com; s=arc-20160816; b=Ikd2XGqJcYa1MrQZ5EURZBcJemxe52YNUKIp2dnCx2rWdJCUQVEX4mS963kD4a8gIp kxIi+1If4wZI73xll0zoZ4zlFHjSLfMDkZ4pX6WkVfQbwgsntsuWSmhY+k0Wit1ugCiZ 1fchf26zYTz+kWL8AQ8R66YJ7wwPwJdYZUPbcdzIS+RO5O9kTfIx5WP0BcQ0csD6VhZf 3/DJTela/+t96GFumHVprl8hvC27CjTdhiNrdh5CsPmnSaD7kHeFQOA5Tw/t/s+LQn3u 8ykY/HmDhu8bxAVMd+yswV1rvvReDbdVLnRcpsJC4wHC9EJXI186KRWZ2zRFuK1JKOtG 055g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-language :content-transfer-encoding:in-reply-to:mime-version:user-agent:date :message-id:organization:from:references:cc:to:subject:reply-to; bh=jbK8JULC9SJ6pDOMm9cfN8qfqr2zIleYy2L1Zt/un+A=; b=ccg6hHmjvYat9SwK4fIPEw8j2muw1DF/8j8kN+CG4S1PwuuKekHBh7LpAg2zTcxG6Z uvS+1D1svnxL7wTm0afR3r5EWj5M4wPtygllcyklhQkD4Frvt30o2dZxWIN5l46sYf0p /ONRONaNpq+w68HhUtEKAynkgEGmlmI0SZiE89cWrhmh6DWP3qBsn4U+tV/XpQQzZrIv aKcichFQ1IZUUa8JPLQ57HGiAorU5yMVcdrob8el35XVSycIOkUH3AdCu5TkWAWbjyoV 9uarr+n0IPWgCRV4iHUwpJz/kJNx0PqKFWoxH1cRh1v7z9rsegoG/ofLP0Nd5UG31XSO XWaA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id u189si30164908pgc.385.2019.07.30.17.18.36; Tue, 30 Jul 2019 17:18:50 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388196AbfG3W2P (ORCPT + 99 others); Tue, 30 Jul 2019 18:28:15 -0400 Received: from mga04.intel.com ([192.55.52.120]:24624 "EHLO mga04.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2387486AbfG3W2P (ORCPT ); Tue, 30 Jul 2019 18:28:15 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga104.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 30 Jul 2019 15:28:15 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.64,327,1559545200"; d="scan'208";a="183424035" Received: from linux.intel.com ([10.54.29.200]) by orsmga002.jf.intel.com with ESMTP; 30 Jul 2019 15:28:14 -0700 Received: from [10.54.74.33] (skuppusw-desk.jf.intel.com [10.54.74.33]) by linux.intel.com (Postfix) with ESMTP id 1919558060A; Tue, 30 Jul 2019 15:28:14 -0700 (PDT) Reply-To: sathyanarayanan.kuppuswamy@linux.intel.com Subject: Re: [PATCH v1 1/1] mm/vmalloc.c: Fix percpu free VM area search criteria To: Dennis Zhou Cc: Dave Hansen , Uladzislau Rezki , akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org References: <20190729232139.91131-1-sathyanarayanan.kuppuswamy@linux.intel.com> <20190730204643.tsxgc3n4adb63rlc@pc636> <9fdd44c2-a10e-23f0-a71c-bf8f3e6fc384@linux.intel.com> <20190730215535.GA67664@dennisz-mbp.dhcp.thefacebook.com> From: sathyanarayanan kuppuswamy Organization: Intel Message-ID: Date: Tue, 30 Jul 2019 15:25:42 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.7.2 MIME-Version: 1.0 In-Reply-To: <20190730215535.GA67664@dennisz-mbp.dhcp.thefacebook.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Content-Language: en-US Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 7/30/19 2:55 PM, Dennis Zhou wrote: > On Tue, Jul 30, 2019 at 02:13:25PM -0700, sathyanarayanan kuppuswamy wrote: >> On 7/30/19 1:54 PM, Dave Hansen wrote: >>> On 7/30/19 1:46 PM, Uladzislau Rezki wrote: >>>>> + /* >>>>> + * If required width exeeds current VA block, move >>>>> + * base downwards and then recheck. >>>>> + */ >>>>> + if (base + end > va->va_end) { >>>>> + base = pvm_determine_end_from_reverse(&va, align) - end; >>>>> + term_area = area; >>>>> + continue; >>>>> + } >>>>> + >>>>> /* >>>>> * If this VA does not fit, move base downwards and recheck. >>>>> */ >>>>> - if (base + start < va->va_start || base + end > va->va_end) { >>>>> + if (base + start < va->va_start) { >>>>> va = node_to_va(rb_prev(&va->rb_node)); >>>>> base = pvm_determine_end_from_reverse(&va, align) - end; >>>>> term_area = area; >>>>> -- >>>>> 2.21.0 >>>>> >>>> I guess it is NUMA related issue, i mean when we have several >>>> areas/sizes/offsets. Is that correct? >>> I don't think NUMA has anything to do with it. The vmalloc() area >>> itself doesn't have any NUMA properties I can think of. We don't, for >>> instance, partition it into per-node areas that I know of. >>> >>> I did encounter this issue on a system with ~100 logical CPUs, which is >>> a moderate amount these days. >> I agree with Dave. I don't think this issue is related to NUMA. The problem >> here is about the logic we use to find appropriate vm_area that satisfies >> the offset and size requirements of pcpu memory allocator. >> >> In my test case, I can reproduce this issue if we make request with offset >> (ffff000000) and size (600000). >> >> -- >> Sathyanarayanan Kuppuswamy >> Linux kernel developer >> > I misspoke earlier. I don't think it's numa related either, but I think > you could trigger this much more easily this way as it could skip more > viable vma space because it'd have to find more holes. > > But it seems that pvm_determine_end_from_reverse() will return the free > vma below the address if it is aligned so: > > base + end > va->va_end > > will always be true and then push down the searching va instead of using > that va first. It won't be always true. Initially base address is calculated as below: base = pvm_determine_end_from_reverse(&va, align) - end; So for first iteration it will not fail. > > Thanks, > Dennis > -- Sathyanarayanan Kuppuswamy Linux kernel developer