Received: by 2002:ad5:4acb:0:0:0:0:0 with SMTP id n11csp685983imw; Wed, 13 Jul 2022 06:18:51 -0700 (PDT) X-Google-Smtp-Source: AGRyM1svCJK6GFFzSVG28YY4K+weu7hBuT4wFe/1btI0C8v1yPcLRXiDbhAceb9XLoKbh+exhwxJ X-Received: by 2002:a17:90b:3d82:b0:1f0:5894:7e39 with SMTP id pq2-20020a17090b3d8200b001f058947e39mr3825358pjb.187.1657718331482; Wed, 13 Jul 2022 06:18:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1657718331; cv=none; d=google.com; s=arc-20160816; b=whH1QXIzNzhUFztRWDLOsgxXwioC96tDUZcT+hOvqv/G9iyQjfNRYP/O673D90R76o Fr7TIkaOys/MtExYsWPbOBsWxRymCnbHij5OtcC6Pny98qlh/R+FZtKpzymEOrF1Zrbp fypVJKc2gXt1hBQb9z/nnUNKI6CHw1o79Ag09uQrwy3VbkWxcr5IjDssepYr6W3HeVLH KWjA9kOsoKKyoiyOXMDek1+v5XBGxOM3U2QcyjztmDGr/j3k0K1DS+Y56a+HUxZfOhxQ Al7qaZ6ytxaH4GDecBUaFyJYVevbvEk+4pC2iig8at8P7JYx4biXOENTN+ctamowWdtO 9gQw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=NtlaKy/Pi7fFkMlw5/xShDsKILPGpTf38KvCH4G4WHI=; b=VEiAitPJ+L+jFkLaZ8Tnom1SMbdWd0f5NkHWCd0xWc6CRde5lu31ep4gOrwEVgQHGV R+krjOPOk5S0SAgxoR7QLhw/IgC5U0aB/CLrifZ5See17eRmLdY0ELHm4wzF7Tds0DJw i6c+cYh8FMsMyF881wfhhc3oalnsL3BlRW+nta3nu2IQtz4T3n5aqkJYlbYDUjDNBnjS iuYQQOIoB98FwxB74djBHkiGioqXDDNiaml+zJJqd/Z55460OJcmFpUrBUpfvVmGkfVm 4owgRWr0bZG4NlIalwl56I2bSD+K9JKQQlvUsoddMq/c0GrVGkbE98NY5aaE8w2+RI/+ npqw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=DvDozQ8o; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id h69-20020a638348000000b00415b439bd14si16190283pge.148.2022.07.13.06.18.38; Wed, 13 Jul 2022 06:18:51 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=DvDozQ8o; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235133AbiGMNKL (ORCPT + 99 others); Wed, 13 Jul 2022 09:10:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59128 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229607AbiGMNKK (ORCPT ); Wed, 13 Jul 2022 09:10:10 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 2DBE5D59 for ; Wed, 13 Jul 2022 06:10:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1657717808; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=NtlaKy/Pi7fFkMlw5/xShDsKILPGpTf38KvCH4G4WHI=; b=DvDozQ8o5yskTULH2U4SfAcpd0QEtJhPTSRtpS5VqLjXWY4hgD9ljlekwYP5IGrmTBQAM+ x4BbSyCMv4i3zaP0SDm0vwfgwlEwdWMLWv7fg62iJ5HNqZMjOB4Xvhos6jtbxJWV+VBftO nn/KCifcKj6T/dqEcigSTNoyZRnWMP4= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-653-65NjDsgYNpWr6p8by5j2Dg-1; Wed, 13 Jul 2022 09:10:04 -0400 X-MC-Unique: 65NjDsgYNpWr6p8by5j2Dg-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 5707E803301; Wed, 13 Jul 2022 13:10:04 +0000 (UTC) Received: from lorien.usersys.redhat.com (unknown [10.22.32.177]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 8D7382026D64; Wed, 13 Jul 2022 13:10:03 +0000 (UTC) Date: Wed, 13 Jul 2022 09:10:02 -0400 From: Phil Auld To: Greg Kroah-Hartman Cc: linux-kernel@vger.kernel.org, "Rafael J . Wysocki" , Tian Tao , Barry Song Subject: Re: [PATCH] drivers/base/node.c: fix userspace break from using bin_attributes for cpumap and cpulist Message-ID: References: <20220712214301.809967-1-pauld@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Scanned-By: MIMEDefang 2.78 on 10.11.54.4 X-Spam-Status: No, score=-3.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_LOW, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jul 13, 2022 at 07:48:00AM -0400 Phil Auld wrote: > Hi Greg, > > On Wed, Jul 13, 2022 at 08:06:02AM +0200 Greg Kroah-Hartman wrote: > > On Tue, Jul 12, 2022 at 05:43:01PM -0400, Phil Auld wrote: > > > Using bin_attributes with a 0 size causes fstat and friends to return that 0 size. > > > This breaks userspace code that retrieves the size before reading the file. Rather > > > than reverting 75bd50fa841 ("drivers/base/node.c: use bin_attribute to break the size > > > limitation of cpumap ABI") let's put in a size value at compile time. Use direct > > > comparison and a worst-case maximum to ensure compile time constants. For cpulist the > > > max is on the order of NR_CPUS * (ceil(log10(NR_CPUS)) + 1) which for 8192 is 40960. > > > In order to get near that you'd need a system with every other CPU on one node or > > > something similar. e.g. (0,2,4,... 1024,1026...). We set it to a min of PAGE_SIZE > > > to retain the older behavior. For cpumap, PAGE_SIZE is plenty big. > > > > Does userspace care about that size, or can we just put any value in > > there and it will be ok? How about just returning to the original > > PAGE_SIZE value to keep things looking identical, will userspace not > > read more than that size from the file then? > > > > I'll go look. But I think the point of pre-reading the size with fstat is to allocate > a buffer to read into. So that may be a problem. > From what I understand the app does use the size from fstat to allocate a buffer. It also does a lseek to the end and back. This is actaully where it breaks when it gets a 0. This is before: 8747 10:20:32.584460 fstat(6, {..., st_size=4096, ...}) = 0 <0.000006> 8747 10:20:32.584502 fstat(6, {..., st_size=4096, ...}) = 0 <0.000005> 8747 10:20:32.584537 lseek(6, 4096, SEEK_SET) = 4096 <0.000005> 8747 10:20:32.584560 lseek(6, 0, SEEK_SET) = 0 <0.000005> 8747 10:20:32.584585 read(6, "0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38,40,42,44,46\n", 4096) = 67 <0.000008> 8747 10:20:32.584617 read(6, "", 4096) = 0 <0.000005> (I'll note here their system does seem to have a worst case cpu to node list too) And with the bin_attr change: 8871 09:27:41.034279 fstat(6, {..., st_size=0, ...}) = 0 <0.000009> 8871 09:27:41.034330 fstat(6, {..., st_size=0, ...}) = 0 <0.000010> 8871 09:27:41.034377 lseek(6, 0, SEEK_SET) = 0 <0.000008> 8871 09:27:41.034410 lseek(6, 0, SEEK_SET) = 0 <0.000008> And here it spins in a loop. > That said, I believe in this case it's the cpulist file which given the use of ranges > is very unlikely to actually get that big. > > > > On an 80 cpu 4-node sytem (NR_CPUS == 8192) > > > > We have systems running Linux with many more cpus than that, and your > > company knows this :) > > The 80 cpus here don't matter and we only build with NR_CPUS = 8192 :) > > But yes, I realize now that the cpumap part I posted is broken for larger > NR_CPUS. I originally had it as NR_CPUS, but as I said in my reply to Barry, > it wants to be ~= NR_CPUS/4 + NR_CPUS/32. I'll change that. > > I think we should decide on a max for each and use that. > What values of NR_CPUS are people actually using when they build kernels? Barry mentioned cpuid 100000 for example. I'm not sure if that was real or just illustrating the need for more characters. I've modified my patch to use NR_CPUS/2 for the cpumap which should be plenty. And to use NR_CPUS*6 for the cpulist, which covers up to 99999 cpus safely. Cheers, Phil > Cheers, > Phil > > > > > thanks, > > > > greg k-h > > > > -- --