Received: by 2002:a6b:500f:0:0:0:0:0 with SMTP id e15csp4150735iob; Tue, 17 May 2022 15:24:13 -0700 (PDT) X-Google-Smtp-Source: ABdhPJx7Yt1lYEl51XmaVU7zCOAYD0pZbvit8jMCIu6KM/fgccAqtyj1vIPjZUrQQuuTCRLVH8JT X-Received: by 2002:a05:6402:34d3:b0:428:2dd3:162e with SMTP id w19-20020a05640234d300b004282dd3162emr21457970edc.260.1652826253374; Tue, 17 May 2022 15:24:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1652826253; cv=none; d=google.com; s=arc-20160816; b=Uh0NFCYdAYUSYazb1+cmezCjzfRONFN+77ycgyQJK5xNWlOO+mGTPaIpsrAAgLgNSH Xb3L8Vx6mM1acjSSryu/wS3ICYfxLkD/SoynG6a3ap+jPPTRh/TPCIuuBdl0QGaG92w0 Vzy0Mri59Zj+U7Zp8n3eAU4ur5jKUqxXDt/Go0vfHlBvN1XtmACo7sjkcQaOPQ07W8dH eJTIlDLo83j/RD3H1QRoTjmdrYwwvYQTjp4MLKVnq+5Nq9j6ZwoEkPZIeUfVkGoHWKrt NHYQeCY7dtmu0jE/ae4pj5BJsQOiQelUnyfjKl9HEMGFFD3RzzwHX1/lMXZbRnCj1Ul9 oUag== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:message-id:date:references :in-reply-to:subject:cc:to:dkim-signature:dkim-signature:from; bh=38MBY5TPzPDWJJ7wfIQbCIMC4tpzXNoTQctp5Q2omdE=; b=D5r2ffjFNT3KjmIcQQiTUi0wAnodWmalf9e+BwW03EKOrH/py7RxBpRJjkxqtYqRSv eOiW1PHDEO0xOWDe4o7Cfv2u/6uE9tAxGHotOczMseXTk/byZkJ7u2GNc8IsO5AsCO47 yjLX5WlXlvylk9RBjNU35k5qCJAkhIFYWOJANU9T6max1cnS2umObrzcWCrDqAvXrjOY tTXcL5UJf5S9jlNOHlknTgYb7FRCDrjp7CfFA12xlDl62caNXMdoqoea57My6nfDwxbM XxRMmpCek+FTAnaoqo9mGP6n3wzGJjZ1WcYxYwqPCRsWLDy5h6mIR2Ns7nr41+7+qF9J yIfQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=qZ3I2loQ; dkim=neutral (no key) header.i=@linutronix.de; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id p11-20020a170906228b00b006e820adb218si406351eja.673.2022.05.17.15.23.47; Tue, 17 May 2022 15:24:13 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=qZ3I2loQ; dkim=neutral (no key) header.i=@linutronix.de; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229595AbiEQVoK (ORCPT + 99 others); Tue, 17 May 2022 17:44:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46138 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229487AbiEQVoC (ORCPT ); Tue, 17 May 2022 17:44:02 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 818C750445; Tue, 17 May 2022 14:43:52 -0700 (PDT) From: Thomas Gleixner DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1652823829; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=38MBY5TPzPDWJJ7wfIQbCIMC4tpzXNoTQctp5Q2omdE=; b=qZ3I2loQXge4Fi5CVcicg69+ghn9lriVgyi3NPWmWzPAuEX4KXYC/io9C4AXS96a19FZvx sk1Umv9m5xJUUCYUQidygcv20wzvES58fTIPGUWt0HIL1u/BmyZdNqbM5FBXKK5ZXJIU0o IwORWcFZnh1x4G9pJZ42qKzguh2Xm6q6vcOIWCmplWNOrx00Xh8cYmEN3xKCCQJ0r+c5jw +zXFQBy0yHB/DcGyPRPBTS9NrvwdKbO349DVwwFkmaupoE5CSeEK+sbcC3lOXhDL/FPGr3 VHXaGq808dpm0+s5+EGiQ7S1/uOkK2esRwO6IJSSDj5CRwxKkfdO+o168nD1KA== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1652823829; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=38MBY5TPzPDWJJ7wfIQbCIMC4tpzXNoTQctp5Q2omdE=; b=if0dLSyC9FrNTNSWRenQex+bMgszw9+Et1v5yB1Ku/+X13/fdyfRbbOCPfr0sAhB0cGrJk aExYqttNsV6KH+AA== To: Max Mehl , LKML Cc: Greg Kroah-Hartman , Christoph Hellwig , linux-spdx@vger.kernel.org Subject: Re: [patch 0/9] scripts/spdxcheck: Better statistics and exclude handling In-Reply-To: <1652775347.3cr9dmk5qv.2220@fsfe.org> References: <20220516101901.475557433@linutronix.de> <1652706350.kh41opdwg4.2220@fsfe.org> <87zgjhpawr.ffs@tglx> <87wnelpam3.ffs@tglx> <1652775347.3cr9dmk5qv.2220@fsfe.org> Date: Tue, 17 May 2022 23:43:49 +0200 Message-ID: <8735h7ltre.ffs@tglx> MIME-Version: 1.0 Content-Type: text/plain X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, May 17 2022 at 10:25, Max Mehl wrote: > ~ Thomas Gleixner [2022-05-16 20:59 +0200]: >> There is also an argument to be made whether we really need to have SPDX >> identifiers on trivial files: >> >> #include >> >> >> Such files are not copyrightable by any means. So what's the value of >> doubling the line count to add an SPDX identifier? Just to make nice >> statistics? > > We agree that such files are not copyrightable. But where is the > threshold? Lines of code? Creativity? Number of used functions? And how > to embed this threshold in tooling? So instead of fuzzy exclusion of > such files in tools like spdxcheck or REUSE, it makes sense to treat > them as every other file with the cost of adding two comment lines. > > This clear-cut rule eases maintaining and growing the effort you and > others did because developers would know exactly what to add to a new > file (license + copyright) without requiring looking up the thresholds > or a manual review by maintainers who can interpret them. Seriously no. I'm outright refusing to add my copyright to a trivial file with one or two includes or a silly comment like '/* empty because */. There is nothing copyrightable there. I'm not going to make myself a fool just to make tools happy, which can figure out on their own whether there is reasonable content in the vast majority of cases. Also you need some exclude rules in any case. Why? - How do you tell a tool that a file is generated, e.g. in the kernel the default configuration files? Yes, the file content depends on human input to the generator tool, but I'm looking forward for the explanation how this is copyrightable especially with multiple people updating this file over time where some of the updates are just done by invoking the generator tool itself. - How do you tell a tool that a file contains licensing documentation? Go and look what license scanners make out of all the various license-rules.rst files. - .... Do all scanners have to grow heuristics for ignoring the content past the topmost SPDX License identifier in certain files or for figuring out what might be generated content? You also might need to add information about binary blobs, which obviously cannot be part of the binary blobs themself. The exclude rules I added are lazy and mostly focussed on spdxcheck, but I'm happy to make them more useful and let them carry information about the nature of the exclude or morph them into a general scanner info which also contains binary blob info and other helpful information. But that needs a larger discussion about the format and rules for such a file. That said, I'm all for clear cut rules, but rules just for the rules sake are almost as bad as no rules at all. As always you have to apply common sense and look at the bigger picture and come up with solutions which are practicable, enforcable and useful for the larger eco-system. Your goal of having SPDX ids and copyright notices in every file of a project is honorable, but impractical for various reasons. See above. Aside of that you cannot replace a full blown license scanner by REUSE even if your project is SPDX and Copyright notice clean at the top level of a file. You still need to verify that there is no other information in a 'clean' file which might be contradicting or supplemental. You cannot add all of this functionality to REUSE or whatever. Thanks, tglx