Received: by 2002:a05:6602:18e:0:0:0:0 with SMTP id m14csp2064774ioo; Mon, 23 May 2022 09:18:52 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxkJAE2wR7r5v1I2hJraKlxuKp48HlWpTsi0H0BWZxrr3hce02U84mHqv4HouCmJ5uEuOu8 X-Received: by 2002:a17:90b:4b83:b0:1df:6862:fa9d with SMTP id lr3-20020a17090b4b8300b001df6862fa9dmr28037186pjb.32.1653322731829; Mon, 23 May 2022 09:18:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1653322731; cv=none; d=google.com; s=arc-20160816; b=kArXZWHIs5+M6mfL/Qu4Vih7V9bOs8FcFciXz8oZopCH58Q8gb4vKT5I4bWKmtSZkg NTeZmID4UwjDnLgOXYQFKAjTSR3A8aUifZzuo1HV0bRYwiV0QXVQvNleqmOQ2JCzbXrr ijESBUHNqDava4g+U9rWDJpPeSO5cr5XggdVyqTGUMQIOKDbiz9kjnhO9pnuRqVRXRDH +JT+qaESa0EK6Q10O15UUlyfu3sRGAFwd6rlEjS2aPSvBSErn81llvHenu5VsFMF8Wwl xTSGwIpqZK3+KefsavJf+Y3rI2uEjULy6/7CvPesN5DRY+AG6bSKKyZ183dpkicIJtZg dHJg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id:dkim-signature; bh=KjSGju59Ich2PK+YmLbZYQHyPrKADu4YlcAeuTKJzsY=; b=AeVS3bKMHBu7bi5uzcL7pJ/JYx5XaqGbKZxlwFd5fKKMKElDgSevlY4oj+OkrWx3NQ +iQ0wxMaq/9Xcg4Hvul6EuTFZQ/oQlxuw2vE7Q8b9rriRM/9X0a5L2mTDaRO2SvxWTZS FfTB+TmfYdXt2VC+ztISKh63mR/sWJeTlugex8ig6+ax8Q/JfY9Cb23WRPa/+lrMmUP4 f66a2VrFn751qnA+vXb60ESIHE6bSsTrW+ZWbLsd+wGwyaeoqsToWqerDPONppxVwPaL IYoQDpzCleLrvd62A2Ofo1nws1wKxuu40M236PuEjJ5ZIomsOcAwvq0XNN/dYqp1O6AF xVNw== ARC-Authentication-Results: i=1; mx.google.com; dkim=temperror (no key for signature) header.i=@jilayne.com header.s=default header.b=Vuzwf6Kt; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id f8-20020a056a00228800b00518518d28f4si16583105pfe.88.2022.05.23.09.18.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 May 2022 09:18:51 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; dkim=temperror (no key for signature) header.i=@jilayne.com header.s=default header.b=Vuzwf6Kt; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id AA5B167D32; Mon, 23 May 2022 09:18:41 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238778AbiEWQRb (ORCPT + 99 others); Mon, 23 May 2022 12:17:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35004 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238649AbiEWQR1 (ORCPT ); Mon, 23 May 2022 12:17:27 -0400 Received: from mx1.supremebox.com (mx1-c1.supremebox.com [198.23.53.215]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EB39A66684 for ; Mon, 23 May 2022 09:17:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=jilayne.com ; s=default; h=Content-Transfer-Encoding:Content-Type:In-Reply-To:From: References:Cc:To:Subject:MIME-Version:Date:Message-ID:Sender:Reply-To: Content-ID:Content-Description:Resent-Date:Resent-From:Resent-Sender: Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=KjSGju59Ich2PK+YmLbZYQHyPrKADu4YlcAeuTKJzsY=; b=Vuzwf6Kty+1GzL54plgo8ujUPA cvuJQiA228sTLSCGqEx9YCy/CphorA+BgZ47Vlf+IVcH8uKd2De5DgTDkY4AnatcvvWtXT8mvsV2N 5qk3mKF0N6FKxbJqHdnMY2DHe+acRXg/6rzuPVwmPwm+ImqlEEGurE3uxZGnPqvJ6RSA=; Received: from 75-166-140-231.hlrn.qwest.net ([75.166.140.231] helo=[192.168.0.91]) by mx1.supremebox.com with esmtpa (Exim 4.92) (envelope-from ) id 1ntAeg-0000R8-Tf; Mon, 23 May 2022 16:11:39 +0000 Message-ID: <97d8beb2-db33-1e50-eadb-6ac8d650f044@jilayne.com> Date: Mon, 23 May 2022 10:11:37 -0600 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0) Gecko/20100101 Thunderbird/91.9.0 Subject: Re: [patch 0/9] scripts/spdxcheck: Better statistics and exclude handling Content-Language: en-US To: Thomas Gleixner , Max Mehl , LKML Cc: Greg Kroah-Hartman , Christoph Hellwig , linux-spdx@vger.kernel.org References: <20220516101901.475557433@linutronix.de> <1652706350.kh41opdwg4.2220@fsfe.org> <87zgjhpawr.ffs@tglx> <87wnelpam3.ffs@tglx> <1652775347.3cr9dmk5qv.2220@fsfe.org> <8735h7ltre.ffs@tglx> From: J Lovejoy In-Reply-To: <8735h7ltre.ffs@tglx> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Sender-Ident-agJab5osgicCis: opensource@jilayne.com X-Spam-Status: No, score=-5.0 required=5.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, NICE_REPLY_A,RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 5/17/22 3:43 PM, Thomas Gleixner wrote: > On Tue, May 17 2022 at 10:25, Max Mehl wrote: >> ~ Thomas Gleixner [2022-05-16 20:59 +0200]: >>> There is also an argument to be made whether we really need to have SPDX >>> identifiers on trivial files: >>> >>> #include >>> >>> >>> Such files are not copyrightable by any means. So what's the value of >>> doubling the line count to add an SPDX identifier? Just to make nice >>> statistics? >> We agree that such files are not copyrightable. But where is the >> threshold? Lines of code? Creativity? Number of used functions? And how >> to embed this threshold in tooling? So instead of fuzzy exclusion of >> such files in tools like spdxcheck or REUSE, it makes sense to treat >> them as every other file with the cost of adding two comment lines. >> >> This clear-cut rule eases maintaining and growing the effort you and >> others did because developers would know exactly what to add to a new >> file (license + copyright) without requiring looking up the thresholds >> or a manual review by maintainers who can interpret them. > Seriously no. I'm outright refusing to add my copyright to a trivial > file with one or two includes or a silly comment like '/* empty because */. > > There is nothing copyrightable there. > > I'm not going to make myself a fool just to make tools happy, which can > figure out on their own whether there is reasonable content in the vast > majority of cases. > > Also you need some exclude rules in any case. Why? > > - How do you tell a tool that a file is generated, e.g. in the kernel > the default configuration files? > > Yes, the file content depends on human input to the generator tool, > but I'm looking forward for the explanation how this is > copyrightable especially with multiple people updating this file > over time where some of the updates are just done by invoking the > generator tool itself. > > - How do you tell a tool that a file contains licensing documentation? > > Go and look what license scanners make out of all the various > license-rules.rst files. > > - .... > > Do all scanners have to grow heuristics for ignoring the content past > the topmost SPDX License identifier in certain files or for figuring > out what might be generated content? > > You also might need to add information about binary blobs, which > obviously cannot be part of the binary blobs themself. > > The exclude rules I added are lazy and mostly focussed on spdxcheck, but > I'm happy to make them more useful and let them carry information about > the nature of the exclude or morph them into a general scanner info > which also contains binary blob info and other helpful information. But > that needs a larger discussion about the format and rules for such a > file. > > That said, I'm all for clear cut rules, but rules just for the rules > sake are almost as bad as no rules at all. > > As always you have to apply common sense and look at the bigger picture > and come up with solutions which are practicable, enforcable and useful > for the larger eco-system. > > Your goal of having SPDX ids and copyright notices in every file of a > project is honorable, but impractical for various reasons. > > See above. > > Aside of that you cannot replace a full blown license scanner by REUSE > even if your project is SPDX and Copyright notice clean at the top level > of a file. You still need to verify that there is no other information > in a 'clean' file which might be contradicting or supplemental. You > cannot add all of this functionality to REUSE or whatever. > Max, Thomas, I think the discussion here is hitting upon the "inconvenience" of the lack of black/white rules in the law (as to what is copyrightable) versus the convenience of downstream recipients of code who want to be sure they have proper rights (which mixes in the guidance/rules of Reuse, tooling, etc.). I think some rules in terms of files that are clearly not copyrightable can be implemented in various tooling (hopefully, with the guidance of a lawyer steeped in copyright law), and I agree that putting a license (by way of an SPDX identifier or any other way for that matter) on such files is neither a good use of time nor a good idea (from the perspective of being inaccurate as to the need for a license and thus sending the wrong impression). That being said, there will not be a way to make clear cut rules for everything, without involving a judge. Sorry! That's just how the law works (and we actually often don't want black/white lines in the law, actually). I can see a policy of, "when it's not clear (as to copyrightability), then add a license", though. Thanks, Jilayne