Received: by 2002:a25:c593:0:0:0:0:0 with SMTP id v141csp3726668ybe; Sun, 8 Sep 2019 20:44:00 -0700 (PDT) X-Google-Smtp-Source: APXvYqw1mL560ouDdHe1Hzyx9jbcP59VC477LreiKz2NNEodaJOEKLzliDI1b+ileNTPXWQttilE X-Received: by 2002:a17:906:fd0:: with SMTP id c16mr17121783ejk.213.1568000640026; Sun, 08 Sep 2019 20:44:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1568000640; cv=none; d=google.com; s=arc-20160816; b=PoH762fLHO5EatV+ObRT5JhEDYe+NcghIJQ1DFnCE51eI3QeziTr0reHK9kN4pGJ/n Dnfm+GDBcusmMFiIix42V1i14kTKRYmkNsJ6ysoA4NFZ141q8jG9w+U4JqD0ot1jE194 fCLhkfceormNDXzDX6AylKzemLrn4Xx00+JOrERbwFlj6WSh9KtUBSJnesesLfGxgnTb 5KsqXca5dt3zhfUxKIpPGmZzlvv8bvYjmrvILZG6DJiDKQ73cZ4Qe1TWBI1GVlEspmEm cTuR0MjqRRJqNBo+jlO65/2gs5ZaOFqwn0896GHgcl+b2jBQdhibobvuC1FWv3nhAosn P7dA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=mvX19uI1WHr5/greX5Z+Z53U/68NrQoxPT6VfqkG8EY=; b=VB+QIPDQDat4QITHcc/9KqFqhEul0Us0xyCcB88+Lu48hzWba8S89kM0ccIvdxyHiq T5f1A7PvsZ9nJHtRYoX8HtRGX97MCMImtcaMnCjJ8B5ytoc1Zz4TkZDz3X9PRqy1AUjv L2EBH/k6JX5Pzb6pUBiZxPIa6tUbTY6kQTVtrQaoDWiN92iia2da1xjd6K+cXALe8hls C6DjHg5xDed7EhqgIqc38hjkD8rTO0h8VFozqzKNCnKkhvF2dSEWLvJpy0rVBNFgWZ0A mkIbHvosj2fy0wbmNo7yk34VOu9jca6wdIxvcnLWKUDYtjuR09S84TkgIjIKJE6PEBIk mm0g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id s3si7977205edx.427.2019.09.08.20.43.36; Sun, 08 Sep 2019 20:44:00 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2391340AbfIGShV (ORCPT + 99 others); Sat, 7 Sep 2019 14:37:21 -0400 Received: from smtp3.goneo.de ([85.220.129.37]:44928 "EHLO smtp3.goneo.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2388619AbfIGShV (ORCPT ); Sat, 7 Sep 2019 14:37:21 -0400 Received: from localhost (localhost [127.0.0.1]) by smtp3.goneo.de (Postfix) with ESMTP id EFD7323F837; Sat, 7 Sep 2019 20:37:16 +0200 (CEST) X-Virus-Scanned: by goneo X-Spam-Flag: NO X-Spam-Score: -2.858 X-Spam-Level: X-Spam-Status: No, score=-2.858 tagged_above=-999 tests=[ALL_TRUSTED=-1, AWL=0.042, BAYES_00=-1.9] autolearn=ham Received: from smtp3.goneo.de ([127.0.0.1]) by localhost (smtp3.goneo.de [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id g16ixfkAA9E9; Sat, 7 Sep 2019 20:37:15 +0200 (CEST) Received: from [192.168.1.107] (dyndsl-178-142-129-167.ewe-ip-backbone.de [178.142.129.167]) by smtp3.goneo.de (Postfix) with ESMTPSA id 7DB0D23F08E; Sat, 7 Sep 2019 20:37:13 +0200 (CEST) Subject: Re: [PATCH 0/6] Address issues with SPDX requirements and PEP-263 To: Mauro Carvalho Chehab Cc: Jonathan Corbet , Linux Media Mailing List , Mauro Carvalho Chehab , Greg Kroah-Hartman , Joe Perches , linux-kernel@vger.kernel.org, Arnaldo Carvalho de Melo , Sven Eckelmann , Ingo Molnar , Thomas Gleixner , Doug Smythies , =?UTF-8?Q?Aur=c3=a9lien_Cedeyn?= , Vincenzo Frascino , linux-doc@vger.kernel.org, "Rafael J. Wysocki" , Andrew Morton , Thierry Reding , Armijn Hemel , Jiri Olsa , =?UTF-8?Q?Uwe_Kleine-K=c3=b6nig?= , Namhyung Kim , Peter Zijlstra , Federico Vaga , Allison Randal , Alexander Shishkin , Shuah Khan References: <20190907073419.6a88e318@lwn.net> <20190907132259.3199c8a2@coco.lan> <20190907150442.583b44c2@coco.lan> From: Markus Heiser Message-ID: <686101df-f40c-916e-2730-353a3852cc84@darmarit.de> Date: Sat, 7 Sep 2019 20:37:13 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 MIME-Version: 1.0 In-Reply-To: <20190907150442.583b44c2@coco.lan> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: de-DE Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Am 07.09.19 um 20:04 schrieb Mauro Carvalho Chehab: > Em Sat, 7 Sep 2019 19:33:06 +0200 > Markus Heiser escreveu: >> An (uncatched) exception is thrown, when writing UTF-8 to a stream which >> do not support UTF-8 .. this is not a crash, it mostly indicates that the >> developper makes some wrong assumption about the use-case. > > A not-handled exception is a crash in Python. I've seen python scripts > crash countless times with non-English names. This has nothing to do with the language, ask the developer of those scripts. >> There exists >> also the possibility to encode the UTF-8 to ASCII and replace unknown >> code points in the out-stream, or to catch the exception. > > Yeah, but getting this right is very painful. I use patchwork since 2013. > It took *years* for it to not crash with non-ASCII chars[1]. That's, btw, > the primary reason why I don't usually use python: with other languages, > an alien char doesn't cause a crash. Python cares encoded (text) string-types while other languages and application are just piping bytes to streams .. if you care about the enconding you need exceptions when one whants write UTF-8 to ASCII out. Anyway this is a bit of nitpicking / not helping here .. > > [1] I might be wrong, but the last patch I saw addressing an issue > there was applied this year. I alrady postet an example [1] This means your application has to know the encoding of a stream/file. E.g. we handle the output from of the external Perl script scripts/kernel-docs by encoding the byte stream from proc-call's stdout into utf-8: out, err = codecs.decode(out, 'utf-8'), codecs.decode(err, 'utf-8') see patch https://github.com/torvalds/linux/commit/86c0f046a8b0c23fca65f77333c233a06c25ef9a Again, this is talking about application development and has nothing to do with the encoding of the source files. [1] https://www.mail-archive.com/linux-doc@vger.kernel.org/msg33240.html >> >> But this was only academical, where do we have such problems in practice? >> >>> At least on media, we define that some Kernel strings can be UTF-8. >>> See, for example the model field at the media_entity struct: >>> >>> https://linuxtv.org/downloads/v4l-dvb-apis/kapi/mc-core.html >>> >>> As stated there: >>> >>> "media_entity.model must be filled with the device model name as >>> a NUL-terminated UTF-8 string. The device/model revision must >>> not be stored in this field." >>> >>> I've no idea if the two perf scripts that contain the encoding data are >>> meant to print some strings that may be UTF-8 encoding (like those that >>> we have at the media subsystem), or if it is just that whomever added >>> were using e-macs and wanted to make his life simpler. As it is better >>> to be safe then sorry, on patches 2 and 3, I'm assuming the first case. >> >> Hm, I'am unsure if I understand you correct: Using UTF-8 in the .rst >> files are fine .. where do we have scripts generating UTF-8 outputs? >> (except the HTML output). > > In thesis, perf scripts may be reading strings from the Kernel, with > might be using UTF-8 encoding. > >> >>> >>> In any case, we do need the encoding line at Sphinx extensions, >>> although there, the shebang line is optional. >>> >>> In other words, we have those alternatives: >>> >>> 1) Neither shebang nor coding -> SPDX will be at first line; >>> 2) shebang + SPDX -> SPDX will be at the second line; >>> 3) shebang + coding + SPDX -> SPDX will be at the third line; >>> 4) coding + SPDX >>> >>> This is something that only makes sense for Sphinx extensions. >>> >>> IMHO, I would place SPDX at the second line too, but I *guess* Python >>> may accept it at the first line and would still properly evaluate >>> coding (as this technically satisfies the text at PEP-263). >> >> Why you are so restrictive .. > > No idea. I would actually prefer to just remove the restriction, and let > the SPDX header to be anywhere inside the first comment block inside a > file [2]. > > That's basically how this thread started: other developers think > that it is a good idea to be pedantic. So, be it, but let's then fix > the documentation, as the way it is, it is implicitly forbidding the > addition of encoding lines for Python scripts. > > [2] I *suspect* that the restriction was added in order to make > ./scripts/spdxcheck.py to run faster and to avoid false positives. > Right now, if the maximum limit is removed (or set to a very high > value), there will be one false positive: > > Documentation/dev-tools/kselftest.rst > > This doc has a SPDX-like tag at line 230, asking people to add SPDX > headers on files, but the file itself doesn't have its own SPDX tag. > >> what we normal do: >> >> - write a shebang line if this file is called directly from the >> command line .. but we do not need shebangs on py modules which >> are imported from other modules or scripts >> >> - write a encoding line if it is need or helpful / mostly it is helpful >> to know the encoding of a text/code file. >> >> - add a SPDX tag > > Yes, but this violates the current documentation, as it doesn't allow the > SPDX tag after line #2. Thats what I mean: The documentation was written with only a small use-cases in mind .. there is no real need for SPDX to be in line one or two ... lets fix the documentation as I described before. Side note: if I can help you with perf or your build systems, don't hesitate to contact me directly. -- Markus -- >> At the end we will have files with one, two or all three of this lines. >> And the oder of this lines is, what I wrote: >> >>>> >>>> Thats what I mean [1] .. lets patch the description in the license-rules.rst:: >>>> >>>> - first line for the OS (shebang) >>>> - second line for environment (python-encoding, editor-mode, ...) >>>> - third and more lines for application (SPDX use) .. >>>> >>>> [1] https://www.mail-archive.com/linux-doc@vger.kernel.org/msg33240.html >>>> >>>> -- Markus -- >>>> >>>>> This suggests to me that we're adding a bunch of complications that we >>>>> don't necessarily need. What am I missing here? >>>>> >>>>> Educate me properly and I'll not try to stand in the way of all this... >>>>> >> >> >> It seems like it is not only me who is mising something .. what are >> the use-cases we have py-Exceptions, what are the use-cases to be so >> restrictive as you described above. >> >> .. or did alice get lost in the cave? >> >> Thanks for your patience with me >> >> -- Markus -- > > > > Thanks, > Mauro >