2015-12-01 17:11:11

by Michal Suchanek

[permalink] [raw]
Subject: Non-ascii mantainers

Hello,

there are non-ascii characters in output of scripts/get_maintainer.pl

If output of said script is used as --to in git format-patch the patch
is rejected by this list.

I am not sure if weeding out non-ascii maintainers is even feasible.
scripts/get_maintainer.pl uses the MAINTAINERS file and git log. While
non-ascii in MAINTAINERS file can be replaced I am not sure how many
git commits use non-ascii committers.

Thanks

Michal


2015-12-01 17:19:13

by Joe Perches

[permalink] [raw]
Subject: Re: Non-ascii mantainers

On Tue, 2015-12-01 at 18:10 +0100, Michal Suchanek wrote:
> there are non-ascii characters in output of scripts/get_maintainer.pl
>
> If output of said script is used as --to in git format-patch the patch
> is rejected by this list.

I've used it. It works for me with 8 bit characterz.
Maybe there's something else going on.

> I am not sure if weeding out non-ascii maintainers is even feasible.
> scripts/get_maintainer.pl uses the MAINTAINERS file and git log. While
> non-ascii in MAINTAINERS file can be replaced I am not sure how many
> git commits use non-ascii committers.

2015-12-01 17:20:06

by Al Viro

[permalink] [raw]
Subject: Re: Non-ascii mantainers

On Tue, Dec 01, 2015 at 06:10:29PM +0100, Michal Suchanek wrote:
> Hello,
>
> there are non-ascii characters in output of scripts/get_maintainer.pl
>
> If output of said script is used as --to in git format-patch the patch
> is rejected by this list.

Try to reproduce that in a UTF8 locale...

2015-12-01 17:28:26

by Michal Suchanek

[permalink] [raw]
Subject: Re: Non-ascii mantainers

On 1 December 2015 at 18:20, Al Viro <[email protected]> wrote:
> On Tue, Dec 01, 2015 at 06:10:29PM +0100, Michal Suchanek wrote:
>> Hello,
>>
>> there are non-ascii characters in output of scripts/get_maintainer.pl
>>
>> If output of said script is used as --to in git format-patch the patch
>> is rejected by this list.
>
> Try to reproduce that in a UTF8 locale...

I am using UTF-8 locale since ages.

The characters show correctly in my terminal. I have no problem with
that. The e-mail is then just rejected by the list server.

I don't really care if the maintainers are encoded or whatever.
However, neither get_maintainers nor git format-patch encodes them and
the listserver rejects them when not encoded.

Thanks

Michal

<[email protected]>:
209.132.180.67 failed after I sent the message.
Remote host said: 550 5.7.1 Content-Policy reject msg: Message headers
can not have 8-bit non-ASCII characters in it; Use MIME encodings if
such are needed! BF:<H 0>; S1754713AbbLARAp

<[email protected]>:
209.132.180.67 failed after I sent the message.
Remote host said: 550 5.7.1 Content-Policy reject msg: Message headers
can not have 8-bit non-ASCII characters in it; Use MIME encodings if
such are needed! BF:<H 0>; S1755924AbbLARAp

2015-12-01 20:22:39

by Andreas Mohr

[permalink] [raw]
Subject: Re: Non-ascii mantainers

Hi,

> I don't really care if the maintainers are encoded or whatever.
> However, neither get_maintainers nor git format-patch encodes them and
> the listserver rejects them when not encoded.

"neither ... nor" - IMHO transcoding should always be done at exactly *one* layer
transition (namely prior to entering the layer which might happen to be
using non-encoded same-char values as used by payload data as control chars,
or which might happen to have a "reduced" encoding only requirement [think 7bit vs. 8bit],
and only there, and of course also correctly [transcodes full set of
control chars, and properly]). ;)

So, since git-format-patch (according to git-format-patch(1)) itself
declares that it does "Prepare patches for e-mail submission",
it would seem that format-patch would definitely need to provide
readily submittable content i.e.
support submitting (i.e., generating) such content in *fully compatible* format
(maybe it would not need to generate this as MIME encoding always
- since there might be different encoding technologies to be chosen -
but at least it should offer an encoding cmdline option,
with this option then definitely defaulting to the mainstream choice, probably MIME).
IOW, I would consider this to be a git-format-patch "missing crucial i18n support" bug
(a bug for this should probably be filed).

(and, due to my reasoning above, transcoding would *not* be the job of get_maintainers)

Rather astonishing that this issue is hitting the streets in 2015 -
if we aren't missing something here, that is...

HTH,

Andreas Mohr

2015-12-03 06:56:04

by Michal Suchanek

[permalink] [raw]
Subject: Re: Non-ascii mantainers

On 1 December 2015 at 21:13, Andreas Mohr <[email protected]> wrote:
> Hi,
>
>> I don't really care if the maintainers are encoded or whatever.
>> However, neither get_maintainers nor git format-patch encodes them and
>> the listserver rejects them when not encoded.
>
> "neither ... nor" - IMHO transcoding should always be done at exactly *one* layer
> transition (namely prior to entering the layer which might happen to be
> using non-encoded same-char values as used by payload data as control chars,
> or which might happen to have a "reduced" encoding only requirement [think 7bit vs. 8bit],
> and only there, and of course also correctly [transcodes full set of
> control chars, and properly]). ;)
>
> So, since git-format-patch (according to git-format-patch(1)) itself
> declares that it does "Prepare patches for e-mail submission",
> it would seem that format-patch would definitely need to provide
> readily submittable content i.e.
> support submitting (i.e., generating) such content in *fully compatible* format
> (maybe it would not need to generate this as MIME encoding always
> - since there might be different encoding technologies to be chosen -
> but at least it should offer an encoding cmdline option,
> with this option then definitely defaulting to the mainstream choice, probably MIME).
> IOW, I would consider this to be a git-format-patch "missing crucial i18n support" bug
> (a bug for this should probably be filed).
>
> (and, due to my reasoning above, transcoding would *not* be the job of get_maintainers)
>

The transcoding would be normally the job of a MUA which is git
format-patch here. If it were really required, that is.

I am not sure what exactly is the position of the SMTP standard here
on 8bit headers. However, as with all RFCs it's suggested that you are
strict in what you send and lenient in what you accept. Here git
format-patch can be held responsible for lack of interoperability with
7-bit only SMTP servers and the LKML mail server for enforcing 7-bit
only in 2015 when most mail servers are 8-bit transparent.

Thanks

Michal

2015-12-07 09:18:45

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: Non-ascii mantainers

Hi Michael,

On Tue, Dec 1, 2015 at 6:27 PM, Michal Suchanek <[email protected]> wrote:
> On 1 December 2015 at 18:20, Al Viro <[email protected]> wrote:
>> On Tue, Dec 01, 2015 at 06:10:29PM +0100, Michal Suchanek wrote:
>>> there are non-ascii characters in output of scripts/get_maintainer.pl
>>>
>>> If output of said script is used as --to in git format-patch the patch
>>> is rejected by this list.
>>
>> Try to reproduce that in a UTF8 locale...
>
> I am using UTF-8 locale since ages.
>
> The characters show correctly in my terminal. I have no problem with
> that. The e-mail is then just rejected by the list server.
>
> I don't really care if the maintainers are encoded or whatever.
> However, neither get_maintainers nor git format-patch encodes them and
> the listserver rejects them when not encoded.

I always pass the --to and --cc to git send-email, not to format-patch, and
that works:

git send-email \
--to "Måns Rullgård <[email protected]>" \
--to "David S. Miller <[email protected]>" \
--cc "[email protected]" \
--cc "[email protected]" \
*00*

becomes:

From: Geert Uytterhoeven <[email protected]>
To: =?UTF-8?q?M=C3=A5ns=20Rullg=C3=A5rd?= <[email protected]>,
"David S. Miller" <[email protected]>
Cc: [email protected],
[email protected],
Geert Uytterhoeven <[email protected]>
Subject: [PATCH] ethernet: aurora: AURORA_NB8800 should depend on HAS_DMA
Date: Mon, 7 Dec 2015 10:09:06 +0100
Message-Id: <[email protected]>
X-Mailer: git-send-email 1.9.1

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2015-12-07 09:41:24

by Michal Suchanek

[permalink] [raw]
Subject: Re: Non-ascii mantainers

On 7 December 2015 at 10:18, Geert Uytterhoeven <[email protected]> wrote:
> Hi Michael,
>
> On Tue, Dec 1, 2015 at 6:27 PM, Michal Suchanek <[email protected]> wrote:
>> On 1 December 2015 at 18:20, Al Viro <[email protected]> wrote:
>>> On Tue, Dec 01, 2015 at 06:10:29PM +0100, Michal Suchanek wrote:
>>>> there are non-ascii characters in output of scripts/get_maintainer.pl
>>>>
>>>> If output of said script is used as --to in git format-patch the patch
>>>> is rejected by this list.
>>>
>>> Try to reproduce that in a UTF8 locale...
>>
>> I am using UTF-8 locale since ages.
>>
>> The characters show correctly in my terminal. I have no problem with
>> that. The e-mail is then just rejected by the list server.
>>
>> I don't really care if the maintainers are encoded or whatever.
>> However, neither get_maintainers nor git format-patch encodes them and
>> the listserver rejects them when not encoded.
>
> I always pass the --to and --cc to git send-email, not to format-patch, and
> that works:
>
> git send-email \
> --to "Måns Rullgård <[email protected]>" \
> --to "David S. Miller <[email protected]>" \
> --cc "[email protected]" \
> --cc "[email protected]" \
> *00*
>
> becomes:
>
> From: Geert Uytterhoeven <[email protected]>
> To: =?UTF-8?q?M=C3=A5ns=20Rullg=C3=A5rd?= <[email protected]>,
> "David S. Miller" <[email protected]>
> Cc: [email protected],
> [email protected],
> Geert Uytterhoeven <[email protected]>
> Subject: [PATCH] ethernet: aurora: AURORA_NB8800 should depend on HAS_DMA
> Date: Mon, 7 Dec 2015 10:09:06 +0100
> Message-Id: <[email protected]>
> X-Mailer: git-send-email 1.9.1
>

I don't use git send-email because I do not have access to a working
SMTP server directly.

There is an option to use a sendmail binary instead of a SMTP server
address so I can try faking that I guess.

Thanks

Michal