Received: by 2002:a05:7412:8598:b0:f9:33c2:5753 with SMTP id n24csp215790rdh; Mon, 18 Dec 2023 17:25:59 -0800 (PST) X-Google-Smtp-Source: AGHT+IEOKWMKjtQRl1YaVQhTi3qdMhHRsayFHNx4X4GPbXRa0vAx8rqS2HF5Wc8ikMcYBSmNq8J3 X-Received: by 2002:a05:600c:3b23:b0:401:bd2e:49fc with SMTP id m35-20020a05600c3b2300b00401bd2e49fcmr8548315wms.24.1702949159359; Mon, 18 Dec 2023 17:25:59 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1702949159; cv=none; d=google.com; s=arc-20160816; b=wOJn35+uGPEyMhdGeXe/PpLqGMGn4+pvZbHphn7rGtHnXRwBFPJ9GIqVMfs4mwy0/5 LesVGJ12pAYnDVsW0IP+2G1QN++Iqle2EcXg0W0hkqW/jBwgWpAxBsVwvVYyURf22f7c KdAJNzM4+mFIscwfQIKuttUUiHhO46nNcYpTZe833AHBn8DoTdjGgxc1V1Si0GW1/FQy 80wEwTcTSWSV0BrtXr3sjn/3En2OFYX4QXZXg7VGXLszigrPv9XqSpgPLoCOik2bQh7v L1Js2XzPvnmaB/fnp8kr+qEBtaKarbmxxF5OxnokuBrA2p/1ADtIZ5BXUKgesC0hxq0t NR8g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:list-unsubscribe:list-subscribe:list-id:precedence :subject:date:from:dkim-signature; bh=DCcA5+GzeT7LmnAC4ZC7lotYpQkFKLlKWVTf7X6olec=; fh=xl2J40ly2ZcWHQktF4kxcRfS4CvpZ1lyNrelw9LazNI=; b=j3T4yBo74Yk3kyvhlN4/NNu9Io43s1QzPN+HuNdmT5BYlxFUMlt/Cs5s/gZknoWb0l VFMC50xu6KwpymdqQREndB5+2s44GBFCDFnqdjlRDoj3XZ3vHGgf2KXF21YJVD9mDtzj qSbHds74vu9+thgUUP63DBGNkQt8iPaCC1+rxXAMn0p7L2yhfXzUKp7zxy0hOfltyVnZ ZSYSueJ1RbzeTGZNlsM5NyGsWInm+GCitrt+2/6DOW6NoSXXixIkmAAXKEiXtYfFqdX6 p6gPSuHPE3rI0QtKVXXcu5FEmKsgyAUg+gj8BWHIJu8PNBdwgZZwpfhq1zu/MmnSPjpH EhsA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@pqrs.dk header.s=key1 header.b=WbvSdKXo; spf=pass (google.com: domain of linux-kernel+bounces-4553-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-4553-linux.lists.archive=gmail.com@vger.kernel.org" Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [147.75.80.249]) by mx.google.com with ESMTPS id qx14-20020a170906fcce00b00a232a3229e1si2432487ejb.529.2023.12.18.17.25.59 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 18 Dec 2023 17:25:59 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-4553-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) client-ip=147.75.80.249; Authentication-Results: mx.google.com; dkim=pass header.i=@pqrs.dk header.s=key1 header.b=WbvSdKXo; spf=pass (google.com: domain of linux-kernel+bounces-4553-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-4553-linux.lists.archive=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id 0A7831F22786 for ; Tue, 19 Dec 2023 01:25:59 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id E94911FDE; Tue, 19 Dec 2023 01:25:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pqrs.dk header.i=@pqrs.dk header.b="WbvSdKXo" X-Original-To: linux-kernel@vger.kernel.org Received: from out-183.mta0.migadu.com (out-183.mta0.migadu.com [91.218.175.183]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8D3B615AC for ; Tue, 19 Dec 2023 01:25:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pqrs.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pqrs.dk X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pqrs.dk; s=key1; t=1702949137; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=DCcA5+GzeT7LmnAC4ZC7lotYpQkFKLlKWVTf7X6olec=; b=WbvSdKXocvo/vee2x/4gWhabR5oUOFwYlDUL49pFBugA3rlgQAtfZEYOnPGoFZbNEAKeQD L7qjo3jJwn5bQ1HLg8a78fkvseHQ2D8YawQ5IhtNdZ9+7Oy+rqfgixtH/t/w1wlYrVqScT Vtm59qGpbVGYDL/y4IFf51r8+d9hiADQP0fSxjMxBeTsznXA8P/e1kFzjhqClg5bNV6n5c iocVQ3UpcrBEilkx0w2lXyfj4Id8OXjfn9XbY3pj8quvD4X8oDeNSwGJztj/EraYGQ1EsA pZG+H2avnXHdNha9ldle27k5xIaN2pUFnI40PEogjJ/7jP+ITArhmrSDihh/sw== From: =?utf-8?q?Alvin_=C5=A0ipraga?= Date: Tue, 19 Dec 2023 02:25:15 +0100 Subject: [PATCH v3 2/2] get_maintainer: remove stray punctuation when cleaning file emails Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 8bit Message-Id: <20231219-get-maintainers-utf8-v3-2-f85a39e2265a@bang-olufsen.dk> References: <20231219-get-maintainers-utf8-v3-0-f85a39e2265a@bang-olufsen.dk> In-Reply-To: <20231219-get-maintainers-utf8-v3-0-f85a39e2265a@bang-olufsen.dk> To: Joe Perches , Linus Torvalds , Andrew Morton Cc: =?utf-8?q?Duje_Mihanovi=C4=87?= , Konstantin Ryabitsev , linux-kernel@vger.kernel.org, =?utf-8?q?Alvin_=C5=A0ipraga?= X-Migadu-Flow: FLOW_OUT From: Alvin Šipraga When parsing emails from .yaml files in particular, stray punctuation such as a leading '-' can end up in the name. For example, consider a common YAML section such as: maintainers: - devicetree@vger.kernel.org This would previously be processed by get_maintainer.pl as: - Make the logic in clean_file_emails more robust by deleting any sub-names which consist of common single punctuation marks before proceeding to the best-effort name extraction logic. The output is then correct: devicetree@vger.kernel.org Some additional comments are added to the function to make things clearer to future readers. Link: https://lore.kernel.org/all/0173e76a36b3a9b4e7f324dd3a36fd4a9757f302.camel@perches.com/ Suggested-by: Joe Perches Signed-off-by: Alvin Šipraga --- scripts/get_maintainer.pl | 18 +++++++++++------- 1 file changed, 11 insertions(+), 7 deletions(-) diff --git a/scripts/get_maintainer.pl b/scripts/get_maintainer.pl index dac38c6e3b1c..ee1aed7e090c 100755 --- a/scripts/get_maintainer.pl +++ b/scripts/get_maintainer.pl @@ -2462,11 +2462,17 @@ sub clean_file_emails { foreach my $email (@file_emails) { $email =~ s/[\(\<\{]{0,1}([A-Za-z0-9_\.\+-]+\@[A-Za-z0-9\.-]+)[\)\>\}]{0,1}/\<$1\>/g; my ($name, $address) = parse_email($email); - if ($name eq '"[,\.]"') { - $name = ""; - } + # Strip quotes for easier processing, format_email will add them back + $name =~ s/^"(.*)"$/$1/; + + # Split into name-like parts and remove stray punctuation particles my @nw = split(/[^\p{L}\'\,\.\+-]/, $name); + @nw = grep(!/^[\'\,\.\+-]$/, @nw); + + # Make a best effort to extract the name, and only the name, by taking + # only the last two names, or in the case of obvious initials, the last + # three names. if (@nw > 2) { my $first = $nw[@nw - 3]; my $middle = $nw[@nw - 2]; @@ -2480,18 +2486,16 @@ sub clean_file_emails { } else { $name = "$middle $last"; } + } else { + $name = "@nw"; } if (substr($name, -1) =~ /[,\.]/) { $name = substr($name, 0, length($name) - 1); - } elsif (substr($name, -2) =~ /[,\.]"/) { - $name = substr($name, 0, length($name) - 2) . '"'; } if (substr($name, 0, 1) =~ /[,\.]/) { $name = substr($name, 1, length($name) - 1); - } elsif (substr($name, 0, 2) =~ /"[,\.]/) { - $name = '"' . substr($name, 2, length($name) - 2); } my $fmt_email = format_email($name, $address, $email_usename); -- 2.43.0