Received: by 2002:a05:6a10:6744:0:0:0:0 with SMTP id w4csp5318573pxu; Wed, 21 Oct 2020 21:37:26 -0700 (PDT) X-Google-Smtp-Source: ABdhPJw1oZ39FTuDoWCJhjchwtrvdE+0jNiXjNk4JkgP7DKcCw0g1yvT06UZ2dM0BmtJXe2DZgLG X-Received: by 2002:a17:906:b055:: with SMTP id bj21mr145673ejb.334.1603341445826; Wed, 21 Oct 2020 21:37:25 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1603341445; cv=none; d=google.com; s=arc-20160816; b=EG9otBOoA+/0evwcmPQbN6cYtIrBDm81wok5DKnB07XBteCbRTjBXffvNeRpPh0vYe zom7B4+yt0P1HKBJqNp+ZSIuq+mwnRAW2eg1vX5nSrJd/ulUloFN3LH2jte4sSBuryKp IgW+1sH80dEw6H9YHk0OumJreK6xo/sAzJ5XTrfosSdPbx67G3hEK+I5M5N9y8bXbGCT aHqhowRhjvsiHcggRB+oUlSapNKGp1BMSz7Ki3nF5gJJfMpkwGRv3/M7Rg4RvLloBDUC lKnPRTUbDxesugh2azBz71sfKFiskDQyMUKE52Nzk8ZwloEIdCkPwHxX1Si+lQyysFfU gueg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:date:cc:to:from:subject :message-id; bh=jvvhFXtwh4yoW4rRBR/MV1wm4DuaRbAZ0eHFOqxUPHE=; b=olJACo9e242WwLKJ7PFKh3QrapgYEcXW5yVb+WXTRxGo1GRFhHAFAUszzGz/2FTnG3 YiiRUngSspooTx57s4aU9sG39Rr8MvO+4pn04vc+3QdvGMnvoxNwpX86jTZZPggy6Y1D D6l7wYT9PTf1XpzkMW56E0FjDYzz2vR3ywKkdxF/Y5W01Gu9wdnbgN78sMLv1TfSlLXp wMtaF+yVRVebI+wlJV+DH4JZoe/meIbO9XOo6cASL7Cy6aQgkJcm1zFy+KkKT8HnpUzS x5lSTLf6d/ThiZS1GOT59J19GHpl1d4D+1Dc3d1xGRgvet0eL9Jb1zOpaaV0h1iPiArR IQ+g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id e7si264877ejm.599.2020.10.21.21.37.03; Wed, 21 Oct 2020 21:37:25 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2444116AbgJUPTA (ORCPT + 99 others); Wed, 21 Oct 2020 11:19:00 -0400 Received: from smtprelay0048.hostedemail.com ([216.40.44.48]:45376 "EHLO smtprelay.hostedemail.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S2444103AbgJUPTA (ORCPT ); Wed, 21 Oct 2020 11:19:00 -0400 Received: from filter.hostedemail.com (clb03-v110.bra.tucows.net [216.40.38.60]) by smtprelay04.hostedemail.com (Postfix) with ESMTP id 7961A180A7FFA; Wed, 21 Oct 2020 15:18:59 +0000 (UTC) X-Session-Marker: 6A6F6540706572636865732E636F6D X-Spam-Summary: 10,1,0,,d41d8cd98f00b204,joe@perches.com,,RULES_HIT:41:334:355:368:369:379:599:973:982:988:989:1260:1277:1311:1313:1314:1345:1359:1437:1515:1516:1518:1534:1541:1593:1594:1711:1730:1747:1777:1792:2393:2559:2562:2828:3138:3139:3140:3141:3142:3353:3622:3653:3834:3865:3866:3867:3868:3870:3874:4321:4560:5007:7903:8784:10004:10400:10848:11026:11232:11658:11914:12043:12295:12297:12438:12555:12740:12760:12895:12986:13069:13095:13161:13229:13311:13357:13439:13523:13524:14181:14659:14721:21080:21433:21451:21627:21660:30003:30054:30070:30089:30091,0,RBL:none,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:,MSBL:0,DNSBL:none,Custom_rules:0:0:0,LFtime:1,LUA_SUMMARY:none X-HE-Tag: tail45_2c0cd3027249 X-Filterd-Recvd-Size: 2737 Received: from XPS-9350.home (unknown [47.151.133.149]) (Authenticated sender: joe@perches.com) by omf12.hostedemail.com (Postfix) with ESMTPA; Wed, 21 Oct 2020 15:18:58 +0000 (UTC) Message-ID: Subject: Re: [PATCH] checkpatch: fix false positive for REPEATED_WORD warning From: Joe Perches To: Aditya Srivastava Cc: linux-kernel@vger.kernel.org, lukas.bulwahn@gmail.com, linux-kernel-mentees@lists.linuxfoundation.org, dwaipayanray1@gmail.com Date: Wed, 21 Oct 2020 08:18:57 -0700 In-Reply-To: <20201021150120.29920-1-yashsri421@gmail.com> References: <20201021150120.29920-1-yashsri421@gmail.com> Content-Type: text/plain; charset="ISO-8859-1" User-Agent: Evolution 3.36.4-0ubuntu1 MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 2020-10-21 at 20:31 +0530, Aditya Srivastava wrote: > Presence of hexadecimal address or symbol results in false warning > message by checkpatch.pl. > > For example, running checkpatch on commit b8ad540dd4e4 ("mptcp: fix > memory leak in mptcp_subflow_create_socket()") results in warning: > > WARNING:REPEATED_WORD: Possible repeated word: 'ff' > 00 00 00 00 00 00 00 00 00 2f 30 0a 81 88 ff ff ........./0..... Right. > To avoid all such reports, add an additional regex check for a repeating > pattern of 4 or more 2-lettered words separated by space in a line. > A quick evaluation on v5.6..v5.8 showed that this fix reduces > REPEATED_WORD warnings from 2797 to 1043. Are many of the other 1043 false positives? Any pattern to them? > diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl [] > @@ -3050,8 +3050,10 @@ sub process { > } > } > > -# check for repeated words separated by a single space > - if ($rawline =~ /^\+/ || $in_commit_log) { > +# check for repeated words separated by a single space and > +# avoid repeating hex occurrences like 'ff ff fe 09 ...' > + if (($rawline =~ /^\+/ || $in_commit_log) && > + $rawline !~ /(\b[0-9a-f]{2}( )+){4,}/) { This might be better as \b$Hex to avoid FF FF and FFFFFFFF FFFFFFFF I might add that check to the line below where the repeated words are checked against long --- scripts/checkpatch.pl | 1 + 1 file changed, 1 insertion(+) diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl index fab38b493cef..929866999f81 100755 --- a/scripts/checkpatch.pl +++ b/scripts/checkpatch.pl @@ -3062,6 +3062,7 @@ sub process { next if ($first ne $second); next if ($first eq 'long'); + next if ($first =~ /^$Hex$/; if (WARN("REPEATED_WORD", "Possible repeated word: '$first'\n" . $herecurr) &&