Received: by 2002:a25:c593:0:0:0:0:0 with SMTP id v141csp562295ybe; Thu, 5 Sep 2019 02:16:14 -0700 (PDT) X-Google-Smtp-Source: APXvYqyyQ0xPPvfs54DoC12nyoNRLMIAJbwVlloHGmgYW0TektYTSmFMB195Q6RhhHQIslydEa5v X-Received: by 2002:a65:4509:: with SMTP id n9mr2207182pgq.133.1567674973971; Thu, 05 Sep 2019 02:16:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1567674973; cv=none; d=google.com; s=arc-20160816; b=U0qbNsGXKGHCYMTYHmL3wZtNakZQnpNbCqp0RuLUOOWUF2WT5jICjF+ZiJlHw76Pg1 qFWDikVM6GVgzm8RTmZAsYnskwP+rh/Caib7lCDTH9N7koa0jaxXSqVCWk9Bx1f/AX6R qTvvUdWOs9CITt1/0t0AC9sYURzoYd/QCxABlQLvGh2NPmYvqtc8onjg4DEV5Xsmu3dP XamUFnl/CAuSCZ3Qfr/tY5d2jWhtmUsDsaz2BAgnAT/YmoKMEQsnyoPfAm7diROEavnn nPL47yaRPDavM6xlXG+pie5dJE7Kca+nKK5C1GZdY0k6yBA9+gp5wJ0EB0oXz5oFxYBI 1qgQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :message-id:in-reply-to:subject:cc:to:from:date; bh=CpkvlrsCu3DMTh6H56U0w1s1r/fZo1Rwdu3SLOX+7uQ=; b=VM+tfsiX1pTaL3M5tnWTmbZr82BtO0K8hc8K7TOjwMxgdyFbr8IkThYQ9ruIcfNpvg qOAd+t1LoCNlWCDf7CiUtpUuIjGjrp9FChauo0dw5TPy69nDqf2Kk8Kbv0t0oKFnWP++ CW3RrZzsIzjYH0TB81gnTOpu0a9qp++ufUMDOs6bl7Ab87hmJ/E+WMZwXxd2gMOyEkO4 P5TAYh51hs/cXvPkqvj40cnvKGhxMSFiw89xWdxS6HNDzIeVmjCbffsLHeMrBNRgfrGp O8lSRmU+ThleFxFd5JlNlFDh85lMHE8yqBpBc8gTfcBoAEwMNXJP5oJyarpg/uZ4Sfxi 8erQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id p14si1229380pgh.566.2019.09.05.02.15.55; Thu, 05 Sep 2019 02:16:13 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731675AbfIEHvc (ORCPT + 99 others); Thu, 5 Sep 2019 03:51:32 -0400 Received: from mail2-relais-roc.national.inria.fr ([192.134.164.83]:1268 "EHLO mail2-relais-roc.national.inria.fr" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731162AbfIEHvb (ORCPT ); Thu, 5 Sep 2019 03:51:31 -0400 X-IronPort-AV: E=Sophos;i="5.64,469,1559512800"; d="scan'208";a="400347690" Received: from unknown (HELO hadrien) ([132.227.124.24]) by mail2-relais-roc.national.inria.fr with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 05 Sep 2019 09:51:28 +0200 Date: Thu, 5 Sep 2019 09:51:28 +0200 (CEST) From: Julia Lawall X-X-Sender: julia@hadrien To: Denis Efremov cc: Coccinelle , Gilles Muller , Nicolas Palix , Michal Marek , linux-kernel@vger.kernel.org, tacingiht@gmail.com Subject: Re: [RFC PATCH] coccinelle: check for integer overflow in binary search In-Reply-To: <8916c9d9-bdf5-51c2-b5cb-49898e14a00c@linux.com> Message-ID: References: <20190904221223.5281-1-efremov@linux.com> <8916c9d9-bdf5-51c2-b5cb-49898e14a00c@linux.com> User-Agent: Alpine 2.21 (DEB 202 2017-01-01) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 5 Sep 2019, Denis Efremov wrote: > > > On 05.09.2019 09:20, Julia Lawall wrote: > > > > > > On Thu, 5 Sep 2019, Denis Efremov wrote: > > > >> This is an RFC. I will resend the patch after feedback. Currently > >> I'm preparing big patchset with bsearch warnings fixed. The rule will > >> be a part of this patchset if it will be considered good enough for > >> checking. > >> > >> There is a known integer overflow error [1] in the binary search > >> algorithm. Google faced it in 2006 [2]. This rule checks midpoint > >> calculation in binary search for overflow, i.e., (l + h) / 2. > >> Not every match is an actual error since the array could be small > >> enough. However, a custom implementation of binary search is > >> error-prone and it's better to use the library function (lib/bsearch.c) > >> or to apply defensive programming for midpoint calculation. > >> > >> [1] https://en.wikipedia.org/wiki/Binary_search_algorithm#Implementation_issues > >> [2] https://ai.googleblog.com/2006/06/extra-extra-read-all-about-it-nearly.html > >> > >> Signed-off-by: Denis Efremov > >> --- > >> scripts/coccinelle/misc/bsearch.cocci | 80 +++++++++++++++++++++++++++ > >> 1 file changed, 80 insertions(+) > >> create mode 100644 scripts/coccinelle/misc/bsearch.cocci > >> > >> diff --git a/scripts/coccinelle/misc/bsearch.cocci b/scripts/coccinelle/misc/bsearch.cocci > >> new file mode 100644 > >> index 000000000000..a99d9a8d3ee5 > >> --- /dev/null > >> +++ b/scripts/coccinelle/misc/bsearch.cocci > >> @@ -0,0 +1,80 @@ > >> +// SPDX-License-Identifier: GPL-2.0-only > >> +/// Check midpoint calculation in binary search algorithm for integer overflow > >> +/// error [1]. Google faced it in 2006 [2]. Not every match is an actual error > >> +/// since the array can be small enough. However, a custom implementation of > >> +/// binary search is error-prone and it's better to use the library function > >> +/// (lib/bsearch.c) or to apply defensive programming for midpoint calculation. > >> +/// > >> +/// [1] https://en.wikipedia.org/wiki/Binary_search_algorithm#Implementation_issues > >> +/// [2] https://ai.googleblog.com/2006/06/extra-extra-read-all-about-it-nearly.html > >> +// > >> +// Confidence: Medium > >> +// Copyright: (C) 2019 Denis Efremov, ISPRAS > >> +// Comments: > >> +// Options: --no-includes --include-headers > >> + > >> +virtual report > >> +virtual org > >> + > >> +@r depends on org || report@ > >> +identifier l, h, m; > >> +statement S; > >> +position p; > >> +// to match 1 in << > >> +// to match 2 in / > >> +// Can't use exact values, e.g. 2, because it fails to match 2L. > >> +// TODO: Is there an isomorphism for 2, 2L, 2U, 2UL, 2ULL, etc? > >> +constant c; > > > > As far as I can see, you aren't checking for 2 at all at the moment? > > Yes, there are no false positives even without pinning constants to 1, 2. > However, it's better to express this in the rule. > > > You > > should be able to say constant c = {2, 2L, etc};. Actually, we do > > consider several variants of 0, so it could be reasonable to allow eg 2 to > > match other variants as well. > > It looks like integer literals aren't fully supported. When I'm trying to write > 'constant c = {2L}; ' it fails with int_of_string error. Oops. I'll fix it, but since people may be using older versions of Coccinelle, perhaps it is not worth taking this strategy in this case. Could you make a disjunction, or check for the proper value in the python code? julia