Received: by 2002:a25:c593:0:0:0:0:0 with SMTP id v141csp495684ybe; Thu, 5 Sep 2019 00:58:01 -0700 (PDT) X-Google-Smtp-Source: APXvYqxNbxypOdcD/uL+pK5flaFLQCRMOrDIbYEZLEUfrN7ukR710K9qo4XzqWPQlRGLbbhubCFX X-Received: by 2002:a17:902:441:: with SMTP id 59mr2046980ple.62.1567670281771; Thu, 05 Sep 2019 00:58:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1567670281; cv=none; d=google.com; s=arc-20160816; b=C3HBaZ7uoSjppoerE0gdA8jCg73ozt6wfXZOK8UHPGmkzzfpnnjMxoj4EZkBppPYFw +rDma0jS/nAmnqrkr9YKvmMoU1MrvIXKZuqqMzK2x4gZN/WdcboYoXVIfZ9TzRai5P0b slXPXzLZGBpVxrx4oNCumcKqi2bDtRcXsVnNnd9h6aJ8o9VIMlqaLrUtz+zFWRTAjIcA DadtqLwn9dYF04+udyTP9jl3urP4ZVqRVDwC11Dhlj2kr1Wd5TqZ6jzieMjXN90ytFkS Xtvq8dzLQWI+T0XZWOF0a9M+/11+7ZZ6I0RmqD4C1Bx72ieSj9H/5iYK5rHccaU+T2i8 rdxA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=k2T9+4yCg/nj22t+epOd3cFsZ2Dp5yvNb3cPsYW2e68=; b=0Q5ROkbEQBBurYZsDh48FW9rBSU7sJFUJd4oObBkeHhtwxBnzM6wL/rDqqjFL6zQI3 8wsTxOQc/dZMULiAZWVqD85+kXljlMw6uDu/tQDuHeegyIL/MVQ6OHswdacF/fo+xGPy gI1lKJXR0NTL9eJKJJxHsPHCckBsjA1d4FNHYMfFLO4k+Cu9YoatZ7ptj0REqMv87q6Q wKErYRdAl03N8S1bo4Wvbho1QZPT0GJypbysMdsa5Br9e+euRvUd5Zj9Xt8IeOKWChgs V/hsAySltjKkRSkpNxGn7dqQIBaGo5O8a2jWnjC562Ocf5M4RhsCstycVrnDv2K49FkN Eq2g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id n14si1405122pjj.56.2019.09.05.00.57.45; Thu, 05 Sep 2019 00:58:01 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731895AbfIEHgg (ORCPT + 99 others); Thu, 5 Sep 2019 03:36:36 -0400 Received: from mail-lf1-f66.google.com ([209.85.167.66]:35578 "EHLO mail-lf1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730809AbfIEHgg (ORCPT ); Thu, 5 Sep 2019 03:36:36 -0400 Received: by mail-lf1-f66.google.com with SMTP id w6so1149761lfl.2 for ; Thu, 05 Sep 2019 00:36:35 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=k2T9+4yCg/nj22t+epOd3cFsZ2Dp5yvNb3cPsYW2e68=; b=FFOGK8egGRRKtuB7jDoIbVRl2iqT18v+4veXXN57toPJyYrPiydNEhig8Ugmlv/ALg Qdwh7ZZGkWtOBb3kKAZg5GS9XSjNhxZrApKn4XLeKla/aVPKgHlVpJAJEzXd0t9EoSl6 4cnR6ag94KKihafYGP7jrlisFKxEkmOJE4k+Ph++QxqiZ32U5teTYgU/Y+UvqGRpPkmP NY7wUhkIdiCeFyajFh87VCxc70txo8YCGDYjbfL82iYw6uAVgf3unUPRL8TCZ8g244fg f1OmQa9nVv9eFuNqoYTPP00zfz/loA5TQH7QGdyJ++NA1omQlFb4P4KOdk7ZdWCbYaEj IE5w== X-Gm-Message-State: APjAAAV2P6qzyvm6qCsOn3/hCc0yWA3dqbLd1iut13NrR85p70nc9+pU nndrklzzYJtEbZ8mpN31QBc= X-Received: by 2002:a19:c191:: with SMTP id r139mr1343662lff.23.1567668994388; Thu, 05 Sep 2019 00:36:34 -0700 (PDT) Received: from [10.68.32.192] (broadband-188-32-48-208.ip.moscow.rt.ru. [188.32.48.208]) by smtp.gmail.com with ESMTPSA id y22sm216706ljj.97.2019.09.05.00.36.33 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 05 Sep 2019 00:36:33 -0700 (PDT) Subject: Re: [RFC PATCH] coccinelle: check for integer overflow in binary search To: Julia Lawall Cc: Coccinelle , Gilles Muller , Nicolas Palix , Michal Marek , linux-kernel@vger.kernel.org, tacingiht@gmail.com References: <20190904221223.5281-1-efremov@linux.com> From: Denis Efremov Message-ID: <8916c9d9-bdf5-51c2-b5cb-49898e14a00c@linux.com> Date: Thu, 5 Sep 2019 10:36:32 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 05.09.2019 09:20, Julia Lawall wrote: > > > On Thu, 5 Sep 2019, Denis Efremov wrote: > >> This is an RFC. I will resend the patch after feedback. Currently >> I'm preparing big patchset with bsearch warnings fixed. The rule will >> be a part of this patchset if it will be considered good enough for >> checking. >> >> There is a known integer overflow error [1] in the binary search >> algorithm. Google faced it in 2006 [2]. This rule checks midpoint >> calculation in binary search for overflow, i.e., (l + h) / 2. >> Not every match is an actual error since the array could be small >> enough. However, a custom implementation of binary search is >> error-prone and it's better to use the library function (lib/bsearch.c) >> or to apply defensive programming for midpoint calculation. >> >> [1] https://en.wikipedia.org/wiki/Binary_search_algorithm#Implementation_issues >> [2] https://ai.googleblog.com/2006/06/extra-extra-read-all-about-it-nearly.html >> >> Signed-off-by: Denis Efremov >> --- >> scripts/coccinelle/misc/bsearch.cocci | 80 +++++++++++++++++++++++++++ >> 1 file changed, 80 insertions(+) >> create mode 100644 scripts/coccinelle/misc/bsearch.cocci >> >> diff --git a/scripts/coccinelle/misc/bsearch.cocci b/scripts/coccinelle/misc/bsearch.cocci >> new file mode 100644 >> index 000000000000..a99d9a8d3ee5 >> --- /dev/null >> +++ b/scripts/coccinelle/misc/bsearch.cocci >> @@ -0,0 +1,80 @@ >> +// SPDX-License-Identifier: GPL-2.0-only >> +/// Check midpoint calculation in binary search algorithm for integer overflow >> +/// error [1]. Google faced it in 2006 [2]. Not every match is an actual error >> +/// since the array can be small enough. However, a custom implementation of >> +/// binary search is error-prone and it's better to use the library function >> +/// (lib/bsearch.c) or to apply defensive programming for midpoint calculation. >> +/// >> +/// [1] https://en.wikipedia.org/wiki/Binary_search_algorithm#Implementation_issues >> +/// [2] https://ai.googleblog.com/2006/06/extra-extra-read-all-about-it-nearly.html >> +// >> +// Confidence: Medium >> +// Copyright: (C) 2019 Denis Efremov, ISPRAS >> +// Comments: >> +// Options: --no-includes --include-headers >> + >> +virtual report >> +virtual org >> + >> +@r depends on org || report@ >> +identifier l, h, m; >> +statement S; >> +position p; >> +// to match 1 in << >> +// to match 2 in / >> +// Can't use exact values, e.g. 2, because it fails to match 2L. >> +// TODO: Is there an isomorphism for 2, 2L, 2U, 2UL, 2ULL, etc? >> +constant c; > > As far as I can see, you aren't checking for 2 at all at the moment? Yes, there are no false positives even without pinning constants to 1, 2. However, it's better to express this in the rule. > You > should be able to say constant c = {2, 2L, etc};. Actually, we do > consider several variants of 0, so it could be reasonable to allow eg 2 to > match other variants as well. It looks like integer literals aren't fully supported. When I'm trying to write 'constant c = {2L}; ' it fails with int_of_string error. Denis