Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp679570yba; Mon, 1 Apr 2019 14:35:03 -0700 (PDT) X-Google-Smtp-Source: APXvYqyDAFy+AvYtv9+0pXRMPDYwPDFF+QeKXeXuqM7uvSR54INPTF0SlxKFnNjC3LYpLsPSrehM X-Received: by 2002:a62:5797:: with SMTP id i23mr64509926pfj.12.1554154503625; Mon, 01 Apr 2019 14:35:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1554154503; cv=none; d=google.com; s=arc-20160816; b=thWht2caoObbJaLuGhb3lEk1O1yojLyXfxLqimUnzJR//hth/B1myyxU4ifBj6kQ8T qC5a8CLoxCZZNTpDhRv1omGcl+eYBSc1jZfWRDxH3rmz2NuA8UJy4XONq7w8z4iVklaW HTUAuSKYNKYyLcVCeQjkFMeHszktbNa+4ZQtbF6sBaMmPRFeU00Z/jtsGsj6IT99IdRo pew4PVEHogRrjxdl7tlW2LIEgmhccrq9DWzksg2DY2z4+mikBaWebsquIJlZPZ2VdDWS mCw8IpVaBtD6t+4aA4zRWIWMRqo9qWTCmpzZK+AgdZwXoixHsOA1CuK58BAKFptTaCbR PP1w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=diW5ZvjUyJnds2xwfdTgna6hjQkNGR70KYto51gcCBs=; b=Xqepor7lQWb/k/BANZyk6u1S3ybkBs2JyxkcL7UNd7FpPxK3l1J8LC7f5Sntf2Tn9r 3XqDmE4GFXuzaQFAaskWjmdm4F9ldId97bz+7P7x89SqnHa3zW6hw+hxf+Mysr6zKKTO kU4w7UiixsgVBU87ik7yeOSZj/suKfBKN368k08++qaqsYFhIojdIweVpVVhi3Zbb5K6 SZuK9SgID4WoL6A0eRgz+1R4ZltR9pnFUA1gw4nH/CNzVCfyZR+Qad+gyfEdSHIw0LQT 0v06KVH10zEzb5IOSBVmU81OaxYujzYMgkW/g8hm+w96zWXH2F6eOIP9jQNJW7w44wUW E75A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id x4si9404700plr.406.2019.04.01.14.34.48; Mon, 01 Apr 2019 14:35:03 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726882AbfDAVeK (ORCPT + 99 others); Mon, 1 Apr 2019 17:34:10 -0400 Received: from www62.your-server.de ([213.133.104.62]:42590 "EHLO www62.your-server.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726030AbfDAVeJ (ORCPT ); Mon, 1 Apr 2019 17:34:09 -0400 Received: from [78.46.172.2] (helo=sslproxy05.your-server.de) by www62.your-server.de with esmtpsa (TLSv1.2:DHE-RSA-AES256-GCM-SHA384:256) (Exim 4.89_1) (envelope-from ) id 1hB4Z7-0005Ex-2e; Mon, 01 Apr 2019 23:34:01 +0200 Received: from [178.197.248.24] (helo=linux.home) by sslproxy05.your-server.de with esmtpsa (TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384:256) (Exim 4.89) (envelope-from ) id 1hB4Z6-0004Wj-Sw; Mon, 01 Apr 2019 23:34:00 +0200 Subject: Re: [PATCH] bpf: do not start from first bucket if elem is not found on a htab To: Brian Vazquez , Brian Vazquez , Alexei Starovoitov , "David S . Miller" Cc: =?UTF-8?Q?Maciej_=c5=bbenczykowski?= , linux-kernel@vger.kernel.org, netdev@vger.kernel.org References: <20190331064129.31702-1-brianvv.kernel@gmail.com> From: Daniel Borkmann Message-ID: <74e3024f-6fde-a04d-4000-89c6e61acc26@iogearbox.net> Date: Mon, 1 Apr 2019 23:34:00 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.3.0 MIME-Version: 1.0 In-Reply-To: <20190331064129.31702-1-brianvv.kernel@gmail.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit X-Authenticated-Sender: daniel@iogearbox.net X-Virus-Scanned: Clear (ClamAV 0.100.3/25406/Mon Apr 1 09:55:53 2019) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 03/31/2019 08:41 AM, Brian Vazquez wrote: > From: Brian Vazquez > > When you want to traverse an entire map using BPF_MAP_GET_NEXT_KEY and > key provided is not present due to a deletion you will start iterating the > map from the beginning without noticing it. This patch changes the > starting bucket in those situations to the bucket where key was > suppossed to be. > > Note that you can still get stuck in the same bucket but it is less > likely than getting stuck in a loop restarting from the beginning of > the entire map. > > Signed-off-by: Brian Vazquez > Signed-off-by: Maciej Żenczykowski This breaks BPF selftest suite unfortunately: # ./test_maps test_maps: test_maps.c:114: test_hashmap: Assertion `bpf_map_get_next_key(fd, &key, &next_key) == 0 && (next_key == first_key)' failed. Aborted Some more background, situation is a bit tricky: pre 8fe45924387b ("bpf: map_get_next_key to return first key on NULL") there was no reliable way of getting to the start of a hash table, meaning it needed an 'invalid' key to return the first element for starting traversal which commit fixed by being able to pass in NULL. So some applications might still just rely on e.g. key of zero bytes (if guaranteed to not be used otherwise) to do just that. With this patch, we'd start out from anywhere in the hash table. > --- > kernel/bpf/hashtab.c | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c > index fed15cf94dca6..eea046d269f51 100644 > --- a/kernel/bpf/hashtab.c > +++ b/kernel/bpf/hashtab.c > @@ -611,11 +611,14 @@ static int htab_map_get_next_key(struct bpf_map *map, void *key, void *next_key) > > hash = htab_map_hash(key, key_size, htab->hashrnd); > > - head = select_bucket(htab, hash); > + /* keep track of current bucket */ > + i = hash & (htab->n_buckets - 1); > + head = select_bucket(htab, i); > > /* lookup the key */ > l = lookup_nulls_elem_raw(head, hash, key, key_size, htab->n_buckets); > > + /* this will start looking from bucket i */ > if (!l) > goto find_first_elem; > > @@ -630,7 +633,6 @@ static int htab_map_get_next_key(struct bpf_map *map, void *key, void *next_key) > } > > /* no more elements in this hash list, go to the next bucket */ > - i = hash & (htab->n_buckets - 1); > i++; > > find_first_elem: >