Received: by 2002:ac0:a582:0:0:0:0:0 with SMTP id m2-v6csp1642497imm; Tue, 2 Oct 2018 11:28:27 -0700 (PDT) X-Google-Smtp-Source: ACcGV60xtmxh/DEGFhhtqllivDJoPU7uDFvRcDi7PWBMZNm7k5JZwVfJdcH01hv7Wbe2yDqEFShr X-Received: by 2002:a63:ed55:: with SMTP id m21-v6mr15503205pgk.147.1538504907742; Tue, 02 Oct 2018 11:28:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1538504907; cv=none; d=google.com; s=arc-20160816; b=qUfP8mMcDqQxvsY8V7U//TpTH6fJJlFv7FnOdxkDRVSS32rP7aZVVVGe/y80XgwzzW 2ahytt8mXIv34Qt8Rr4g/QywHXnJporz+XF7g/xAEfztkaN3iqIIOFRgX7SyC95k96q6 go1GuOsC4UJmydVQ2fRAaQtCq6LlnwtqFraIWLsOqDwnjlC1xB6VBHEukrCy+dCBlu5i Pfjgp6xUgNqXJ5+NQsWJJUbHUa8gDC6QhC+xUaMPaux0AtgwOW+0hcy5F7zCGUwCJHBT D+3xT2u1oSTXwXueP51RPyCaACU98Jq2duA7W99n8PoI4SqXOy75ACS3qOfr7opP3gMe ywxg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-language :content-transfer-encoding:in-reply-to:mime-version:user-agent:date :message-id:organization:from:references:cc:to:subject; bh=cW5Zw+7oIdky+baA/EV/V2OEblZiSAFG/r4f8vqiLSE=; b=g3qFvgbh5MFfplHsSKPzZTDdYfMJnSULameI00ApMv6RNEHRb3x15+hoc8cjsL9/70 Fd2KcSQwmTjw3fhb6amqzcFlrEK7gKXPD7/5gQlVlwkH3RABQ3PMlgRcOdj4QdA7WGeB W+HEyvVCf1YoUc1uLUei3ckWR/OI8rNuMS46gpp374QQBp2wXXD7rkRyhChg8eraQQCf e5LlCYgJM8RV7mDgoFWZn2W7BMSVT1j8v76v+zXZVZq9GKyISxwAModxech019frWsqz wHs6H21lt65OdYCti8S07GcA+Ukbk7KSiH5Sj181uLzNKxci+Vfjq7BQTMkrEmN1t92F 4Fbw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id p81-v6si14125878pfd.76.2018.10.02.11.28.12; Tue, 02 Oct 2018 11:28:27 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726809AbeJCBMk convert rfc822-to-8bit (ORCPT + 99 others); Tue, 2 Oct 2018 21:12:40 -0400 Received: from mx1.redhat.com ([209.132.183.28]:58136 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725951AbeJCBMj (ORCPT ); Tue, 2 Oct 2018 21:12:39 -0400 Received: from smtp.corp.redhat.com (int-mx12.intmail.prod.int.phx2.redhat.com [10.5.11.27]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 4A77030020A1; Tue, 2 Oct 2018 18:27:58 +0000 (UTC) Received: from llong.remote.csb (dhcp-17-55.bos.redhat.com [10.18.17.55]) by smtp.corp.redhat.com (Postfix) with ESMTP id 56A3F9D814; Tue, 2 Oct 2018 18:27:57 +0000 (UTC) Subject: Re: [RFC, PATCH] ipc/util.c: use idr_alloc_cyclic() for ipc allocations To: Manfred Spraul , LKML , Davidlohr Bueso Cc: 1vier1@web.de, Andrew Morton , Kees Cook , "Luis R . Rodriguez" , Matthew Wilcox References: <20181002161942.1037-1-manfred@colorfullife.com> From: Waiman Long Organization: Red Hat Message-ID: <1b8bcd5c-1264-08da-0b91-ebb4c02f6035@redhat.com> Date: Tue, 2 Oct 2018 14:27:56 -0400 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.0 MIME-Version: 1.0 In-Reply-To: <20181002161942.1037-1-manfred@colorfullife.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8BIT Content-Language: en-US X-Scanned-By: MIMEDefang 2.84 on 10.5.11.27 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.44]); Tue, 02 Oct 2018 18:27:58 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 10/02/2018 12:19 PM, Manfred Spraul wrote: > A bit related to the patch series that increases IPC_MNI: > > (User space) id reuse create the risk of data corruption: > > Process A: calls ipc function > Process A: sleeps just at the beginning of the syscall > Process B: Frees the ipc object (i.e.: calls ...ctl(IPC_RMID) > Process B: Creates a new ipc object (i.e.: calls ...get()) > > Process A: is woken up, and accesses the new object > > To reduce the probability that the new and the old object > have the same id, the current implementation adds a > sequence number to the index of the object in the idr tree. > > To further reduce the probability for a reuse, switch from > idr_alloc to idr_alloc_cyclic. > > The patch cycles over at least RADIX_TREE_MAP_SIZE, i.e. > if there is only a small number of objects, the accesses > continue to be direct. > > As an option, this could be made dependent on the extended > mode: In extended mode, cycle over e.g. at least 16k ids. > > Signed-off-by: Manfred Spraul > --- > > Open questions: > - Is there a significant performance advantage, especially > there are many ipc ids? > - Over how many ids should the code cycle always? > - Further review remarks? > > ipc/util.c | 22 +++++++++++++++++++++- > 1 file changed, 21 insertions(+), 1 deletion(-) > > diff --git a/ipc/util.c b/ipc/util.c > index 0af05752969f..6f83841f6761 100644 > --- a/ipc/util.c > +++ b/ipc/util.c > @@ -216,10 +216,30 @@ static inline int ipc_idr_alloc(struct ipc_ids *ids, struct kern_ipc_perm *new) > */ > > if (next_id < 0) { /* !CHECKPOINT_RESTORE or next_id is unset */ > + int idr_max; > + > new->seq = ids->seq++; > if (ids->seq > IPCID_SEQ_MAX) > ids->seq = 0; > - idx = idr_alloc(&ids->ipcs_idr, new, 0, 0, GFP_NOWAIT); > + > + /* > + * If a user space visible id is reused, then this creates a > + * risk for data corruption. To reduce the probability that > + * a number is reduced, two approaches are used: reduced -> reused? > + * 1) the idr index is allocated cyclically. > + * 2) the use space id is build by concatenating the > + * internal idr index with a sequence number > + * To avoid that both numbers have the same cycle time, try > + * to set the size for the cyclic alloc to an odd number. > + */ > + idr_max = ids->in_use*2+1; > + if (idr_max < RADIX_TREE_MAP_SIZE-1) > + idr_max = RADIX_TREE_MAP_SIZE-1; > + if (idr_max > IPCMNI) > + idr_max = IPCMNI; > + > + idx = idr_alloc_cyclic(&ids->ipcs_idr, new, 0, idr_max, > + GFP_NOWAIT); > } else { > new->seq = ipcid_to_seqx(next_id); > idx = idr_alloc(&ids->ipcs_idr, new, ipcid_to_idx(next_id), Each of IPC components have their own sysctl parameters limiting the max number of objects that can be allocated. With cyclic allocation, you will have to make sure that idr_max is not larger than the corresponding IPC sysctl parameters. That may require moving the limits to the corresponding ipc_ids structure so that it can be used in ipc_idr_alloc(). What is the point of comparing idr_max against RADIX_TREE_MAP_SIZE-1? Is it for performance reason. Cheers, Longman