Received: by 10.213.65.68 with SMTP id h4csp1740734imn; Mon, 19 Mar 2018 11:59:46 -0700 (PDT) X-Google-Smtp-Source: AG47ELtPhZFRsdMdfEBxzZldsPvmxCRfEXwCVVvqt365L7XkCSmwWGM2SGAnC+h/2OqzPbW+tAwJ X-Received: by 10.98.254.5 with SMTP id z5mr10772789pfh.53.1521485986076; Mon, 19 Mar 2018 11:59:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1521485986; cv=none; d=google.com; s=arc-20160816; b=NzE/IALSWsQRfkZiR2M8vbGQ9h0cqugb2xcfyVRS39JnY0DILCqhJUl2p4e0YF4f1I ngddi5oav+qKnRlq1vF5dtBsjXMwd8pPQbaDLXTtaNf7RngHxpJEa0ewHXAlxrFcNeyI f89z9S3GVUuAKzF/zrvnT12EzsG2QdhrykkQDNGV+1YIRsl6H+rHVeOqSLFY/o/3HT3/ 8VEGOWX6gvwcM/K+8gB1Q+RBPAq4ViTmMyiLWp0dgfDib2p29rysND9i19JMaU49KpaS 3ypiKvRZ3U7tBc/1lIjaz9H9y8zCEatKVy/ZClbbgw4+mwkZbjxJ2po7/4VNWrw7c5C2 Dl0g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:newsgroups:cc:to:subject :arc-authentication-results; bh=zXVWEQyvd1wAqZnpePjPZVwv4tZ4cZw/XnN9KPI2mOA=; b=O8dqOXHFWQrSe3T0FH56QZqRvpW/erhLWhFjYEOsdLuady4uELAnpz578rPtG6AD5n Fu8wlia53K5MNNY+nq7U5zrb4dGJ29rx1Hl+19srbZSiwuHteCXn5H6e2BKVojldzV2X GyaF/gVJXnxx9CrXORLgFoISznhpHHmBHpNEoz7FUcc37i47I3Bpk9QsURP3CdKH06Yr EqY+7VfKkVNGAvE6Ck4Lgw9oP43phxK8gpORW3j4fsOE1YT7DuQ3rg2oGlpT4aQLDiKU Igeq/PwznCsXHopiJxffkwy4Tlvxy22KpKEnh5M0Fj/aJiZO+tqZGVDwSIs2qJ8ON1re Qhhg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=synopsys.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id i11si344849pgq.332.2018.03.19.11.59.31; Mon, 19 Mar 2018 11:59:46 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=synopsys.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S970654AbeCSS6A (ORCPT + 99 others); Mon, 19 Mar 2018 14:58:00 -0400 Received: from us01smtprelay-2.synopsys.com ([198.182.47.9]:47051 "EHLO smtprelay.synopsys.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S935537AbeCSS3e (ORCPT ); Mon, 19 Mar 2018 14:29:34 -0400 Received: from mailhost.synopsys.com (mailhost1.synopsys.com [10.12.238.239]) by smtprelay.synopsys.com (Postfix) with ESMTP id CE95B24E073D; Mon, 19 Mar 2018 11:29:33 -0700 (PDT) Received: from mailhost.synopsys.com (localhost [127.0.0.1]) by mailhost.synopsys.com (Postfix) with ESMTP id B72D25465; Mon, 19 Mar 2018 11:29:33 -0700 (PDT) Received: from us01wehtc1.internal.synopsys.com (us01wehtc1-vip.internal.synopsys.com [10.12.239.236]) by mailhost.synopsys.com (Postfix) with ESMTP id A4BF05463; Mon, 19 Mar 2018 11:29:33 -0700 (PDT) Received: from IN01WEHTCB.internal.synopsys.com (10.144.199.106) by us01wehtc1.internal.synopsys.com (10.12.239.235) with Microsoft SMTP Server (TLS) id 14.3.361.1; Mon, 19 Mar 2018 11:29:32 -0700 Received: from IN01WEHTCA.internal.synopsys.com (10.144.199.103) by IN01WEHTCB.internal.synopsys.com (10.144.199.105) with Microsoft SMTP Server (TLS) id 14.3.361.1; Mon, 19 Mar 2018 23:59:30 +0530 Received: from [10.10.161.84] (10.10.161.84) by IN01WEHTCA.internal.synopsys.com (10.144.199.243) with Microsoft SMTP Server (TLS) id 14.3.361.1; Mon, 19 Mar 2018 23:59:30 +0530 Subject: Re: [PATCH] ARC: Improve cmpxchng syscall implementation To: Alexey Brodkin , CC: , Peter Zijlstra , Max Filippov , Newsgroups: gmane.linux.kernel,gmane.linux.kernel.arc,gmane.linux.kernel.cross-arch References: <20180319110002.27419-1-abrodkin@synopsys.com> From: Vineet Gupta Message-ID: <5bc39838-b1c5-ef65-f97d-8777ed33bda0@synopsys.com> Date: Mon, 19 Mar 2018 11:29:24 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 MIME-Version: 1.0 In-Reply-To: <20180319110002.27419-1-abrodkin@synopsys.com> Content-Type: text/plain; charset="utf-8"; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [10.10.161.84] Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 03/19/2018 04:00 AM, Alexey Brodkin wrote: > arc_usr_cmpxchg syscall is supposed to be used on platforms > that lack support of Load-Locked/Store-Conditional instructions > in hardware. And in that case we mimic missing hardware features > with help of kernel's sycall that "atomically" checks current > value in memory and then if it matches caller expectation new > value is written to that same location. > ... ... > > 2. What's worse if we're dealing with data from not yet allocated > page (think of pre-copy-on-write state) we'll successfully > read data but on write we'll silently return to user-space > with correct result This is technically incorrect, even for reading, you need a page, which could be common zero page in certain cases. (which we really read just before). That leads > to very strange problems in user-space app further down the line > because new value was never written to the destination. > > 3. Regardless of what went wrong we'll return from syscall > and user-space application will continue to execute. > Even if user's pointer was completely bogus. Again we are exaggerating (from technical correctness POV) - if user pointer was bogs, the read would not have worked in first place etc. So lets tone down the rhetoric. > In case of hardware LL/SC that app would have been killed > by the kernel. > > With that change we attempt to imrove on all 3 items above: > > 1. We still disable preemption around read-and-write of > user's data but if we happen to fail with either of them > we're enabling preemption and try to force page fault so > that we have a correct mapping in the TLB. Then re-try > again in "atomic" context. > > 2. If real page fault fails or even access_ok() returns false > we send SIGSEGV to the user-space process so if something goes > seriously wrong we'll know about it much earlier. > > > /* > * This is only for old cores lacking LLOCK/SCOND, which by defintion > @@ -60,23 +62,48 @@ SYSCALL_DEFINE3(arc_usr_cmpxchg, int *, uaddr, int, expected, int, new) > /* Z indicates to userspace if operation succeded */ > regs->status32 &= ~STATUS_Z_MASK; > > - if (!access_ok(VERIFY_WRITE, uaddr, sizeof(int))) > - return -EFAULT; > + ret = access_ok(VERIFY_WRITE, uaddr, sizeof(*uaddr)); > + if (!ret) > + goto fail; > > +again: > preempt_disable(); > > - if (__get_user(uval, uaddr)) > - goto done; > - > - if (uval == expected) { > - if (!__put_user(new, uaddr)) > + ret = __get_user(val, uaddr); > + if (ret == -EFAULT) { Lets see if this warrants adding complexity ! This implies that TLB entry with Read permissions didn't exist for reading the var and page fault handler could not wire up even a zero page due to preempt_disable, meaning it was something not touched by userspace already - sort of uninitialized variable in user code. Otherwise it is extremely unlikely to start with a TLB entry with Read permissions, followed by syscall Trap only to find the entry missing, unless a global TLB flush came from other cores, right in the middle. But this syscall is not guaranteed to work with SMP anyways, so lets ignore any SMP misdoings here. Now in case it was *an* uninitialized var, do we have to guarantee any well defined semantics for the kernel emulation of cmpxchg ? IMO it should be fine to return 0 or -EFAULT etc. Infact -EFAULT is better as it will force a retry loop on user side, given the typical cmpxchg usage pattern. > + preempt_enable(); > + ret = get_user_pages_fast((unsigned long)uaddr, 1, 1, &page); > + if (ret < 0) > + goto fail; > + > + put_page(page); > + goto again; > + } else if (ret) > + goto fail; > + > + if (val == expected) { > + ret = __put_user(new, uaddr); > + if (!ret) > regs->status32 |= STATUS_Z_MASK; > } > > -done: > preempt_enable(); > > - return uval; > + if (ret == -EFAULT) { > + ret = get_user_pages_fast((unsigned long)uaddr, 1, 1, &page); > + if (ret < 0) > + goto fail; > + > + put_page(page); > + goto again; > + } else if (ret) > + goto fail; > + > + return val; > + > +fail: > + force_sig(SIGSEGV, current); > + return ret; > } > > #ifdef CONFIG_ISA_ARCV2 >