Received: by 2002:a05:6a10:206:0:0:0:0 with SMTP id 6csp412227pxj; Wed, 16 Jun 2021 05:28:06 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyGNjpo7A65F+otBMWmN+NanqcYCzZFndK8KeZaAqrKPRotrVFY9wVrHOVHW+IkVgFDnqht X-Received: by 2002:a92:7303:: with SMTP id o3mr3550182ilc.203.1623846486123; Wed, 16 Jun 2021 05:28:06 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1623846486; cv=none; d=google.com; s=arc-20160816; b=ptMi8Bmd5eEomOXo3J7FUrSjo5mT+9CoGL8GIG3XcobDmTA0HI63MLSwD5+Tl7lYhV 68jesJcqzJRLQtyVdvI2vE8qrV/Bx6HpboR0E54ZsTsDqQCTqV9qfz/gWUmh+Pin9e/s tJR1jziP3+nixIiiGTsSH0CrZ6jzmhYXuCCGJ8DhquJTugXBfo+etcA8hivoLtk1GF64 rt/95w3ndPIZcrnStEp2j9bCKQrEejkkKi9PUoZRz13ofZPkSbZ17RnQ2PqupMimiH1M Fv3c1y65GB6yffhK/UdUvJppn0o+f0ZA6nyQ8Thx7H2v9lsGqSoLI1qaNX04/c2SrigF lVJA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=Q46ul1q6zD/SUn/ad4IWDU/p2NWIFplb7G9Tk5kN7E0=; b=KjQA1xuetmBibHBGi201xRt0pGiPct0ARA12fvQzHDy0JNj+nlX6SpD8nXTin9dBBL T/OKqcOLVMudsXuc8rQCZSJSue7rwz4rr8yGz8jLatvINxPKaHzq7Ax86jBRWum2c7q/ dch0wo6PUvuFh750UihcQdY6qj7r4RdhTcfwzLAlKjPiRVoUqSNlD0U8xssjkUmA3imo DIHCUOSjjjm3o7HZKAAUpYatiL1KMm7aroAepdQqgdJHVTL2ghx/dFAnhUZRt0H75tD5 Z2NXEks8AKF2ic58uhNL5Vtu4h9CHxToL85zjKJOgMKYwt5mUkOibBGyIQZIBwKGoUi+ L2WQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id m5si2109740ioc.41.2021.06.16.05.27.54; Wed, 16 Jun 2021 05:28:06 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232802AbhFPMDF (ORCPT + 99 others); Wed, 16 Jun 2021 08:03:05 -0400 Received: from foss.arm.com ([217.140.110.172]:35350 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232519AbhFPMDF (ORCPT ); Wed, 16 Jun 2021 08:03:05 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 47B091042; Wed, 16 Jun 2021 05:00:59 -0700 (PDT) Received: from localhost (e108754-lin.cambridge.arm.com [10.1.195.40]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id DBD443F719; Wed, 16 Jun 2021 05:00:58 -0700 (PDT) Date: Wed, 16 Jun 2021 13:00:57 +0100 From: Ionela Voinescu To: Viresh Kumar Cc: Rafael Wysocki , Sudeep Holla , Greg Kroah-Hartman , "Rafael J. Wysocki" , linux-pm@vger.kernel.org, Vincent Guittot , Qian Cai , "Paul E . McKenney" , linux-kernel@vger.kernel.org Subject: Re: [PATCH V2 2/3] arch_topology: Avoid use-after-free for scale_freq_data Message-ID: <20210616120057.GA23282@arm.com> References: <9dba462b4d09a1a8a9fbb75740b74bf91a09a3e1.1623825725.git.viresh.kumar@linaro.org> <20210616112544.GA23657@arm.com> <20210616113604.e4kc3jxb7ayqskev@vireshk-i7> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20210616113604.e4kc3jxb7ayqskev@vireshk-i7> User-Agent: Mutt/1.9.4 (2018-02-28) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wednesday 16 Jun 2021 at 17:06:04 (+0530), Viresh Kumar wrote: > Hi Ionela, > > On 16-06-21, 12:25, Ionela Voinescu wrote: > > Please correct me if I'm wrong, but my understanding is that this is > > only a problem for the cppc cpufreq invariance functionality. Let's > > consider a scenario where CPUs are either hotplugged out or the cpufreq > > CPPC driver module is removed; topology_clear_scale_freq_source() would > > get called and the sfd_data will be set to NULL. But if at the same > > time topology_scale_freq_tick() got an old reference of sfd_data, > > set_freq_scale() will be called. This is only a problem for CPPC cpufreq > > as cppc_scale_freq_tick() will end up using driver internal data that > > might have been freed during the hotplug callbacks or the exit path. > > For now, yes, CPPC is the only one affected. > > > If this is the case, wouldn't the synchronisation issue be better > > resolved in the CPPC cpufreq driver, rather than here? > > Hmm, the way I see it is that topology_clear_scale_freq_source() is an API > provided by topology core and the topology core needs to guarantee that it > doesn't use the data any longer after topology_clear_scale_freq_source() is > called. > > The same is true for other APIs, like: > > irq_work_sync(); > kthread_cancel_work_sync(); > > It isn't the user which needs to take this into account, but the API provider. > I would agree if it wasn't for the fact that the driver provides the set_freq_scale() implementation that ends up using driver internal data which could have been freed by the driver's own .exit()/stop_cpu() callbacks. The API and the generic implementation has the responsibility of making sure of sane access to its own structures. Even if we would want to keep drivers from shooting themselves in the foot, I would prefer we postpone it until we have more users for this, before we add any synchronisation mechanisms to functionality called on the tick. Let's see if there's a less invasive solution to fix CPPC for now, what do you think? Thanks, Ionela. > There may be more users of this in the future, lets say another cpufreq driver, > and so keeping this synchronization at the API provider is the right thing to do > IMHO. > > And from the user's perspective, like cppc, it doesn't have any control over who > is using its callback and how and when. It is very very difficult to provide > something like this at the users, redundant anyway. For example cppc won't ever > know when topology_scale_freq_tick() has stopped calling its callback. > > For example this is what cppc driver needs to do now: > > +static void cppc_cpufreq_stop_cpu(struct cpufreq_policy *policy, > + unsigned int cpu) > +{ > + struct cppc_freq_invariance *cppc_fi = &per_cpu(cppc_freq_inv, cpu); > + > + topology_clear_scale_freq_source(SCALE_FREQ_SOURCE_CPPC, cpumask_of(cpu)); > + irq_work_sync(&cppc_fi->irq_work); > + kthread_cancel_work_sync(&cppc_fi->work); > +} > > The driver uses APIs provided by 3 layers, topology, irq-work, kthread and all > must provide these guarantees. > > A very similar thing is implemented in kernel/sched/cpufreq.c for example. > > -- > viresh