MIME-Version: 1.0
References: <20230113175459.14825-1-james.morse@arm.com> <20230113175459.14825-10-james.morse@arm.com>
 <CALPaoCg4T52ju5XJC-BVX-EuZUtc67LruWbgyH5s8CoiEwOUPw@mail.gmail.com> <c3ca6d66-e58c-8ace-e88e-45ded5de836f@arm.com>
In-Reply-To: <c3ca6d66-e58c-8ace-e88e-45ded5de836f@arm.com>
From:   Peter Newman <peternewman@google.com>
Date:   Mon, 6 Mar 2023 14:14:33 +0100
Message-ID: <CALPaoCik0j7ATCv-He5HWVqbL+3njpqO1fhF5FQJO7qqT1zR3w@mail.gmail.com>
Subject: Re: [PATCH v2 09/18] x86/resctrl: Allow resctrl_arch_rmid_read() to sleep
To:     James Morse <james.morse@arm.com>
Cc:     x86@kernel.org, linux-kernel@vger.kernel.org,
        Fenghua Yu <fenghua.yu@intel.com>,
        Reinette Chatre <reinette.chatre@intel.com>,
        Thomas Gleixner <tglx@linutronix.de>,
        Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
        H Peter Anvin <hpa@zytor.com>,
        Babu Moger <Babu.Moger@amd.com>,
        shameerali.kolothum.thodi@huawei.com,
        D Scott Phillips OS <scott@os.amperecomputing.com>,
        carl@os.amperecomputing.com, lcherian@marvell.com,
        bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com,
        xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com,
        Jamie Iles <quic_jiles@quicinc.com>,
        Xin Hao <xhao@linux.alibaba.com>,
        Stephane Eranian <eranian@google.com>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Precedence: bulk

Hi James,

On Mon, Mar 6, 2023 at 12:34=E2=80=AFPM James Morse <james.morse@arm.com> w=
rote:
> On 23/01/2023 15:33, Peter Newman wrote:
> > On Fri, Jan 13, 2023 at 6:56 PM James Morse <james.morse@arm.com> wrote=
:
> >> MPAM's cache occupancy counters can take a little while to settle once
> >> the monitor has been configured. The maximum settling time is describe=
d
> >> to the driver via a firmware table. The value could be large enough
> >> that it makes sense to sleep.
> >
> > Would it be easier to return an error when reading the occupancy count
> > too soon after configuration? On Intel it is already normal for counter
> > reads to fail on newly-allocated RMIDs.
>
> For x86, you have as many counters as there are RMIDs, so there is no iss=
ue just accessing
> the counter.

I should have said AMD instead of Intel, because their implementations
have far fewer counters than RMIDs.

>
> With MPAM there may be as few as 1 monitor for the CSU (cache storage uti=
lisation)
> counter, which needs to be multiplexed between different PARTID to find t=
he cache
> occupancy (This works for CSU because its a stable count, it doesn't work=
 for the
> bandwidth monitors)
> On such a platform the monitor needs to be allocated and programmed befor=
e it reads a
> value for a particular PARTID/CLOSID. If you had two threads trying to re=
ad the same
> counter, they could interleave perfectly to prevent either thread managin=
g to read a value.
> The 'not ready' time is advertised in a firmware table, and the driver wi=
ll wait at most
> that long before giving up and returning an error.

Likewise, on AMD, a repeating sequence of tasks which are LRU in terms
of counter -> RMID allocation could prevent RMID event reads from ever
returning a value.

The main difference I see with MPAM is that software allocates the
counters instead of hardware, but the overall behavior sounds the same.

The part I object to is introducing the wait to the counter read because
existing software already expects an immediate error when reading a
counter too soon. To produce accurate data, these readings are usually
read at intervals of multiple seconds.

Instead, when configuring a counter, could you use the firmware table
value to compute the time when the counter will next be valid and return
errors on read requests received before that?

-Peter