In siw_orqe_start_rx, the orqe's flag in the if condition is read using
READ_ONCE, checked, and then re-read, voiding all guarantees of the
checks. Reuse the value that was read by READ_ONCE to ensure the
consistency of the flags throughout the function.
Signed-off-by: linke li <[email protected]>
---
drivers/infiniband/sw/siw/siw_qp_rx.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/infiniband/sw/siw/siw_qp_rx.c b/drivers/infiniband/sw/siw/siw_qp_rx.c
index ed4fc39718b4..f5f69de56882 100644
--- a/drivers/infiniband/sw/siw/siw_qp_rx.c
+++ b/drivers/infiniband/sw/siw/siw_qp_rx.c
@@ -740,6 +740,7 @@ static int siw_orqe_start_rx(struct siw_qp *qp)
{
struct siw_sqe *orqe;
struct siw_wqe *wqe = NULL;
+ u16 orqe_flags;
if (unlikely(!qp->attrs.orq_size))
return -EPROTO;
@@ -748,7 +749,8 @@ static int siw_orqe_start_rx(struct siw_qp *qp)
smp_mb();
orqe = orq_get_current(qp);
- if (READ_ONCE(orqe->flags) & SIW_WQE_VALID) {
+ orqe_flags = READ_ONCE(orqe->flags);
+ if (orqe_flags & SIW_WQE_VALID) {
/* RRESP is a TAGGED RDMAP operation */
wqe = rx_wqe(&qp->rx_tagged);
wqe->sqe.id = orqe->id;
@@ -756,7 +758,7 @@ static int siw_orqe_start_rx(struct siw_qp *qp)
wqe->sqe.sge[0].laddr = orqe->sge[0].laddr;
wqe->sqe.sge[0].lkey = orqe->sge[0].lkey;
wqe->sqe.sge[0].length = orqe->sge[0].length;
- wqe->sqe.flags = orqe->flags;
+ wqe->sqe.flags = orqe_flags;
wqe->sqe.num_sge = 1;
wqe->bytes = orqe->sge[0].length;
wqe->processed = 0;
--
2.39.3 (Apple Git-146)
在 2024/3/9 13:27, linke li 写道:
> In siw_orqe_start_rx, the orqe's flag in the if condition is read using
> READ_ONCE, checked, and then re-read, voiding all guarantees of the
> checks. Reuse the value that was read by READ_ONCE to ensure the
> consistency of the flags throughout the function.
>
> Signed-off-by: linke li <[email protected]>
> ---
> drivers/infiniband/sw/siw/siw_qp_rx.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/infiniband/sw/siw/siw_qp_rx.c b/drivers/infiniband/sw/siw/siw_qp_rx.c
> index ed4fc39718b4..f5f69de56882 100644
> --- a/drivers/infiniband/sw/siw/siw_qp_rx.c
> +++ b/drivers/infiniband/sw/siw/siw_qp_rx.c
> @@ -740,6 +740,7 @@ static int siw_orqe_start_rx(struct siw_qp *qp)
> {
> struct siw_sqe *orqe;
> struct siw_wqe *wqe = NULL;
> + u16 orqe_flags;
>
> if (unlikely(!qp->attrs.orq_size))
> return -EPROTO;
> @@ -748,7 +749,8 @@ static int siw_orqe_start_rx(struct siw_qp *qp)
> smp_mb();
>
> orqe = orq_get_current(qp);
> - if (READ_ONCE(orqe->flags) & SIW_WQE_VALID) {
In this if test, READ_ONCE is needed to read orqe->flags. But in this
commit, this READ_ONCE is moved to other places.
In a complicated environment, for example, this function is called many
times at the same time and orqe->flags is changed at the same time, I am
not sure if this will introduce risks or not.
if you need to ensure the consistency of the flags throughout the
function, not sure if the following is better or not.
if (((orqe_flags=READ_ONCE(orqe->flags))) & SIW_WQE_VALID) {
Thanks,
Zhu Yanjun
> + orqe_flags = READ_ONCE(orqe->flags);
> + if (orqe_flags & SIW_WQE_VALID) {
> /* RRESP is a TAGGED RDMAP operation */
> wqe = rx_wqe(&qp->rx_tagged);
> wqe->sqe.id = orqe->id;
> @@ -756,7 +758,7 @@ static int siw_orqe_start_rx(struct siw_qp *qp)
> wqe->sqe.sge[0].laddr = orqe->sge[0].laddr;
> wqe->sqe.sge[0].lkey = orqe->sge[0].lkey;
> wqe->sqe.sge[0].length = orqe->sge[0].length;
> - wqe->sqe.flags = orqe->flags;
> + wqe->sqe.flags = orqe_flags;
> wqe->sqe.num_sge = 1;
> wqe->bytes = orqe->sge[0].length;
> wqe->processed = 0;
On Sat, Mar 09, 2024 at 08:27:16PM +0800, linke li wrote:
> In siw_orqe_start_rx, the orqe's flag in the if condition is read using
> READ_ONCE, checked, and then re-read, voiding all guarantees of the
> checks. Reuse the value that was read by READ_ONCE to ensure the
> consistency of the flags throughout the function.
Please read include/asm-generic/rwonce.h comments when READ_ONCE() is used.
There is no value in caching the output of READ_ONCE().
Thanks
I want to emphasize that if the value of orqe->flags has changed by the
time of the second read, the value read will not satisfy the if condition,
causing inconsistency. Given that there is already a READ_ONCE.
> In a complicated environment, for example, this function is called many
> times at the same time and orqe->flags is changed at the same time, I am
> not sure if this will introduce risks or not.
I think one function of READ_ONCE is to read a valid value while the value
may change concurrently. And there is a smp() above the READ_ONCE, which
means that the READ_ONCE is well ordered. I think it is kind of safe here.
> if you need to ensure the consistency of the flags throughout the function, not sure if the following is better or not.
> if (((orqe_flags=READ_ONCE(orqe->flags))) & SIW_WQE_VALID) {
This patch looks like exactly do the same things. The only difference I
think is the code style.
Thanks,
Linke
On Sun, Mar 10, 2024 at 8:36 PM linke li <[email protected]> wrote:
>
> > In a complicated environment, for example, this function is called many
> > times at the same time and orqe->flags is changed at the same time, I am
> > not sure if this will introduce risks or not.
>
> I think one function of READ_ONCE is to read a valid value while the value
> may change concurrently. And there is a smp() above the READ_ONCE, which
> means that the READ_ONCE is well ordered. I think it is kind of safe here.
This is not a smp problem. Compared with the original source, your
commit introduces a time slot.
>
> > if you need to ensure the consistency of the flags throughout the function, not sure if the following is better or not.
>
> > if (((orqe_flags=READ_ONCE(orqe->flags))) & SIW_WQE_VALID) {
>
> This patch looks like exactly do the same things. The only difference I
> think is the code style.
No.
>
> Thanks,
> Linke
>
>
On Sun, Mar 10, 2024 at 7:33 PM Leon Romanovsky <[email protected]> wrote:
>
> On Sat, Mar 09, 2024 at 08:27:16PM +0800, linke li wrote:
> > In siw_orqe_start_rx, the orqe's flag in the if condition is read using
> > READ_ONCE, checked, and then re-read, voiding all guarantees of the
> > checks. Reuse the value that was read by READ_ONCE to ensure the
> > consistency of the flags throughout the function.
>
> Please read include/asm-generic/rwonce.h comments when READ_ONCE() is used.
> There is no value in caching the output of READ_ONCE().
Agree. Read the link
https://www.kernel.org/doc/Documentation/memory-barriers.txt, too
>
> Thanks
>
On Sun, Mar 10, 2024 at 08:15:25PM +0800, linke li wrote:
> I want to emphasize that if the value of orqe->flags has changed by the
> time of the second read, the value read will not satisfy the if condition,
> causing inconsistency. Given that there is already a READ_ONCE.
If value can change between subsequent reads, then you need to use locks
to make sure that it doesn't happen. Using READ_ONCE() doesn't solve the
concurrency issue, but makes sure that compiler doesn't reorder reads
and writes.
Thanks
> If value can change between subsequent reads, then you need to use locks
> to make sure that it doesn't happen. Using READ_ONCE() doesn't solve the
> concurrency issue, but makes sure that compiler doesn't reorder reads
> and writes.
This code do not need to prevent other thread from writing on the flags.
This topic got quite a bit of discussion [1], quote from it:
(READ_ONCE and WRITE_ONCE)
That's often useful - lots of code doesn't really care if you get the
old or the new value, but the code *does* care that it gets *one*
value, and not some random mix of "I tested one value for validity,
then it got reloaded due to register pressure, and I actually used
another value".
And not some "I read one value, and it was a mix of two other values".
From the original code, the first read seems to do the same things. So
READ_ONCE is probably ok here.
I just want to make sure the flags stored to wqe->sqe.flags is consistent
with the read used in the if condition.
[1]https://lore.kernel.org/lkml/CAHk-=wgG6Dmt1JTXDbrbXh_6s2yLjL=9pHo7uv0==LHFD+aBtg@mail.gmail.com/
> This is not a smp problem. Compared with the original source, your
> commit introduces a time slot.
I don't know what do you mean by a time slot. In the binary level, they
have the same code.
在 2024/3/11 3:34, linke li 写道:
>> If value can change between subsequent reads, then you need to use locks
>> to make sure that it doesn't happen. Using READ_ONCE() doesn't solve the
>> concurrency issue, but makes sure that compiler doesn't reorder reads
>> and writes.
>
> This code do not need to prevent other thread from writing on the flags.
>
> This topic got quite a bit of discussion [1], quote from it:
>
> (READ_ONCE and WRITE_ONCE)
> That's often useful - lots of code doesn't really care if you get the
> old or the new value, but the code *does* care that it gets *one*
> value, and not some random mix of "I tested one value for validity,
> then it got reloaded due to register pressure, and I actually used
> another value".
>
> And not some "I read one value, and it was a mix of two other values".
>
> From the original code, the first read seems to do the same things. So
> READ_ONCE is probably ok here.
>
> I just want to make sure the flags stored to wqe->sqe.flags is consistent
> with the read used in the if condition.
Sure. Follow Leon's advice, to make this ("wqe->sqe.flags is consistent
with the read used in the if condition") happen, you need a lock to
ensure it. The lock can be spin lock or mutex lock depens on its
sleeping or not.
From the original source code, wqe->sqe.flags should be a volatile
variable. It should be read from the original source, not from cache.
Zhu Yanjun
>
> [1]https://lore.kernel.org/lkml/CAHk-=wgG6Dmt1JTXDbrbXh_6s2yLjL=9pHo7uv0==LHFD+aBtg@mail.gmail.com/
>
In the original source code, READ_ONCE(xxx) is in if test. In your
commit, you move READ_ONCE out of this if test.
So the time slot exists between fetching and using. In the original
source code, it does not exist. And the fetching and using are not
protected by locks. As is suggested by Leon.
This will introduce risks.
The binary is based on optimization level and architectures. It is very
complicated.
Zhu Yanjun
On 11.03.24 03:57, linke li wrote:
>> This is not a smp problem. Compared with the original source, your
>> commit introduces a time slot.
> I don't know what do you mean by a time slot. In the binary level, they
> have the same code.
>
> -----Original Message-----
> From: linke li <[email protected]>
> Sent: Saturday, March 9, 2024 1:27 PM
> Cc: [email protected]; Bernard Metzler <[email protected]>; Jason Gunthorpe
> <[email protected]>; Leon Romanovsky <[email protected]>; linux-
> [email protected]; [email protected]
> Subject: [EXTERNAL] [PATCH] RDMA/siw: Reuse value read using READ_ONCE
> instead of re-reading it
>
> In siw_orqe_start_rx, the orqe's flag in the if condition is read using
> READ_ONCE, checked, and then re-read, voiding all guarantees of the
> checks. Reuse the value that was read by READ_ONCE to ensure the
> consistency of the flags throughout the function.
>
> Signed-off-by: linke li <[email protected]>
> ---
> drivers/infiniband/sw/siw/siw_qp_rx.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/infiniband/sw/siw/siw_qp_rx.c
> b/drivers/infiniband/sw/siw/siw_qp_rx.c
> index ed4fc39718b4..f5f69de56882 100644
> --- a/drivers/infiniband/sw/siw/siw_qp_rx.c
> +++ b/drivers/infiniband/sw/siw/siw_qp_rx.c
> @@ -740,6 +740,7 @@ static int siw_orqe_start_rx(struct siw_qp *qp)
> {
> struct siw_sqe *orqe;
> struct siw_wqe *wqe = NULL;
> + u16 orqe_flags;
>
> if (unlikely(!qp->attrs.orq_size))
> return -EPROTO;
> @@ -748,7 +749,8 @@ static int siw_orqe_start_rx(struct siw_qp *qp)
> smp_mb();
>
> orqe = orq_get_current(qp);
> - if (READ_ONCE(orqe->flags) & SIW_WQE_VALID) {
> + orqe_flags = READ_ONCE(orqe->flags);
> + if (orqe_flags & SIW_WQE_VALID) {
> /* RRESP is a TAGGED RDMAP operation */
> wqe = rx_wqe(&qp->rx_tagged);
> wqe->sqe.id = orqe->id;
> @@ -756,7 +758,7 @@ static int siw_orqe_start_rx(struct siw_qp *qp)
> wqe->sqe.sge[0].laddr = orqe->sge[0].laddr;
> wqe->sqe.sge[0].lkey = orqe->sge[0].lkey;
> wqe->sqe.sge[0].length = orqe->sge[0].length;
> - wqe->sqe.flags = orqe->flags;
> + wqe->sqe.flags = orqe_flags;
> wqe->sqe.num_sge = 1;
> wqe->bytes = orqe->sge[0].length;
> wqe->processed = 0;
> --
> 2.39.3 (Apple Git-146)
>
>
The outbound read queue (orq) is a ring buffer with only one
consumer (this code) and one producer (READ.request sending
code). There is no parallel reader and a single writer.
The producer (sender of the READ.request) sets the orq entry
valid and does this only once after completely writing
the entry. It does it under qp->orq_lock.
Only if we find the orq entry valid, its content gets copied
at the beginning of a new READ.response (this code).
The orq entry remains valid to stop the producer from re-using
it until the complete READ.response has been received (may be
multiple fragments). The flag gets cleared under qp->orq_lock
after the complete READ.response has been received, or the
response was invalid.
There is no possibility a valid orq entry gets invalidated
after it has been found valid, so it is safe to copy all its
members.
Thanks,
Bernard.
Thank you for your reasonal reply. That makes sense. But you may still
consider to make it better, like this patch, to read the flag only one
time. It will avoid some potential risks. However, it depends on
maintainer's choice.
Linke
Thanks
On Tue, Mar 12, 2024 at 09:30:53AM +0800, linke li wrote:
> Thank you for your reasonal reply. That makes sense. But you may still
> consider to make it better, like this patch, to read the flag only one
> time. It will avoid some potential risks. However, it depends on
> maintainer's choice.
Maintainer doesn't see any potential risks and value is read only once anyway.
Thanks
>
> Linke
> Thanks
>
>