2021-03-16 15:25:28

by Ahmed S. Darwish

[permalink] [raw]
Subject: [PATCH v1 0/2] net: xfrm: Use seqcount_spinlock_t

Hi,

This is a small series to trasform xfrm_state_hash_generation sequence
counter to seqcount_spinlock_t, instead of plain seqcount_t.

In general, seqcount_LOCKNAME_t sequence counters allows to associate
the lock used for write serialization with the seqcount. This enables
lockdep to verify that the write serialization lock is always held
before entering the seqcount write section.

If lockdep is disabled, this lock association is compiled out and has
neither storage size nor runtime overhead.

The first patch is a general mainline fix, and has a Fixes tag.

Thanks,

8<----------

Ahmed S. Darwish (2):
net: xfrm: Localize sequence counter per network namespace
net: xfrm: Use sequence counter with associated spinlock

include/net/netns/xfrm.h | 4 +++-
net/xfrm/xfrm_state.c | 11 ++++++-----
2 files changed, 9 insertions(+), 6 deletions(-)

base-commit: 1e28eed17697bcf343c6743f0028cc3b5dd88bf0
--
2.30.2


2021-03-16 15:26:16

by Ahmed S. Darwish

[permalink] [raw]
Subject: [PATCH v1 1/2] net: xfrm: Localize sequence counter per network namespace

A sequence counter write section must be serialized or its internal
state can get corrupted. The "xfrm_state_hash_generation" seqcount is
global, but its write serialization lock (net->xfrm.xfrm_state_lock) is
instantiated per network namespace. The write protection is thus
insufficient.

To provide full protection, localize the sequence counter per network
namespace instead. This should be safe as both the seqcount read and
write sections access data exclusively within the network namespace. It
also lays the foundation for transforming "xfrm_state_hash_generation"
data type from seqcount_t to seqcount_LOCKNAME_t in further commits.

Fixes: b65e3d7be06f ("xfrm: state: add sequence count to detect hash resizes")
Signed-off-by: Ahmed S. Darwish <[email protected]>
---
include/net/netns/xfrm.h | 4 +++-
net/xfrm/xfrm_state.c | 10 +++++-----
2 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/include/net/netns/xfrm.h b/include/net/netns/xfrm.h
index 59f45b1e9dac..b59d73d529ba 100644
--- a/include/net/netns/xfrm.h
+++ b/include/net/netns/xfrm.h
@@ -72,7 +72,9 @@ struct netns_xfrm {
#if IS_ENABLED(CONFIG_IPV6)
struct dst_ops xfrm6_dst_ops;
#endif
- spinlock_t xfrm_state_lock;
+ spinlock_t xfrm_state_lock;
+ seqcount_t xfrm_state_hash_generation;
+
spinlock_t xfrm_policy_lock;
struct mutex xfrm_cfg_mutex;
};
diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c
index d01ca1a18418..ffd315cff984 100644
--- a/net/xfrm/xfrm_state.c
+++ b/net/xfrm/xfrm_state.c
@@ -44,7 +44,6 @@ static void xfrm_state_gc_task(struct work_struct *work);
*/

static unsigned int xfrm_state_hashmax __read_mostly = 1 * 1024 * 1024;
-static __read_mostly seqcount_t xfrm_state_hash_generation = SEQCNT_ZERO(xfrm_state_hash_generation);
static struct kmem_cache *xfrm_state_cache __ro_after_init;

static DECLARE_WORK(xfrm_state_gc_work, xfrm_state_gc_task);
@@ -140,7 +139,7 @@ static void xfrm_hash_resize(struct work_struct *work)
}

spin_lock_bh(&net->xfrm.xfrm_state_lock);
- write_seqcount_begin(&xfrm_state_hash_generation);
+ write_seqcount_begin(&net->xfrm.xfrm_state_hash_generation);

nhashmask = (nsize / sizeof(struct hlist_head)) - 1U;
odst = xfrm_state_deref_prot(net->xfrm.state_bydst, net);
@@ -156,7 +155,7 @@ static void xfrm_hash_resize(struct work_struct *work)
rcu_assign_pointer(net->xfrm.state_byspi, nspi);
net->xfrm.state_hmask = nhashmask;

- write_seqcount_end(&xfrm_state_hash_generation);
+ write_seqcount_end(&net->xfrm.xfrm_state_hash_generation);
spin_unlock_bh(&net->xfrm.xfrm_state_lock);

osize = (ohashmask + 1) * sizeof(struct hlist_head);
@@ -1063,7 +1062,7 @@ xfrm_state_find(const xfrm_address_t *daddr, const xfrm_address_t *saddr,

to_put = NULL;

- sequence = read_seqcount_begin(&xfrm_state_hash_generation);
+ sequence = read_seqcount_begin(&net->xfrm.xfrm_state_hash_generation);

rcu_read_lock();
h = xfrm_dst_hash(net, daddr, saddr, tmpl->reqid, encap_family);
@@ -1176,7 +1175,7 @@ xfrm_state_find(const xfrm_address_t *daddr, const xfrm_address_t *saddr,
if (to_put)
xfrm_state_put(to_put);

- if (read_seqcount_retry(&xfrm_state_hash_generation, sequence)) {
+ if (read_seqcount_retry(&net->xfrm.xfrm_state_hash_generation, sequence)) {
*err = -EAGAIN;
if (x) {
xfrm_state_put(x);
@@ -2666,6 +2665,7 @@ int __net_init xfrm_state_init(struct net *net)
net->xfrm.state_num = 0;
INIT_WORK(&net->xfrm.state_hash_work, xfrm_hash_resize);
spin_lock_init(&net->xfrm.xfrm_state_lock);
+ seqcount_init(&net->xfrm.xfrm_state_hash_generation);
return 0;

out_byspi:
--
2.30.2

2021-03-16 18:24:16

by Ahmed S. Darwish

[permalink] [raw]
Subject: [PATCH v1 2/2] net: xfrm: Use sequence counter with associated spinlock

A sequence counter write section must be serialized or its internal
state can get corrupted. A plain seqcount_t does not contain the
information of which lock must be held to guaranteee write side
serialization.

For xfrm_state_hash_generation, use seqcount_spinlock_t instead of plain
seqcount_t. This allows to associate the spinlock used for write
serialization with the sequence counter. It thus enables lockdep to
verify that the write serialization lock is indeed held before entering
the sequence counter write section.

If lockdep is disabled, this lock association is compiled out and has
neither storage size nor runtime overhead.

Signed-off-by: Ahmed S. Darwish <[email protected]>
---
include/net/netns/xfrm.h | 2 +-
net/xfrm/xfrm_state.c | 3 ++-
2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/include/net/netns/xfrm.h b/include/net/netns/xfrm.h
index b59d73d529ba..e816b6a3ef2b 100644
--- a/include/net/netns/xfrm.h
+++ b/include/net/netns/xfrm.h
@@ -73,7 +73,7 @@ struct netns_xfrm {
struct dst_ops xfrm6_dst_ops;
#endif
spinlock_t xfrm_state_lock;
- seqcount_t xfrm_state_hash_generation;
+ seqcount_spinlock_t xfrm_state_hash_generation;

spinlock_t xfrm_policy_lock;
struct mutex xfrm_cfg_mutex;
diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c
index ffd315cff984..4496f7efa220 100644
--- a/net/xfrm/xfrm_state.c
+++ b/net/xfrm/xfrm_state.c
@@ -2665,7 +2665,8 @@ int __net_init xfrm_state_init(struct net *net)
net->xfrm.state_num = 0;
INIT_WORK(&net->xfrm.state_hash_work, xfrm_hash_resize);
spin_lock_init(&net->xfrm.xfrm_state_lock);
- seqcount_init(&net->xfrm.xfrm_state_hash_generation);
+ seqcount_spinlock_init(&net->xfrm.xfrm_state_hash_generation,
+ &net->xfrm.xfrm_state_lock);
return 0;

out_byspi:
--
2.30.2

2021-03-23 08:15:35

by Steffen Klassert

[permalink] [raw]
Subject: Re: [PATCH v1 0/2] net: xfrm: Use seqcount_spinlock_t

On Tue, Mar 16, 2021 at 11:56:28AM +0100, Ahmed S. Darwish wrote:
> Hi,
>
> This is a small series to trasform xfrm_state_hash_generation sequence
> counter to seqcount_spinlock_t, instead of plain seqcount_t.
>
> In general, seqcount_LOCKNAME_t sequence counters allows to associate
> the lock used for write serialization with the seqcount. This enables
> lockdep to verify that the write serialization lock is always held
> before entering the seqcount write section.
>
> If lockdep is disabled, this lock association is compiled out and has
> neither storage size nor runtime overhead.
>
> The first patch is a general mainline fix, and has a Fixes tag.
>
> Thanks,
>
> 8<----------
>
> Ahmed S. Darwish (2):
> net: xfrm: Localize sequence counter per network namespace
> net: xfrm: Use sequence counter with associated spinlock

Applied to the ipsec tree, thanks a lot!