add_hwgenerator_randomness() currently blocks until more entropy
is needed. Move the blocking wait out of the function to the caller,
by letting the function return the number of jiffies needed to block.
This is done to prepare the function's sole kernel caller from a
kthread to self-rearming delayed_work.
Signed-off-by: Sven van Ashbrook <[email protected]>
---
Changes in v2:
- justify patch as a preparation for next patch
drivers/char/hw_random/core.c | 7 +++++--
drivers/char/random.c | 13 ++++++-------
include/linux/random.h | 2 +-
3 files changed, 12 insertions(+), 10 deletions(-)
diff --git a/drivers/char/hw_random/core.c b/drivers/char/hw_random/core.c
index 16f227b995e8..3675122c6cce 100644
--- a/drivers/char/hw_random/core.c
+++ b/drivers/char/hw_random/core.c
@@ -491,6 +491,7 @@ static int __init register_miscdev(void)
static int hwrng_fillfn(void *unused)
{
size_t entropy, entropy_credit = 0; /* in 1/1024 of a bit */
+ unsigned long delay;
long rc;
while (!kthread_should_stop()) {
@@ -526,8 +527,10 @@ static int hwrng_fillfn(void *unused)
entropy_credit = entropy;
/* Outside lock, sure, but y'know: randomness. */
- add_hwgenerator_randomness((void *)rng_fillbuf, rc,
- entropy >> 10);
+ delay = add_hwgenerator_randomness((void *)rng_fillbuf, rc,
+ entropy >> 10);
+ if (delay > 0)
+ schedule_timeout_interruptible(delay);
}
hwrng_fill = NULL;
return 0;
diff --git a/drivers/char/random.c b/drivers/char/random.c
index 79d7d4e4e582..5dc949298f92 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -686,7 +686,7 @@ static void __cold _credit_init_bits(size_t bits)
* the above entropy accumulation routines:
*
* void add_device_randomness(const void *buf, size_t len);
- * void add_hwgenerator_randomness(const void *buf, size_t len, size_t entropy);
+ * unsigned long add_hwgenerator_randomness(const void *buf, size_t len, size_t entropy);
* void add_bootloader_randomness(const void *buf, size_t len);
* void add_vmfork_randomness(const void *unique_vm_id, size_t len);
* void add_interrupt_randomness(int irq);
@@ -702,8 +702,8 @@ static void __cold _credit_init_bits(size_t bits)
* available to them (particularly common in the embedded world).
*
* add_hwgenerator_randomness() is for true hardware RNGs, and will credit
- * entropy as specified by the caller. If the entropy pool is full it will
- * block until more entropy is needed.
+ * entropy as specified by the caller. Returns time delay in jiffies until
+ * more entropy is needed.
*
* add_bootloader_randomness() is called by bootloader drivers, such as EFI
* and device tree, and credits its input depending on whether or not the
@@ -857,10 +857,10 @@ EXPORT_SYMBOL(add_device_randomness);
/*
* Interface for in-kernel drivers of true hardware RNGs.
- * Those devices may produce endless random bits and will be throttled
+ * Those devices may produce endless random bits and should be throttled
* when our pool is full.
*/
-void add_hwgenerator_randomness(const void *buf, size_t len, size_t entropy)
+unsigned long add_hwgenerator_randomness(const void *buf, size_t len, size_t entropy)
{
mix_pool_bytes(buf, len);
credit_init_bits(entropy);
@@ -869,8 +869,7 @@ void add_hwgenerator_randomness(const void *buf, size_t len, size_t entropy)
* Throttle writing to once every CRNG_RESEED_INTERVAL, unless
* we're not yet initialized.
*/
- if (!kthread_should_stop() && crng_ready())
- schedule_timeout_interruptible(CRNG_RESEED_INTERVAL);
+ return crng_ready() ? CRNG_RESEED_INTERVAL : 0;
}
EXPORT_SYMBOL_GPL(add_hwgenerator_randomness);
diff --git a/include/linux/random.h b/include/linux/random.h
index 3fec206487f6..6608b0fb4402 100644
--- a/include/linux/random.h
+++ b/include/linux/random.h
@@ -17,7 +17,7 @@ void __init add_bootloader_randomness(const void *buf, size_t len);
void add_input_randomness(unsigned int type, unsigned int code,
unsigned int value) __latent_entropy;
void add_interrupt_randomness(int irq) __latent_entropy;
-void add_hwgenerator_randomness(const void *buf, size_t len, size_t entropy);
+unsigned long add_hwgenerator_randomness(const void *buf, size_t len, size_t entropy);
#if defined(LATENT_ENTROPY_PLUGIN) && !defined(__CHECKER__)
static inline void add_latent_entropy(void)
--
2.37.2.789.g6183377224-goog
The hwrng fill function runs as a normal kthread. This thread
doesn't get frozen by the PM, i.e. it will keep running during,
or in, system suspend. It may call the client driver's
data_present()/data_read() functions during, or in, suspend;
which may generate errors or warnings. For example, if the
client driver uses an i2c bus, the following warning may be
intermittently generated:
i2c: Transfer while suspended
Fix by converting the delay polled kthread into an ordered work
queue running a single, self-rearming delayed_work. Make the
workqueue WQ_FREEZABLE, so the PM will drain any work items
before going into suspend. This prevents client drivers from
being accessed during, or in, suspend.
Tested on a Chromebook containing an cr50 tpm over i2c. The test
consists of 31000 suspend/resume cycles. Occasional
"i2c: Transfer while suspended" warnings are seen. After applying
this patch, these warnings disappear.
This patch also does not appear to cause any regressions on the
ChromeOS test queues.
Signed-off-by: Sven van Ashbrook <[email protected]>
---
Changes in v2:
- refreshed patch set against latest tree
drivers/char/hw_random/core.c | 95 +++++++++++++++++++----------------
1 file changed, 51 insertions(+), 44 deletions(-)
diff --git a/drivers/char/hw_random/core.c b/drivers/char/hw_random/core.c
index 3675122c6cce..ee85ca97d215 100644
--- a/drivers/char/hw_random/core.c
+++ b/drivers/char/hw_random/core.c
@@ -17,7 +17,7 @@
#include <linux/hw_random.h>
#include <linux/random.h>
#include <linux/kernel.h>
-#include <linux/kthread.h>
+#include <linux/workqueue.h>
#include <linux/sched/signal.h>
#include <linux/miscdevice.h>
#include <linux/module.h>
@@ -28,14 +28,17 @@
#define RNG_MODULE_NAME "hw_random"
-static struct hwrng *current_rng;
/* the current rng has been explicitly chosen by user via sysfs */
static int cur_rng_set_by_user;
-static struct task_struct *hwrng_fill;
+static struct workqueue_struct *hwrng_wq;
+static struct delayed_work hwrng_fill_dwork;
+static size_t entropy_credit;
+/* Protects rng_list, current_rng, is_hwrng_wq_running */
+static DEFINE_MUTEX(rng_mutex);
/* list of registered rngs */
static LIST_HEAD(rng_list);
-/* Protects rng_list and current_rng */
-static DEFINE_MUTEX(rng_mutex);
+static struct hwrng *current_rng;
+static bool is_hwrng_wq_running;
/* Protects rng read functions, data_avail, rng_buffer and rng_fillbuf */
static DEFINE_MUTEX(reading_mutex);
static int data_avail;
@@ -488,37 +491,29 @@ static int __init register_miscdev(void)
return misc_register(&rng_miscdev);
}
-static int hwrng_fillfn(void *unused)
+static void hwrng_fillfn(struct work_struct *unused)
{
- size_t entropy, entropy_credit = 0; /* in 1/1024 of a bit */
+ unsigned short quality;
unsigned long delay;
+ struct hwrng *rng;
+ size_t entropy; /* in 1/1024 of a bit */
long rc;
- while (!kthread_should_stop()) {
- unsigned short quality;
- struct hwrng *rng;
-
- rng = get_current_rng();
- if (IS_ERR(rng) || !rng)
- break;
- mutex_lock(&reading_mutex);
- rc = rng_get_data(rng, rng_fillbuf,
- rng_buffer_size(), 1);
- if (current_quality != rng->quality)
- rng->quality = current_quality; /* obsolete */
- quality = rng->quality;
- mutex_unlock(&reading_mutex);
- put_rng(rng);
-
- if (!quality)
- break;
+ rng = get_current_rng();
+ if (IS_ERR(rng) || !rng)
+ return;
+ mutex_lock(&reading_mutex);
+ rc = rng_get_data(rng, rng_fillbuf, rng_buffer_size(), 1);
+ if (current_quality != rng->quality)
+ rng->quality = current_quality; /* obsolete */
+ quality = rng->quality;
+ mutex_unlock(&reading_mutex);
+ put_rng(rng);
- if (rc <= 0) {
- pr_warn("hwrng: no data available\n");
- msleep_interruptible(10000);
- continue;
- }
+ if (!quality)
+ return;
+ if (rc > 0) {
/* If we cannot credit at least one bit of entropy,
* keep track of the remainder for the next iteration
*/
@@ -529,11 +524,11 @@ static int hwrng_fillfn(void *unused)
/* Outside lock, sure, but y'know: randomness. */
delay = add_hwgenerator_randomness((void *)rng_fillbuf, rc,
entropy >> 10);
- if (delay > 0)
- schedule_timeout_interruptible(delay);
+ } else {
+ pr_warn("hwrng: no data available\n");
+ delay = 10 * HZ;
}
- hwrng_fill = NULL;
- return 0;
+ mod_delayed_work(hwrng_wq, &hwrng_fill_dwork, delay);
}
static void hwrng_manage_rngd(struct hwrng *rng)
@@ -541,14 +536,12 @@ static void hwrng_manage_rngd(struct hwrng *rng)
if (WARN_ON(!mutex_is_locked(&rng_mutex)))
return;
- if (rng->quality == 0 && hwrng_fill)
- kthread_stop(hwrng_fill);
- if (rng->quality > 0 && !hwrng_fill) {
- hwrng_fill = kthread_run(hwrng_fillfn, NULL, "hwrng");
- if (IS_ERR(hwrng_fill)) {
- pr_err("hwrng_fill thread creation failed\n");
- hwrng_fill = NULL;
- }
+ if (rng->quality == 0 && is_hwrng_wq_running) {
+ cancel_delayed_work(&hwrng_fill_dwork);
+ is_hwrng_wq_running = false;
+ } else if (rng->quality > 0 && !is_hwrng_wq_running) {
+ mod_delayed_work(hwrng_wq, &hwrng_fill_dwork, 0);
+ is_hwrng_wq_running = true;
}
}
@@ -631,8 +624,7 @@ void hwrng_unregister(struct hwrng *rng)
new_rng = get_current_rng_nolock();
if (list_empty(&rng_list)) {
mutex_unlock(&rng_mutex);
- if (hwrng_fill)
- kthread_stop(hwrng_fill);
+ cancel_delayed_work_sync(&hwrng_fill_dwork);
} else
mutex_unlock(&rng_mutex);
@@ -703,17 +695,32 @@ static int __init hwrng_modinit(void)
return -ENOMEM;
}
+ /* ordered wq to mimic delay-polled kthread behaviour */
+ hwrng_wq = alloc_ordered_workqueue("hwrng",
+ WQ_FREEZABLE | /* prevent work from running during suspend/resume */
+ WQ_MEM_RECLAIM /* client drivers may need memory reclaim */
+ );
+ if (!hwrng_wq) {
+ kfree(rng_fillbuf);
+ kfree(rng_buffer);
+ return -ENOMEM;
+ }
+
ret = register_miscdev();
if (ret) {
+ destroy_workqueue(hwrng_wq);
kfree(rng_fillbuf);
kfree(rng_buffer);
}
+ INIT_DELAYED_WORK(&hwrng_fill_dwork, hwrng_fillfn);
+
return ret;
}
static void __exit hwrng_modexit(void)
{
+ destroy_workqueue(hwrng_wq);
mutex_lock(&rng_mutex);
BUG_ON(current_rng);
kfree(rng_buffer);
--
2.37.2.789.g6183377224-goog
Hi Sven,
On Thu, Sep 15, 2022 at 12:22:53AM +0000, Sven van Ashbrook wrote:
> add_hwgenerator_randomness() currently blocks until more entropy
> is needed. Move the blocking wait out of the function to the caller,
> by letting the function return the number of jiffies needed to block.
>
> This is done to prepare the function's sole kernel caller from a
> kthread to self-rearming delayed_work.
Isn't Dominik working on the same thing, but slightly different? I
recall he sent a patch recently, which looked pretty good, except it
just needed to be split up. I'm waiting for his v2. Does this build on
that?
Jason
Dominik, Jason,
On Thu, Sep 15, 2022 at 11:44 AM Jason A. Donenfeld <[email protected]> wrote:
>
> Isn't Dominik working on the same thing, but slightly different?
I don't believe Dominik is working on quite the same thing, but his
work will conflict with mine for sure:
https://lore.kernel.org/lkml/[email protected]/T/
What are the odds that two people are making changes to the hwrng
kthread at the same time?
I see two possible ways forward:
1. Dominik rebases his patch against mine (iff mine finds favour).
This may simplify his patch
quite a bit, because the delayed_work abstraction tends to have fewer
footguns than
kthread.
or
2. I rebase against Dominik's.
Both are fine with me, just let me know what you think.
-Sven
Hi Dominik,
On Fri, Sep 16, 2022 at 2:24 AM Dominik Brodowski
<[email protected]> wrote:
>
> Indeed, our patches address different issues. I'm fine with both approaches,
> i.e. my patches to be based on Sven's, or the other way round.
Sounds good! May I suggest then that you try to rebase your patch on top of
mine. With a bit of luck, it will become simpler, with fewer kthread related
footguns. I am available to help review.
If your patch becomes more complicated however, we'll work the other way
around.
How does that sound?
Sven
Hi Sven,
On Thu, Sep 15, 2022 at 1:22 AM Sven van Ashbrook <[email protected]> wrote:
> - if (!kthread_should_stop() && crng_ready())
> - schedule_timeout_interruptible(CRNG_RESEED_INTERVAL);
> + return crng_ready() ? CRNG_RESEED_INTERVAL : 0;
I was thinking the other day that under certain circumstances, it
would be nice if random.c could ask hwrng for more bytes NOW, rather
than waiting. With the code as it is currently, this could be
accomplished by having a completion event or something similar to
that. With your proposed change, now it's left up to the hwrng
interface to handle.
That's not the end of the world, but it does mean we'd have to come up
with a patch down the line that exports a hwrng function saying, "stop
the delays and schedule the worker NOW". Now impossible, just more
complex, as now the state flow is split across two places. Wondering
if you have any thoughts about this.
The other thing that occurred to me when reading this patch in context
of the other one is that this sleep you're removing here is not the
only sleep in the call chain. Each hwrng driver can also sleep, and
many do, sometimes for a long time, blocking until there's data
available, which might happen after minutes in some cases. So maybe
that's something to think about in context of this patchset -- that
just moving this to a delayed worker might not actually fix the issue
you're having with sleeps.
Jason
On Mon, Sep 19, 2022 at 5:03 PM Sven van Ashbrook <[email protected]> wrote:
>
> On Fri, Sep 16, 2022 at 10:51 AM Jason A. Donenfeld <[email protected]> wrote:
> >
> > The other thing that occurred to me when reading this patch in context
> > of the other one is that this sleep you're removing here is not the
> > only sleep in the call chain. Each hwrng driver can also sleep, and
> > many do, sometimes for a long time, blocking until there's data
> > available, which might happen after minutes in some cases. So maybe
> > that's something to think about in context of this patchset -- that
> > just moving this to a delayed worker might not actually fix the issue
> > you're having with sleeps.
> >
>
> This is an excellent point. A look at tpm2_calc_ordinal_duration()
> reveals that tpm_transmit() may block for 300s at a time. So when
> we are using a WQ_FREEZABLE delayed_work, the PM may have to wait
> for up to 300s when draining the wq on suspend. That will introduce
> a lot of breakage in suspend/resume.
>
> Dominik: in light of this, please proceed with your patch, without
> rebasing it onto mine.
>
> + tpm maintainers Peter Huewe and Jarkko Sakkinen, a quick recap of
> the problem:
>
> - on ChromeOS we are seeing intermittent suspend/resume errors+warnings
> related to activity of the core's hwrng_fillfn. this kthread keeps
> runningduring suspend/resume. if this happens to kick off an bus (i2c)
> transaction while the bus driver is in suspend, this triggers
> a "Transfer while suspended" warning from the i2c core, followed by
> an error return:
>
> i2c_designware i2c_designware.1: Transfer while suspended
> tpm tpm0: i2c transfer failed (attempt 1/3): -108
> [ snip 10s of transfer failed attempts]
>
> - in 2019, Stephen Boyd made an attempt at fixing this by making the
> hwrng_fillfn kthread freezable. But a freezable thread requires
> different API calls for scheduling, waiting, and timeout. This
> generated regressions, so the solution had to be reverted.
>
> https://patchwork.kernel.org/project/linux-crypto/patch/[email protected]/
>
> - the current patch attempts to halt hwrng_fillfn during suspend by
> converting it to a self-rearming delayed_work. The PM drains all
> work before going into suspend. But, the potential minute-long
> blocking delays in tpm make this solution infeasible.
>
> Peter and Jarkko, can you think of a possible way forward to eliminate
> the warnings+errors?
>
> -Sven
By the way, there was a recent ath9k patch that kind of went to a
similar tune. The solution was to make ath9k's hwrng driver sleep
using a hwrng-specific sleep function that allowed the core framework
to cancel that sleep. Maybe that's a potential solution here, or
something similar to it. Might be worth taking a look at that patch. I
think it's in one of Herbert's trees.
Jason
On Mon, Sep 19, 2022 at 11:06 AM Jason A. Donenfeld <[email protected]> wrote:
>
> By the way, there was a recent ath9k patch that kind of went to a
> similar tune. [...] Maybe that's a potential solution here, or
> something similar to it.
Jason was kind enough to point me to the patch in question:
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=36cb6494429bd64b27b7ff8b4af56f8e526da2b4
This patch limits the long sleep inside the fillfn kthread, by
terminating the sleep on
hwrng_unregister().
This doesn't appear like a viable approach for the suspend/resume issue?
- there is a great multitude of tpm_msleep()/msleep() calls in the tpm's
rng_get_data() path. They would all have to be made interruptible.
- even if interrupted successfully, now the kthread must be blocked until
after resume. If so, what is the point of using a non-freezable kthread.