2007-11-17 19:30:11

by Patrick McHardy

[permalink] [raw]
Subject: [RFC HIFN 00/02]: RNG support

These two patches add support for using the HIFN rng.

The first patch improves the PLL initialization a bit by making the
reference clock configurable and its speed known to the driver, which
is needed to calculate the amount of time to wait between two RNG reads.
Since there is no way to find out the frequency reliably (especially for
the external clock), it adds some sane looking defaults and a module
parameter to override it. Suggestions how to improve this are welcome.

The second patch adds hw_random support. The ugly part is finding out
when to allow reads from the RNG. It currently translates the public
key engine clock cycles to CPU cycles based on a 4GHz CPU and uses
get_cycles(). The problems with this are obvious, it only works on CPUs
that actually have some kind of cycle counter, has problems with
unsynchronized TSCs and the 4GHz assumption is not very nice either,
but I was reluctant to use ktime for this since it seems rather
expensive to call ktime_get once per 4 bytes of random. Suggestion
for improvement of this are also welcome :)

Running rngtest on the random number generator indicates that it works
properly, with an average failure ratio of about 1:1000 at ~2.5mbit.


drivers/crypto/hifn_795x.c | 158 +++++++++++++++++++++++++++++++++++++++++++-
1 files changed, 156 insertions(+), 2 deletions(-)

Patrick McHardy (2):
[HIFN]: Improve PLL initialization
[HIFN]: Add support for using the random number generator


2007-11-17 19:30:11

by Patrick McHardy

[permalink] [raw]
Subject: [RFC HIFN 01/02]: Improve PLL initialization

[HIFN]: Improve PLL initialization

The current PLL initalization has a number of deficiencies:

- uses fixed multiplier of 8, which overclocks the chip when using a
reference clock that operates at frequencies above 33MHz. According
to a comment in the BSD source, this is true for the external clock
on almost all every board.

- writes to a reserved bit

- doesn't allow to use the PCI clock

This patch adds a module parameter to specify the reference clock
(pci or external) and its frequency and uses that to calculate the
optimum multiplier to reach the maximal speed. By default it uses
the external clock and assumes a speed of 66MHz, which effectively
halfs the frequency currently used.

Signed-off-by: Patrick McHardy <[email protected]>

---
commit 3ca22e0c464bc84bffdf63d65c1094b2fed78bff
tree 25f8eec10f986304415f7248255cfa813adf56fc
parent c6e92f8a894cc97b3176ab6111ec3d5e20ef3583
author Patrick McHardy <[email protected]> Sat, 17 Nov 2007 20:15:39 +0100
committer Patrick McHardy <[email protected]> Sat, 17 Nov 2007 20:15:39 +0100

drivers/crypto/hifn_795x.c | 93 +++++++++++++++++++++++++++++++++++++++++++-
1 files changed, 91 insertions(+), 2 deletions(-)

diff --git a/drivers/crypto/hifn_795x.c b/drivers/crypto/hifn_795x.c
index fec32aa..81306b7 100644
--- a/drivers/crypto/hifn_795x.c
+++ b/drivers/crypto/hifn_795x.c
@@ -19,6 +19,7 @@

#include <linux/kernel.h>
#include <linux/module.h>
+#include <linux/moduleparam.h>
#include <linux/mod_devicetable.h>
#include <linux/interrupt.h>
#include <linux/pci.h>
@@ -47,6 +48,11 @@
#define dprintk(f, a...) do {} while (0)
#endif

+static char hifn_pll_ref[sizeof("extNNN")] = "ext66";
+module_param_string(pll_ref, hifn_pll_ref, sizeof(hifn_pll_ref), 0444);
+MODULE_PARM_DESC(pll_ref,
+ "PLL reference clock (pci[freq] or ext[freq], default ext66)");
+
static atomic_t hifn_dev_number;

#define ACRYPTO_OP_DECRYPT 0
@@ -286,7 +292,26 @@ static atomic_t hifn_dev_number;
#define HIFN_DMACNFG_DMARESET 0x00000002 /* DMA Reset # */
#define HIFN_DMACNFG_MSTRESET 0x00000001 /* Master Reset # */

-#define HIFN_PLL_7956 0x00001d18 /* 7956 PLL config value */
+/* PLL configuraton register */
+#define HIFN_PLL_REF_CLK_HBI 0x00000000 /* HBI reference clock */
+#define HIFN_PLL_REF_CLK_PLL 0x00000001 /* PLL reference clock */
+#define HIFN_PLL_BP 0x00000002 /* Reference clock bypass */
+#define HIFN_PLL_PK_CLK_HBI 0x00000000 /* PK engine HBI clock */
+#define HIFN_PLL_PK_CLK_PLL 0x00000008 /* PK engine PLL clock */
+#define HIFN_PLL_PE_CLK_HBI 0x00000000 /* PE engine HBI clock */
+#define HIFN_PLL_PE_CLK_PLL 0x00000010 /* PE engine PLL clock */
+#define HIFN_PLL_RESERVED_1 0x00000400 /* Reserved bit, must be 1 */
+#define HIFN_PLL_ND_SHIFT 11 /* Clock multiplier shift */
+#define HIFN_PLL_ND_MULT_2 0x00000000 /* PLL clock multiplier 2 */
+#define HIFN_PLL_ND_MULT_4 0x00000800 /* PLL clock multiplier 4 */
+#define HIFN_PLL_ND_MULT_6 0x00001000 /* PLL clock multiplier 6 */
+#define HIFN_PLL_ND_MULT_8 0x00001800 /* PLL clock multiplier 8 */
+#define HIFN_PLL_ND_MULT_10 0x00002000 /* PLL clock multiplier 10 */
+#define HIFN_PLL_ND_MULT_12 0x00002800 /* PLL clock multiplier 12 */
+#define HIFN_PLL_IS_1_8 0x00000000 /* charge pump (mult. 1-8) */
+#define HIFN_PLL_IS_9_12 0x00010000 /* charge pump (mult. 9-12) */
+
+#define HIFN_PLL_FCK_MAX 266 /* Maximum PLL frequency */

/* Public key reset register (HIFN_1_PUB_RESET) */
#define HIFN_PUBRST_RESET 0x00000001 /* reset public/rng unit */
@@ -871,6 +896,48 @@ static void hifn_init_dma(struct hifn_device *dev)
dma->cmdk = dma->srck = dma->dstk = dma->resk = 0;
}

+/*
+ * Initialize the PLL. We need to know the frequency of the reference clock
+ * to calculate the optimal multiplier. For PCI we assume 66Mhz, since that
+ * allows us to operate without the risk of overclocking the chip. If it
+ * actually uses 33MHz, the chip will operate at half the speed, this can be
+ * overriden by specifying the frequency as module parameter (pci33).
+ *
+ * Unfortunately the PCI clock is not very suitable since the HIFN needs a
+ * stable clock and the PCI clock frequency may vary, so the default is the
+ * external clock. There is no way to find out its frequency, we default to
+ * 66Mhz since according to Mike Ham of HiFn, almost every board in existence
+ * has an external crystal populated at 66Mhz.
+ */
+static void hifn_init_pll(struct hifn_device *dev)
+{
+ u32 pllcfg = HIFN_1_PLL | HIFN_PLL_RESERVED_1;
+ unsigned int freq, m;
+
+ pllcfg = HIFN_1_PLL | HIFN_PLL_RESERVED_1 |
+ HIFN_PLL_PK_CLK_PLL | HIFN_PLL_PE_CLK_PLL;
+
+ if (strncmp(hifn_pll_ref, "ext", 3) == 0)
+ pllcfg |= HIFN_PLL_REF_CLK_PLL;
+ else
+ pllcfg |= HIFN_PLL_REF_CLK_HBI;
+
+ if (hifn_pll_ref[3] != '\0')
+ freq = simple_strtoul(hifn_pll_ref + 3, NULL, 10);
+ else
+ freq = 66;
+
+ m = HIFN_PLL_FCK_MAX / freq;
+
+ pllcfg |= (m / 2 - 1) << HIFN_PLL_ND_SHIFT;
+ if (m <= 8)
+ pllcfg |= HIFN_PLL_IS_1_8;
+ else
+ pllcfg |= HIFN_PLL_IS_9_12;
+
+ hifn_write_1(dev, HIFN_1_PLL, pllcfg);
+}
+
static void hifn_init_registers(struct hifn_device *dev)
{
u32 dptr = dev->desc_dma;
@@ -938,7 +1005,7 @@ static void hifn_init_registers(struct hifn_device *dev)
#else
hifn_write_0(dev, HIFN_0_PUCNFG, 0x10342);
#endif
- hifn_write_1(dev, HIFN_1_PLL, HIFN_PLL_7956);
+ hifn_init_pll(dev);

hifn_write_0(dev, HIFN_0_PUISR, HIFN_PUISR_DSTOVER);
hifn_write_1(dev, HIFN_1_DMA_CNFG, HIFN_DMACNFG_MSTRESET |
@@ -2621,8 +2688,30 @@ static struct pci_driver hifn_pci_driver = {

static int __devinit hifn_init(void)
{
+ unsigned int freq;
int err;

+ if (strncmp(hifn_pll_ref, "ext", 3) &&
+ strncmp(hifn_pll_ref, "pci", 3)) {
+ printk(KERN_ERR "hifn795x: invalid pll_ref clock, must be"
+ "pci or ext");
+ return -EINVAL;
+ }
+
+ /*
+ * For the 7955/7956 the reference clock frequency must be in the
+ * range of 20MHz-100Mhz. For the 7954 the upper bound is 66.67Mhz,
+ * but this chip is currently not supported.
+ */
+ if (hifn_pll_ref[3] != '\0') {
+ freq = simple_strtoul(hifn_pll_ref + 3, NULL, 10);
+ if (freq < 20 || freq > 100) {
+ printk(KERN_ERR "hifn795x: invalid pll_ref frequency, "
+ "must be in the range of 20-100");
+ return -EINVAL;
+ }
+ }
+
err = pci_register_driver(&hifn_pci_driver);
if (err < 0) {
dprintk("Failed to register PCI driver for %s device.\n",

2007-11-17 19:30:13

by Patrick McHardy

[permalink] [raw]
Subject: [RFC HIFN 02/02]: Add support for using the random number generator

[HIFN]: Add support for using the random number generator

Signed-off-by: Patrick McHardy <[email protected]>

---
commit ad6c98ed1e38edf4da775780b59a0faf56ff42a7
tree 7c733d13a387e18bace04fe904bed9937c0bb628
parent 3ca22e0c464bc84bffdf63d65c1094b2fed78bff
author Patrick McHardy <[email protected]> Sat, 17 Nov 2007 20:15:40 +0100
committer Patrick McHardy <[email protected]> Sat, 17 Nov 2007 20:15:40 +0100

drivers/crypto/hifn_795x.c | 65 ++++++++++++++++++++++++++++++++++++++++++++
1 files changed, 65 insertions(+), 0 deletions(-)

diff --git a/drivers/crypto/hifn_795x.c b/drivers/crypto/hifn_795x.c
index 81306b7..fe5289a 100644
--- a/drivers/crypto/hifn_795x.c
+++ b/drivers/crypto/hifn_795x.c
@@ -31,6 +31,7 @@
#include <linux/highmem.h>
#include <linux/interrupt.h>
#include <linux/crypto.h>
+#include <linux/hw_random.h>

#include <crypto/algapi.h>
#include <crypto/des.h>
@@ -458,6 +459,14 @@ struct hifn_device

struct crypto_queue queue;
struct list_head alg_list;
+
+ unsigned int pk_clk_freq;
+
+#if defined(CONFIG_HW_RANDOM) || defined(CONFIG_HW_RANDOM_MODULE)
+ cycles_t rng_wait_cycles;
+ cycles_t rngtime;
+ struct hwrng rng;
+#endif
};

#define HIFN_D_LENGTH 0x0000ffff
@@ -785,6 +794,49 @@ static struct pci2id {
}
};

+#if defined(CONFIG_HW_RANDOM) || defined(CONFIG_HW_RANDOM_MODULE)
+static int hifn_rng_data_present(struct hwrng *rng)
+{
+ struct hifn_device *dev = (struct hifn_device *)rng->priv;
+
+ return get_cycles() - dev->rngtime > dev->rng_wait_cycles;
+}
+
+static int hifn_rng_data_read(struct hwrng *rng, u32 *data)
+{
+ struct hifn_device *dev = (struct hifn_device *)rng->priv;
+
+ *data = hifn_read_1(dev, HIFN_1_RNG_DATA);
+ dev->rngtime = get_cycles();
+ return 4;
+}
+
+static int hifn_register_rng(struct hifn_device *dev)
+{
+ /*
+ * We must wait at least 256 Pk_clk cycles between two reads of
+ * the rng. Calculate corresponding amount CPU cycles based on
+ * a CPU speed of 4GHz.
+ */
+ dev->rng_wait_cycles = 256 * DIV_ROUND_UP(4000U, dev->pk_clk_freq);
+
+ dev->rng.name = dev->name;
+ dev->rng.data_present = hifn_rng_data_present,
+ dev->rng.data_read = hifn_rng_data_read,
+ dev->rng.priv = (unsigned long)dev;
+
+ return hwrng_register(&dev->rng);
+}
+
+static void hifn_unregister_rng(struct hifn_device *dev)
+{
+ hwrng_unregister(&dev->rng);
+}
+#else
+#define hifn_register_rng(dev) 0
+#define hifn_unregister_rng(dev)
+#endif
+
static int hifn_init_pubrng(struct hifn_device *dev)
{
int i;
@@ -936,6 +988,14 @@ static void hifn_init_pll(struct hifn_device *dev)
pllcfg |= HIFN_PLL_IS_9_12;

hifn_write_1(dev, HIFN_1_PLL, pllcfg);
+
+ /*
+ * The Fpk_clk runs at half the total speed. Its frequency is needed to
+ * calculate the minimum time between two reads of the rng. Since 33MHz
+ * is actually 33.333... we overestimate the frequency here, resulting
+ * in slightly larger intervals.
+ */
+ dev->pk_clk_freq = (freq + 1) * m / 2;
}

static void hifn_init_registers(struct hifn_device *dev)
@@ -2597,6 +2657,10 @@ static int hifn_probe(struct pci_dev *pdev, const struct pci_device_id *id)
if (err)
goto err_out_stop_device;

+ err = hifn_register_rng(dev);
+ if (err)
+ goto err_out_stop_device;
+
INIT_DELAYED_WORK(&dev->work, hifn_work);
schedule_delayed_work(&dev->work, HZ);

@@ -2647,6 +2711,7 @@ static void hifn_remove(struct pci_dev *pdev)
flush_scheduled_work();

hifn_unregister_alg(dev);
+ hifn_unregister_rng(dev);
hifn_reset_dma(dev, 1);
hifn_stop_device(dev);


2007-11-17 19:53:32

by Evgeniy Polyakov

[permalink] [raw]
Subject: Re: [RFC HIFN 00/02]: RNG support

Hi Patrick.

On Sat, Nov 17, 2007 at 08:30:09PM +0100, Patrick McHardy ([email protected]) wrote:
> These two patches add support for using the HIFN rng.

Great!

> The first patch improves the PLL initialization a bit by making the
> reference clock configurable and its speed known to the driver, which
> is needed to calculate the amount of time to wait between two RNG reads.
> Since there is no way to find out the frequency reliably (especially for
> the external clock), it adds some sane looking defaults and a module
> parameter to override it. Suggestions how to improve this are welcome.

I'm not sure it is possible to get it cleaner anyway.

> The second patch adds hw_random support. The ugly part is finding out
> when to allow reads from the RNG. It currently translates the public
> key engine clock cycles to CPU cycles based on a 4GHz CPU and uses
> get_cycles(). The problems with this are obvious, it only works on CPUs
> that actually have some kind of cycle counter, has problems with
> unsynchronized TSCs and the 4GHz assumption is not very nice either,
> but I was reluctant to use ktime for this since it seems rather
> expensive to call ktime_get once per 4 bytes of random. Suggestion
> for improvement of this are also welcome :)

It will not work on arm, but I'm not sure this is relevant...
Another option is to directly access xtime without all wrappers in the
ktime_get().

> Running rngtest on the random number generator indicates that it works
> properly, with an average failure ratio of about 1:1000 at ~2.5mbit.
>
>
> drivers/crypto/hifn_795x.c | 158 +++++++++++++++++++++++++++++++++++++++++++-
> 1 files changed, 156 insertions(+), 2 deletions(-)
>
> Patrick McHardy (2):
> [HIFN]: Improve PLL initialization
> [HIFN]: Add support for using the random number generator

Ack both patches.
Thanks a lot Patrick.

--
Evgeniy Polyakov

2007-11-17 20:04:48

by Patrick McHardy

[permalink] [raw]
Subject: Re: [RFC HIFN 00/02]: RNG support

Evgeniy Polyakov wrote:
> On Sat, Nov 17, 2007 at 08:30:09PM +0100, Patrick McHardy ([email protected]) wrote:
>
>> The second patch adds hw_random support. The ugly part is finding out
>> when to allow reads from the RNG. It currently translates the public
>> key engine clock cycles to CPU cycles based on a 4GHz CPU and uses
>> get_cycles(). The problems with this are obvious, it only works on CPUs
>> that actually have some kind of cycle counter, has problems with
>> unsynchronized TSCs and the 4GHz assumption is not very nice either,
>> but I was reluctant to use ktime for this since it seems rather
>> expensive to call ktime_get once per 4 bytes of random. Suggestion
>> for improvement of this are also welcome :)
>
> It will not work on arm, but I'm not sure this is relevant...
> Another option is to directly access xtime without all wrappers in the
> ktime_get().


I'll look into that, thanks.

>> Running rngtest on the random number generator indicates that it works
>> properly, with an average failure ratio of about 1:1000 at ~2.5mbit.
>>
>>
>> drivers/crypto/hifn_795x.c | 158 +++++++++++++++++++++++++++++++++++++++++++-
>> 1 files changed, 156 insertions(+), 2 deletions(-)
>>
>> Patrick McHardy (2):
>> [HIFN]: Improve PLL initialization
>> [HIFN]: Add support for using the random number generator
>
> Ack both patches.


Just in case, they shouldn't be applied yet, I forgot one minor bit
when splitting them (discarding the first value read from the RNG).

The PLL initialization also doesn't seem to follow the specified way
from the documentation, I want to have another look at this.

2007-11-18 03:14:15

by Herbert Xu

[permalink] [raw]
Subject: Re: [RFC HIFN 00/02]: RNG support

On Sat, Nov 17, 2007 at 07:53:25PM +0000, Evgeniy Polyakov wrote:
>
> > The second patch adds hw_random support. The ugly part is finding out
> > when to allow reads from the RNG. It currently translates the public
> > key engine clock cycles to CPU cycles based on a 4GHz CPU and uses
> > get_cycles(). The problems with this are obvious, it only works on CPUs
> > that actually have some kind of cycle counter, has problems with
> > unsynchronized TSCs and the 4GHz assumption is not very nice either,
> > but I was reluctant to use ktime for this since it seems rather
> > expensive to call ktime_get once per 4 bytes of random. Suggestion
> > for improvement of this are also welcome :)
>
> It will not work on arm, but I'm not sure this is relevant...
> Another option is to directly access xtime without all wrappers in the
> ktime_get().

Is it really that bad? You're most likely going to be spinning
for 10us in udelay() or worse busy-looping anyway so I'm not sure
if we need to worry about that overhead too much.

Otherwise these patches look pretty nice to me. Thanks Patrick!

Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

2007-11-18 03:30:43

by Patrick McHardy

[permalink] [raw]
Subject: Re: [RFC HIFN 00/02]: RNG support

Herbert Xu wrote:
> On Sat, Nov 17, 2007 at 07:53:25PM +0000, Evgeniy Polyakov wrote:
>
>>> The second patch adds hw_random support. The ugly part is finding out
>>> when to allow reads from the RNG. It currently translates the public
>>> key engine clock cycles to CPU cycles based on a 4GHz CPU and uses
>>> get_cycles(). The problems with this are obvious, it only works on CPUs
>>> that actually have some kind of cycle counter, has problems with
>>> unsynchronized TSCs and the 4GHz assumption is not very nice either,
>>> but I was reluctant to use ktime for this since it seems rather
>>> expensive to call ktime_get once per 4 bytes of random. Suggestion
>>> for improvement of this are also welcome :)
>>>
>> It will not work on arm, but I'm not sure this is relevant...
>> Another option is to directly access xtime without all wrappers in the
>> ktime_get().
>>
>
> Is it really that bad? You're most likely going to be spinning
> for 10us in udelay() or worse busy-looping anyway so I'm not sure
> if we need to worry about that overhead too much.
>

Thats a good point, the busy-waiting makes any overhead pretty
much irrelevant (10us is most like much more than ktime_get())
so I'm going to try using ktime_get. xtime doesn't seem suitable
since its only updated on timer interrupts.

On a related issue, I think the rng interface is not very suitable
for chips like HIFN that have a constant random bandwidth, it would
make a lot more sense to return the time to wait to the core, instead
of waiting 10us in all cases. 256 cycles at a speed of 266MHz comes
down to 0.96us, so we're waiting about 10 times as long as necessary.
Since its busy waiting anyway, I'd think that from a performance POV
constant polling or returning the exact amount of time would be more
reasonable.
>
> Otherwise these patches look pretty nice to me. Thanks Patrick!

Fresh patches coming up tommorrow :)

2007-11-18 04:04:22

by Herbert Xu

[permalink] [raw]
Subject: Re: [RFC HIFN 00/02]: RNG support

On Sun, Nov 18, 2007 at 04:30:40AM +0100, Patrick McHardy wrote:
>
> On a related issue, I think the rng interface is not very suitable
> for chips like HIFN that have a constant random bandwidth, it would
> make a lot more sense to return the time to wait to the core, instead
> of waiting 10us in all cases. 256 cycles at a speed of 266MHz comes
> down to 0.96us, so we're waiting about 10 times as long as necessary.
> Since its busy waiting anyway, I'd think that from a performance POV
> constant polling or returning the exact amount of time would be more
> reasonable.

I agree, a better interface would be to let the hardware do the
blocking where necessary.

Michael, what do you think about this?

Thanks,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

2007-11-18 04:04:37

by Herbert Xu

[permalink] [raw]
Subject: Re: [RFC HIFN 00/02]: RNG support

On Sun, Nov 18, 2007 at 12:04:01PM +0800, Herbert Xu wrote:
> On Sun, Nov 18, 2007 at 04:30:40AM +0100, Patrick McHardy wrote:
> >
> > On a related issue, I think the rng interface is not very suitable
> > for chips like HIFN that have a constant random bandwidth, it would
> > make a lot more sense to return the time to wait to the core, instead
> > of waiting 10us in all cases. 256 cycles at a speed of 266MHz comes
> > down to 0.96us, so we're waiting about 10 times as long as necessary.
> > Since its busy waiting anyway, I'd think that from a performance POV
> > constant polling or returning the exact amount of time would be more
> > reasonable.
>
> I agree, a better interface would be to let the hardware do the
> blocking where necessary.

I meant the hardware driver of course.

Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

2007-11-18 10:28:36

by Michael Büsch

[permalink] [raw]
Subject: Re: [RFC HIFN 00/02]: RNG support

On Sunday 18 November 2007 05:04:01 Herbert Xu wrote:
> On Sun, Nov 18, 2007 at 04:30:40AM +0100, Patrick McHardy wrote:
> >
> > On a related issue, I think the rng interface is not very suitable
> > for chips like HIFN that have a constant random bandwidth, it would
> > make a lot more sense to return the time to wait to the core, instead
> > of waiting 10us in all cases. 256 cycles at a speed of 266MHz comes
> > down to 0.96us, so we're waiting about 10 times as long as necessary.
> > Since its busy waiting anyway, I'd think that from a performance POV
> > constant polling or returning the exact amount of time would be more
> > reasonable.
>
> I agree, a better interface would be to let the hardware do the
> blocking where necessary.
>
> Michael, what do you think about this?

Patches are welcome. ;)

--
Greetings Michael.