2010-01-29 21:45:17

by Alex Chiang

[permalink] [raw]
Subject: [PATCH 0/7] Increase maximum Infiniband HCAs per-system

Justin Chen discovered that Linux "only" supports 32 IB cards in
a single system when testing a larger system with 40 cards and
discovered that OFED only reported 32 HCAs.

This patchset doubles the number of HCAs allowed per system in a
backwards-compatible manner.

Why only double? Well, it keeps the implementation very simple while
also keeping the memory increase at a minimum. If the future
requires even more HCAs per system, we can do a more complicated
implementation then.

I tested this by inserting 4 IB cards into a system, and then
changing IB_UVERBS_MAX_DEVICES = 2.

dl585g2:~ # modprobe ib_uverbs
dl585g2:~ # ls -l /dev/uverbs*
crw-rw---- 1 root root 231, 192 Jan 28 06:59 /dev/uverbs0
crw-rw---- 1 root root 231, 193 Jan 28 06:59 /dev/uverbs1
crw-rw---- 1 root root 249, 0 Jan 28 06:59 /dev/uverbs2
crw-rw---- 1 root root 249, 1 Jan 28 06:59 /dev/uverbs3

You can see that the uverbs devices are numbered in order, as one
would expect, and that devices 2 and 3 have different major numbers
while devices 0 and 1 retain the legacy, allocated major number.

I don't have access to a huge system to really test what
happens at device 33, but it should just work, right? :)

I am also unaware of any OFED changes required (if that's even
necessary), but then again, I'm just a simple kernel guy.

The final two patches are just something I discovered while using
pahole to verify my changes in converting the *cdev pointer to an
embedded struct. They don't save all that much, but they also don't
change very much code either, so why not?

Thanks,
/ac

---

Alex Chiang (7):
IB/uverbs: convert *cdev to cdev in struct ib_uverbs_device
IB/uverbs: remove dev_table
IB/uverbs: use stack variable 'devnum' in ib_uverbs_add_one
IB/uverbs: use stack variable 'base' in ib_uverbs_add_one
IB/uverbs: increase maximum devices supported
IB/uverbs: pack struct ib_uverbs_event_file tighter
IB/core: pack struct ib_device a little tighter


drivers/infiniband/core/uverbs.h | 11 ++-
drivers/infiniband/core/uverbs_main.c | 106 +++++++++++++++++++++------------
include/rdma/ib_verbs.h | 4 +
3 files changed, 76 insertions(+), 45 deletions(-)


2010-01-29 21:45:28

by Alex Chiang

[permalink] [raw]
Subject: [PATCH 1/7] IB/uverbs: convert *cdev to cdev in struct ib_uverbs_device

Instead of storing a pointer to a cdev, embed the entire struct cdev.

This change allows us to use the container_of() macro in
ib_uverbs_open() in a future patch.

This change increases the size of struct ib_uverbs_device to
168 bytes across 3 cachelines from 80 bytes in 2 cachelines.
However, we rearrange the members so that everything fits into
the first cacheline except for the struct cdev. Finally, we don't
touch the cdev in any fastpaths, so this change shouldn't negatively
affect performance.

Signed-off-by: Alex Chiang <[email protected]>
---

drivers/infiniband/core/uverbs.h | 7 ++++---
drivers/infiniband/core/uverbs_main.c | 23 ++++++++++-------------
2 files changed, 14 insertions(+), 16 deletions(-)

diff --git a/drivers/infiniband/core/uverbs.h b/drivers/infiniband/core/uverbs.h
index b3ea958..e695f65 100644
--- a/drivers/infiniband/core/uverbs.h
+++ b/drivers/infiniband/core/uverbs.h
@@ -41,6 +41,7 @@
#include <linux/idr.h>
#include <linux/mutex.h>
#include <linux/completion.h>
+#include <linux/cdev.h>

#include <rdma/ib_verbs.h>
#include <rdma/ib_umem.h>
@@ -69,12 +70,12 @@

struct ib_uverbs_device {
struct kref ref;
+ int num_comp_vectors;
struct completion comp;
- int devnum;
- struct cdev *cdev;
struct device *dev;
struct ib_device *ib_dev;
- int num_comp_vectors;
+ int devnum;
+ struct cdev cdev;
};

struct ib_uverbs_event_file {
diff --git a/drivers/infiniband/core/uverbs_main.c b/drivers/infiniband/core/uverbs_main.c
index 5f284ff..5da9a73 100644
--- a/drivers/infiniband/core/uverbs_main.c
+++ b/drivers/infiniband/core/uverbs_main.c
@@ -43,7 +43,6 @@
#include <linux/sched.h>
#include <linux/file.h>
#include <linux/mount.h>
-#include <linux/cdev.h>

#include <asm/uaccess.h>

@@ -761,17 +760,15 @@ static void ib_uverbs_add_one(struct ib_device *device)
uverbs_dev->ib_dev = device;
uverbs_dev->num_comp_vectors = device->num_comp_vectors;

- uverbs_dev->cdev = cdev_alloc();
- if (!uverbs_dev->cdev)
- goto err;
- uverbs_dev->cdev->owner = THIS_MODULE;
- uverbs_dev->cdev->ops = device->mmap ? &uverbs_mmap_fops : &uverbs_fops;
- kobject_set_name(&uverbs_dev->cdev->kobj, "uverbs%d", uverbs_dev->devnum);
- if (cdev_add(uverbs_dev->cdev, IB_UVERBS_BASE_DEV + uverbs_dev->devnum, 1))
+ cdev_init(&uverbs_dev->cdev, NULL);
+ uverbs_dev->cdev.owner = THIS_MODULE;
+ uverbs_dev->cdev.ops = device->mmap ? &uverbs_mmap_fops : &uverbs_fops;
+ kobject_set_name(&uverbs_dev->cdev.kobj, "uverbs%d", uverbs_dev->devnum);
+ if (cdev_add(&uverbs_dev->cdev, IB_UVERBS_BASE_DEV + uverbs_dev->devnum, 1))
goto err_cdev;

uverbs_dev->dev = device_create(uverbs_class, device->dma_device,
- uverbs_dev->cdev->dev, uverbs_dev,
+ uverbs_dev->cdev.dev, uverbs_dev,
"uverbs%d", uverbs_dev->devnum);
if (IS_ERR(uverbs_dev->dev))
goto err_cdev;
@@ -790,10 +787,10 @@ static void ib_uverbs_add_one(struct ib_device *device)
return;

err_class:
- device_destroy(uverbs_class, uverbs_dev->cdev->dev);
+ device_destroy(uverbs_class, uverbs_dev->cdev.dev);

err_cdev:
- cdev_del(uverbs_dev->cdev);
+ cdev_del(&uverbs_dev->cdev);
clear_bit(uverbs_dev->devnum, dev_map);

err:
@@ -811,8 +808,8 @@ static void ib_uverbs_remove_one(struct ib_device *device)
return;

dev_set_drvdata(uverbs_dev->dev, NULL);
- device_destroy(uverbs_class, uverbs_dev->cdev->dev);
- cdev_del(uverbs_dev->cdev);
+ device_destroy(uverbs_class, uverbs_dev->cdev.dev);
+ cdev_del(&uverbs_dev->cdev);

spin_lock(&map_lock);
dev_table[uverbs_dev->devnum] = NULL;

2010-01-29 21:45:35

by Alex Chiang

[permalink] [raw]
Subject: [PATCH 6/7] IB/uverbs: pack struct ib_uverbs_event_file tighter

Eliminate some padding in the structure by rearranging the members.

sizeof(struct ib_uverbs_event_file) is now 72 bytes (from 80) and
more members now fit in the first cacheline.

Signed-off-by: Alex Chiang <[email protected]>
---

drivers/infiniband/core/uverbs.h | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/core/uverbs.h b/drivers/infiniband/core/uverbs.h
index e695f65..e54d9ac 100644
--- a/drivers/infiniband/core/uverbs.h
+++ b/drivers/infiniband/core/uverbs.h
@@ -80,13 +80,13 @@ struct ib_uverbs_device {

struct ib_uverbs_event_file {
struct kref ref;
+ int is_async;
struct ib_uverbs_file *uverbs_file;
spinlock_t lock;
+ int is_closed;
wait_queue_head_t poll_wait;
struct fasync_struct *async_queue;
struct list_head event_list;
- int is_async;
- int is_closed;
};

struct ib_uverbs_file {

2010-01-29 21:45:48

by Alex Chiang

[permalink] [raw]
Subject: [PATCH 7/7] IB/core: pack struct ib_device a little tighter

A small change to reduce the size of ib_device to 1112 bytes
(from 1128).

Signed-off-by: Alex Chiang <[email protected]>
---

include/rdma/ib_verbs.h | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 09509ed..a585e0f 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -984,9 +984,9 @@ struct ib_device {
struct list_head event_handler_list;
spinlock_t event_handler_lock;

+ spinlock_t client_data_lock;
struct list_head core_list;
struct list_head client_data_list;
- spinlock_t client_data_lock;

struct ib_cache cache;
int *pkey_tbl_len;
@@ -1144,8 +1144,8 @@ struct ib_device {
IB_DEV_UNREGISTERED
} reg_state;

- u64 uverbs_cmd_mask;
int uverbs_abi_ver;
+ u64 uverbs_cmd_mask;

char node_desc[64];
__be64 node_guid;

2010-01-29 21:45:26

by Alex Chiang

[permalink] [raw]
Subject: [PATCH 2/7] IB/uverbs: remove dev_table

dev_table's raison d'etre was to associate an inode back to a
struct ib_uverbs_device.

However, now that we've converted ib_uverbs_device to contain
an embedded cdev (instead of a *cdev), we can use the container_of()
macro and cast back to the containing device.

There's no longer any need for dev_table, so get rid of it.

Signed-off-by: Alex Chiang <[email protected]>
---

drivers/infiniband/core/uverbs_main.c | 24 +++++-------------------
1 files changed, 5 insertions(+), 19 deletions(-)

diff --git a/drivers/infiniband/core/uverbs_main.c b/drivers/infiniband/core/uverbs_main.c
index 5da9a73..3f11292 100644
--- a/drivers/infiniband/core/uverbs_main.c
+++ b/drivers/infiniband/core/uverbs_main.c
@@ -74,7 +74,6 @@ DEFINE_IDR(ib_uverbs_qp_idr);
DEFINE_IDR(ib_uverbs_srq_idr);

static DEFINE_SPINLOCK(map_lock);
-static struct ib_uverbs_device *dev_table[IB_UVERBS_MAX_DEVICES];
static DECLARE_BITMAP(dev_map, IB_UVERBS_MAX_DEVICES);

static ssize_t (*uverbs_cmd_table[])(struct ib_uverbs_file *file,
@@ -616,14 +615,12 @@ static int ib_uverbs_mmap(struct file *filp, struct vm_area_struct *vma)
/*
* ib_uverbs_open() does not need the BKL:
*
- * - dev_table[] accesses are protected by map_lock, the
- * ib_uverbs_device structures are properly reference counted, and
+ * - the ib_uverbs_device structures are properly reference counted and
* everything else is purely local to the file being created, so
* races against other open calls are not a problem;
* - there is no ioctl method to race against;
- * - the device is added to dev_table[] as the last part of module
- * initialization, the open method will either immediately run
- * -ENXIO, or all required initialization will be done.
+ * - the open method will either immediately run -ENXIO, or all
+ * required initialization will be done.
*/
static int ib_uverbs_open(struct inode *inode, struct file *filp)
{
@@ -631,13 +628,10 @@ static int ib_uverbs_open(struct inode *inode, struct file *filp)
struct ib_uverbs_file *file;
int ret;

- spin_lock(&map_lock);
- dev = dev_table[iminor(inode) - IB_UVERBS_BASE_MINOR];
+ dev = container_of(inode->i_cdev, struct ib_uverbs_device, cdev);
if (dev)
kref_get(&dev->ref);
- spin_unlock(&map_lock);
-
- if (!dev)
+ else
return -ENXIO;

if (!try_module_get(dev->ib_dev->owner)) {
@@ -778,10 +772,6 @@ static void ib_uverbs_add_one(struct ib_device *device)
if (device_create_file(uverbs_dev->dev, &dev_attr_abi_version))
goto err_class;

- spin_lock(&map_lock);
- dev_table[uverbs_dev->devnum] = uverbs_dev;
- spin_unlock(&map_lock);
-
ib_set_client_data(device, &uverbs_client, uverbs_dev);

return;
@@ -811,10 +801,6 @@ static void ib_uverbs_remove_one(struct ib_device *device)
device_destroy(uverbs_class, uverbs_dev->cdev.dev);
cdev_del(&uverbs_dev->cdev);

- spin_lock(&map_lock);
- dev_table[uverbs_dev->devnum] = NULL;
- spin_unlock(&map_lock);
-
clear_bit(uverbs_dev->devnum, dev_map);

kref_put(&uverbs_dev->ref, ib_uverbs_release_dev);

2010-01-29 21:46:08

by Alex Chiang

[permalink] [raw]
Subject: [PATCH 4/7] IB/uverbs: use stack variable 'base' in ib_uverbs_add_one

This change is not useful by itself, but sets us up for a future change
that allows us to support more than IB_UVERBS_MAX_DEVICES in a system.

Signed-off-by: Alex Chiang <[email protected]>
---

drivers/infiniband/core/uverbs_main.c | 5 +++--
1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/core/uverbs_main.c b/drivers/infiniband/core/uverbs_main.c
index acae9ed..a5d441d 100644
--- a/drivers/infiniband/core/uverbs_main.c
+++ b/drivers/infiniband/core/uverbs_main.c
@@ -730,7 +730,7 @@ static CLASS_ATTR(abi_version, S_IRUGO, show_abi_version, NULL);

static void ib_uverbs_add_one(struct ib_device *device)
{
- int devnum;
+ int devnum, base;
struct ib_uverbs_device *uverbs_dev;

if (!device->alloc_ucontext)
@@ -750,6 +750,7 @@ static void ib_uverbs_add_one(struct ib_device *device)
goto err;
}
uverbs_dev->devnum = devnum;
+ base = devnum + IB_UVERBS_BASE_DEV;
set_bit(devnum, dev_map);
spin_unlock(&map_lock);

@@ -760,7 +761,7 @@ static void ib_uverbs_add_one(struct ib_device *device)
uverbs_dev->cdev.owner = THIS_MODULE;
uverbs_dev->cdev.ops = device->mmap ? &uverbs_mmap_fops : &uverbs_fops;
kobject_set_name(&uverbs_dev->cdev.kobj, "uverbs%d", uverbs_dev->devnum);
- if (cdev_add(&uverbs_dev->cdev, IB_UVERBS_BASE_DEV + devnum, 1))
+ if (cdev_add(&uverbs_dev->cdev, base, 1))
goto err_cdev;

uverbs_dev->dev = device_create(uverbs_class, device->dma_device,

2010-01-29 21:46:36

by Alex Chiang

[permalink] [raw]
Subject: [PATCH 3/7] IB/uverbs: use stack variable 'devnum' in ib_uverbs_add_one

This change is not useful by itself, but it sets us up for a future
change that allows us to dynamically allocate device numbers in case
we have more than IB_UVERBS_MAX_DEVICES in the system.

Signed-off-by: Alex Chiang <[email protected]>
---

drivers/infiniband/core/uverbs_main.c | 12 +++++++-----
1 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/drivers/infiniband/core/uverbs_main.c b/drivers/infiniband/core/uverbs_main.c
index 3f11292..acae9ed 100644
--- a/drivers/infiniband/core/uverbs_main.c
+++ b/drivers/infiniband/core/uverbs_main.c
@@ -730,6 +730,7 @@ static CLASS_ATTR(abi_version, S_IRUGO, show_abi_version, NULL);

static void ib_uverbs_add_one(struct ib_device *device)
{
+ int devnum;
struct ib_uverbs_device *uverbs_dev;

if (!device->alloc_ucontext)
@@ -743,12 +744,13 @@ static void ib_uverbs_add_one(struct ib_device *device)
init_completion(&uverbs_dev->comp);

spin_lock(&map_lock);
- uverbs_dev->devnum = find_first_zero_bit(dev_map, IB_UVERBS_MAX_DEVICES);
- if (uverbs_dev->devnum >= IB_UVERBS_MAX_DEVICES) {
+ devnum = find_first_zero_bit(dev_map, IB_UVERBS_MAX_DEVICES);
+ if (devnum >= IB_UVERBS_MAX_DEVICES) {
spin_unlock(&map_lock);
goto err;
}
- set_bit(uverbs_dev->devnum, dev_map);
+ uverbs_dev->devnum = devnum;
+ set_bit(devnum, dev_map);
spin_unlock(&map_lock);

uverbs_dev->ib_dev = device;
@@ -758,7 +760,7 @@ static void ib_uverbs_add_one(struct ib_device *device)
uverbs_dev->cdev.owner = THIS_MODULE;
uverbs_dev->cdev.ops = device->mmap ? &uverbs_mmap_fops : &uverbs_fops;
kobject_set_name(&uverbs_dev->cdev.kobj, "uverbs%d", uverbs_dev->devnum);
- if (cdev_add(&uverbs_dev->cdev, IB_UVERBS_BASE_DEV + uverbs_dev->devnum, 1))
+ if (cdev_add(&uverbs_dev->cdev, IB_UVERBS_BASE_DEV + devnum, 1))
goto err_cdev;

uverbs_dev->dev = device_create(uverbs_class, device->dma_device,
@@ -781,7 +783,7 @@ err_class:

err_cdev:
cdev_del(&uverbs_dev->cdev);
- clear_bit(uverbs_dev->devnum, dev_map);
+ clear_bit(devnum, dev_map);

err:
kref_put(&uverbs_dev->ref, ib_uverbs_release_dev);

2010-01-29 21:46:48

by Alex Chiang

[permalink] [raw]
Subject: [PATCH 5/7] IB/uverbs: increase maximum devices supported

Some large systems may support more than IB_UVERBS_MAX_DEVICES
(currently 32).

This change allows us to support more devices in a backwards-compatible
manner. The first IB_UVERBS_MAX_DEVICES keep the same major/minor
device numbers that they've always had.

If there are more than IB_UVERBS_MAX_DEVICES, we then dynamically
request a new major device number (new minors start at 0).

This change increases the maximum number of HCAs to 64 (from 32).

Signed-off-by: Alex Chiang <[email protected]>
---

drivers/infiniband/core/uverbs_main.c | 56 +++++++++++++++++++++++++++++----
1 files changed, 50 insertions(+), 6 deletions(-)

diff --git a/drivers/infiniband/core/uverbs_main.c b/drivers/infiniband/core/uverbs_main.c
index a5d441d..0555460 100644
--- a/drivers/infiniband/core/uverbs_main.c
+++ b/drivers/infiniband/core/uverbs_main.c
@@ -728,6 +728,34 @@ static ssize_t show_abi_version(struct class *class, char *buf)
}
static CLASS_ATTR(abi_version, S_IRUGO, show_abi_version, NULL);

+static dev_t oflo_maj;
+static DECLARE_BITMAP(oflo_map, IB_UVERBS_MAX_DEVICES);
+
+/*
+ * If we have more than IB_UVERBS_MAX_DEVICES, dynamically overflow by
+ * requesting a new major number and doubling the number of max devices we
+ * support. It's stupid, but simple.
+ */
+static int find_oflo_devnum(void)
+{
+ int ret;
+
+ if (!oflo_maj) {
+ ret = alloc_chrdev_region(&oflo_maj, 0, IB_UVERBS_MAX_DEVICES,
+ "infiniband_verbs");
+ if (ret) {
+ printk(KERN_ERR "user_verbs: couldn't register dynamic device number\n");
+ return ret;
+ }
+ }
+
+ ret = find_first_zero_bit(oflo_map, IB_UVERBS_MAX_DEVICES);
+ if (ret >= IB_UVERBS_MAX_DEVICES)
+ return -1;
+
+ return ret;
+}
+
static void ib_uverbs_add_one(struct ib_device *device)
{
int devnum, base;
@@ -747,11 +775,19 @@ static void ib_uverbs_add_one(struct ib_device *device)
devnum = find_first_zero_bit(dev_map, IB_UVERBS_MAX_DEVICES);
if (devnum >= IB_UVERBS_MAX_DEVICES) {
spin_unlock(&map_lock);
- goto err;
+ devnum = find_oflo_devnum();
+ if (devnum < 0)
+ goto err;
+
+ spin_lock(&map_lock);
+ uverbs_dev->devnum = devnum + IB_UVERBS_MAX_DEVICES;
+ base = devnum + oflo_maj;
+ set_bit(devnum, oflo_map);
+ } else {
+ uverbs_dev->devnum = devnum;
+ base = devnum + IB_UVERBS_BASE_DEV;
+ set_bit(devnum, dev_map);
}
- uverbs_dev->devnum = devnum;
- base = devnum + IB_UVERBS_BASE_DEV;
- set_bit(devnum, dev_map);
spin_unlock(&map_lock);

uverbs_dev->ib_dev = device;
@@ -784,7 +820,10 @@ err_class:

err_cdev:
cdev_del(&uverbs_dev->cdev);
- clear_bit(devnum, dev_map);
+ if (oflo_maj)
+ clear_bit(devnum, oflo_map);
+ else
+ clear_bit(devnum, dev_map);

err:
kref_put(&uverbs_dev->ref, ib_uverbs_release_dev);
@@ -804,7 +843,10 @@ static void ib_uverbs_remove_one(struct ib_device *device)
device_destroy(uverbs_class, uverbs_dev->cdev.dev);
cdev_del(&uverbs_dev->cdev);

- clear_bit(uverbs_dev->devnum, dev_map);
+ if (uverbs_dev->devnum < IB_UVERBS_MAX_DEVICES)
+ clear_bit(uverbs_dev->devnum, dev_map);
+ else
+ clear_bit(uverbs_dev->devnum - IB_UVERBS_MAX_DEVICES, oflo_map);

kref_put(&uverbs_dev->ref, ib_uverbs_release_dev);
wait_for_completion(&uverbs_dev->comp);
@@ -894,6 +936,8 @@ static void __exit ib_uverbs_cleanup(void)
unregister_filesystem(&uverbs_event_fs);
class_destroy(uverbs_class);
unregister_chrdev_region(IB_UVERBS_BASE_DEV, IB_UVERBS_MAX_DEVICES);
+ if (oflo_maj)
+ unregister_chrdev_region(oflo_maj, IB_UVERBS_MAX_DEVICES);
idr_destroy(&ib_uverbs_pd_idr);
idr_destroy(&ib_uverbs_mr_idr);
idr_destroy(&ib_uverbs_mw_idr);

2010-01-29 22:54:42

by Roland Dreier

[permalink] [raw]
Subject: Re: [PATCH 0/7] Increase maximum Infiniband HCAs per-system

> Justin Chen discovered that Linux "only" supports 32 IB cards in
> a single system when testing a larger system with 40 cards and
> discovered that OFED only reported 32 HCAs.
>
> This patchset doubles the number of HCAs allowed per system in a
> backwards-compatible manner.

Looks like a really great start, some nice cleanups as well the added
functionality. I've been meaning to use pahole for a while...

Have you considered drivers/infiniband/core/user_mad.c and ucm.c? I
think user_mad.c is somewhat more important, as that is what allows an
adapter to be used for running the SM. So I think we're still left with
some potential issues around lots of adapters in one system. (I think
use of ucm by real apps is minimal to nonexistent, but someday we should
deal with that too)

- R.

2010-01-29 23:41:48

by Alex Chiang

[permalink] [raw]
Subject: Re: [PATCH 0/7] Increase maximum Infiniband HCAs per-system

* Roland Dreier <[email protected]>:
>
> Have you considered drivers/infiniband/core/user_mad.c and ucm.c?

Ah, darn. I had not considered those drivers. You're gonna make
me learn a lot more about IB than I'd originally intended. ;)

> I think user_mad.c is somewhat more important, as that is what
> allows an adapter to be used for running the SM. So I think
> we're still left with some potential issues around lots of
> adapters in one system. (I think use of ucm by real apps is
> minimal to nonexistent, but someday we should deal with that
> too)

Ok, a quick glance through those drivers shows:

enum {
IB_UMAD_MAX_PORTS = 64,
IB_UMAD_MAX_AGENTS = 32,

IB_UMAD_MAJOR = 231,
IB_UMAD_MINOR_BASE = 0
};

and

enum {
IB_UCM_MAJOR = 231,
IB_UCM_BASE_MINOR = 224,
IB_UCM_MAX_DEVICES = 32
};

They're all sharing the same major number, so they'll all have to
get the same treatment as the uverbs driver wrt overflow (to
prevent minor number overlap).

What I'm a little unsure of is, does IB_UMAD_MAX_AGENTS need to
double too? We don't export the agent id in the filesystem
anywhere, but we do give it to the user via an ioctl. That's just
used for book keeping purposes but...

Currently, there are 2x as many ports as there are agents. Do we
want to keep that ratio, or would it be ok to have 4x as many
ports as there are agents?

Thanks (and sorry for the n00b questions).

/ac

2010-01-30 07:13:06

by Roland Dreier

[permalink] [raw]
Subject: Re: [PATCH 0/7] Increase maximum Infiniband HCAs per-system

> What I'm a little unsure of is, does IB_UMAD_MAX_AGENTS need to
> double too? We don't export the agent id in the filesystem
> anywhere, but we do give it to the user via an ioctl. That's just
> used for book keeping purposes but...

I don't think so. That is a limit on "agents" registered per open file,
so it is orthogonal to the number of devices.

- R.

2010-02-01 12:55:57

by Hal Rosenstock

[permalink] [raw]
Subject: Re: [PATCH 0/7] Increase maximum Infiniband HCAs per-system

On Fri, Jan 29, 2010 at 6:41 PM, Alex Chiang <[email protected]> wrote:
> * Roland Dreier <[email protected]>:
>>
>> Have you considered drivers/infiniband/core/user_mad.c and ucm.c?
>
> Ah, darn. I had not considered those drivers. You're gonna make
> me learn a lot more about IB than I'd originally intended. ;)
>
>> I think user_mad.c is somewhat more important, as that is what
>> allows an adapter to be used for running the SM.  So I think
>> we're still left with some potential issues around lots of
>> adapters in one system.  (I think use of ucm by real apps is
>> minimal to nonexistent, but someday we should deal with that
>> too)
>
> Ok, a quick glance through those drivers shows:
>
>        enum {
>                IB_UMAD_MAX_PORTS  = 64,
>                IB_UMAD_MAX_AGENTS = 32,
>
>                IB_UMAD_MAJOR      = 231,
>                IB_UMAD_MINOR_BASE = 0
>        };
>
> and
>
>        enum {
>                IB_UCM_MAJOR = 231,
>                IB_UCM_BASE_MINOR = 224,
>                IB_UCM_MAX_DEVICES = 32
>        };
>
> They're all sharing the same major number, so they'll all have to
> get the same treatment as the uverbs driver wrt overflow (to
> prevent minor number overlap).
>
> What I'm a little unsure of is, does IB_UMAD_MAX_AGENTS need to
> double too? We don't export the agent id in the filesystem
> anywhere, but we do give it to the user via an ioctl. That's just
> used for book keeping purposes but...
>
> Currently, there are 2x as many ports as there are agents. Do we
> want to keep that ratio, or would it be ok to have 4x as many
> ports as there are agents?

I think it's 2x as many ports as devices (based on the common HCAs
being 2 ports max). This should be maintained as that is still the
case.

-- Hal

>
> Thanks (and sorry for the n00b questions).
>
> /ac
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to [email protected]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>