2007-12-14 00:06:16

by Pallipadi, Venkatesh

[permalink] [raw]
Subject: [RFC PATCH 08/12] PAT 64b: coherent mmap and sysfs bin ioctl

Forward port of coherent-mmap.patch and sysfs-bin-ioctl.patch to x86 tree.

TBD: Do we need the ioctl interface to sysfs or get the type attribute
through a different sysfs file. And then actually specify the attribute
while doing pci_mmap_page_range ;-)

And when this interface is in place, X server has to use this interface for WC
mapping.

Signed-off-by: Venkatesh Pallipadi <[email protected]>
Signed-off-by: Suresh Siddha <[email protected]>
---

Index: linux-2.6.24-rc4/fs/sysfs/bin.c
===================================================================
--- linux-2.6.24-rc4.orig/fs/sysfs/bin.c 2007-12-11 16:23:26.000000000 -0800
+++ linux-2.6.24-rc4/fs/sysfs/bin.c 2007-12-11 16:32:01.000000000 -0800
@@ -221,6 +221,19 @@
return 0;
}

+static int ioctl(struct inode *i, struct file *f, unsigned cmd,
+ unsigned long arg)
+{
+ struct sysfs_dirent *attr_sd = f->f_path.dentry->d_fsdata;
+ struct bin_attribute *attr = attr_sd->s_bin_attr.bin_attr;
+ struct kobject *kobj = attr_sd->s_parent->s_dir.kobj;
+
+ if (!attr->ioctl)
+ return -EINVAL;
+
+ return attr->ioctl(kobj, attr, cmd, arg);
+}
+
const struct file_operations bin_fops = {
.read = read,
.write = write,
@@ -228,6 +241,7 @@
.llseek = generic_file_llseek,
.open = open,
.release = release,
+ .ioctl = ioctl,
};

/**
Index: linux-2.6.24-rc4/include/linux/sysfs.h
===================================================================
--- linux-2.6.24-rc4.orig/include/linux/sysfs.h 2007-12-11 16:23:26.000000000 -0800
+++ linux-2.6.24-rc4/include/linux/sysfs.h 2007-12-11 16:29:07.000000000 -0800
@@ -69,6 +69,8 @@
char *, loff_t, size_t);
int (*mmap)(struct kobject *, struct bin_attribute *attr,
struct vm_area_struct *vma);
+ int (*ioctl)(struct kobject *, struct bin_attribute *attr,
+ unsigned cmd, unsigned long arg);
};

struct sysfs_ops {
Index: linux-2.6.24-rc4/drivers/pci/pci-sysfs.c
===================================================================
--- linux-2.6.24-rc4.orig/drivers/pci/pci-sysfs.c 2007-12-11 16:03:55.000000000 -0800
+++ linux-2.6.24-rc4/drivers/pci/pci-sysfs.c 2007-12-11 16:29:07.000000000 -0800
@@ -473,8 +473,56 @@
kfree(res_attr);
}
}
+
+#ifdef HAVE_PCI_COHERENT_MMAP
+ sysfs_remove_bin_file(&pdev->dev.kobj, pdev->coherent_attr);
+ kfree(pdev->coherent_attr);
+#endif
+}
+
+#ifdef HAVE_PCI_COHERENT_MMAP
+
+struct coh_mmap_data {
+ void *map;
+ struct device *dev;
+ dma_addr_t busadr;
+};
+
+void pci_coherent_mmap_close(struct vm_area_struct *vma)
+{
+ struct coh_mmap_data *cm = vma->vm_private_data;
+ dma_free_coherent(cm->dev, vma->vm_end - vma->vm_start, cm->map,
+ cm->busadr);
}

+static struct vm_operations_struct pci_coherent_vmops = {
+ .close = pci_coherent_mmap_close,
+};
+
+static int
+pci_mmap_coherent_mem(struct kobject *kobj, struct bin_attribute *attr,
+ struct vm_area_struct *vma)
+{
+ struct device *dev = container_of(kobj, struct device, kobj);
+ struct pci_dev *pdev = to_pci_dev(dev);
+ struct coh_mmap_data *cm = kmalloc(sizeof(struct coh_mmap_data),
+ GFP_KERNEL);
+ if (!cm)
+ return -ENOMEM;
+ cm->map = dma_alloc_coherent(dev, vma->vm_end - vma->vm_start,
+ &cm->busadr, GFP_KERNEL);
+ cm->dev = dev;
+ if (!cm->map) {
+ kfree(cm->map);
+ return -ENOMEM;
+ }
+ vma->vm_private_data = cm;
+ vma->vm_pgoff = cm->busadr >> PAGE_SHIFT;
+ vma->vm_ops = &pci_coherent_vmops;
+ return pci_mmap_page_range(pdev, vma, pci_mmap_coherent, 0);
+}
+#endif
+
/**
* pci_create_resource_files - create resource files in sysfs for @dev
* @dev: dev in question
@@ -692,6 +740,22 @@
kfree(pdev->rom_attr);
}
}
+#ifdef HAVE_PCI_COHERENT_MMAP
+ {
+ struct bin_attribute *a = kzalloc(sizeof(struct bin_attribute),
+ GFP_KERNEL);
+ if (!a)
+ return;
+ pdev->coherent_attr = a;
+ a->attr.name = "coherent_mem";
+ a->attr.mode = S_IRUSR | S_IWUSR;
+ a->attr.owner = THIS_MODULE;
+ a->size = *(pdev->dev.dma_mask);
+ a->mmap = pci_mmap_coherent_mem;
+ a->private = NULL;
+ sysfs_create_bin_file(&pdev->dev.kobj, a);
+ }
+#endif
}

static int __init pci_sysfs_init(void)
Index: linux-2.6.24-rc4/include/asm-x86/pci.h
===================================================================
--- linux-2.6.24-rc4.orig/include/asm-x86/pci.h 2007-12-11 16:03:55.000000000 -0800
+++ linux-2.6.24-rc4/include/asm-x86/pci.h 2007-12-11 16:29:07.000000000 -0800
@@ -61,6 +61,7 @@


#define HAVE_PCI_MMAP
+#define HAVE_PCI_COHERENT_MMAP
extern int pci_mmap_page_range(struct pci_dev *dev, struct vm_area_struct *vma,
enum pci_mmap_state mmap_state, int write_combine);

Index: linux-2.6.24-rc4/include/linux/pci.h
===================================================================
--- linux-2.6.24-rc4.orig/include/linux/pci.h 2007-12-11 16:03:55.000000000 -0800
+++ linux-2.6.24-rc4/include/linux/pci.h 2007-12-11 16:29:07.000000000 -0800
@@ -57,7 +57,8 @@
/* File state for mmap()s on /proc/bus/pci/X/Y */
enum pci_mmap_state {
pci_mmap_io,
- pci_mmap_mem
+ pci_mmap_mem,
+ pci_mmap_coherent
};

/* This defines the direction arg to the DMA mapping routines. */
@@ -201,6 +202,7 @@
struct bin_attribute *rom_attr; /* attribute descriptor for sysfs ROM entry */
int rom_attr_enabled; /* has display of the rom attribute been enabled? */
struct bin_attribute *res_attr[DEVICE_COUNT_RESOURCE]; /* sysfs file for resources */
+ struct bin_attribute *coherent_attr;
#ifdef CONFIG_PCI_MSI
struct list_head msi_list;
#endif

--


2007-12-14 00:20:38

by Greg KH

[permalink] [raw]
Subject: Re: [RFC PATCH 08/12] PAT 64b: coherent mmap and sysfs bin ioctl

On Thu, Dec 13, 2007 at 03:55:51PM -0800, [email protected] wrote:
> Forward port of coherent-mmap.patch and sysfs-bin-ioctl.patch to x86 tree.
>
> TBD: Do we need the ioctl interface to sysfs or get the type attribute
> through a different sysfs file. And then actually specify the attribute
> while doing pci_mmap_page_range ;-)

Woah! No, no ioctls on sysfs files, sorry. Not going to happen, do
this on a /dev file if you want to have ioctls...

thanks,

greg k-h

2007-12-14 00:35:24

by David Miller

[permalink] [raw]
Subject: Re: [RFC PATCH 08/12] PAT 64b: coherent mmap and sysfs bin ioctl

From: Greg KH <[email protected]>
Date: Thu, 13 Dec 2007 16:19:32 -0800

> On Thu, Dec 13, 2007 at 03:55:51PM -0800, [email protected] wrote:
> > Forward port of coherent-mmap.patch and sysfs-bin-ioctl.patch to x86 tree.
> >
> > TBD: Do we need the ioctl interface to sysfs or get the type attribute
> > through a different sysfs file. And then actually specify the attribute
> > while doing pci_mmap_page_range ;-)
>
> Woah! No, no ioctls on sysfs files, sorry. Not going to happen, do
> this on a /dev file if you want to have ioctls...

Well since we told people to move over to sysfs for PCI
accesses, and that's where mmap() is done via too,
it should be no surprise that we run into problems when
people want to set attributes for the mmap() as was done
for the procfs case.

So you have two choices:

1) Balk on the sysfs pci usage, and erase years of effort
of moving people over to sysfs. Tell them to go back to
procfs so we can add the attribute setting via ioctl()
which is absolutely needed.

2) Relax your restrictions a little bit and allow ioctl()'s
for limited cases, like this one.

Otherwise, propase a way to specify PCI device mmap attributes
which works within your whole-universe sysfs theory of everything
:-)

2007-12-14 00:50:32

by Andi Kleen

[permalink] [raw]
Subject: Re: [RFC PATCH 08/12] PAT 64b: coherent mmap and sysfs bin ioctl

On Thu, Dec 13, 2007 at 04:19:32PM -0800, Greg KH wrote:
> On Thu, Dec 13, 2007 at 03:55:51PM -0800, [email protected] wrote:
> > Forward port of coherent-mmap.patch and sysfs-bin-ioctl.patch to x86 tree.
> >
> > TBD: Do we need the ioctl interface to sysfs or get the type attribute
> > through a different sysfs file. And then actually specify the attribute
> > while doing pci_mmap_page_range ;-)
>
> Woah! No, no ioctls on sysfs files, sorry. Not going to happen, do
> this on a /dev file if you want to have ioctls...

That would require putting the whole PCI bus hierarchy into /dev.
We could do that, but for what would we still need /sys then ? @)

Anyways if you can suggest a better interface please do, but please think first.

-Andi

2007-12-14 00:54:44

by Jesse Barnes

[permalink] [raw]
Subject: Re: [RFC PATCH 08/12] PAT 64b: coherent mmap and sysfs bin ioctl

On Thursday, December 13, 2007 3:55 pm [email protected] wrote:
> Forward port of coherent-mmap.patch and sysfs-bin-ioctl.patch to x86 tree.
>
> TBD: Do we need the ioctl interface to sysfs or get the type attribute
> through a different sysfs file. And then actually specify the attribute
> while doing pci_mmap_page_range ;-)
>
> And when this interface is in place, X server has to use this interface for
> WC mapping.

I remember talking with people about using madvise and/or mmap flags to set
the attributes correctly, but I don't remember whether it was a good idea or
not...

The advantage of a specific post-mmap call is that it would make setting the
attribute types a little easier, so either ioctl or madvise seems preferable
to mmapping over and over with different flags until you get the mapping.

Jesse

2007-12-14 04:02:53

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [RFC PATCH 08/12] PAT 64b: coherent mmap and sysfs bin ioctl

[email protected] writes:

> Forward port of coherent-mmap.patch and sysfs-bin-ioctl.patch to x86 tree.
>
> TBD: Do we need the ioctl interface to sysfs or get the type attribute
> through a different sysfs file. And then actually specify the attribute
> while doing pci_mmap_page_range ;-)

This ioctl is not connected up. So regardless of the wisdom of ioctls on
sysfs adding the infrastructure and then not using it is broken.

> And when this interface is in place, X server has to use this interface for WC
> mapping.

Eric

2007-12-14 05:59:22

by Greg KH

[permalink] [raw]
Subject: Re: [RFC PATCH 08/12] PAT 64b: coherent mmap and sysfs bin ioctl

On Thu, Dec 13, 2007 at 08:59:44PM -0700, Eric W. Biederman wrote:
> [email protected] writes:
>
> > Forward port of coherent-mmap.patch and sysfs-bin-ioctl.patch to x86 tree.
> >
> > TBD: Do we need the ioctl interface to sysfs or get the type attribute
> > through a different sysfs file. And then actually specify the attribute
> > while doing pci_mmap_page_range ;-)
>
> This ioctl is not connected up. So regardless of the wisdom of ioctls on
> sysfs adding the infrastructure and then not using it is broken.

Ok, I guess the "use an ioctl on a binary file in sysfs for PCI devices"
makes a bit more sense (hint, next time explain this in the changelog
instead of just saying that it is being added), but I would like to see
how this is all hooked up before passing final judgement on it.

thanks,

greg k-h

2007-12-14 06:07:32

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [RFC PATCH 08/12] PAT 64b: coherent mmap and sysfs bin ioctl

Greg KH <[email protected]> writes:

> On Thu, Dec 13, 2007 at 08:59:44PM -0700, Eric W. Biederman wrote:
>> [email protected] writes:
>>
>> > Forward port of coherent-mmap.patch and sysfs-bin-ioctl.patch to x86 tree.
>> >
>> > TBD: Do we need the ioctl interface to sysfs or get the type attribute
>> > through a different sysfs file. And then actually specify the attribute
>> > while doing pci_mmap_page_range ;-)
>>
>> This ioctl is not connected up. So regardless of the wisdom of ioctls on
>> sysfs adding the infrastructure and then not using it is broken.
>
> Ok, I guess the "use an ioctl on a binary file in sysfs for PCI devices"
> makes a bit more sense (hint, next time explain this in the changelog
> instead of just saying that it is being added), but I would like to see
> how this is all hooked up before passing final judgement on it.


The obvious thing to do would be to hook it up like:
drivers/pci/proc.c:proc_bus_pci_ioctl.

Eric

2007-12-14 06:31:46

by Greg KH

[permalink] [raw]
Subject: Re: [RFC PATCH 08/12] PAT 64b: coherent mmap and sysfs bin ioctl

On Thu, Dec 13, 2007 at 04:35:05PM -0800, David Miller wrote:
> From: Greg KH <[email protected]>
> Date: Thu, 13 Dec 2007 16:19:32 -0800
>
> > On Thu, Dec 13, 2007 at 03:55:51PM -0800, [email protected] wrote:
> > > Forward port of coherent-mmap.patch and sysfs-bin-ioctl.patch to x86 tree.
> > >
> > > TBD: Do we need the ioctl interface to sysfs or get the type attribute
> > > through a different sysfs file. And then actually specify the attribute
> > > while doing pci_mmap_page_range ;-)
> >
> > Woah! No, no ioctls on sysfs files, sorry. Not going to happen, do
> > this on a /dev file if you want to have ioctls...
>
> Well since we told people to move over to sysfs for PCI
> accesses, and that's where mmap() is done via too,
> it should be no surprise that we run into problems when
> people want to set attributes for the mmap() as was done
> for the procfs case.
>
> So you have two choices:
>
> 1) Balk on the sysfs pci usage, and erase years of effort
> of moving people over to sysfs. Tell them to go back to
> procfs so we can add the attribute setting via ioctl()
> which is absolutely needed.

Ok, sorry, it wasn't blindingly obvious that this was for pci sysfs
devices that are mmaped, that makes a bit more sense.

But I'd like to see what ioctl is wanted here first.

thanks,

greg k-h

2007-12-14 10:19:32

by Andi Kleen

[permalink] [raw]
Subject: Re: [RFC PATCH 08/12] PAT 64b: coherent mmap and sysfs bin ioctl

> The obvious thing to do would be to hook it up like:
> drivers/pci/proc.c:proc_bus_pci_ioctl.

Yes that is what it intended to do -- i just had never finished/tested
that.

-Andi

2007-12-17 00:30:41

by Paul Mackerras

[permalink] [raw]
Subject: Re: [RFC PATCH 08/12] PAT 64b: coherent mmap and sysfs bin ioctl

Greg KH writes:

> Ok, sorry, it wasn't blindingly obvious that this was for pci sysfs
> devices that are mmaped, that makes a bit more sense.
>
> But I'd like to see what ioctl is wanted here first.

I believe the ioctl would be to set whether the mapping goes to I/O or
memory space, and whether write-combining is allowed.

So the alternative to the ioctl would be to have multiple files in
sysfs, one per combination of modes -- i.e., 4 files, or 3 if we
exclude the "I/O with write combining" mode, which would be
reasonable.

Paul.

2007-12-17 12:41:49

by Andi Kleen

[permalink] [raw]
Subject: Re: [RFC PATCH 08/12] PAT 64b: coherent mmap and sysfs bin ioctl

On Mon, Dec 17, 2007 at 08:57:50AM +1100, Paul Mackerras wrote:
> Greg KH writes:
>
> > Ok, sorry, it wasn't blindingly obvious that this was for pci sysfs
> > devices that are mmaped, that makes a bit more sense.
> >
> > But I'd like to see what ioctl is wanted here first.
>
> I believe the ioctl would be to set whether the mapping goes to I/O or
> memory space,

x86 cannot really access IO space through mmap so no that wasn't planned

The main planned use was to get the translated bus address (after IOMMU)
for a mapping and to set the caching modes.

> So the alternative to the ioctl would be to have multiple files in
> sysfs, one per combination of modes -- i.e., 4 files, or 3 if we
> exclude the "I/O with write combining" mode, which would be
> reasonable.

At least for the IOMMU translation case that wouldn't work.

-Andi

2007-12-18 04:33:26

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [RFC PATCH 08/12] PAT 64b: coherent mmap and sysfs bin ioctl

Andi Kleen <[email protected]> writes:

> On Mon, Dec 17, 2007 at 08:57:50AM +1100, Paul Mackerras wrote:
>> Greg KH writes:
>>
>> > Ok, sorry, it wasn't blindingly obvious that this was for pci sysfs
>> > devices that are mmaped, that makes a bit more sense.
>> >
>> > But I'd like to see what ioctl is wanted here first.
>>
>> I believe the ioctl would be to set whether the mapping goes to I/O or
>> memory space,
>
> x86 cannot really access IO space through mmap so no that wasn't planned

0000_00FD_FC00_0000h - 0000_00FD_FDFF_FFFFh On a hypertransport based
system should work. There is a 32MB window for it.

However the I/O vs mem distinction doesn't matter anyway if we start out
per bar because we already know if it is I/O or mem.

> The main planned use was to get the translated bus address (after IOMMU)
> for a mapping and to set the caching modes.
>
>> So the alternative to the ioctl would be to have multiple files in
>> sysfs, one per combination of modes -- i.e., 4 files, or 3 if we
>> exclude the "I/O with write combining" mode, which would be
>> reasonable.
>
> At least for the IOMMU translation case that wouldn't work.

Well the other alternative looks like having a second file per par
bar. Say resource0_wc to support the write-combining mode, possibly
restricted to just prefetchable bars.

If that is all we have to worry about my inclination is to suggest
a second file, because that feels a bit more generally useable. As
that attribute could be applied to ordinary reads and writes to and
from the bar.

Eric

2007-12-18 04:58:13

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [RFC PATCH 08/12] PAT 64b: coherent mmap and sysfs bin ioctl

Eric W. Biederman wrote:
>
> 0000_00FD_FC00_0000h - 0000_00FD_FDFF_FFFFh On a hypertransport based
> system should work. There is a 32MB window for it.
>

It doesn't. The termination on MMIO and IOIO transaction is different,
and poking this memory window with an MMIO transaction will lock the
chipset hard (yes, I've tried it.)

-hpa

2007-12-18 09:36:11

by Andi Kleen

[permalink] [raw]
Subject: Re: [RFC PATCH 08/12] PAT 64b: coherent mmap and sysfs bin ioctl

> Well the other alternative looks like having a second file per par
> bar. Say resource0_wc to support the write-combining mode, possibly

The intention was to support memory not in bars, but give a generic
IOMMU mapped memory interface for user space e.g. for the X server.
But that needs a way to return the bus address for the mmap mapping
and ioctl was the best I came up with for that.
Given it was never finished.

-Andi

2007-12-18 13:51:19

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [RFC PATCH 08/12] PAT 64b: coherent mmap and sysfs bin ioctl

Andi Kleen <[email protected]> writes:

>> Well the other alternative looks like having a second file per par
>> bar. Say resource0_wc to support the write-combining mode, possibly
>
> The intention was to support memory not in bars, but give a generic
> IOMMU mapped memory interface for user space e.g. for the X server.
> But that needs a way to return the bus address for the mmap mapping
> and ioctl was the best I came up with for that.
> Given it was never finished.

Ok that part wasn't obvious. The only thing we mmap in sysfs today
are the bars.

Taking normal memory and iommu mapping it to a device and then having
a user space accessible version is a bit different. We need a special
interface to allocate it and map it through the iommu to user space.
This needs to be a driver or a support subsystem like DRM. Once
we have gone that far then we can map those address to user space.

I expect from the sysfs perspective those per device regions should look
a lot like bars showing contiguous chunks of memory RAM from the devices
perspective. At which point having two files instead of just one can
solve the problem without an ioctl.

For contiguous to device memory we also have some permission issues
so I'm not yet certain that it make sense to expose it through sysfs.

Regardless that seems to be solving a completely new aspect of the problem,
and we can solve that problem separately.

Eric