2015-11-04 20:48:43

by Gabriel L. Somlo

[permalink] [raw]
Subject: Re: [PATCH v3 0/4] SysFS driver for QEMU fw_cfg device

On Mon, Oct 05, 2015 at 03:18:05PM +0200, Paolo Bonzini wrote:
>
>
> On 05/10/2015 14:50, Peter Maydell wrote:
> > If you want to try to support "firmware might also be reading
> > fw_cfg at the same time as the kernel" this is a (painful)
> > problem regardless of how the kernel figures out whether a
> > fw_cfg device is present. I had assumed that one of the design
> > assumptions of this series was that firmware would only
> > read the fw_cfg before booting the guest kernel and never touch
> > it afterwards. If it might touch it later then letting the
> > guest kernel also mess with fw_cfg seems like a really bad idea.
>
> The idea of tinkering with fw_cfg from the AML code (DSDT/SSDT) has been
> proposed many times, and always dropped. One of the reasons was that
> the OS could have a driver for fw_cfg.
>
> So I think that we can define the QEMU0002 id as owned by the OSPM,
> similar to the various standard ACPI ids that are usually found in the
> x86 world (e.g. PNP0B00 is a mc146818 RTC, PNP0303 is an 8042 keyboard
> controller, PNP0501 is a 16550 or similar UART, and so on). This
> basically sanctions _CRS as the way to pass information from the
> firmware to the OSPM, also similarly to those standard PNP ids.

OK, so I don't expect to go the "pure ACPI" route in the final
version, mainly because I'm still hoping to fill the requirement
of writing a driver which can use either ACPI or DT to detect the
presence of fw_cfg (how I'm going to compile this on kernels with
no ACPI or no DT support is still TBD, and probably will have to
involve #ifdef, but I digress :)

However, Michael's idea of having ACPI supply an "accessor method" for
the guest kernel driver to call, without having to worry about the
specific details in _CRS, sounded intriguing enough to at least explore
in further detail.

So, assuming we have the following fw_cfg AML in ssdt (i386) or
dsdt (arm) (using cpp-style #ifdef to highlight the arch-specific
bits):

Scope (\_SB)
{
Device (FWCF)
{
Name (_HID, EisaId ("QMU0002")) // _HID: Hardware ID
Name (_STA, 0x0B) // _STA: Status

#ifdef ARCH_X86

Name (_CRS, ResourceTemplate ()
{
IO (Decode16,
0x0510, // Range Minimum
0x0510, // Range Maximum
0x01, // Alignment
0x02, // Length
)
})

#else /* ARCH_ARM */

NAME (_CRS, ResourceTemplate ()
{
Memory32Fixed (ReadWrite,
0x09020000, // Address Base
0x0000000a, // Address Length
)
})

#endif

}
}

I can easily patch QEMU to additionally insert the following into
"Device (FWCF)":

#ifdef ARCH_X86

OperationRegion (FWCR, SystemIO, 0x0510, 0x02)
Field (FWCR, WordAcc, NoLock, Preserve)
{
FWCC, 16
}
Field (FWCR, ByteAcc, NoLock, Preserve)
{
Offset (0x01),
FWCD, 8
}

#else /* ARCH_ARM */

OperationRegion (FWCR, SystemMemory, 0x09020000, 0x0a)
Field (FWCR, WordAcc, NoLock, Preserve)
{
Offset (0x08),
FWCC, 16 // not sure about endianness on ARM here, but
// I can still address this from the userspace
// kernel driver if needed
}
Field (FWCR, ByteAcc, NoLock, Preserve)
{
FWCD, 8
}

#endif

Method (RDBL, 3, Serialized) // (Arg0 = key, Arg1 = skip, Arg2 = count)
{
FWCC = Arg0
Local0 = Zero
While ((Local0 < Arg1))
{
Local1 = FWCD
Local0++
}
Name (BBUF, Buffer (Arg2) {})
Local0 = Zero
While ((Local0 < Arg2))
{
Index (BBUF, Local0) = FWCD /* \_SB_.FWCF.FWCD */
Local0++
}
Return (BBUF) /* \_SB_.FWCF.RDBL.BBUF */
}

With the host generating the additional RDBL method above, I could
write a "pure ACPI" driver for the guest kernel, where instead of the
current fw_cfg_read_blob() logic:


static DEFINE_MUTEX(fw_cfg_dev_lock);
static bool fw_cfg_is_mmio;

static inline u16 fw_cfg_sel_endianness(u16 key)
{
return fw_cfg_is_mmio ? cpu_to_be16(key) : cpu_to_le16(key);
}

static inline void fw_cfg_read_blob(u16 key,
void *buf, loff_t pos, size_t count)
{
mutex_lock(&fw_cfg_dev_lock);
iowrite16(fw_cfg_sel_endianness(key), fw_cfg_reg_ctrl);
while (pos-- > 0)
ioread8(fw_cfg_reg_data);
ioread8_rep(fw_cfg_reg_data, buf, count);
mutex_unlock(&fw_cfg_dev_lock);
}


I could instead write something like this:


struct acpi_device *dev; /* set during module init. */

static inline void fw_cfg_read_blob(u16 key,
void *buf, loff_t pos, size_t count)
{
union acpi_object arg_elem[3], *obj;
struct acpi_object_list arg;
struct acpi_buffer acpi_buf = { ACPI_ALLOCATE_BUFFER, NULL };
acpi_status status;

arg.count = 3;
arg.pointer = &arg_elem[0];

arg_elem[0].type =
arg_elem[1].type =
arg_elem[2].type = ACPI_TYPE_INTEGER;

arg_elem[0].integer.value = key;
arg_elem[1].integer.value = pos;
arg_elem[2].integer.value = count;

status = acpi_evaluate_object_typed(dev->handle, "RDBL", &arg,
&acpi_buf, ACPI_TYPE_BUFFER);
if (ACPI_FAILURE(status)) {
return; /* FIXME: actual error signaling to caller TBD */
}

obj = (union acpi_object *)acpi_buf.pointer;

/* FIXME: in lieu of better error signaling to caller: */
assert(count == obj->buffer.length);

memcpy(buf, obj->buffer.pointer, obj->buffer.length);
kfree(acpi_buf.pointer);
}

Now, if ACPI-less DT-enabled architectures are to be supported, this
wouldn't get me out of having to spell out the original ioread8_rep()
based fw_cfg_read_blob(), so probably not worth doing.

But I simply *had* to try and chase this down before writing it off,
since part of the reason I'm doing all this in the first place is to
teach myself new tricks... :)

Sorry for going off on a somewhat lengthy (and maybe just a *little*
bit off-topic) tangent, but I figured I'd put it out there to at least
facilitate future archaeology :)

Obviously, any comments or feedback much appreciated!

Cheers,
--Gabriel