by Peter Gonda

[permalink] [raw]

Subject: Re: [PATCH] virt: Prevent AES-GCM IV reuse in SNP guest driver

On Thu, Oct 20, 2022 at 8:02 AM Tom Lendacky <[email protected]> wrote:
>
> On 10/19/22 16:47, Peter Gonda wrote:
> > On Wed, Oct 19, 2022 at 2:58 PM Tom Lendacky <[email protected]> wrote:
> >> On 10/19/22 15:39, Peter Gonda wrote:
> >>> On Wed, Oct 19, 2022 at 1:56 PM Tom Lendacky <[email protected]> wrote:
> >>>> On 10/19/22 14:17, Dionna Amalie Glaze wrote:
> >>>>> On Wed, Oct 19, 2022 at 11:44 AM Tom Lendacky <[email protected]> wrote:
> >>>>>> On 10/19/22 12:40, Peter Gonda wrote:
> >>>>>>> On Wed, Oct 19, 2022 at 11:03 AM Tom Lendacky <[email protected]> wrote:
> >>>>>>>> On 10/19/22 10:03, Peter Gonda wrote:
> >>>>>>>>> The ASP and an SNP guest use a series of AES-GCM keys called VMPCKs to
> >>>>>>>>> communicate securely with each other. The IV to this scheme is a
> >>>>>>>>> sequence number that both the ASP and the guest track. Currently this
> >>>>>>>>> sequence number in a guest request must exactly match the sequence
> >>>>>>>>> number tracked by the ASP. This means that if the guest sees an error
> >>>>>>>>> from the host during a request it can only retry that exact request or
> >>>>>>>>> disable the VMPCK to prevent an IV reuse. AES-GCM cannot tolerate IV
> >>>>>>>>> reuse see:
> >>>>>>>>> https://csrc.nist.gov/csrc/media/projects/block-cipher-techniques/documents/bcm/comments/800-38-series-drafts/gcm/joux_comments.pdf
>
> >>> OK so the guest retires with the same request when it gets an
> >>> SNP_GUEST_REQ_INVALID_LEN error. It expands its internal buffer to
> >>
> >> It would just use the pre-allocated snp_dev->certs_data buffer with npages
> >> set to the full size of that buffer.
> >
> > Actually we allocate that buffer with size SEV_FW_BLOB_MAX_SIZE. Maybe
> > we want to just allocate this buffer which we think is sufficient and
> > never increase the allocation?
> >
> > I see the size of
> > https://developer.amd.com/wp-content/resources/ask_ark_milan.cert is
> > 3200 bytes. Assuming the VCEK cert is the same size (which it should
> > be since this .cert is 2 certificates). 16K seems to leave enough room
> > even for some vendor certificates?
>
> I think just using the 16K buffer (4 pages) as it is allocated today is
> ok. If we get a SNP_GUEST_REQ_INVALID_LEN error that is larger than 4
> pages, then we won't ever be able to pull the certs given how the driver
> is coded today. In that case, disabling the VMPCK is in order.
>
> A separate patch could be submitted later to improve this overall aspect
> of the certs buffer if needed.

If that sounds OK I'd prefer that. This keeps the drivers current limit:

static int get_ext_report(struct snp_guest_dev *snp_dev, struct
snp_guest_request_ioctl *arg)
...
if (req.certs_len > SEV_FW_BLOB_MAX_SIZE ||
!IS_ALIGNED(req.certs_len, PAGE_SIZE))
return -EINVAL;

I'd prefer not to add extra features during the bug fix. But happy to
make this work with buffers greater than SEV_FW_BLOB_MAX_SIZE as
follow up if you want.

>
> Thanks,
> Tom
>
> >
> >>
> >>> hold the certificates. When it finally gets a successful request w/
> >>> certs. Do we want to return the attestation bits to userspace, but
> >>> leave out the certificate data. Or just error out the ioctl
> >>> completely?
> >>
> >> We need to be able to return the attestation bits that came back with the
> >> extra certs. So just error out of the ioctl with the length error and let
> >> user-space retry with the recommended number of pages.
> >
> > That sounded simpler to me. Will do.
> >
> >>
> >>>
> >>> I can do that in this series.
> >>
> >> Thanks!
> >>
> >>>
> >>>>
> >>>>>
> >>>>>>>
> >>>>>>>>
> >>>>>>>> For the rate-limiting patch series [1], the rate-limiting will have to be
> >>>>>>>> performed within the kernel, while the mutex is held, and then retry the
> >>>>>>>> exact request again. Otherwise, that error will require disabling the
> >>>>>>>> VMPCK. Either that, or the hypervisor must provide the rate limiting.
> >>>>>>>>
> >>>>>>>> Thoughts?
> >>>>>>>>
> >>>>>>>> [1] https://lore.kernel.org/lkml/[email protected]/
> >>>>>>>
> >>>>>>> Yes I think if the host rate limits the guest. The guest kernel should
> >>>>>>> retry the exact message. Which mutex are you referring too?
> >>>>>>
> >>>>>> Or the host waits and then submits the request and the guest kernel
> >>>>>> doesn't have to do anything. The mutex I'm referring to is the
> >>>>>> snp_cmd_mutex that is taken in snp_guest_ioctl().
> >>>>>
> >>>>> I think that either the host kernel or guest kernel waiting can lead
> >>>>> to unacceptable delays.
> >>>>> I would recommend that we add a zero argument ioctl to /dev/sev-guest
> >>>>> specifically for retrying the last request.
> >>>>>
> >>>>> We can know what the last request is due to the sev_cmd_mutex serialization.
> >>>>> The driver will just keep a scratch buffer for this. Any other request
> >>>>> that comes in without resolving the retry will get an -EBUSY error
> >>>>> code.
> >>>>
> >>>> And the first caller will have received an -EAGAIN in order to
> >>>> differentiate between the two situations?
> >>>>
> >>>>>
> >>>>> Calling the retry ioctl without a pending command will result in -EINVAL.
> >>>>>
> >>>>> Let me know what you think.
> >>>>
> >>>> I think that sounds reasonable, but there are some catches. You will need
> >>>> to ensure that the caller that is supposed to retry does actually retry
> >>>> and that a caller that does retry is the same caller that was told to retry.
> >>>
> >>> Whats the issue with the guest driver taking some time?
> >>>
> >>> This sounds complex because there may be many users of the driver. How
> >>> do multiple users coordinate when they need to use the retry ioctl?
> >>>
> >>>>
> >>>> Thanks,
> >>>> Tom
> >>>>
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Tom
> >>>>>
> >>>>>
> >>>>>

2022-10-24 19:15:57

by kernel test robot

[permalink] [raw]

Subject: Re: [PATCH] virt: Prevent AES-GCM IV reuse in SNP guest driver

Hi Peter,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on linus/master]
[also build test WARNING on v6.1-rc2 next-20221024]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url: https://github.com/intel-lab-lkp/linux/commits/Peter-Gonda/virt-Prevent-AES-GCM-IV-reuse-in-SNP-guest-driver/20221020-100950
patch link: https://lore.kernel.org/r/20221019150333.1047423-1-pgonda%40google.com
patch subject: [PATCH] virt: Prevent AES-GCM IV reuse in SNP guest driver
config: x86_64-buildonly-randconfig-r006-20221024
compiler: clang version 14.0.6 (https://github.com/llvm/llvm-project f28c006a5895fc0e329fe15fead81e37457cb1d1)
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/intel-lab-lkp/linux/commit/3bb420ef25bf00e5fa26d6146446b11e6ebfd255
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Peter-Gonda/virt-Prevent-AES-GCM-IV-reuse-in-SNP-guest-driver/20221020-100950
git checkout 3bb420ef25bf00e5fa26d6146446b11e6ebfd255
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash drivers/virt/coco/sev-guest/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <[email protected]>

All warnings (new ones prefixed by >>):

>> drivers/virt/coco/sev-guest/sev-guest.c:351:10: warning: format specifies type 'unsigned long' but the argument has type '__u64 *' (aka 'unsigned long long *') [-Wformat]
rc, fw_err);
^~~~~~
include/linux/dev_printk.h:142:69: note: expanded from macro 'dev_alert'
dev_printk_index_wrap(_dev_alert, KERN_ALERT, dev, dev_fmt(fmt), ##__VA_ARGS__)
~~~ ^~~~~~~~~~~
include/linux/dev_printk.h:110:23: note: expanded from macro 'dev_printk_index_wrap'
_p_func(dev, fmt, ##__VA_ARGS__); \
~~~ ^~~~~~~~~~~
1 warning generated.

vim +351 drivers/virt/coco/sev-guest/sev-guest.c

322
323 static int handle_guest_request(struct snp_guest_dev *snp_dev, u64 exit_code, int msg_ver,
324 u8 type, void *req_buf, size_t req_sz, void *resp_buf,
325 u32 resp_sz, __u64 *fw_err)
326 {
327 unsigned long err;
328 u64 seqno;
329 int rc;
330
331 /* Get message sequence and verify that its a non-zero */
332 seqno = snp_get_msg_seqno(snp_dev);
333 if (!seqno)
334 return -EIO;
335
336 memset(snp_dev->response, 0, sizeof(struct snp_guest_msg));
337
338 /* Encrypt the userspace provided payload */
339 rc = enc_payload(snp_dev, seqno, msg_ver, type, req_buf, req_sz);
340 if (rc)
341 return rc;
342
343 /* Call firmware to process the request */
344 rc = snp_issue_guest_request(exit_code, &snp_dev->input, &err);
345 if (fw_err)
346 *fw_err = err;
347
348 if (rc) {
349 dev_alert(snp_dev->dev,
350 "Detected error from ASP request. rc: %d, fw_err: %lu\n",
> 351 rc, fw_err);
352 goto disable_vmpck;
353 }
354
355 rc = verify_and_dec_payload(snp_dev, resp_buf, resp_sz);
356 if (rc) {
357 dev_alert(snp_dev->dev,
358 "Detected unexpected decode failure from ASP. rc: %d\n",
359 rc);
360 goto disable_vmpck;
361 }
362
363 /* Increment to new message sequence after payload decryption was successful. */
364 snp_inc_msg_seqno(snp_dev);
365
366 return 0;
367
368 disable_vmpck:
369 snp_disable_vmpck(snp_dev);
370 return rc;
371 }
372

--
0-DAY CI Kernel Test Service
https://01.org/lkp

Attachments:

(No filename) (4.16 kB)
config (195.07 kB)
Download all attachments