Mailing List Archive

Cache Allocation Technology(CAT) design for XEN
Hi all, we plan to bring Intel CAT into XEN. This is the initial
design for that. Comments/suggestions are welcome.

Background
==========
Traditionally, all Virtual Machines ("VMs") share the same set of system
cache resources. There is no hardware support to control allocation or
availability of cache resources to individual VMs. The lack of such
partition mechanism for cache resource makes the cache utilization for
different types of VMs inefficient. While on the other side, more and
more cache resources become available on modern server platforms.

With the introduction of Intel Cache Allocation Technology ("CAT"), now
Virtualization Machine Monitor ("VMM") has the ability to partition the
cache allocation per VM, based on the priority of VM.


CAT Introduction
================
Generally speaking, CAT introduces a mechanism for software to enable
cache allocation based on application priority or Class of Service
("COS"). Cache allocation for the respective applications is then
restricted based on the COS with which they are associated. Each COS can
be configured using capacity bitmasks ("CBM") which represent cache
capacity and indicate the degree of overlap and isolation between
classes. For each logical processor there is a register
exposed(IA32_PQR_ASSOC MSR) to allow the OS/VMM to specify a COS when an
application, thread or VM is scheduled. Cache allocation for the
indicated application/thread/VM is then controlled automatically by the
hardware based on the COS and the CBM associated with that class.
Hardware initializes COS of each logical processor to 0 and the
corresponding CBM is set to all-ones, means all the system cache
resource can be used for each application.

For more information please refer to Section 17.15 in the Intel SDM [1].


Design Overview
===============
- Domain COS/CBM association
When enforcing cache allocation for VMs, the minimum granularity is
defined as the domain. All Virtual CPUs ("VCPUs") of a domain have the
same COS, and therefore, correspond to the same CBM. COS is used only in
hypervisor and is transparent to tool stack/user. System administrator
can specify the initial CBM for each domain or change it at runtime using
tool stack. Hypervisor then choses a free COS to associate it with that
CBM or find a existed COS which has the same CBM.

- VCPU Schedule
When VCPU is scheduled on the physical CPU ("PCPU"), its COS value is
then written to MSR (IA32_PQR_ASSOC) of PCPU to notify hardware to use
the new COS. The cache allocation is then enforced by hardware.

- Multi-Socket
In multi-socket environment, each VCPU may be scheduled on different
sockets. The hardware CAT ability(such as maximum supported COS and length
of CBM) maybe different among sockets. For such system, per-socket COS/CBM
configuration of a domain is specified. Hypervisor then use this per-socket
CBM information for VCPU schedule.


Implementation Description
==========================
In this design, one principal is that only implementing the cache
enforcement mechanism in hypervisor but leaving the cache allocation
policy to user space tool stack. In this way some complex governors then
can be implemented in tool stack.

In summary, hypervisor changes include:
1) A new field "cat_info" in domain structure to indicate the CAT
information for each socket. It points to array of structure:
struct domain_socket_cat_info {
unsigned int cbm; /* CBM specified by toolstack */
unsigned int cos; /* COS allocated by Hypervisor */
}
2) A new SYSCTL to expose the CAT information to tool stack:
* Whether CAT is enabled;
* Max COS supported;
* Length of CBM;
* Other needed information from host cpuid;
3) A new DOMCTL to allow tool stack to set/get CBM for a specified domain
for each socket.
4) Context switch: write COS of domain to MSR (IA32_PQR_ASSOC) of PCPU.
5) XSM policy to restrict the functions visibility to control domain only.

Hypervisor interfaces:
1) Boot line param: "psr=cat" to enable the feature.
2) SYSCTL: XEN_SYSCTL_psr_cat_op
- XEN_SYSCTL_PSR_CAT_INFO_GET: Get system CAT information;
3) DOMCTL: XEN_DOMCTL_psr_cat_op
- XEN_DOMCTL_PSR_CAT_OP_CBM_SET: Set CBM value for a domain.
- XEN_DOMCTL_PSR_CAT_OP_CBM_GET: Get CBM value for a domain.

xl interfaces:
1) psr-cat-show: Show system/runtime CAT information.
=> XEN_SYSCTL_PSR_CAT_INFO_GET/XEN_DOMCTL_PSR_CAT_OP_CBM_GET
2) psr-cat-cbm-set [dom] [cbm] [socket]: Set CBM for a domain.
=> XEN_DOMCTL_PSR_CAT_OP_CBM_SET


Hardware Limitation & Performance Improvement
=============================================
As the COS of PCPU in IA32_PQR_ASSOC is changed on each VCPU context
switch. If the change is frequent then hardware may fail to strictly
enforce the cache allocation basing on the specified COS. Due to this
the strict placement characteristic would soften if VCPU is migrated on
different PCPU frequently.

For this reason, lazy updating for IA32_PQR_ASSOC will be done. Also this
design allows CAT to run in two modes:

1) Non Affinitized mode: Each VM can be freely scheduled on any PCPU
assigning its COS as it does.

2) Affinitized mode: Each PCPU is assigned a fixed COS and a VM can be
scheduled on the PCPU only when it has a same COS. It's less flexible
but can be an option for those who must have strict COS placement or in
cases where problems have arisen because of the less strict nature of the
non-affinitized mode.

However, no additional code is designed to support these two modes. CAT is
already running in non affinitized mode by default. If affinitized mode
is desirable, then existed "xl vcpu-pin" command can be used to pin all
the VCPUs which has the same COS to certain fixed PCPUs so that these
PCPUs always have the same COS set.

[1] http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-manual-325462.pdf

Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
Re: Cache Allocation Technology(CAT) design for XEN [ In reply to ]
On 12/12/14 12:27, Chao Peng wrote:
> Hi all, we plan to bring Intel CAT into XEN. This is the initial
> design for that. Comments/suggestions are welcome.
>
> Background
> ==========
> Traditionally, all Virtual Machines ("VMs") share the same set of system
> cache resources. There is no hardware support to control allocation or
> availability of cache resources to individual VMs. The lack of such
> partition mechanism for cache resource makes the cache utilization for
> different types of VMs inefficient. While on the other side, more and
> more cache resources become available on modern server platforms.
>
> With the introduction of Intel Cache Allocation Technology ("CAT"), now
> Virtualization Machine Monitor ("VMM") has the ability to partition the
> cache allocation per VM, based on the priority of VM.
>
>
> CAT Introduction
> ================
> Generally speaking, CAT introduces a mechanism for software to enable
> cache allocation based on application priority or Class of Service
> ("COS"). Cache allocation for the respective applications is then
> restricted based on the COS with which they are associated. Each COS can
> be configured using capacity bitmasks ("CBM") which represent cache
> capacity and indicate the degree of overlap and isolation between
> classes. For each logical processor there is a register
> exposed(IA32_PQR_ASSOC MSR) to allow the OS/VMM to specify a COS when an
> application, thread or VM is scheduled. Cache allocation for the
> indicated application/thread/VM is then controlled automatically by the
> hardware based on the COS and the CBM associated with that class.
> Hardware initializes COS of each logical processor to 0 and the
> corresponding CBM is set to all-ones, means all the system cache
> resource can be used for each application.
>
> For more information please refer to Section 17.15 in the Intel SDM [1].
>
>
> Design Overview
> ===============
> - Domain COS/CBM association
> When enforcing cache allocation for VMs, the minimum granularity is
> defined as the domain. All Virtual CPUs ("VCPUs") of a domain have the
> same COS, and therefore, correspond to the same CBM. COS is used only in
> hypervisor and is transparent to tool stack/user. System administrator
> can specify the initial CBM for each domain or change it at runtime using
> tool stack. Hypervisor then choses a free COS to associate it with that
> CBM or find a existed COS which has the same CBM.
>
> - VCPU Schedule
> When VCPU is scheduled on the physical CPU ("PCPU"), its COS value is
> then written to MSR (IA32_PQR_ASSOC) of PCPU to notify hardware to use
> the new COS. The cache allocation is then enforced by hardware.
>
> - Multi-Socket
> In multi-socket environment, each VCPU may be scheduled on different
> sockets. The hardware CAT ability(such as maximum supported COS and length
> of CBM) maybe different among sockets. For such system, per-socket COS/CBM
> configuration of a domain is specified. Hypervisor then use this per-socket
> CBM information for VCPU schedule.
>
>
> Implementation Description
> ==========================
> In this design, one principal is that only implementing the cache
> enforcement mechanism in hypervisor but leaving the cache allocation
> policy to user space tool stack. In this way some complex governors then
> can be implemented in tool stack.
>
> In summary, hypervisor changes include:
> 1) A new field "cat_info" in domain structure to indicate the CAT
> information for each socket. It points to array of structure:
> struct domain_socket_cat_info {
> unsigned int cbm; /* CBM specified by toolstack */
> unsigned int cos; /* COS allocated by Hypervisor */
> }
> 2) A new SYSCTL to expose the CAT information to tool stack:
> * Whether CAT is enabled;
> * Max COS supported;
> * Length of CBM;
> * Other needed information from host cpuid;
> 3) A new DOMCTL to allow tool stack to set/get CBM for a specified domain
> for each socket.
> 4) Context switch: write COS of domain to MSR (IA32_PQR_ASSOC) of PCPU.
> 5) XSM policy to restrict the functions visibility to control domain only.
>
> Hypervisor interfaces:
> 1) Boot line param: "psr=cat" to enable the feature.
> 2) SYSCTL: XEN_SYSCTL_psr_cat_op
> - XEN_SYSCTL_PSR_CAT_INFO_GET: Get system CAT information;
> 3) DOMCTL: XEN_DOMCTL_psr_cat_op
> - XEN_DOMCTL_PSR_CAT_OP_CBM_SET: Set CBM value for a domain.
> - XEN_DOMCTL_PSR_CAT_OP_CBM_GET: Get CBM value for a domain.
>
> xl interfaces:
> 1) psr-cat-show: Show system/runtime CAT information.
> => XEN_SYSCTL_PSR_CAT_INFO_GET/XEN_DOMCTL_PSR_CAT_OP_CBM_GET
> 2) psr-cat-cbm-set [dom] [cbm] [socket]: Set CBM for a domain.
> => XEN_DOMCTL_PSR_CAT_OP_CBM_SET
>
>
> Hardware Limitation & Performance Improvement
> =============================================
> As the COS of PCPU in IA32_PQR_ASSOC is changed on each VCPU context
> switch. If the change is frequent then hardware may fail to strictly
> enforce the cache allocation basing on the specified COS. Due to this
> the strict placement characteristic would soften if VCPU is migrated on
> different PCPU frequently.
>
> For this reason, lazy updating for IA32_PQR_ASSOC will be done. Also this
> design allows CAT to run in two modes:
>
> 1) Non Affinitized mode: Each VM can be freely scheduled on any PCPU
> assigning its COS as it does.
>
> 2) Affinitized mode: Each PCPU is assigned a fixed COS and a VM can be
> scheduled on the PCPU only when it has a same COS. It's less flexible
> but can be an option for those who must have strict COS placement or in
> cases where problems have arisen because of the less strict nature of the
> non-affinitized mode.
>
> However, no additional code is designed to support these two modes. CAT is
> already running in non affinitized mode by default. If affinitized mode
> is desirable, then existed "xl vcpu-pin" command can be used to pin all
> the VCPUs which has the same COS to certain fixed PCPUs so that these
> PCPUs always have the same COS set.
>
> [1] http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-manual-325462.pdf
>
> Chao

Fantastic - this is a very clear and well presented document. In terms
of a plan of action, it looks fine.

From my understanding, CAT is a largely orthogonal to CMT, but will
share some of the base PSR infrastructure in Xen?

~Andrew


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
Re: Cache Allocation Technology(CAT) design for XEN [ In reply to ]
On Fri, Dec 12, 2014 at 03:02:33PM +0000, Andrew Cooper wrote:
> On 12/12/14 12:27, Chao Peng wrote:
> > Hi all, we plan to bring Intel CAT into XEN. This is the initial
> > design for that. Comments/suggestions are welcome.
> >
>
> Fantastic - this is a very clear and well presented document. In terms
> of a plan of action, it looks fine.
Thank you for review.
>
> >From my understanding, CAT is a largely orthogonal to CMT, but will
> share some of the base PSR infrastructure in Xen?
>
Yes, from functional perspective, they are independent features. But
they do share some common codes and also have similar designs in XEN.

Chao
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
Re: Cache Allocation Technology(CAT) design for XEN [ In reply to ]
Hi Jan,

Any comments from you? It would be greatly appreciated if you can look
at this when you have time. Your comments are always important to me :)

Thanks,
Chao

On Fri, Dec 12, 2014 at 08:27:57PM +0800, Chao Peng wrote:
> Hi all, we plan to bring Intel CAT into XEN. This is the initial
> design for that. Comments/suggestions are welcome.
>
> Background
> ==========
> Traditionally, all Virtual Machines ("VMs") share the same set of system
> cache resources. There is no hardware support to control allocation or
> availability of cache resources to individual VMs. The lack of such
> partition mechanism for cache resource makes the cache utilization for
> different types of VMs inefficient. While on the other side, more and
> more cache resources become available on modern server platforms.
>
> With the introduction of Intel Cache Allocation Technology ("CAT"), now
> Virtualization Machine Monitor ("VMM") has the ability to partition the
> cache allocation per VM, based on the priority of VM.
>
>
> CAT Introduction
> ================
> Generally speaking, CAT introduces a mechanism for software to enable
> cache allocation based on application priority or Class of Service
> ("COS"). Cache allocation for the respective applications is then
> restricted based on the COS with which they are associated. Each COS can
> be configured using capacity bitmasks ("CBM") which represent cache
> capacity and indicate the degree of overlap and isolation between
> classes. For each logical processor there is a register
> exposed(IA32_PQR_ASSOC MSR) to allow the OS/VMM to specify a COS when an
> application, thread or VM is scheduled. Cache allocation for the
> indicated application/thread/VM is then controlled automatically by the
> hardware based on the COS and the CBM associated with that class.
> Hardware initializes COS of each logical processor to 0 and the
> corresponding CBM is set to all-ones, means all the system cache
> resource can be used for each application.
>
> For more information please refer to Section 17.15 in the Intel SDM [1].
>
>
> Design Overview
> ===============
> - Domain COS/CBM association
> When enforcing cache allocation for VMs, the minimum granularity is
> defined as the domain. All Virtual CPUs ("VCPUs") of a domain have the
> same COS, and therefore, correspond to the same CBM. COS is used only in
> hypervisor and is transparent to tool stack/user. System administrator
> can specify the initial CBM for each domain or change it at runtime using
> tool stack. Hypervisor then choses a free COS to associate it with that
> CBM or find a existed COS which has the same CBM.
>
> - VCPU Schedule
> When VCPU is scheduled on the physical CPU ("PCPU"), its COS value is
> then written to MSR (IA32_PQR_ASSOC) of PCPU to notify hardware to use
> the new COS. The cache allocation is then enforced by hardware.
>
> - Multi-Socket
> In multi-socket environment, each VCPU may be scheduled on different
> sockets. The hardware CAT ability(such as maximum supported COS and length
> of CBM) maybe different among sockets. For such system, per-socket COS/CBM
> configuration of a domain is specified. Hypervisor then use this per-socket
> CBM information for VCPU schedule.
>
>
> Implementation Description
> ==========================
> In this design, one principal is that only implementing the cache
> enforcement mechanism in hypervisor but leaving the cache allocation
> policy to user space tool stack. In this way some complex governors then
> can be implemented in tool stack.
>
> In summary, hypervisor changes include:
> 1) A new field "cat_info" in domain structure to indicate the CAT
> information for each socket. It points to array of structure:
> struct domain_socket_cat_info {
> unsigned int cbm; /* CBM specified by toolstack */
> unsigned int cos; /* COS allocated by Hypervisor */
> }
> 2) A new SYSCTL to expose the CAT information to tool stack:
> * Whether CAT is enabled;
> * Max COS supported;
> * Length of CBM;
> * Other needed information from host cpuid;
> 3) A new DOMCTL to allow tool stack to set/get CBM for a specified domain
> for each socket.
> 4) Context switch: write COS of domain to MSR (IA32_PQR_ASSOC) of PCPU.
> 5) XSM policy to restrict the functions visibility to control domain only.
>
> Hypervisor interfaces:
> 1) Boot line param: "psr=cat" to enable the feature.
> 2) SYSCTL: XEN_SYSCTL_psr_cat_op
> - XEN_SYSCTL_PSR_CAT_INFO_GET: Get system CAT information;
> 3) DOMCTL: XEN_DOMCTL_psr_cat_op
> - XEN_DOMCTL_PSR_CAT_OP_CBM_SET: Set CBM value for a domain.
> - XEN_DOMCTL_PSR_CAT_OP_CBM_GET: Get CBM value for a domain.
>
> xl interfaces:
> 1) psr-cat-show: Show system/runtime CAT information.
> => XEN_SYSCTL_PSR_CAT_INFO_GET/XEN_DOMCTL_PSR_CAT_OP_CBM_GET
> 2) psr-cat-cbm-set [dom] [cbm] [socket]: Set CBM for a domain.
> => XEN_DOMCTL_PSR_CAT_OP_CBM_SET
>
>
> Hardware Limitation & Performance Improvement
> =============================================
> As the COS of PCPU in IA32_PQR_ASSOC is changed on each VCPU context
> switch. If the change is frequent then hardware may fail to strictly
> enforce the cache allocation basing on the specified COS. Due to this
> the strict placement characteristic would soften if VCPU is migrated on
> different PCPU frequently.
>
> For this reason, lazy updating for IA32_PQR_ASSOC will be done. Also this
> design allows CAT to run in two modes:
>
> 1) Non Affinitized mode: Each VM can be freely scheduled on any PCPU
> assigning its COS as it does.
>
> 2) Affinitized mode: Each PCPU is assigned a fixed COS and a VM can be
> scheduled on the PCPU only when it has a same COS. It's less flexible
> but can be an option for those who must have strict COS placement or in
> cases where problems have arisen because of the less strict nature of the
> non-affinitized mode.
>
> However, no additional code is designed to support these two modes. CAT is
> already running in non affinitized mode by default. If affinitized mode
> is desirable, then existed "xl vcpu-pin" command can be used to pin all
> the VCPUs which has the same COS to certain fixed PCPUs so that these
> PCPUs always have the same COS set.
>
> [1] http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-manual-325462.pdf
>
> Chao
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
Re: Cache Allocation Technology(CAT) design for XEN [ In reply to ]
>>> On 16.12.14 at 09:55, <chao.p.peng@linux.intel.com> wrote:
> Any comments from you? It would be greatly appreciated if you can look
> at this when you have time. Your comments are always important to me :)

I don't think I have to say much here:

> On Fri, Dec 12, 2014 at 08:27:57PM +0800, Chao Peng wrote:
>> Implementation Description
>> ==========================
>> In this design, one principal is that only implementing the cache
>> enforcement mechanism in hypervisor but leaving the cache allocation
>> policy to user space tool stack. In this way some complex governors then
>> can be implemented in tool stack.

With this, the changes to the hypervisor ought to be quite limited,
even if length of the list you give seems long at a first glance, and
hence I'm fine with the concept.

>> Hardware Limitation & Performance Improvement
>> =============================================
>> As the COS of PCPU in IA32_PQR_ASSOC is changed on each VCPU context
>> switch. If the change is frequent then hardware may fail to strictly
>> enforce the cache allocation basing on the specified COS.

This certainly would deserve a little more explanation: What's the
value of the functionality if one can't rely on it being enforced by
hardware at all times?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
Re: Cache Allocation Technology(CAT) design for XEN [ In reply to ]
On Tue, Dec 16, 2014 at 09:38:09AM +0000, Jan Beulich wrote:
> >>> On 16.12.14 at 09:55, <chao.p.peng@linux.intel.com> wrote:
> > Any comments from you? It would be greatly appreciated if you can look
> > at this when you have time. Your comments are always important to me :)
>
> I don't think I have to say much here:
>
> > On Fri, Dec 12, 2014 at 08:27:57PM +0800, Chao Peng wrote:
> >> Implementation Description
> >> ==========================
> >> In this design, one principal is that only implementing the cache
> >> enforcement mechanism in hypervisor but leaving the cache allocation
> >> policy to user space tool stack. In this way some complex governors then
> >> can be implemented in tool stack.
>
> With this, the changes to the hypervisor ought to be quite limited,
> even if length of the list you give seems long at a first glance, and
> hence I'm fine with the concept.
Thanks for your input.
>
> >> Hardware Limitation & Performance Improvement
> >> =============================================
> >> As the COS of PCPU in IA32_PQR_ASSOC is changed on each VCPU context
> >> switch. If the change is frequent then hardware may fail to strictly
> >> enforce the cache allocation basing on the specified COS.
>
> This certainly would deserve a little more explanation: What's the
> value of the functionality if one can't rely on it being enforced by
> hardware at all times?
OK. The hardware just can't enforce that strictly when cos is changed
frequently. But the performance is likely better overall because CAT limits
the amount of thrashing that the lower priority VM gets more cache.

If affinitized mode can be used, then strictly enforcement is guaranteed
by hardware. This is actually useful for scenarios like OpenStack NFV appliance
where vcpu are affinitized to specific logical threads.

Thanks,
Chao
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
Re: Cache Allocation Technology(CAT) design for XEN [ In reply to ]
Hi,

Thanks for posting this - it's very useful. I have a couple of
questions about the interface design.

At 20:27 +0800 on 12 Dec (1418412477), Chao Peng wrote:
> Design Overview
> When enforcing cache allocation for VMs, the minimum granularity is
> defined as the domain. All Virtual CPUs ("VCPUs") of a domain have the
> same COS, and therefore, correspond to the same CBM. COS is used only in
> hypervisor and is transparent to tool stack/user. System administrator
> can specify the initial CBM for each domain or change it at runtime using
> tool stack. Hypervisor then choses a free COS to associate it with that
> CBM or find a existed COS which has the same CBM.

What happens if there is no existing COS that matches, and all COSes
are in use? Does Xen return an error? Or try to change COS->CMB
mappings during context switches?

> - VCPU Schedule
> When VCPU is scheduled on the physical CPU ("PCPU"), its COS value is
> then written to MSR (IA32_PQR_ASSOC) of PCPU to notify hardware to use
> the new COS. The cache allocation is then enforced by hardware.
>
> - Multi-Socket
> In multi-socket environment, each VCPU may be scheduled on different
> sockets. The hardware CAT ability(such as maximum supported COS and length
> of CBM) maybe different among sockets. For such system, per-socket COS/CBM
> configuration of a domain is specified. Hypervisor then use this per-socket
> CBM information for VCPU schedule.

Is it OK to assume that in the common case all CPUs have the same CAT
capabilities? Then Xen can just report the smallest set of
capabilities of any socket in the system, and the toolstack doesn't
have to mess about with per-socket settings.

I guess you can add that syntactic sugar on the tools if you want and
leave the more powerful hypecall interface in case it's useful. :)

Tim.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
Re: Cache Allocation Technology(CAT) design for XEN [ In reply to ]
> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Tuesday, December 16, 2014 1:38 AM
> To: Chao Peng
> Cc: andrew.cooper3@citrix.com; Ian.Campbell@citrix.com;
> wei.liu2@citrix.com; George.Dunlap@eu.citrix.com;
> Ian.Jackson@eu.citrix.com; stefano.stabellini@eu.citrix.com; Auld,
> Will; xen-devel@lists.xen.org; dgdegra@tycho.nsa.gov; keir@xen.org
> Subject: Re: [Xen-devel] Cache Allocation Technology(CAT) design for
> XEN
>
> >>> On 16.12.14 at 09:55, <chao.p.peng@linux.intel.com> wrote:
> > Any comments from you? It would be greatly appreciated if you can
> look
> > at this when you have time. Your comments are always important to me
> > :)
>
> I don't think I have to say much here:
>
> > On Fri, Dec 12, 2014 at 08:27:57PM +0800, Chao Peng wrote:
> >> Implementation Description
> >> ==========================
> >> In this design, one principal is that only implementing the cache
> >> enforcement mechanism in hypervisor but leaving the cache allocation
> >> policy to user space tool stack. In this way some complex governors
> >> then can be implemented in tool stack.
>
> With this, the changes to the hypervisor ought to be quite limited,
> even if length of the list you give seems long at a first glance, and
> hence I'm fine with the concept.
>
> >> Hardware Limitation & Performance Improvement
> >> =============================================
> >> As the COS of PCPU in IA32_PQR_ASSOC is changed on each VCPU context
> >> switch. If the change is frequent then hardware may fail to strictly
> >> enforce the cache allocation basing on the specified COS.
>
> This certainly would deserve a little more explanation: What's the
> value of the functionality if one can't rely on it being enforced by
> hardware at all times?
>
> Jan

The majority of the time it should work as expected without additional configuration. Cases requiring additional configuration for higher accuracy will have COS exclusively affinitized to CPU(s) removing the source of error.

Thanks,

Will

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
Re: Cache Allocation Technology(CAT) design for XEN [ In reply to ]
On Thu, Dec 18, 2014 at 05:40:28PM +0100, Tim Deegan wrote:
> Hi,
>
> Thanks for posting this - it's very useful. I have a couple of
> questions about the interface design.
Thanks Tim.
>
> At 20:27 +0800 on 12 Dec (1418412477), Chao Peng wrote:
> > Design Overview
> > When enforcing cache allocation for VMs, the minimum granularity is
> > defined as the domain. All Virtual CPUs ("VCPUs") of a domain have the
> > same COS, and therefore, correspond to the same CBM. COS is used only in
> > hypervisor and is transparent to tool stack/user. System administrator
> > can specify the initial CBM for each domain or change it at runtime using
> > tool stack. Hypervisor then choses a free COS to associate it with that
> > CBM or find a existed COS which has the same CBM.
>
> What happens if there is no existing COS that matches, and all COSes
> are in use? Does Xen return an error? Or try to change COS->CMB
> mappings during context switches?

In the initial implementation, error is returned. It¡¯s possible for
hypervisor to share COS for different CBMes and not to return error
here. But the problem is that COS shortage may still happen during
context switch. At that time we will have no idea for what to do. So I¡¯d
prefer to return error directly here and leave the decision to user
space, e.g. if error is returned then it can clear CBM for some domain
and get free COS.
>
> > - VCPU Schedule
> > When VCPU is scheduled on the physical CPU ("PCPU"), its COS value is
> > then written to MSR (IA32_PQR_ASSOC) of PCPU to notify hardware to use
> > the new COS. The cache allocation is then enforced by hardware.
> >
> > - Multi-Socket
> > In multi-socket environment, each VCPU may be scheduled on different
> > sockets. The hardware CAT ability(such as maximum supported COS and length
> > of CBM) maybe different among sockets. For such system, per-socket COS/CBM
> > configuration of a domain is specified. Hypervisor then use this per-socket
> > CBM information for VCPU schedule.
>
> Is it OK to assume that in the common case all CPUs have the same CAT
> capabilities? Then Xen can just report the smallest set of
> capabilities of any socket in the system, and the toolstack doesn't
> have to mess about with per-socket settings.
>
> I guess you can add that syntactic sugar on the tools if you want and
> leave the more powerful hypecall interface in case it's useful. :)

Agreed, this is what I want to do. Basically the socketId is optional
for the caller. If more than one socket exists, omitting socketId means
the specified CBM applied to all sockets. While we still maintain
per-socket CBM in hypervisor internally and provide per-socket hypercall
interface in case it¡¯s needed. In this way the interface should be user
friendly for most cases.

Chao


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
Re: Cache Allocation Technology(CAT) design for XEN [ In reply to ]
At 14:09 +0800 on 19 Dec (1418994579), Chao Peng wrote:
> On Thu, Dec 18, 2014 at 05:40:28PM +0100, Tim Deegan wrote:
> > Hi,
> >
> > Thanks for posting this - it's very useful. I have a couple of
> > questions about the interface design.
> Thanks Tim.
> >
> > At 20:27 +0800 on 12 Dec (1418412477), Chao Peng wrote:
> > > Design Overview
> > > When enforcing cache allocation for VMs, the minimum granularity is
> > > defined as the domain. All Virtual CPUs ("VCPUs") of a domain have the
> > > same COS, and therefore, correspond to the same CBM. COS is used only in
> > > hypervisor and is transparent to tool stack/user. System administrator
> > > can specify the initial CBM for each domain or change it at runtime using
> > > tool stack. Hypervisor then choses a free COS to associate it with that
> > > CBM or find a existed COS which has the same CBM.
> >
> > What happens if there is no existing COS that matches, and all COSes
> > are in use? Does Xen return an error? Or try to change COS->CMB
> > mappings during context switches?
>
> In the initial implementation, error is returned. It??s possible for
> hypervisor to share COS for different CBMes and not to return error
> here. But the problem is that COS shortage may still happen during
> context switch. At that time we will have no idea for what to do. So I??d
> prefer to return error directly here and leave the decision to user
> space, e.g. if error is returned then it can clear CBM for some domain
> and get free COS.

Righto, thanks.

> > Is it OK to assume that in the common case all CPUs have the same CAT
> > capabilities? Then Xen can just report the smallest set of
> > capabilities of any socket in the system, and the toolstack doesn't
> > have to mess about with per-socket settings.
> >
> > I guess you can add that syntactic sugar on the tools if you want and
> > leave the more powerful hypecall interface in case it's useful. :)
>
> Agreed, this is what I want to do. Basically the socketId is optional
> for the caller. If more than one socket exists, omitting socketId means
> the specified CBM applied to all sockets. While we still maintain
> per-socket CBM in hypervisor internally and provide per-socket hypercall
> interface in case it??s needed. In this way the interface should be user
> friendly for most cases.

Sounds good. Thanks for clarifying.

Cheers,

Tim.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel