<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/kernel/cgroup/misc.c, branch v6.18.21</title>
<subtitle>Clone of https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git</subtitle>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/'/>
<entry>
<title>Merge branch 'kvm-tdx-initial' into HEAD</title>
<updated>2025-04-07T11:36:33+00:00</updated>
<author>
<name>Paolo Bonzini</name>
<email>pbonzini@redhat.com</email>
</author>
<published>2025-03-19T13:46:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=fd02aa45bda6d2f2fedcab70e828867332ef7e1c'/>
<id>fd02aa45bda6d2f2fedcab70e828867332ef7e1c</id>
<content type='text'>
This large commit contains the initial support for TDX in KVM.  All x86
parts enable the host-side hypercalls that KVM uses to talk to the TDX
module, a software component that runs in a special CPU mode called SEAM
(Secure Arbitration Mode).

The series is in turn split into multiple sub-series, each with a separate
merge commit:

- Initialization: basic setup for using the TDX module from KVM, plus
  ioctls to create TDX VMs and vCPUs.

- MMU: in TDX, private and shared halves of the address space are mapped by
  different EPT roots, and the private half is managed by the TDX module.
  Using the support that was added to the generic MMU code in 6.14,
  add support for TDX's secure page tables to the Intel side of KVM.
  Generic KVM code takes care of maintaining a mirror of the secure page
  tables so that they can be queried efficiently, and ensuring that changes
  are applied to both the mirror and the secure EPT.

- vCPU enter/exit: implement the callbacks that handle the entry of a TDX
  vCPU (via the SEAMCALL TDH.VP.ENTER) and the corresponding save/restore
  of host state.

- Userspace exits: introduce support for guest TDVMCALLs that KVM forwards to
  userspace.  These correspond to the usual KVM_EXIT_* "heavyweight vmexits"
  but are triggered through a different mechanism, similar to VMGEXIT for
  SEV-ES and SEV-SNP.

- Interrupt handling: support for virtual interrupt injection as well as
  handling VM-Exits that are caused by vectored events.  Exclusive to
  TDX are machine-check SMIs, which the kernel already knows how to
  handle through the kernel machine check handler (commit 7911f145de5f,
  "x86/mce: Implement recovery for errors in TDX/SEAM non-root mode")

- Loose ends: handling of the remaining exits from the TDX module, including
  EPT violation/misconfig and several TDVMCALL leaves that are handled in
  the kernel (CPUID, HLT, RDMSR/WRMSR, GetTdVmCallInfo); plus returning
  an error or ignoring operations that are not supported by TDX guests

Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This large commit contains the initial support for TDX in KVM.  All x86
parts enable the host-side hypercalls that KVM uses to talk to the TDX
module, a software component that runs in a special CPU mode called SEAM
(Secure Arbitration Mode).

The series is in turn split into multiple sub-series, each with a separate
merge commit:

- Initialization: basic setup for using the TDX module from KVM, plus
  ioctls to create TDX VMs and vCPUs.

- MMU: in TDX, private and shared halves of the address space are mapped by
  different EPT roots, and the private half is managed by the TDX module.
  Using the support that was added to the generic MMU code in 6.14,
  add support for TDX's secure page tables to the Intel side of KVM.
  Generic KVM code takes care of maintaining a mirror of the secure page
  tables so that they can be queried efficiently, and ensuring that changes
  are applied to both the mirror and the secure EPT.

- vCPU enter/exit: implement the callbacks that handle the entry of a TDX
  vCPU (via the SEAMCALL TDH.VP.ENTER) and the corresponding save/restore
  of host state.

- Userspace exits: introduce support for guest TDVMCALLs that KVM forwards to
  userspace.  These correspond to the usual KVM_EXIT_* "heavyweight vmexits"
  but are triggered through a different mechanism, similar to VMGEXIT for
  SEV-ES and SEV-SNP.

- Interrupt handling: support for virtual interrupt injection as well as
  handling VM-Exits that are caused by vectored events.  Exclusive to
  TDX are machine-check SMIs, which the kernel already knows how to
  handle through the kernel machine check handler (commit 7911f145de5f,
  "x86/mce: Implement recovery for errors in TDX/SEAM non-root mode")

- Loose ends: handling of the remaining exits from the TDX module, including
  EPT violation/misconfig and several TDVMCALL leaves that are handled in
  the kernel (CPUID, HLT, RDMSR/WRMSR, GetTdVmCallInfo); plus returning
  an error or ignoring operations that are not supported by TDX guests

Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>KVM: TDX: Register TDX host key IDs to cgroup misc controller</title>
<updated>2025-03-14T18:20:51+00:00</updated>
<author>
<name>Zhiming Hu</name>
<email>zhiming.hu@intel.com</email>
</author>
<published>2025-02-19T14:02:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=7c035bea94074b19ed560a4f23a76c5a6c8e594f'/>
<id>7c035bea94074b19ed560a4f23a76c5a6c8e594f</id>
<content type='text'>
TDX host key IDs (HKID) are limit resources in a machine, and the misc
cgroup lets the machine owner track their usage and limits the possibility
of abusing them outside the owner's control.

The cgroup v2 miscellaneous subsystem was introduced to control the
resource of AMD SEV &amp; SEV-ES ASIDs.  Likewise introduce HKIDs as a misc
resource.

Signed-off-by: Zhiming Hu &lt;zhiming.hu@intel.com&gt;
Signed-off-by: Isaku Yamahata &lt;isaku.yamahata@intel.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
TDX host key IDs (HKID) are limit resources in a machine, and the misc
cgroup lets the machine owner track their usage and limits the possibility
of abusing them outside the owner's control.

The cgroup v2 miscellaneous subsystem was introduced to control the
resource of AMD SEV &amp; SEV-ES ASIDs.  Likewise introduce HKIDs as a misc
resource.

Signed-off-by: Zhiming Hu &lt;zhiming.hu@intel.com&gt;
Signed-off-by: Isaku Yamahata &lt;isaku.yamahata@intel.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cgroup/misc: Remove unused misc_cg_res_total_usage</title>
<updated>2025-01-28T19:00:54+00:00</updated>
<author>
<name>Dr. David Alan Gilbert</name>
<email>linux@treblig.org</email>
</author>
<published>2025-01-27T23:52:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=ad6c08d8c1045843ec564a73981ece6ec31d11a0'/>
<id>ad6c08d8c1045843ec564a73981ece6ec31d11a0</id>
<content type='text'>
misc_cg_res_total_usage() was added in 2021 by
commit a72232eabdfc ("cgroup: Add misc cgroup controller")

but has remained unused.

Remove it.

Signed-off-by: Dr. David Alan Gilbert &lt;linux@treblig.org&gt;
Acked-by: Michal Koutný &lt;mkoutny@suse.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
misc_cg_res_total_usage() was added in 2021 by
commit a72232eabdfc ("cgroup: Add misc cgroup controller")

but has remained unused.

Remove it.

Signed-off-by: Dr. David Alan Gilbert &lt;linux@treblig.org&gt;
Acked-by: Michal Koutný &lt;mkoutny@suse.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cgroup/misc: Introduce misc.events.local</title>
<updated>2024-07-12T16:45:23+00:00</updated>
<author>
<name>Xiu Jianfeng</name>
<email>xiujianfeng@huawei.com</email>
</author>
<published>2024-07-11T10:14:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=6a26f9c68901797261bc145975a02f85be0c1d8f'/>
<id>6a26f9c68901797261bc145975a02f85be0c1d8f</id>
<content type='text'>
Currently the event counting provided by misc.events is hierarchical,
it's not practical if user is only concerned with events of a
specified cgroup. Therefore, introduce misc.events.local collect events
specific to the given cgroup.

This is analogous to memory.events.local and pids.events.local.

Signed-off-by: Xiu Jianfeng &lt;xiujianfeng@huawei.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently the event counting provided by misc.events is hierarchical,
it's not practical if user is only concerned with events of a
specified cgroup. Therefore, introduce misc.events.local collect events
specific to the given cgroup.

This is analogous to memory.events.local and pids.events.local.

Signed-off-by: Xiu Jianfeng &lt;xiujianfeng@huawei.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cgroup/misc: Introduce misc.peak</title>
<updated>2024-07-03T18:08:43+00:00</updated>
<author>
<name>Xiu Jianfeng</name>
<email>xiujianfeng@huawei.com</email>
</author>
<published>2024-07-03T00:36:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=1028f391d5f9d4248e2f49193e6de2516ad630f8'/>
<id>1028f391d5f9d4248e2f49193e6de2516ad630f8</id>
<content type='text'>
Introduce misc.peak to record the historical maximum usage of the
resource, as in some scenarios the value of misc.max could be
adjusted based on the peak usage of the resource.

Signed-off-by: Xiu Jianfeng &lt;xiujianfeng@huawei.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Introduce misc.peak to record the historical maximum usage of the
resource, as in some scenarios the value of misc.max could be
adjusted based on the peak usage of the resource.

Signed-off-by: Xiu Jianfeng &lt;xiujianfeng@huawei.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cgroup/misc: Store atomic64_t reads to u64</title>
<updated>2023-07-21T18:10:06+00:00</updated>
<author>
<name>Haitao Huang</name>
<email>haitao.huang@linux.intel.com</email>
</author>
<published>2023-07-21T12:02:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=714e08cc3ec564dc1717000f5a4c1f40ca8c8878'/>
<id>714e08cc3ec564dc1717000f5a4c1f40ca8c8878</id>
<content type='text'>
Change 'new_usage' type to u64 so it can be compared with unsigned 'max'
and 'capacity' properly even if the value crosses the signed boundary.

Signed-off-by: Haitao Huang &lt;haitao.huang@linux.intel.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Change 'new_usage' type to u64 so it can be compared with unsigned 'max'
and 'capacity' properly even if the value crosses the signed boundary.

Signed-off-by: Haitao Huang &lt;haitao.huang@linux.intel.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cgroup/misc: Change counters to be explicit 64bit types</title>
<updated>2023-07-18T22:10:00+00:00</updated>
<author>
<name>Haitao Huang</name>
<email>haitao.huang@linux.intel.com</email>
</author>
<published>2023-07-18T01:08:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=32bf85c60ca3584a7ba3bef19da2779b73b2e7d6'/>
<id>32bf85c60ca3584a7ba3bef19da2779b73b2e7d6</id>
<content type='text'>
So the variables can account for resources of huge quantities even on
32-bit machines.

Signed-off-by: Haitao Huang &lt;haitao.huang@linux.intel.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
So the variables can account for resources of huge quantities even on
32-bit machines.

Signed-off-by: Haitao Huang &lt;haitao.huang@linux.intel.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cgroup/misc: Expose misc.current on cgroup v2 root</title>
<updated>2023-06-21T20:43:29+00:00</updated>
<author>
<name>LeiZhou-97</name>
<email>lei.zhou@intel.com</email>
</author>
<published>2023-06-13T10:59:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=e973dfe9299441ad263efedf094897fbb31e6a87'/>
<id>e973dfe9299441ad263efedf094897fbb31e6a87</id>
<content type='text'>
Hello,

This patch is to expose misc.current on cgroup v2 root for tracking
how much of the resource has been consumed in total on the system.

Most of the cloud infrastucture use cgroup to fetch the host
information for scheduling purpose.

Currently, the misc controller can be used by Intel TDX HKIDs and
AMD SEV ASIDs, which are both used for creating encrypted VMs.
Intel TDX and AMD SEV are mostly be used by the cloud providers
for providing confidential VMs.

In actual use of a server, these confidential VMs may be launched
in different ways. For the cloud solution, there are kubvirt and
coco (tracked by kubepods.slice); on host, they can be booted
directly through qemu by end user (tracked by user.slice), etc.

In this complex environment, when wanting to know how many resource
is used in total it has to iterate through all existing slices to
get the value of each misc.current and add them up to calculate
the total number of consumed keys.

So exposing misc.current to root cgroup tends to give much easier
when calculates how much resource has been used in total, which
helps to schedule and count resources for the cloud infrastucture.

Signed-off-by: LeiZhou-97 &lt;lei.zhou@intel.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Hello,

This patch is to expose misc.current on cgroup v2 root for tracking
how much of the resource has been consumed in total on the system.

Most of the cloud infrastucture use cgroup to fetch the host
information for scheduling purpose.

Currently, the misc controller can be used by Intel TDX HKIDs and
AMD SEV ASIDs, which are both used for creating encrypted VMs.
Intel TDX and AMD SEV are mostly be used by the cloud providers
for providing confidential VMs.

In actual use of a server, these confidential VMs may be launched
in different ways. For the cloud solution, there are kubvirt and
coco (tracked by kubepods.slice); on host, they can be booted
directly through qemu by end user (tracked by user.slice), etc.

In this complex environment, when wanting to know how many resource
is used in total it has to iterate through all existing slices to
get the value of each misc.current and add them up to calculate
the total number of consumed keys.

So exposing misc.current to root cgroup tends to give much easier
when calculates how much resource has been used in total, which
helps to schedule and count resources for the cloud infrastucture.

Signed-off-by: LeiZhou-97 &lt;lei.zhou@intel.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>misc_cgroup: remove error log to avoid log flood</title>
<updated>2021-09-20T17:35:38+00:00</updated>
<author>
<name>Chunguang Xu</name>
<email>brookxu@tencent.com</email>
</author>
<published>2021-09-17T12:44:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=b03357528fd9516600ff358ab5d4b887b8cb80e8'/>
<id>b03357528fd9516600ff358ab5d4b887b8cb80e8</id>
<content type='text'>
In scenarios where containers are frequently created and deleted,
a large number of error logs maybe generated. The logs only show
which node is about to go over the max limit, not the node which
resource request failed. As misc.events has provided relevant
information, maybe we can remove this log.

Signed-off-by: Chunguang Xu &lt;brookxu@tencent.com&gt;
Reviewed-by: Michal Koutný &lt;mkoutny@suse.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In scenarios where containers are frequently created and deleted,
a large number of error logs maybe generated. The logs only show
which node is about to go over the max limit, not the node which
resource request failed. As misc.events has provided relevant
information, maybe we can remove this log.

Signed-off-by: Chunguang Xu &lt;brookxu@tencent.com&gt;
Reviewed-by: Michal Koutný &lt;mkoutny@suse.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>misc_cgroup: introduce misc.events to count failures</title>
<updated>2021-09-20T17:35:38+00:00</updated>
<author>
<name>Chunguang Xu</name>
<email>brookxu@tencent.com</email>
</author>
<published>2021-09-17T12:44:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=f279294b329363eb6ada568e494d609ef78e3e8e'/>
<id>f279294b329363eb6ada568e494d609ef78e3e8e</id>
<content type='text'>
Introduce misc.events to make it easier for us to understand
the pressure of resources. Currently only the 'max' event is
implemented, which indicates the times the resource is about
to exceeds the max limit.

Signed-off-by: Chunguang Xu &lt;brookxu@tencent.com&gt;
Reviewed-by: Vipin Sharma &lt;vipinsh@google.com&gt;
Reviewed-by: Michal Koutný &lt;mkoutny@suse.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Introduce misc.events to make it easier for us to understand
the pressure of resources. Currently only the 'max' event is
implemented, which indicates the times the resource is about
to exceeds the max limit.

Signed-off-by: Chunguang Xu &lt;brookxu@tencent.com&gt;
Reviewed-by: Vipin Sharma &lt;vipinsh@google.com&gt;
Reviewed-by: Michal Koutný &lt;mkoutny@suse.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
