summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-05-12smb: client: show share compression info in DebugDatasmb-compression-asyncEnzo Matsumiya1-1/+5
Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
2024-05-12smb: client: implement chained compression supportEnzo Matsumiya6-48/+297
Introduce 'Pattern V1' algorithm (MS-SMB2) for pattern (repeated single byte) scanning/matching. Usage of this algorithm requires chained compression support to be negotiated with the server, which is also done by this commit. Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
2024-05-12smb: client: implement decompression of READ responsesEnzo Matsumiya8-70/+422
Implement LZ77 decompression for SMB2 READ responses. Since a lot of the code in smb2ops.c for receiving encrypted responses are very similar, add some helpers and/or modify some of those to support decompression as well. No behaviour change is expected. Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
2024-05-12smb: client: implement compression of WRITE requestsEnzo Matsumiya13-18/+861
Implement unchained compression of write requests (with data size >= 4096) using the LZ77 "plain" compression algorithm as per MS-XCA specification, section "2.3 Plain LZ77 Compression Algorithm Details". Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
2024-03-19cifs: update internal module version number for cifs.koSteve French1-2/+2
From 2.47 to 2.48 Signed-off-by: Steve French <stfrench@microsoft.com>
2024-03-19smb: common: simplify compression headersEnzo Matsumiya1-19/+26
Unify compression headers (chained and unchained) into a single struct so we can use it for the initial compression transform header interchangeably. Also make the OriginalPayloadSize field to be always visible in the compression payload header, and have callers subtract its size when not needed. Rename the related structs to match the naming convetion used in the other SMB2 structs. Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-03-10smb: common: fix fields sizes in compression_pattern_payload_v1Enzo Matsumiya1-2/+2
See protocol documentation in MS-SMB2 section 2.2.42.2.2 Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-03-10smb: client: negotiate compression algorithmsEnzo Matsumiya6-15/+49
Change "compress=" mount option to a boolean flag, that, if set, will enable negotiating compression algorithms with the server. Do not de/compress anything for now. Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-03-10smb3: add dynamic trace point for ioctlsSteve French2-0/+37
It can be helpful in debugging to know which ioctls are called to better correlate them with smb3 fsctls (and opens). Add a dynamic trace point to trace ioctls into cifs.ko Here is sample output: TASK-PID CPU# ||||| TIMESTAMP FUNCTION | | | ||||| | | new-inotify-ioc-90418 [001] ..... 142157.397024: smb3_ioctl: xid=18 fid=0x0 ioctl cmd=0xc009cf0b new-inotify-ioc-90457 [007] ..... 142217.943569: smb3_ioctl: xid=22 fid=0x389bf5b6 ioctl cmd=0xc009cf0b Signed-off-by: Steve French <stfrench@microsoft.com>
2024-03-10cifs: Fix writeback data corruptionDavid Howells1-126/+157
cifs writeback doesn't correctly handle the case where cifs_extend_writeback() hits a point where it is considering an additional folio, but this would overrun the wsize - at which point it drops out of the xarray scanning loop and calls xas_pause(). The problem is that xas_pause() advances the loop counter - thereby skipping that page. What needs to happen is for xas_reset() to be called any time we decide we don't want to process the page we're looking at, but rather send the request we are building and start a new one. Fix this by copying and adapting the netfslib writepages code as a temporary measure, with cifs writeback intending to be offloaded to netfslib in the near future. This also fixes the issue with the use of filemap_get_folios_tag() causing retry of a bunch of pages which the extender already dealt with. This can be tested by creating, say, a 64K file somewhere not on cifs (otherwise copy-offload may get underfoot), mounting a cifs share with a wsize of 64000, copying the file to it and then comparing the original file and the copy: dd if=/dev/urandom of=/tmp/64K bs=64k count=1 mount //192.168.6.1/test /mnt -o user=...,pass=...,wsize=64000 cp /tmp/64K /mnt/64K cmp /tmp/64K /mnt/64K Without the fix, the cmp fails at position 64000 (or shortly thereafter). Fixes: d08089f649a0 ("cifs: Change the I/O paths to use an iterator rather than a page list") Signed-off-by: David Howells <dhowells@redhat.com> cc: Steve French <sfrench@samba.org> cc: Paulo Alcantara <pc@manguebit.com> cc: Ronnie Sahlberg <ronniesahlberg@gmail.com> cc: Shyam Prasad N <sprasad@microsoft.com> cc: Tom Talpey <tom@talpey.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cifs@vger.kernel.org cc: samba-technical@lists.samba.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
2024-03-10smb: client: return reparse type in /proc/mountsPaulo Alcantara2-0/+14
Add support for returning reparse mount option in /proc/mounts. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202402262152.YZOwDlCM-lkp@intel.com/ Signed-off-by: Paulo Alcantara <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-03-10smb: client: set correct d_type for reparse DFS/DFSR and mount pointPaulo Alcantara1-7/+9
Set correct dirent->d_type for IO_REPARSE_TAG_DFS{,R} and IO_REPARSE_TAG_MOUNT_POINT reparse points. Signed-off-by: Paulo Alcantara <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-03-10smb: client: parse uid, gid, mode and dev from WSL reparse pointsPaulo Alcantara4-17/+97
Parse the extended attributes from WSL reparse points to correctly report uid, gid mode and dev from ther instantiated inodes. Signed-off-by: Paulo Alcantara <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-03-10smb: client: introduce SMB2_OP_QUERY_WSL_EAPaulo Alcantara6-25/+190
Add a new command to smb2_compound_op() for querying WSL extended attributes from reparse points. Signed-off-by: Paulo Alcantara <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-03-10smb: client: Fix a NULL vs IS_ERR() check in wsl_set_xattrs()Dan Carpenter1-1/+1
This was intended to be an IS_ERR() check. The ea_create_context() function doesn't return NULL. Fixes: 1eab17fe485c ("smb: client: add support for WSL reparse points") Reviewed-by: Paulo Alcantara <pc@manguebit.com> Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-03-10smb: client: add support for WSL reparse pointsPaulo Alcantara10-20/+210
Add support for creating special files via WSL reparse points when using 'reparse=wsl' mount option. They're faster than NFS reparse points because they don't require extra roundtrips to figure out what ->d_type a specific dirent is as such information is already stored in query dir responses and then making getdents() calls faster. Signed-off-by: Paulo Alcantara <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-03-10smb: client: reduce number of parameters in smb2_compound_op()Paulo Alcantara2-69/+95
Replace @desired_access, @create_disposition, @create_options and @mode parameters with a single @oparms. No functional changes. Signed-off-by: Paulo Alcantara <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-03-10smb: client: fix potential broken compound requestPaulo Alcantara1-43/+63
Now that smb2_compound_op() can accept up to 5 commands in a single compound request, set the appropriate NextCommand and related flags to all subsequent commands as well as handling the case where a valid @cfile is passed and therefore skipping create and close requests in the compound chain. This fix a potential broken compound request that could be sent from smb2_get_reparse_inode() if the client found a valid open file (@cfile) prior to calling smb2_compound_op(). Signed-off-by: Paulo Alcantara <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-03-10smb: client: move most of reparse point handling code to common filePaulo Alcantara9-364/+405
In preparation to add support for creating special files also via WSL reparse points in next commits. Signed-off-by: Paulo Alcantara <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-03-10smb: client: introduce reparse mount optionPaulo Alcantara4-0/+52
Allow the user to create special files and symlinks by choosing between WSL and NFS reparse points via 'reparse={nfs,wsl}' mount options. If unset or 'reparse=default', the client will default to creating them via NFS reparse points. Creating WSL reparse points isn't supported yet, so simply return error when attempting to mount with 'reparse=wsl' for now. Signed-off-by: Paulo Alcantara <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-03-10smb: client: retry compound request without reusing leaseMeetakshi Setiya1-3/+38
There is a shortcoming in the current implementation of the file lease mechanism exposed when the lease keys were attempted to be reused for unlink, rename and set_path_size operations for a client. As per MS-SMB2, lease keys are associated with the file name. Linux smb client maintains lease keys with the inode. If the file has any hardlinks, it is possible that the lease for a file be wrongly reused for an operation on the hardlink or vice versa. In these cases, the mentioned compound operations fail with STATUS_INVALID_PARAMETER. This patch adds a fallback to the old mechanism of not sending any lease with these compound operations if the request with lease key fails with STATUS_INVALID_PARAMETER. Resending the same request without lease key should not hurt any functionality, but might impact performance especially in cases where the error is not because of the usage of wrong lease key and we might end up doing an extra roundtrip. Signed-off-by: Meetakshi Setiya <msetiya@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-03-10smb: client: do not defer close open handles to deleted filesMeetakshi Setiya6-5/+74
When a file/dentry has been deleted before closing all its open handles, currently, closing them can add them to the deferred close list. This can lead to problems in creating file with the same name when the file is re-created before the deferred close completes. This issue was seen while reusing a client's already existing lease on a file for compound operations and xfstest 591 failed because of the deferred close handle that remained valid even after the file was deleted and was being reused to create a file with the same name. The server in this case returns an error on open with STATUS_DELETE_PENDING. Recreating the file would fail till the deferred handles are closed (duration specified in closetimeo). This patch fixes the issue by flagging all open handles for the deleted file (file path to be precise) by setting status_file_deleted to true in the cifsFileInfo structure. As per the information classes specified in MS-FSCC, SMB2 query info response from the server has a DeletePending field, set to true to indicate that deletion has been requested on that file. If this is the case, flag the open handles for this file too. When doing close in cifs_close for each of these handles, check the value of this boolean field and do not defer close these handles if the corresponding filepath has been deleted. Signed-off-by: Meetakshi Setiya <msetiya@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-03-10smb: client: reuse file lease key in compound operationsMeetakshi Setiya6-31/+48
Currently, when a rename, unlink or set path size compound operation is requested on a file that has a lot of dirty pages to be written to the server, we do not send the lease key for these requests. As a result, the server can assume that this request is from a new client, and send a lease break notification to the same client, on the same connection. As a response to the lease break, the client can consume several credits to write the dirty pages to the server. Depending on the server's credit grant implementation, the server can stop granting more credits to this connection, and this can cause a deadlock (which can only be resolved when the lease timer on the server expires). One of the problems here is that the client is sending no lease key, even if it has a lease for the file. This patch fixes the problem by reusing the existing lease key on the file for rename, unlink and set path size compound operations so that the client does not break its own lease. A very trivial example could be a set of commands by a client that maintains open handle (for write) to a file and then tries to copy the contents of that file to another one, eg., tail -f /dev/null > myfile & mv myfile myfile2 Presently, the network capture on the client shows that the move (or rename) would trigger a lease break on the same client, for the same file. With the lease key reused, the lease break request-response overhead is eliminated, thereby reducing the roundtrips performed for this set of operations. The patch fixes the bug described above and also provides perf benefit. Signed-off-by: Meetakshi Setiya <msetiya@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-03-10smb3: update allocation size more accurately on write completionSteve French1-1/+8
Changes to allocation size are approximated for extending writes of cached files until the server returns the actual value (on SMB3 close or query info for example), but it was setting the estimated value for number of blocks to larger than the file size even if the file is likely sparse which breaks various xfstests (e.g. generic/129, 130, 221, 228). When i_size and i_blocks are updated in write completion do not increase allocation size more than what was written (rounded up to 512 bytes). Signed-off-by: Steve French <stfrench@microsoft.com>
2024-03-10cifs: minor update to list of reviewersSteve French1-0/+1
Add Bharath for reviewing deferred close and leases Acked-by: Bharath SM <bharathsm@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-03-10smb: remove SLAB_MEM_SPREAD flag usageChengming Zhou1-1/+1
The SLAB_MEM_SPREAD flag is already a no-op as of 6.8-rc1, remove its usage so we can delete it from slab. No functional change. Link: https://lore.kernel.org/all/20240223-slab-cleanup-flags-v2-0-02f1753e8303@suse.cz/ Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-03-10cifs: allow changing password during remountSteve French4-5/+30
There are cases where a session is disconnected and password has changed on the server (or expired) for this user and this currently can not be fixed without unmount and mounting again. This patch allows remount to change the password (for the non Kerberos case, Kerberos ticket refresh is handled differently) when the session is disconnected and the user can not reconnect due to still using old password. Future patches should also allow us to setup the keyring (cifscreds) to have an "alternate password" so we would be able to change the password before the session drops (without the risk of races between when the password changes and the disconnect occurs - ie cases where the old password is still needed because the new password has not fully rolled out to all servers yet). Cc: stable@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
2024-03-10cifs: prevent updating file size from server if we have a read/write leaseBharath SM4-12/+17
In cases of large directories, the readdir operation may span multiple round trips to retrieve contents. This introduces a potential race condition in case of concurrent write and readdir operations. If the readdir operation initiates before a write has been processed by the server, it may update the file size attribute to an older value. Address this issue by avoiding file size updates from readdir when we have read/write lease. Scenario: 1) process1: open dir xyz 2) process1: readdir instance 1 on xyz 3) process2: create file.txt for write 4) process2: write x bytes to file.txt 5) process2: close file.txt 6) process2: open file.txt for read 7) process1: readdir 2 - overwrites file.txt inode size to 0 8) process2: read contents of file.txt - bug, short read with 0 bytes Cc: stable@vger.kernel.org Reviewed-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Bharath SM <bharathsm@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-03-10Linux 6.8v6.8Linus Torvalds1-1/+1
2024-03-10Merge tag 'trace-ring-buffer-v6.8-rc7' of ↵Linus Torvalds4-94/+120
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing fixes from Steven Rostedt: - Do not allow large strings (> 4096) as single write to trace_marker The size of a string written into trace_marker was determined by the size of the sub-buffer in the ring buffer. That size is dependent on the PAGE_SIZE of the architecture as it can be mapped into user space. But on PowerPC, where PAGE_SIZE is 64K, that made the limit of the string of writing into trace_marker 64K. One of the selftests looks at the size of the ring buffer sub-buffers and writes that plus more into the trace_marker. The write will take what it can and report back what it consumed so that the user space application (like echo) will write the rest of the string. The string is stored in the ring buffer and can be read via the "trace" or "trace_pipe" files. The reading of the ring buffer uses vsnprintf(), which uses a precision "%.*s" to make sure it only reads what is stored in the buffer, as a bug could cause the string to be non terminated. With the combination of the precision change and the PAGE_SIZE of 64K allowing huge strings to be added into the ring buffer, plus the test that would actually stress that limit, a bug was reported that the precision used was too big for "%.*s" as the string was close to 64K in size and the max precision of vsnprintf is 32K. Linus suggested not to have that precision as it could hide a bug if the string was again stored without a nul byte. Another issue that was brought up is that the trace_seq buffer is also based on PAGE_SIZE even though it is not tied to the architecture limit like the ring buffer sub-buffer is. Having it be 64K * 2 is simply just too big and wasting memory on systems with 64K page sizes. It is now hardcoded to 8K which is what all other architectures with 4K PAGE_SIZE has. Finally, the write to trace_marker is now limited to 4K as there is no reason to write larger strings into trace_marker. - ring_buffer_wait() should not loop. The ring_buffer_wait() does not have the full context (yet) on if it should loop or not. Just exit the loop as soon as its woken up and let the callers decide to loop or not (they already do, so it's a bit redundant). - Fix shortest_full field to be the smallest amount in the ring buffer that a waiter is waiting for. The "shortest_full" field is updated when a new waiter comes in and wants to wait for a smaller amount of data in the ring buffer than other waiters. But after all waiters are woken up, it's not reset, so if another waiter comes in wanting to wait for more data, it will be woken up when the ring buffer has a smaller amount from what the previous waiters were waiting for. - The wake up all waiters on close is incorrectly called frome .release() and not from .flush() so it will never wake up any waiters as the .release() will not get called until all .read() calls are finished. And the wakeup is for the waiters in those .read() calls. * tag 'trace-ring-buffer-v6.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing: Use .flush() call to wake up readers ring-buffer: Fix resetting of shortest_full ring-buffer: Fix waking up ring buffer readers tracing: Limit trace_marker writes to just 4K tracing: Limit trace_seq size to just 8K and not depend on architecture PAGE_SIZE tracing: Remove precision vsnprintf() check from print event
2024-03-10Merge tag 'phy-fixes3-6.8' of ↵Linus Torvalds1-8/+8
git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy Pull phy fixes from Vinod Koul: - fixes for Qualcomm qmp-combo driver for ordering of drm and type-c switch registartion due to drivers might not probe defer after having registered child devices to avoid triggering a probe deferral loop. This fixes internal display on Lenovo ThinkPad X13s * tag 'phy-fixes3-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy: phy: qcom-qmp-combo: fix type-c switch registration phy: qcom-qmp-combo: fix drm bridge registration
2024-03-10tracing: Use .flush() call to wake up readersSteven Rostedt (Google)1-6/+15
The .release() function does not get called until all readers of a file descriptor are finished. If a thread is blocked on reading a file descriptor in ring_buffer_wait(), and another thread closes the file descriptor, it will not wake up the other thread as ring_buffer_wake_waiters() is called by .release(), and that will not get called until the .read() is finished. The issue originally showed up in trace-cmd, but the readers are actually other processes with their own file descriptors. So calling close() would wake up the other tasks because they are blocked on another descriptor then the one that was closed(). But there's other wake ups that solve that issue. When a thread is blocked on a read, it can still hang even when another thread closed its descriptor. This is what the .flush() callback is for. Have the .flush() wake up the readers. Link: https://lore.kernel.org/linux-trace-kernel/20240308202432.107909457@goodmis.org Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: linke li <lilinke99@qq.com> Cc: Rabin Vincent <rabin@rab.in> Fixes: f3ddb74ad0790 ("tracing: Wake up ring buffer waiters on closing of the file") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-03-10ring-buffer: Fix resetting of shortest_fullSteven Rostedt (Google)1-7/+23
The "shortest_full" variable is used to keep track of the waiter that is waiting for the smallest amount on the ring buffer before being woken up. When a tasks waits on the ring buffer, it passes in a "full" value that is a percentage. 0 means wake up on any data. 1-100 means wake up from 1% to 100% full buffer. As all waiters are on the same wait queue, the wake up happens for the waiter with the smallest percentage. The problem is that the smallest_full on the cpu_buffer that stores the smallest amount doesn't get reset when all the waiters are woken up. It does get reset when the ring buffer is reset (echo > /sys/kernel/tracing/trace). This means that tasks may be woken up more often then when they want to be. Instead, have the shortest_full field get reset just before waking up all the tasks. If the tasks wait again, they will update the shortest_full before sleeping. Also add locking around setting of shortest_full in the poll logic, and change "work" to "rbwork" to match the variable name for rb_irq_work structures that are used in other places. Link: https://lore.kernel.org/linux-trace-kernel/20240308202431.948914369@goodmis.org Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: linke li <lilinke99@qq.com> Cc: Rabin Vincent <rabin@rab.in> Fixes: 2c2b0a78b3739 ("ring-buffer: Add percentage of ring buffer full to wake up reader") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-03-10Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds8-15/+120
Pull kvm fixes from Paolo Bonzini: "KVM GUEST_MEMFD fixes for 6.8: - Make KVM_MEM_GUEST_MEMFD mutually exclusive with KVM_MEM_READONLY to avoid creating an inconsistent ABI (KVM_MEM_GUEST_MEMFD is not writable from userspace, so there would be no way to write to a read-only guest_memfd). - Update documentation for KVM_SW_PROTECTED_VM to make it abundantly clear that such VMs are purely for development and testing. - Limit KVM_SW_PROTECTED_VM guests to the TDP MMU, as the long term plan is to support confidential VMs with deterministic private memory (SNP and TDX) only in the TDP MMU. - Fix a bug in a GUEST_MEMFD dirty logging test that caused false passes. x86 fixes: - Fix missing marking of a guest page as dirty when emulating an atomic access. - Check for mmu_notifier invalidation events before faulting in the pfn, and before acquiring mmu_lock, to avoid unnecessary work and lock contention with preemptible kernels (including CONFIG_PREEMPT_DYNAMIC in non-preemptible mode). - Disable AMD DebugSwap by default, it breaks VMSA signing and will be re-enabled with a better VM creation API in 6.10. - Do the cache flush of converted pages in svm_register_enc_region() before dropping kvm->lock, to avoid a race with unregistering of the same region and the consequent use-after-free issue" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: SEV: disable SEV-ES DebugSwap by default KVM: x86/mmu: Retry fault before acquiring mmu_lock if mapping is changing KVM: SVM: Flush pages under kvm->lock to fix UAF in svm_register_enc_region() KVM: selftests: Add a testcase to verify GUEST_MEMFD and READONLY are exclusive KVM: selftests: Create GUEST_MEMFD for relevant invalid flags testcases KVM: x86/mmu: Restrict KVM_SW_PROTECTED_VM to the TDP MMU KVM: x86: Update KVM_SW_PROTECTED_VM docs to make it clear they're a WIP KVM: Make KVM_MEM_GUEST_MEMFD mutually exclusive with KVM_MEM_READONLY KVM: x86: Mark target gfn of emulated atomic instruction as dirty
2024-03-10ring-buffer: Fix waking up ring buffer readersSteven Rostedt (Google)1-71/+68
A task can wait on a ring buffer for when it fills up to a specific watermark. The writer will check the minimum watermark that waiters are waiting for and if the ring buffer is past that, it will wake up all the waiters. The waiters are in a wait loop, and will first check if a signal is pending and then check if the ring buffer is at the desired level where it should break out of the loop. If a file that uses a ring buffer closes, and there's threads waiting on the ring buffer, it needs to wake up those threads. To do this, a "wait_index" was used. Before entering the wait loop, the waiter will read the wait_index. On wakeup, it will check if the wait_index is different than when it entered the loop, and will exit the loop if it is. The waker will only need to update the wait_index before waking up the waiters. This had a couple of bugs. One trivial one and one broken by design. The trivial bug was that the waiter checked the wait_index after the schedule() call. It had to be checked between the prepare_to_wait() and the schedule() which it was not. The main bug is that the first check to set the default wait_index will always be outside the prepare_to_wait() and the schedule(). That's because the ring_buffer_wait() doesn't have enough context to know if it should break out of the loop. The loop itself is not needed, because all the callers to the ring_buffer_wait() also has their own loop, as the callers have a better sense of what the context is to decide whether to break out of the loop or not. Just have the ring_buffer_wait() block once, and if it gets woken up, exit the function and let the callers decide what to do next. Link: https://lore.kernel.org/all/CAHk-=whs5MdtNjzFkTyaUy=vHi=qwWgPi0JgTe6OYUYMNSRZfg@mail.gmail.com/ Link: https://lore.kernel.org/linux-trace-kernel/20240308202431.792933613@goodmis.org Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: linke li <lilinke99@qq.com> Cc: Rabin Vincent <rabin@rab.in> Fixes: e30f53aad2202 ("tracing: Do not busy wait in buffer splice") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-03-09Merge tag 'i2c-for-6.8-rc8' of ↵Linus Torvalds3-3/+10
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: "Two patches from Heiner for the i801 are targeting muxes discovered while working on some other features. Essentially, there is a reordering when adding optional slaves and proper cleanup upon registering a mux device. Christophe fixes the exit path in the wmt driver that was leaving the clocks hanging, and the last fix from Tommy avoids false error reports in IRQ" * tag 'i2c-for-6.8-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: aspeed: Fix the dummy irq expected print i2c: wmt: Fix an error handling path in wmt_i2c_probe() i2c: i801: Avoid potential double call to gpiod_remove_lookup_table i2c: i801: Fix using mux_pdev before it's set
2024-03-09Merge tag 'firewire-fixes-6.8-final' of ↵Linus Torvalds1-0/+2
git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394 Pull firewire fix from Takashi Sakamoto: "A fix to suppress a warning about unreleased IRQ for 1394 OHCI hardware when disabling MSI. In Linux kernel v6.5, a PCI driver for 1394 OHCI hardware was optimized into the managed device resources. Edmund Raile points out that the change brings the warning about unreleased IRQ at the call of pci_disable_msi(), since the API expects that the relevant IRQ has already been released in advance. As long as the API is called in .remove callback of PCI device operation, it is prohibited to maintain the IRQ as the part of managed device resource. As a workaround, the IRQ is explicitly released at .remove callback, before the call of pci_disable_msi(). pci_disable_msi() is legacy API nowadays in PCI MSI implementation. I have a plan to replace it with the modern API in the development for the future version of Linux kernel. So at present I keep them as is" * tag 'firewire-fixes-6.8-final' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394: firewire: ohci: prevent leak of left-over IRQ on unbind
2024-03-09SEV: disable SEV-ES DebugSwap by defaultPaolo Bonzini1-2/+5
The DebugSwap feature of SEV-ES provides a way for confidential guests to use data breakpoints. However, because the status of the DebugSwap feature is recorded in the VMSA, enabling it by default invalidates the attestation signatures. In 6.10 we will introduce a new API to create SEV VMs that will allow enabling DebugSwap based on what the user tells KVM to do. Contextually, we will change the legacy KVM_SEV_ES_INIT API to never enable DebugSwap. For compatibility with kernels that pre-date the introduction of DebugSwap, as well as with those where KVM_SEV_ES_INIT will never enable it, do not enable the feature by default. If anybody wants to use it, for now they can enable the sev_es_debug_swap_enabled module parameter, but this will result in a warning. Fixes: d1f85fbe836e ("KVM: SEV: Enable data breakpoints in SEV-ES") Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-03-09Merge tag 'kvm-x86-guest_memfd_fixes-6.8' of ↵Paolo Bonzini5-6/+28
https://github.com/kvm-x86/linux into HEAD KVM GUEST_MEMFD fixes for 6.8: - Make KVM_MEM_GUEST_MEMFD mutually exclusive with KVM_MEM_READONLY to avoid creating ABI that KVM can't sanely support. - Update documentation for KVM_SW_PROTECTED_VM to make it abundantly clear that such VMs are purely a development and testing vehicle, and come with zero guarantees. - Limit KVM_SW_PROTECTED_VM guests to the TDP MMU, as the long term plan is to support confidential VMs with deterministic private memory (SNP and TDX) only in the TDP MMU. - Fix a bug in a GUEST_MEMFD negative test that resulted in false passes when verifying that KVM_MEM_GUEST_MEMFD memslots can't be dirty logged.
2024-03-09Merge tag 'kvm-x86-fixes-6.8-2' of https://github.com/kvm-x86/linux into HEADPaolo Bonzini3-0/+78
KVM x86 fixes for 6.8, round 2: - When emulating an atomic access, mark the gfn as dirty in the memslot to fix a bug where KVM could fail to mark the slot as dirty during live migration, ultimately resulting in guest data corruption due to a dirty page not being re-copied from the source to the target. - Check for mmu_notifier invalidation events before faulting in the pfn, and before acquiring mmu_lock, to avoid unnecessary work and lock contention. Contending mmu_lock is especially problematic on preemptible kernels, as KVM may yield mmu_lock in response to the contention, which severely degrades overall performance due to vCPUs making it difficult for the task that triggered invalidation to make forward progress. Note, due to another kernel bug, this fix isn't limited to preemtible kernels, as any kernel built with CONFIG_PREEMPT_DYNAMIC=y will yield contended rwlocks and spinlocks. https://lore.kernel.org/all/20240110214723.695930-1-seanjc@google.com
2024-03-08Merge tag 'ceph-for-6.8-rc8' of https://github.com/ceph/ceph-clientLinus Torvalds1-0/+3
Pull ceph fix from Ilya Dryomov: "A follow-up for sparse read fixes that went into -rc4 -- msgr2 case was missed and is corrected here" * tag 'ceph-for-6.8-rc8' of https://github.com/ceph/ceph-client: libceph: init the cursor when preparing sparse read in msgr2
2024-03-08Merge tag 'char-misc-6.8-rc8' of ↵Linus Torvalds15-30/+126
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc driver fixes from Greg KH: "Here are a few small char/misc and other driver subsystem fixes for reported issues that have been in my tree. Included in here are fixes for: - iio driver fixes for reported problems - much reported bugfix for a lis3lv02d_i2c regression - comedi driver bugfix - mei new device ids - mei driver fixes - counter core fix All of these have been in linux-next with no reported issues, some for many weeks" * tag 'char-misc-6.8-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: mei: gsc_proxy: match component when GSC is on different bus misc: fastrpc: Pass proper arguments to scm call comedi: comedi_test: Prevent timers rescheduling during deletion comedi: comedi_8255: Correct error in subdevice initialization misc: lis3lv02d_i2c: Fix regulators getting en-/dis-abled twice on suspend/resume iio: accel: adxl367: fix I2C FIFO data register iio: accel: adxl367: fix DEVID read after reset iio: pressure: dlhl60d: Initialize empty DLH bytes iio: imu: inv_mpu6050: fix frequency setting when chip is off iio: pressure: Fixes BMP38x and BMP390 SPI support iio: imu: inv_mpu6050: fix FIFO parsing when empty mei: Add Meteor Lake support for IVSC device mei: me: add arrow lake point H DID mei: me: add arrow lake point S DID counter: fix privdata alignment
2024-03-08Merge tag 'tty-6.8-rc8' of ↵Linus Torvalds6-15/+57
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty / serial fixes from Greg KH: "Here are some small remaining tty/serial driver fixes. Included in here is fixes for: - vt unicode buffer corruption fix - imx serial driver fixes, again - port suspend fix - 8250_dw driver fix - fsl_lpuart driver fix - revert for the qcom_geni_serial driver to fix a reported regression All of these have been in linux-next with no reported issues" * tag 'tty-6.8-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: Revert "tty: serial: simplify qcom_geni_serial_send_chunk_fifo()" tty: serial: fsl_lpuart: avoid idle preamble pending if CTS is enabled vt: fix unicode buffer corruption when deleting characters serial: port: Don't suspend if the port is still busy serial: 8250_dw: Do not reclock if already at correct rate tty: serial: imx: Fix broken RS485
2024-03-08Merge tag 'usb-6.8-rc8' of ↵Linus Torvalds8-20/+47
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB / Thunderbolt fixes from Greg KH: "Here are some small remaining fixes for USB and Thunderbolt drivers. Included in here are fixes for: - thunderbold NULL dereference fix - typec driver fixes - xhci driver regression fix - usb-storage divide-by-0 fix - ncm gadget driver fix All of these have been in linux-next with no reported issues" * tag 'usb-6.8-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: xhci: Fix failure to detect ring expansion need. usb: port: Don't try to peer unused USB ports based on location usb: gadget: ncm: Fix handling of zero block length packets usb: typec: altmodes/displayport: create sysfs nodes as driver's default device attribute group usb: typec: tpcm: Fix PORT_RESET behavior for self powered devices usb: typec: ucsi: fix UCSI on SM8550 & SM8650 Qualcomm devices USB: usb-storage: Prevent divide-by-0 error in isd200_ata_command thunderbolt: Fix NULL pointer dereference in tb_port_update_credits()
2024-03-08Merge tag 'pinctrl-v6.8-3' of ↵Linus Torvalds2-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl Pull pin control fixes from Linus Walleij: - Fix the PM suspend callback in the STM32 ST32MP257 driver to properly support suspend - Drop an extraneous reference put in the debugfs code, this was confusing the reference counts and causing unsolicited calls to __free() * tag 'pinctrl-v6.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: pinctrl: don't put the reference to GPIO device in pinctrl_pins_show() pinctrl: stm32: fix PM support for stm32mp257
2024-03-08Merge tag 'input-for-v6.8-rc7' of ↵Linus Torvalds4-29/+13
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input updates from Dmitry Torokhov: - a revert of endpoint checks in bcm5974 - the driver is being naughty and pokes at unclaimed USB interface, so the check fails. We need to fix the driver to claim both interfaces, and then re-implement the endpoints check - a fix to Synaptics RMI driver to avoid UAF on driver unload or device unbinding - a few new VID/PIDs added to xpad game controller driver - a change to gpio_keys_polled driver to quiet it when GPIO causes probe deferral. * tag 'input-for-v6.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: synaptics-rmi4 - fix UAF of IRQ domain on driver removal Input: gpio_keys_polled - suppress deferred probe error for gpio Revert "Input: bcm5974 - check endpoint type before starting traffic" Input: xpad - add additional HyperX Controller Identifiers
2024-03-08Merge tag 'sound-6.8' of ↵Linus Torvalds9-14/+69
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "A collection of small fixes. Half of them are HD-audio quirks while the rest are various device-specific ASoC fixes" * tag 'sound-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ASoC: wm8962: Fix up incorrect error message in wm8962_set_fll ASoC: wm8962: Enable both SPKOUTR_ENA and SPKOUTL_ENA in mono mode ASoC: wm8962: Enable oscillator if selecting WM8962_FLL_OSC ASoC: dt-bindings: nvidia: Fix 'lge' vendor prefix ALSA: hda/realtek: fix mute/micmute LEDs for HP EliteBook ASoC: amd: yc: Add HP Pavilion Aero Laptop 13-be2xxx(8BD6) into DMI quirk table ASoC: rcar: adg: correct TIMSEL setting for SSI9 ALSA: hda: cs35l41: Overwrite CS35L41 configuration for ASUS UM5302LA ALSA: hda/realtek: Add quirks for Lenovo Thinkbook 16P laptops ALSA: hda: cs35l41: Support Lenovo Thinkbook 16P ALSA: hda/realtek - Add Headset Mic supported Acer NB platform ALSA: hda: optimize the probe codec process ALSA: hda/realtek - Fix headset Mic no show at resume back for Lenovo ALC897 platform ASoC: Intel: bytcr_rt5640: Add an extra entry for the Chuwi Vi8 tablet ASoC: madera: Fix typo in madera_set_fll_clks shift value
2024-03-08Merge tag 'drm-fixes-2024-03-08' of https://gitlab.freedesktop.org/drm/kernelLinus Torvalds24-74/+141
Pull drm fixes from Dave Airlie: "Regular fixes (two weeks for i915), scattered across drivers, amdgpu and i915 being the main ones, with nouveau having a couple of fixes. One patch got applied for udl, but reverted soon after as the maintainer has missed some crucial prior discussion. Seems quiet and normal enough for this stage. MAINTAINERS - update email address core: - fix polling in certain configurations buddy: - fix kunit test warning panel: - boe-tv101wum-nl6: timing tuning fixes i915: - Fix to extract HDCP information from primary connector - Check for NULL mmu_interval_notifier before removing - Fix for #10184: Kernel crash on UHD Graphics 730 (Cc stable) - Fix for #10284: Boot delay regresion with PSR - Fix DP connector DSC HW state readout - Selftest fix to convert msecs to jiffies xe: - error path fix amdgpu: - SMU14 fix - Fix possible NULL pointer - VRR fix - pwm fix nouveau: - fix deadlock in new ioctls fail path - fix missing locking around object rbtree udl: - apply and revert format change" * tag 'drm-fixes-2024-03-08' of https://gitlab.freedesktop.org/drm/kernel: (21 commits) nouveau: lock the client object tree. drm/tests/buddy: fix print format drm/xe: Return immediately on tile_