linux.git/drivers/android, branch v5.7.15

binder: Prevent context manager from incrementing ref 0

2020-08-11T13:35:33+00:00

commit 4b836a1426cb0f1ef2a6e211d7e553221594f8fc upstream.

Binder is designed such that a binder_proc never has references to
itself. If this rule is violated, memory corruption can occur when a
process sends a transaction to itself; see e.g.
.

There is a remaining edgecase through which such a transaction-to-self
can still occur from the context of a task with BINDER_SET_CONTEXT_MGR
access:

 - task A opens /dev/binder twice, creating binder_proc instances P1
   and P2
 - P1 becomes context manager
 - P2 calls ACQUIRE on the magic handle 0, allocating index 0 in its
   handle table
 - P1 dies (by closing the /dev/binder fd and waiting a bit)
 - P2 becomes context manager
 - P2 calls ACQUIRE on the magic handle 0, allocating index 1 in its
   handle table
   [this triggers a warning: "binder: 1974:1974 tried to acquire
   reference to desc 0, got 1 instead"]
 - task B opens /dev/binder once, creating binder_proc instance P3
 - P3 calls P2 (via magic handle 0) with (void*)1 as argument (two-way
   transaction)
 - P2 receives the handle and uses it to call P3 (two-way transaction)
 - P3 calls P2 (via magic handle 0) (two-way transaction)
 - P2 calls P2 (via handle 1) (two-way transaction)

And then, if P2 does *NOT* accept the incoming transaction work, but
instead closes the binder fd, we get a crash.

Solve it by preventing the context manager from using ACQUIRE on ref 0.
There shouldn't be any legitimate reason for the context manager to do
that.

Additionally, print a warning if someone manages to find another way to
trigger a transaction-to-self bug in the future.

Cc: stable@vger.kernel.org
Fixes: 457b9a6f09f0 ("Staging: android: add binder driver")
Acked-by: Todd Kjos 
Signed-off-by: Jann Horn 
Reviewed-by: Martijn Coenen 
Link: https://lore.kernel.org/r/20200727120424.1627555-1-jannh@google.com
Signed-off-by: Greg Kroah-Hartman

binder: Don't use mmput() from shrinker function.

2020-07-29T08:19:55+00:00

commit f867c771f98891841c217fa8459244ed0dd28921 upstream.

syzbot is reporting that mmput() from shrinker function has a risk of
deadlock [1], for delayed_uprobe_add() from update_ref_ctr() calls
kzalloc(GFP_KERNEL) with delayed_uprobe_lock held, and
uprobe_clear_state() from __mmput() also holds delayed_uprobe_lock.

Commit a1b2289cef92ef0e ("android: binder: drop lru lock in isolate
callback") replaced mmput() with mmput_async() in order to avoid sleeping
with spinlock held. But this patch replaces mmput() with mmput_async() in
order not to start __mmput() from shrinker context.

[1] https://syzkaller.appspot.com/bug?id=bc9e7303f537c41b2b0cc2dfcea3fc42964c2d45

Reported-by: syzbot 
Reported-by: syzbot 
Signed-off-by: Tetsuo Handa 
Reviewed-by: Michal Hocko 
Acked-by: Todd Kjos 
Acked-by: Christian Brauner 
Cc: stable 
Link: https://lore.kernel.org/r/4ba9adb2-43f5-2de0-22de-f6075c1fab50@i-love.sakura.ne.jp
Signed-off-by: Greg Kroah-Hartman

binder: fix null deref of proc->context

2020-06-30T19:35:54+00:00

commit d35d3660e065b69fdb8bf512f3d899f350afce52 upstream.

The binder driver makes the assumption proc->context pointer is invariant after
initialization (as documented in the kerneldoc header for struct proc).
However, in commit f0fe2c0f050d ("binder: prevent UAF for binderfs devices II")
proc->context is set to NULL during binder_deferred_release().

Another proc was in the middle of setting up a transaction to the dying
process and crashed on a NULL pointer deref on "context" which is a local
set to &proc->context:

    new_ref->data.desc = (node == context->binder_context_mgr_node) ? 0 : 1;

Here's the stack:

[ 5237.855435] Call trace:
[ 5237.855441] binder_get_ref_for_node_olocked+0x100/0x2ec
[ 5237.855446] binder_inc_ref_for_node+0x140/0x280
[ 5237.855451] binder_translate_binder+0x1d0/0x388
[ 5237.855456] binder_transaction+0x2228/0x3730
[ 5237.855461] binder_thread_write+0x640/0x25bc
[ 5237.855466] binder_ioctl_write_read+0xb0/0x464
[ 5237.855471] binder_ioctl+0x30c/0x96c
[ 5237.855477] do_vfs_ioctl+0x3e0/0x700
[ 5237.855482] __arm64_sys_ioctl+0x78/0xa4
[ 5237.855488] el0_svc_common+0xb4/0x194
[ 5237.855493] el0_svc_handler+0x74/0x98
[ 5237.855497] el0_svc+0x8/0xc

The fix is to move the kfree of the binder_device to binder_free_proc()
so the binder_device is freed when we know there are no references
remaining on the binder_proc.

Fixes: f0fe2c0f050d ("binder: prevent UAF for binderfs devices II")
Acked-by: Christian Brauner 
Signed-off-by: Todd Kjos 
Cc: stable 
Link: https://lore.kernel.org/r/20200622200715.114382-1-tkjos@google.com
Signed-off-by: Greg Kroah-Hartman

Merge 5.6-rc7 into char-misc-next

2020-03-23T06:59:38+00:00

We need the char/misc driver fixes in here as well.

Signed-off-by: Greg Kroah-Hartman

binderfs: port to new mount api

2020-03-19T06:41:01+00:00

When I first wrote binderfs the new mount api had not yet landed. Now
that it has been around for a little while and a bunch of filesystems
have already been ported we should do so too. When Al sent his
mount-api-conversion pr he requested that binderfs (and a few others) be
ported separately. It's time we port binderfs. We can make use of the
new option parser, get nicer infrastructure and it will be easier if we
ever add any new mount options.

This survives testing with the binderfs selftests:

for i in `seq 1 1000`; do ./binderfs_test; done

including the new stress tests I sent out for review today:

 TAP version 13
 1..1
 # selftests: filesystems/binderfs: binderfs_test
 # [==========] Running 3 tests from 1 test cases.
 # [ RUN      ] global.binderfs_stress
 # [  XFAIL!  ] Tests are not run as root. Skipping privileged tests
 # [==========] Running 3 tests from 1 test cases.
 # [ RUN      ] global.binderfs_stress
 # [       OK ] global.binderfs_stress
 # [ RUN      ] global.binderfs_test_privileged
 # [       OK ] global.binderfs_test_privileged
 # [ RUN      ] global.binderfs_test_unprivileged
 # # Allocated new binder device with major 243, minor 4, and name my-binder
 # # Detected binder version: 8
 # [==========] Running 3 tests from 1 test cases.
 # [ RUN      ] global.binderfs_stress
 # [       OK ] global.binderfs_stress
 # [ RUN      ] global.binderfs_test_privileged
 # [       OK ] global.binderfs_test_privileged
 # [ RUN      ] global.binderfs_test_unprivileged
 # [       OK ] global.binderfs_test_unprivileged
 # [==========] 3 / 3 tests passed.
 # [  PASSED  ]
 ok 1 selftests: filesystems/binderfs: binderfs_test

Cc: Todd Kjos 
Signed-off-by: Christian Brauner 
Reviewed-by: Kees Cook 
Link: https://lore.kernel.org/r/20200313153427.141789-1-christian.brauner@ubuntu.com
Signed-off-by: Greg Kroah-Hartman

binderfs: use refcount for binder control devices too

2020-03-11T18:33:52+00:00

Binderfs binder-control devices are cleaned up via binderfs_evict_inode
too() which will use refcount_dec_and_test(). However, we missed to set
the refcount for binderfs binder-control devices and so we underflowed
when the binderfs instance got unmounted. Pretty obvious oversight and
should have been part of the more general UAF fix. The good news is that
having test cases (suprisingly) helps.

Technically, we could detect that we're about to cleanup the
binder-control dentry in binderfs_evict_inode() and then simply clean it
up. But that makes the assumption that the binder driver itself will
never make use of a binderfs binder-control device after the binderfs
instance it belongs to has been unmounted and the superblock for it been
destroyed. While it is unlikely to ever come to this let's be on the
safe side. Performance-wise this also really doesn't matter since the
binder-control device is only every really when creating the binderfs
filesystem or creating additional binder devices. Both operations are
pretty rare.

Fixes: f0fe2c0f050d ("binder: prevent UAF for binderfs devices II")
Link: https://lore.kernel.org/r/CA+G9fYusdfg7PMfC9Xce-xLT7NiyKSbgojpK35GOm=Pf9jXXrA@mail.gmail.com
Reported-by: Naresh Kamboju 
Cc: stable@vger.kernel.org
Signed-off-by: Christian Brauner 
Acked-by: Todd Kjos 
Link: https://lore.kernel.org/r/20200311105309.1742827-1-christian.brauner@ubuntu.com
Signed-off-by: Greg Kroah-Hartman

binder: prevent UAF for binderfs devices II

2020-03-03T18:58:37+00:00

This is a necessary follow up to the first fix I proposed and we merged
in 2669b8b0c79 ("binder: prevent UAF for binderfs devices"). I have been
overly optimistic that the simple fix I proposed would work. But alas,
ihold() + iput() won't work since the inodes won't survive the
destruction of the superblock.
So all we get with my prior fix is a different race with a tinier
race-window but it doesn't solve the issue. Fwiw, the problem lies with
generic_shutdown_super(). It even has this cozy Al-style comment:

          if (!list_empty(&sb->s_inodes)) {
                  printk("VFS: Busy inodes after unmount of %s. "
                     "Self-destruct in 5 seconds.  Have a nice day...\n",
                     sb->s_id);
          }

On binder_release(), binder_defer_work(proc, BINDER_DEFERRED_RELEASE) is
called which punts the actual cleanup operation to a workqueue. At some
point, binder_deferred_func() will be called which will end up calling
binder_deferred_release() which will retrieve and cleanup the
binder_context attach to this struct binder_proc.

If we trace back where this binder_context is attached to binder_proc we
see that it is set in binder_open() and is taken from the struct
binder_device it is associated with. This obviously assumes that the
struct binder_device that context is attached to is _never_ freed. While
that might be true for devtmpfs binder devices it is most certainly
wrong for binderfs binder devices.

So, assume binder_open() is called on a binderfs binder devices. We now
stash away the struct binder_context associated with that struct
binder_devices:
	proc->context = &binder_dev->context;
	/* binderfs stashes devices in i_private */
	if (is_binderfs_device(nodp)) {
		binder_dev = nodp->i_private;
		info = nodp->i_sb->s_fs_info;
		binder_binderfs_dir_entry_proc = info->proc_log_dir;
	} else {
	.
	.
	.
	proc->context = &binder_dev->context;

Now let's assume that the binderfs instance for that binder devices is
shutdown via umount() and/or the mount namespace associated with it goes
away. As long as there is still an fd open for that binderfs binder
device things are fine. But let's assume we now close the last fd for
that binderfs binder device. Now binder_release() is called and punts to
the workqueue. Assume that the workqueue has quite a bit of stuff to do
and doesn't get to cleaning up the struct binder_proc and the associated
struct binder_context with it for that binderfs binder device right
away. In the meantime, the VFS is killing the super block and is
ultimately calling sb->evict_inode() which means it will call
binderfs_evict_inode() which does:

static void binderfs_evict_inode(struct inode *inode)
{
	struct binder_device *device = inode->i_private;
	struct binderfs_info *info = BINDERFS_I(inode);

	clear_inode(inode);

	if (!S_ISCHR(inode->i_mode) || !device)
		return;

	mutex_lock(&binderfs_minors_mutex);
	--info->device_count;
	ida_free(&binderfs_minors, device->miscdev.minor);
	mutex_unlock(&binderfs_minors_mutex);

	kfree(device->context.name);
	kfree(device);
}

thereby freeing the struct binder_device including struct
binder_context.

Now the workqueue finally has time to get around to cleaning up struct
binder_proc and is now trying to access the associate struct
binder_context. Since it's already freed it will OOPs.

Fix this by introducing a refounct on binder devices.

This is an alternative fix to 51d8a7eca677 ("binder: prevent UAF read in
print_binder_transaction_log_entry()").

Fixes: 3ad20fe393b3 ("binder: implement binderfs")
Fixes: 2669b8b0c798 ("binder: prevent UAF for binderfs devices")
Fixes: 03e2e07e3814 ("binder: Make transaction_log available in binderfs")
Related : 51d8a7eca677 ("binder: prevent UAF read in print_binder_transaction_log_entry()")
Cc: stable@vger.kernel.org
Signed-off-by: Christian Brauner 
Acked-by: Todd Kjos 
Link: https://lore.kernel.org/r/20200303164340.670054-1-christian.brauner@ubuntu.com
Signed-off-by: Greg Kroah-Hartman

binder: prevent UAF for binderfs devices

2020-03-03T07:02:57+00:00

On binder_release(), binder_defer_work(proc, BINDER_DEFERRED_RELEASE) is
called which punts the actual cleanup operation to a workqueue. At some
point, binder_deferred_func() will be called which will end up calling
binder_deferred_release() which will retrieve and cleanup the
binder_context attach to this struct binder_proc.

If we trace back where this binder_context is attached to binder_proc we
see that it is set in binder_open() and is taken from the struct
binder_device it is associated with. This obviously assumes that the
struct binder_device that context is attached to is _never_ freed. While
that might be true for devtmpfs binder devices it is most certainly
wrong for binderfs binder devices.

So, assume binder_open() is called on a binderfs binder devices. We now
stash away the struct binder_context associated with that struct
binder_devices:
	proc->context = &binder_dev->context;
	/* binderfs stashes devices in i_private */
	if (is_binderfs_device(nodp)) {
		binder_dev = nodp->i_private;
		info = nodp->i_sb->s_fs_info;
		binder_binderfs_dir_entry_proc = info->proc_log_dir;
	} else {
	.
	.
	.
	proc->context = &binder_dev->context;

Now let's assume that the binderfs instance for that binder devices is
shutdown via umount() and/or the mount namespace associated with it goes
away. As long as there is still an fd open for that binderfs binder
device things are fine. But let's assume we now close the last fd for
that binderfs binder device. Now binder_release() is called and punts to
the workqueue. Assume that the workqueue has quite a bit of stuff to do
and doesn't get to cleaning up the struct binder_proc and the associated
struct binder_context with it for that binderfs binder device right
away. In the meantime, the VFS is killing the super block and is
ultimately calling sb->evict_inode() which means it will call
binderfs_evict_inode() which does:

static void binderfs_evict_inode(struct inode *inode)
{
	struct binder_device *device = inode->i_private;
	struct binderfs_info *info = BINDERFS_I(inode);

	clear_inode(inode);

	if (!S_ISCHR(inode->i_mode) || !device)
		return;

	mutex_lock(&binderfs_minors_mutex);
	--info->device_count;
	ida_free(&binderfs_minors, device->miscdev.minor);
	mutex_unlock(&binderfs_minors_mutex);

	kfree(device->context.name);
	kfree(device);
}

thereby freeing the struct binder_device including struct
binder_context.

Now the workqueue finally has time to get around to cleaning up struct
binder_proc and is now trying to access the associate struct
binder_context. Since it's already freed it will OOPs.

Fix this by holding an additional reference to the inode that is only
released once the workqueue is done cleaning up struct binder_proc. This
is an easy alternative to introducing separate refcounting on struct
binder_device which we can always do later if it becomes necessary.

This is an alternative fix to 51d8a7eca677 ("binder: prevent UAF read in
print_binder_transaction_log_entry()").

Fixes: 3ad20fe393b3 ("binder: implement binderfs")
Fixes: 03e2e07e3814 ("binder: Make transaction_log available in binderfs")
Related : 51d8a7eca677 ("binder: prevent UAF read in print_binder_transaction_log_entry()")
Cc: stable@vger.kernel.org
Signed-off-by: Christian Brauner 
Acked-by: Todd Kjos 
Signed-off-by: Greg Kroah-Hartman

Merge tag 'for-5.6/io_uring-vfs-2020-01-29' of git://git.kernel.dk/linux-block

2020-01-30T02:53:37+00:00

Pull io_uring updates from Jens Axboe:

 - Support for various new opcodes (fallocate, openat, close, statx,
   fadvise, madvise, openat2, non-vectored read/write, send/recv, and
   epoll_ctl)

 - Faster ring quiesce for fileset updates

 - Optimizations for overflow condition checking

 - Support for max-sized clamping

 - Support for probing what opcodes are supported

 - Support for io-wq backend sharing between "sibling" rings

 - Support for registering personalities

 - Lots of little fixes and improvements

* tag 'for-5.6/io_uring-vfs-2020-01-29' of git://git.kernel.dk/linux-block: (64 commits)
  io_uring: add support for epoll_ctl(2)
  eventpoll: support non-blocking do_epoll_ctl() calls
  eventpoll: abstract out epoll_ctl() handler
  io_uring: fix linked command file table usage
  io_uring: support using a registered personality for commands
  io_uring: allow registering credentials
  io_uring: add io-wq workqueue sharing
  io-wq: allow grabbing existing io-wq
  io_uring/io-wq: don't use static creds/mm assignments
  io-wq: make the io_wq ref counted
  io_uring: fix refcounting with batched allocations at OOM
  io_uring: add comment for drain_next
  io_uring: don't attempt to copy iovec for READ/WRITE
  io_uring: honor IOSQE_ASYNC for linked reqs
  io_uring: prep req when do IOSQE_ASYNC
  io_uring: use labeled array init in io_op_defs
  io_uring: optimise sqe-to-req flags translation
  io_uring: remove REQ_F_IO_DRAINED
  io_uring: file switch work needs to get flushed on exit
  io_uring: hide uring_fd in ctx
  ...

binder: fix log spam for existing debugfs file creation.

2020-01-22T14:14:24+00:00

Since commit 43e23b6c0b01 ("debugfs: log errors when something goes wrong")
debugfs logs attempts to create existing files.

However binder attempts to create multiple debugfs files with
the same name when a single PID has multiple contexts, this leads
to log spamming during an Android boot (17 such messages during
boot on my system).

Fix this by checking if we already know the PID and only create
the debugfs entry for the first context per PID.

Do the same thing for binderfs for symmetry.

Signed-off-by: Martin Fuzzey 
Acked-by: Todd Kjos 
Fixes: 43e23b6c0b01 ("debugfs: log errors when something goes wrong")
Cc: stable 
Link: https://lore.kernel.org/r/1578671054-5982-1-git-send-email-martin.fuzzey@flowbird.group
Signed-off-by: Greg Kroah-Hartman