.. SPDX-License-Identifier: GPL-2.0
=========================================
Overview of the Linux Virtual File System
=========================================
Original author: Richard Gooch <rgooch@atnf.csiro.au>
- Copyright (C) 1999 Richard Gooch
- Copyright (C) 2005 Pekka Enberg
Introduction
============
The Virtual File System (also known as the Virtual Filesystem Switch) is
the software layer in the kernel that provides the filesystem interface
to userspace programs. It also provides an abstraction within the
kernel which allows different filesystem implementations to coexist.
VFS system calls open(2), stat(2), read(2), write(2), chmod(2) and so on
are called from a process context. Filesystem locking is described in
the document Documentation/filesystems/locking.rst.
Directory Entry Cache (dcache)
------------------------------
The VFS implements the open(2), stat(2), chmod(2), and similar system
calls. The pathname argument that is passed to them is used by the VFS
to search through the directory entry cache (also known as the dentry
cache or dcache). This provides a very fast look-up mechanism to
translate a pathname (filename) into a specific dentry. Dentries live
in RAM and are never saved to disc: they exist only for performance.
The dentry cache is meant to be a view into your entire filespace. As
most computers cannot fit all dentries in the RAM at the same time, some
bits of the cache are missing. In order to resolve your pathname into a
dentry, the VFS may have to resort to creating dentries along the way,
and then loading the inode. This is done by looking up the inode.
The Inode Object
----------------
An individual dentry usually has a pointer to an inode. Inodes are
filesystem objects such as regular files, directories, FIFOs and other
beasts. They live either on the disc (for block device filesystems) or
in the memory (for pseudo filesystems). Inodes that live on the disc
are copied into the memory when required and changes to the inode are
written back to disc. A single inode can be pointed to by multiple
dentries (hard links, for example, do this).
To look up an inode requires that the VFS calls the lookup() method of
the parent directory inode. This method is installed by the specific
filesystem implementation that the inode lives in. Once the VFS has the
required dentry (and hence the inode), we can do all those boring things
like open(2) the file, or stat(2) it to peek at the inode data. The
stat(2) operation is fairly simple: once the VFS has the dentry, it
peeks at the inode data and passes some of it back to userspace.
The File Object
---------------
Opening a file requires another operation: allocation of a file
structure (this is the kernel-side implementation of file descriptors).
The freshly allocated file structure is initialized with a pointer to
the dentry and a set of file operation member functions. These are
taken from the inode data. The open() file method is then called so the
specific filesystem implementation can do its work. You can see that
this is another switch performed by the VFS. The file structure is
placed into the file descriptor table for the process.
Reading, writing and closing files (and other assorted VFS operations)
is done by using the userspace file descriptor to grab the appropriate
file structure, and then calling the required file structure method to
do whatever is required. For as long as the file is open, it keeps the
dentry in use, which in turn means that the VFS inode is still in use.
Registering and Mounting a Filesystem
=====================================
To register and unregister a filesystem, use the following API
functions:
.. code-block:: c
#include <linux/fs.h>
extern int register_filesystem(struct file_system_type *);
extern int unregister_filesystem(struct file_system_type *);
The passed struct file_system_type describes your filesystem. When a
request is made to mount a filesystem onto a directory in your
namespace, the VFS will call the appropriate mount() method for the
specific filesystem. New vfsmount referring to