fscache: Rewrite documentation

Rewrite the fscache documentation.

Changes
=======
ver #3:
 - The volume coherency data is now an arbitrarily-sized blob, not a u64.

ver #2:
 - Put quoting around some bits of C being referred to in the docs[1].
 - Stripped the markup off the ref to the netfs lib doc[2].

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
cc: linux-cachefs@redhat.com
Link: https://lore.kernel.org/r/20211130175119.63d0e7aa@canb.auug.org.au/ [1]
Link: https://lore.kernel.org/r/20211130162311.105fcfa5@canb.auug.org.au/ [2]
Link: https://lore.kernel.org/r/163819672252.215744.15454333549935901588.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/163906986754.143852.17703291789683936950.stgit@warthog.procyon.org.uk/ # v2
Link: https://lore.kernel.org/r/163967193834.1823006.15991526817786159772.stgit@warthog.procyon.org.uk/ # v3
Link: https://lore.kernel.org/r/164021585970.640689.3162537597817521032.stgit@warthog.procyon.org.uk/ # v4
This commit is contained in:
David Howells 2021-11-10 13:25:03 +00:00
parent 1702e79734
commit e0484344c0
8 changed files with 928 additions and 2358 deletions

File diff suppressed because it is too large Load Diff

View File

@ -1,8 +1,8 @@
.. SPDX-License-Identifier: GPL-2.0
===============================================
CacheFiles: CACHE ON ALREADY MOUNTED FILESYSTEM
===============================================
===================================
Cache on Already Mounted Filesystem
===================================
.. Contents:

View File

@ -10,25 +10,25 @@ Overview
This facility is a general purpose cache for network filesystems, though it
could be used for caching other things such as ISO9660 filesystems too.
FS-Cache mediates between cache backends (such as CacheFS) and network
FS-Cache mediates between cache backends (such as CacheFiles) and network
filesystems::
+---------+
| | +--------------+
| NFS |--+ | |
| | | +-->| CacheFS |
+---------+ | +----------+ | | /dev/hda5 |
| | | | +--------------+
+---------+ +-->| | |
| | | |--+
| AFS |----->| FS-Cache |
| | | |--+
+---------+ +-->| | |
| | | | +--------------+
+---------+ | +----------+ | | |
| | | +-->| CacheFiles |
| ISOFS |--+ | /var/cache |
| | +--------------+
| | +--------------+
| NFS |--+ | |
| | | +-->| CacheFS |
+---------+ | +----------+ | | /dev/hda5 |
| | | | +--------------+
+---------+ +-------------->| | |
| | +-------+ | |--+
| AFS |----->| | | FS-Cache |
| | | netfs |-->| |--+
+---------+ +-->| lib | | | |
| | | | | | +--------------+
+---------+ | +-------+ +----------+ | | |
| | | +-->| CacheFiles |
| 9P |--+ | /var/cache |
| | +--------------+
+---------+
Or to look at it another way, FS-Cache is a module that provides a caching
@ -84,101 +84,62 @@ then serving the pages out of that cache rather than the netfs inode because:
one-off access of a small portion of it (such as might be done with the
"file" program).
It instead serves the cache out in PAGE_SIZE chunks as and when requested by
the netfs('s) using it.
It instead serves the cache out in chunks as and when requested by the netfs
using it.
FS-Cache provides the following facilities:
(1) More than one cache can be used at once. Caches can be selected
* More than one cache can be used at once. Caches can be selected
explicitly by use of tags.
(2) Caches can be added / removed at any time.
* Caches can be added / removed at any time, even whilst being accessed.
(3) The netfs is provided with an interface that allows either party to
* The netfs is provided with an interface that allows either party to
withdraw caching facilities from a file (required for (2)).
(4) The interface to the netfs returns as few errors as possible, preferring
* The interface to the netfs returns as few errors as possible, preferring
rather to let the netfs remain oblivious.
(5) Cookies are used to represent indices, files and other objects to the
netfs. The simplest cookie is just a NULL pointer - indicating nothing
cached there.
* There are three types of cookie: cache, volume and data file cookies.
Cache cookies represent the cache as a whole and are not normally visible
to the netfs; the netfs gets a volume cookie to represent a collection of
files (typically something that a netfs would get for a superblock); and
data file cookies are used to cache data (something that would be got for
an inode).
(6) The netfs is allowed to propose - dynamically - any index hierarchy it
desires, though it must be aware that the index search function is
recursive, stack space is limited, and indices can only be children of
indices.
* Volumes are matched using a key. This is a printable string that is used
to encode all the information that might be needed to distinguish one
superblock, say, from another. This would be a compound of things like
cell name or server address, volume name or share path. It must be a
valid pathname.
(7) Data I/O is done direct to and from the netfs's pages. The netfs
indicates that page A is at index B of the data-file represented by cookie
C, and that it should be read or written. The cache backend may or may
not start I/O on that page, but if it does, a netfs callback will be
invoked to indicate completion. The I/O may be either synchronous or
asynchronous.
* Cookies are matched using a key. This is a binary blob and is used to
represent the object within a volume (so the volume key need not form
part of the blob). This might include things like an inode number and
uniquifier or a file handle.
(8) Cookies can be "retired" upon release. At this point FS-Cache will mark
them as obsolete and the index hierarchy rooted at that point will get
recycled.
* Cookie resources are set up and pinned by marking the cookie in-use.
This prevents the backing resources from being culled. Timed garbage
collection is employed to eliminate cookies that haven't been used for a
short while, thereby reducing resource overload. This is intended to be
used when a file is opened or closed.
(9) The netfs provides a "match" function for index searches. In addition to
saying whether a match was made or not, this can also specify that an
entry should be updated or deleted.
A cookie can be marked in-use multiple times simultaneously; each mark
must be unused.
(10) As much as possible is done asynchronously.
* Begin/end access functions are provided to delay cache withdrawal for the
duration of an operation and prevent structs from being freed whilst
we're looking at them.
* Data I/O is done by asynchronous DIO to/from a buffer described by the
netfs using an iov_iter.
FS-Cache maintains a virtual indexing tree in which all indices, files, objects
and pages are kept. Bits of this tree may actually reside in one or more
caches::
* An invalidation facility is available to discard data from the cache and
to deal with I/O that's in progress that is accessing old data.
FSDEF
|
+------------------------------------+
| |
NFS AFS
| |
+--------------------------+ +-----------+
| | | |
homedir mirror afs.org redhat.com
| | |
+------------+ +---------------+ +----------+
| | | | | |
00001 00002 00007 00125 vol00001 vol00002
| | | | |
+---+---+ +-----+ +---+ +------+------+ +-----+----+
| | | | | | | | | | | | |
PG0 PG1 PG2 PG0 XATTR PG0 PG1 DIRENT DIRENT DIRENT R/W R/O Bak
| |
PG0 +-------+
| |
00001 00003
|
+---+---+
| | |
PG0 PG1 PG2
In the example above, you can see two netfs's being backed: NFS and AFS. These
have different index hierarchies:
* The NFS primary index contains per-server indices. Each server index is
indexed by NFS file handles to get data file objects. Each data file
objects can have an array of pages, but may also have further child
objects, such as extended attributes and directory entries. Extended
attribute objects themselves have page-array contents.
* The AFS primary index contains per-cell indices. Each cell index contains
per-logical-volume indices. Each of volume index contains up to three
indices for the read-write, read-only and backup mirrors of those volumes.
Each of these contains vnode data file objects, each of which contains an
array of pages.
The very top index is the FS-Cache master index in which individual netfs's
have entries.
Any index object may reside in more than one cache, provided it only has index
children. Any index with non-index object children will be assumed to only
reside in one cache.
* Cookies can be "retired" upon release, thereby causing the object to be
removed from the cache.
The netfs API to FS-Cache can be found in:
@ -189,11 +150,6 @@ The cache backend API to FS-Cache can be found in:
Documentation/filesystems/caching/backend-api.rst
A description of the internal representations and object state machine can be
found in:
Documentation/filesystems/caching/object.rst
Statistical Information
=======================
@ -201,333 +157,162 @@ Statistical Information
If FS-Cache is compiled with the following options enabled::
CONFIG_FSCACHE_STATS=y
CONFIG_FSCACHE_HISTOGRAM=y
then it will gather certain statistics and display them through a number of
proc files.
then it will gather certain statistics and display them through:
/proc/fs/fscache/stats
----------------------
/proc/fs/fscache/stats
This shows counts of a number of events that can happen in FS-Cache:
This shows counts of a number of events that can happen in FS-Cache:
+--------------+-------+-------------------------------------------------------+
|CLASS |EVENT |MEANING |
+==============+=======+=======================================================+
|Cookies |idx=N |Number of index cookies allocated |
|Cookies |n=N |Number of data storage cookies allocated |
+ +-------+-------------------------------------------------------+
| |dat=N |Number of data storage cookies allocated |
| |v=N |Number of volume index cookies allocated |
+ +-------+-------------------------------------------------------+
| |spc=N |Number of special cookies allocated |
+--------------+-------+-------------------------------------------------------+
|Objects |alc=N |Number of objects allocated |
| |vcol=N |Number of volume index key collisions |
+ +-------+-------------------------------------------------------+
| |nal=N |Number of object allocation failures |
+ +-------+-------------------------------------------------------+
| |avl=N |Number of objects that reached the available state |
+ +-------+-------------------------------------------------------+
| |ded=N |Number of objects that reached the dead state |
+--------------+-------+-------------------------------------------------------+
|ChkAux |non=N |Number of objects that didn't have a coherency check |
+ +-------+-------------------------------------------------------+
| |ok=N |Number of objects that passed a coherency check |
+ +-------+-------------------------------------------------------+
| |upd=N |Number of objects that needed a coherency data update |
+ +-------+-------------------------------------------------------+
| |obs=N |Number of objects that were declared obsolete |
+--------------+-------+-------------------------------------------------------+
|Pages |mrk=N |Number of pages marked as being cached |
| |unc=N |Number of uncache page requests seen |
| |voom=N |Number of OOM events when allocating volume cookies |
+--------------+-------+-------------------------------------------------------+
|Acquire |n=N |Number of acquire cookie requests seen |
+ +-------+-------------------------------------------------------+
| |nul=N |Number of acq reqs given a NULL parent |
+ +-------+-------------------------------------------------------+
| |noc=N |Number of acq reqs rejected due to no cache available |
+ +-------+-------------------------------------------------------+
| |ok=N |Number of acq reqs succeeded |
+ +-------+-------------------------------------------------------+
| |nbf=N |Number of acq reqs rejected due to error |
+ +-------+-------------------------------------------------------+
| |oom=N |Number of acq reqs failed on ENOMEM |
+--------------+-------+-------------------------------------------------------+
|Lookups |n=N |Number of lookup calls made on cache backends |
|LRU |n=N |Number of cookies currently on the LRU |
+ +-------+-------------------------------------------------------+
| |neg=N |Number of negative lookups made |
| |exp=N |Number of cookies expired off of the LRU |
+ +-------+-------------------------------------------------------+
| |pos=N |Number of positive lookups made |
| |rmv=N |Number of cookies removed from the LRU |
+ +-------+-------------------------------------------------------+
| |crt=N |Number of objects created by lookup |
| |drp=N |Number of LRU'd cookies relinquished/withdrawn |
+ +-------+-------------------------------------------------------+
| |tmo=N |Number of lookups timed out and requeued |
| |at=N |Time till next LRU cull (jiffies) |
+--------------+-------+-------------------------------------------------------+
|Invals |n=N |Number of invalidations |
+--------------+-------+-------------------------------------------------------+
|Updates |n=N |Number of update cookie requests seen |
+ +-------+-------------------------------------------------------+
| |nul=N |Number of upd reqs given a NULL parent |
| |rsz=N |Number of resize requests |
+ +-------+-------------------------------------------------------+
| |run=N |Number of upd reqs granted CPU time |
| |rsn=N |Number of skipped resize requests |
+--------------+-------+-------------------------------------------------------+
|Relinqs |n=N |Number of relinquish cookie requests seen |
+ +-------+-------------------------------------------------------+
| |nul=N |Number of rlq reqs given a NULL parent |
| |rtr=N |Number of rlq reqs with retire=true |
+ +-------+-------------------------------------------------------+
| |wcr=N |Number of rlq reqs waited on completion of creation |
| |drop=N |Number of cookies no longer blocking re-acquisition |
+--------------+-------+-------------------------------------------------------+
|AttrChg |n=N |Number of attribute changed requests seen |
|NoSpace |nwr=N |Number of write requests refused due to lack of space |
+ +-------+-------------------------------------------------------+
| |ok=N |Number of attr changed requests queued |
| |ncr=N |Number of create requests refused due to lack of space |
+ +-------+-------------------------------------------------------+
| |nbf=N |Number of attr changed rejected -ENOBUFS |
+ +-------+-------------------------------------------------------+
| |oom=N |Number of attr changed failed -ENOMEM |
+ +-------+-------------------------------------------------------+
| |run=N |Number of attr changed ops given CPU time |
| |cull=N |Number of objects culled to make space |
+--------------+-------+-------------------------------------------------------+
|Allocs |n=N |Number of allocation requests seen |
|IO |rd=N |Number of read operations in the cache |
+ +-------+-------------------------------------------------------+
| |ok=N |Number of successful alloc reqs |
+ +-------+-------------------------------------------------------+
| |wt=N |Number of alloc reqs that waited on lookup completion |
+ +-------+-------------------------------------------------------+
| |nbf=N |Number of alloc reqs rejected -ENOBUFS |
+ +-------+-------------------------------------------------------+
| |int=N |Number of alloc reqs aborted -ERESTARTSYS |
+ +-------+-------------------------------------------------------+
| |ops=N |Number of alloc reqs submitted |
+ +-------+-------------------------------------------------------+
| |owt=N |Number of alloc reqs waited for CPU time |
+ +-------+-------------------------------------------------------+
| |abt=N |Number of alloc reqs aborted due to object death |
+--------------+-------+-------------------------------------------------------+
|Retrvls |n=N |Number of retrieval (read) requests seen |
+ +-------+-------------------------------------------------------+
| |ok=N |Number of successful retr reqs |
+ +-------+-------------------------------------------------------+
| |wt=N |Number of retr reqs that waited on lookup completion |
+ +-------+-------------------------------------------------------+
| |nod=N |Number of retr reqs returned -ENODATA |
+ +-------+-------------------------------------------------------+
| |nbf=N |Number of retr reqs rejected -ENOBUFS |
+ +-------+-------------------------------------------------------+
| |int=N |Number of retr reqs aborted -ERESTARTSYS |
+ +-------+-------------------------------------------------------+
| |oom=N |Number of retr reqs failed -ENOMEM |
+ +-------+-------------------------------------------------------+
| |ops=N |Number of retr reqs submitted |
+ +-------+-------------------------------------------------------+
| |owt=N |Number of retr reqs waited for CPU time |
+ +-------+-------------------------------------------------------+
| |abt=N |Number of retr reqs aborted due to object death |
+--------------+-------+-------------------------------------------------------+
|Stores |n=N |Number of storage (write) requests seen |
+ +-------+-------------------------------------------------------+
| |ok=N |Number of successful store reqs |
+ +-------+-------------------------------------------------------+
| |agn=N |Number of store reqs on a page already pending storage |
+ +-------+-------------------------------------------------------+
| |nbf=N |Number of store reqs rejected -ENOBUFS |
+ +-------+-------------------------------------------------------+
| |oom=N |Number of store reqs failed -ENOMEM |
+ +-------+-------------------------------------------------------+
| |ops=N |Number of store reqs submitted |
+ +-------+-------------------------------------------------------+
| |run=N |Number of store reqs granted CPU time |
+ +-------+-------------------------------------------------------+
| |pgs=N |Number of pages given store req processing time |
+ +-------+-------------------------------------------------------+
| |rxd=N |Number of store reqs deleted from tracking tree |
+ +-------+-------------------------------------------------------+
| |olm=N |Number of store reqs over store limit |
+--------------+-------+-------------------------------------------------------+
|VmScan |nos=N |Number of release reqs against pages with no |
| | |pending store |
+ +-------+-------------------------------------------------------+
| |gon=N |Number of release reqs against pages stored by |
| | |time lock granted |
+ +-------+-------------------------------------------------------+
| |bsy=N |Number of release reqs ignored due to in-progress store|
+ +-------+-------------------------------------------------------+
| |can=N |Number of page stores cancelled due to release req |
+--------------+-------+-------------------------------------------------------+
|Ops |pend=N |Number of times async ops added to pending queues |
+ +-------+-------------------------------------------------------+
| |run=N |Number of times async ops given CPU time |
+ +-------+-------------------------------------------------------+
| |enq=N |Number of times async ops queued for processing |
+ +-------+-------------------------------------------------------+
| |can=N |Number of async ops cancelled |
+ +-------+-------------------------------------------------------+
| |rej=N |Number of async ops rejected due to object |
| | |lookup/create failure |
+ +-------+-------------------------------------------------------+
| |ini=N |Number of async ops initialised |
+ +-------+-------------------------------------------------------+
| |dfr=N |Number of async ops queued for deferred release |
+ +-------+-------------------------------------------------------+
| |rel=N |Number of async ops released |
| | |(should equal ini=N when idle) |
+ +-------+-------------------------------------------------------+
| |gc=N |Number of deferred-release async ops garbage collected |
+--------------+-------+-------------------------------------------------------+
|CacheOp |alo=N |Number of in-progress alloc_object() cache ops |
+ +-------+-------------------------------------------------------+
| |luo=N |Number of in-progress lookup_object() cache ops |
+ +-------+-------------------------------------------------------+
| |luc=N |Number of in-progress lookup_complete() cache ops |
+ +-------+-------------------------------------------------------+
| |gro=N |Number of in-progress grab_object() cache ops |
+ +-------+-------------------------------------------------------+
| |upo=N |Number of in-progress update_object() cache ops |
+ +-------+-------------------------------------------------------+
| |dro=N |Number of in-progress drop_object() cache ops |
+ +-------+-------------------------------------------------------+
| |pto=N |Number of in-progress put_object() cache ops |
+ +-------+-------------------------------------------------------+
| |syn=N |Number of in-progress sync_cache() cache ops |
+ +-------+-------------------------------------------------------+
| |atc=N |Number of in-progress attr_changed() cache ops |
+ +-------+-------------------------------------------------------+
| |rap=N |Number of in-progress read_or_alloc_page() cache ops |
+ +-------+-------------------------------------------------------+
| |ras=N |Number of in-progress read_or_alloc_pages() cache ops |
+ +-------+-------------------------------------------------------+
| |alp=N |Number of in-progress allocate_page() cache ops |
+ +-------+-------------------------------------------------------+
| |als=N |Number of in-progress allocate_pages() cache ops |
+ +-------+-------------------------------------------------------+
| |wrp=N |Number of in-progress write_page() cache ops |
+ +-------+-------------------------------------------------------+
| |ucp=N |Number of in-progress uncache_page() cache ops |
+ +-------+-------------------------------------------------------+
| |dsp=N |Number of in-progress dissociate_pages() cache ops |
+--------------+-------+-------------------------------------------------------+
|CacheEv |nsp=N |Number of object lookups/creations rejected due to |
| | |lack of space |
+ +-------+-------------------------------------------------------+
| |stl=N |Number of stale objects deleted |
+ +-------+-------------------------------------------------------+
| |rtr=N |Number of objects retired when relinquished |
+ +-------+-------------------------------------------------------+
| |cul=N |Number of objects culled |
| |wr=N |Number of write operations in the cache |
+--------------+-------+-------------------------------------------------------+
Netfslib will also add some stats counters of its own.
/proc/fs/fscache/histogram
--------------------------
Cache List
==========
::
FS-Cache provides a list of cache cookies:
cat /proc/fs/fscache/histogram
JIFS SECS OBJ INST OP RUNS OBJ RUNS RETRV DLY RETRIEVLS
===== ===== ========= ========= ========= ========= =========
This shows the breakdown of the number of times each amount of time
between 0 jiffies and HZ-1 jiffies a variety of tasks took to run. The
columns are as follows:
========= =======================================================
COLUMN TIME MEASUREMENT
========= =======================================================
OBJ INST Length of time to instantiate an object
OP RUNS Length of time a call to process an operation took
OBJ RUNS Length of time a call to process an object event took
RETRV DLY Time between an requesting a read and lookup completing
RETRIEVLS Time between beginning and end of a retrieval
========= =======================================================
Each row shows the number of events that took a particular range of times.
Each step is 1 jiffy in size. The JIFS column indicates the particular
jiffy range covered, and the SECS field the equivalent number of seconds.
Object List
===========
If CONFIG_FSCACHE_OBJECT_LIST is enabled, the FS-Cache facility will maintain a
list of all the objects currently allocated and allow them to be viewed
through::
/proc/fs/fscache/objects
/proc/fs/fscache/cookies
This will look something like::
[root@andromeda ~]# head /proc/fs/fscache/objects
OBJECT PARENT STAT CHLDN OPS OOP IPR EX READS EM EV F S | NETFS_COOKIE_DEF TY FL NETFS_DATA OBJECT_KEY, AUX_DATA
======== ======== ==== ===== === === === == ===== == == = = | ================ == == ================ ================
17e4b 2 ACTV 0 0 0 0 0 0 7b 4 0 0 | NFS.fh DT 0 ffff88001dd82820 010006017edcf8bbc93b43298fdfbe71e50b57b13a172c0117f38472, e567634700000000000000000000000063f2404a000000000000000000000000c9030000000000000000000063f2404a
1693a 2 ACTV 0 0 0 0 0 0 7b 4 0 0 | NFS.fh DT 0 ffff88002db23380 010006017edcf8bbc93b43298fdfbe71e50b57b1e0162c01a2df0ea6, 420ebc4a000000000000000000000000420ebc4a0000000000000000000000000e1801000000000000000000420ebc4a
# cat /proc/fs/fscache/caches
CACHE REF VOLS OBJS ACCES S NAME
======== ===== ===== ===== ===== = ===============
00000001 2 1 2123 1 A default
where the first set of columns before the '|' describe the object:
where the columns are:
======= ===============================================================
COLUMN DESCRIPTION
======= ===============================================================
OBJECT Object debugging ID (appears as OBJ%x in some debug messages)
PARENT Debugging ID of parent object
STAT Object state
CHLDN Number of child objects of this object
OPS Number of outstanding operations on this object
OOP Number of outstanding child object management operations
IPR
EX Number of outstanding exclusive operations
READS Number of outstanding read operations
EM Object's event mask
EV Events raised on this object
F Object flags
S Object work item busy state mask (1:pending 2:running)
CACHE Cache cookie debug ID (also appears in traces)
REF Number of references on the cache cookie
VOLS Number of volumes cookies in this cache
OBJS Number of cache objects in use
ACCES Number of accesses pinning the cache
S State
NAME Name of the cache.
======= ===============================================================
and the second set of columns describe the object's cookie, if present:
The state can be (-) Inactive, (P)reparing, (A)ctive, (E)rror or (W)ithdrawing.
================ ======================================================
COLUMN DESCRIPTION
================ ======================================================
NETFS_COOKIE_DEF Name of netfs cookie definition
TY Cookie type (IX - index, DT - data, hex - special)
FL Cookie flags
NETFS_DATA Netfs private data stored in the cookie
OBJECT_KEY Object key } 1 column, with separating comma
AUX_DATA Object aux data } presence may be configured
================ ======================================================
The data shown may be filtered by attaching the a key to an appropriate keyring
before viewing the file. Something like::
Volume List
===========
keyctl add user fscache:objlist <restrictions> @s
FS-Cache provides a list of volume cookies:
where <restrictions> are a selection of the following letters:
/proc/fs/fscache/volumes
== =========================================================
K Show hexdump of object key (don't show if not given)
A Show hexdump of object aux data (don't show if not given)
== =========================================================
This will look something like::
and the following paired letters:
VOLUME REF nCOOK ACC FL CACHE KEY
======== ===== ===== === == =============== ================
00000001 55 54 1 00 default afs,example.com,100058
== =========================================================
C Show objects that have a cookie
c Show objects that don't have a cookie
B Show objects that are busy
b Show objects that aren't busy
W Show objects that have pending writes
w Show objects that don't have pending writes
R Show objects that have outstanding reads
r Show objects that don't have outstanding reads
S Show objects that have work queued
s Show objects that don't have work queued
== =========================================================
where the columns are:
If neither side of a letter pair is given, then both are implied. For example:
======= ===============================================================
COLUMN DESCRIPTION
======= ===============================================================
VOLUME The volume cookie debug ID (also appears in traces)
REF Number of references on the volume cookie
nCOOK Number of cookies in the volume
ACC Number of accesses pinning the cache
FL Flags on the volume cookie
CACHE Name of the cache or "-"
KEY The indexing key for the volume
======= ===============================================================
keyctl add user fscache:objlist KB @s
shows objects that are busy, and lists their object keys, but does not dump
their auxiliary data. It also implies "CcWwRrSs", but as 'B' is given, 'b' is
not implied.
Cookie List
===========
By default all objects and all fields will be shown.
FS-Cache provides a list of cookies:
/proc/fs/fscache/cookies
This will look something like::
# head /proc/fs/fscache/cookies
COOKIE VOLUME REF ACT ACC S FL DEF
======== ======== === === === = == ================
00000435 00000001 1 0 -1 - 08 0000000201d080070000000000000000, 0000000000000000
00000436 00000001 1 0 -1 - 00 0000005601d080080000000000000000, 0000000000000051
00000437 00000001 1 0 -1 - 08 00023b3001d0823f0000000000000000, 0000000000000000
00000438 00000001 1 0 -1 - 08 0000005801d0807b0000000000000000, 0000000000000000
00000439 00000001 1 0 -1 - 08 00023b3201d080a10000000000000000, 0000000000000000
0000043a 00000001 1 0 -1 - 08 00023b3401d080a30000000000000000, 0000000000000000
0000043b 00000001 1 0 -1 - 08 00023b3601d080b30000000000000000, 0000000000000000
0000043c 00000001 1 0 -1 - 08 00023b3801d080b40000000000000000, 0000000000000000
where the columns are:
======= ===============================================================
COLUMN DESCRIPTION
======= ===============================================================
COOKIE The cookie debug ID (also appears in traces)
VOLUME The parent volume cookie debug ID
REF Number of references on the volume cookie
ACT Number of times the cookie is marked for in use
ACC Number of access pins in the cookie
S State of the cookie
FL Flags on the cookie
DEF Key, auxiliary data
======= ===============================================================
Debugging
@ -549,10 +334,8 @@ This is a bitmask of debugging streams to enable:
3 8 Cookie management Function entry trace
4 16 Function exit trace
5 32 General
6 64 Page handling Function entry trace
7 128 Function exit trace
8 256 General
9 512 Operation management Function entry trace
6-8 (Not used)
9 512 I/O operation management Function entry trace
10 1024 Function exit trace
11 2048 General
======= ======= =============================== =======================
@ -560,6 +343,6 @@ This is a bitmask of debugging streams to enable:
The appropriate set of values should be OR'd together and the result written to
the control file. For example::
echo $((1|8|64)) >/sys/module/fscache/parameters/debug
echo $((1|8|512)) >/sys/module/fscache/parameters/debug
will turn on all function entry debugging.

View File

@ -7,8 +7,6 @@ Filesystem Caching
:maxdepth: 2
fscache
object
netfs-api
backend-api
cachefiles
netfs-api
operations

File diff suppressed because it is too large Load Diff

View File

@ -1,313 +0,0 @@
.. SPDX-License-Identifier: GPL-2.0
====================================================
In-Kernel Cache Object Representation and Management
====================================================
By: David Howells <dhowells@redhat.com>
.. Contents:
(*) Representation
(*) Object management state machine.
- Provision of cpu time.
- Locking simplification.
(*) The set of states.
(*) The set of events.
Representation
==============
FS-Cache maintains an in-kernel representation of each object that a netfs is
currently interested in. Such objects are represented by the fscache_cookie
struct and are referred to as cookies.
FS-Cache also maintains a separate in-kernel representation of the objects that
a cache backend is currently actively caching. Such objects are represented by
the fscache_object struct. The cache backends allocate these upon request, and
are expected to embed them in their own representations. These are referred to
as objects.
There is a 1:N relationship between cookies and objects. A cookie may be
represented by multiple objects - an index may exist in more than one cache -
or even by no objects (it may not be cached).
Furthermore, both cookies and objects are hierarchical. The two hierarchies
correspond, but the cookies tree is a superset of the union of the object trees
of multiple caches::
NETFS INDEX TREE : CACHE 1 : CACHE 2
: :
: +-----------+ :
+----------->| IObject | :
+-----------+ | : +-----------+ :
| ICookie |-------+ : | :
+-----------+ | : | : +-----------+
| +------------------------------>| IObject |
| : | : +-----------+
| : V : |
| : +-----------+ : |
V +----------->| IObject | : |
+-----------+ | : +-----------+ : |
| ICookie |-------+ : | : V
+-----------+ | : | : +-----------+
| +------------------------------>| IObject |
+-----+-----+ : | : +-----------+
| | : | : |
V | : V : |
+-----------+ | : +-----------+ : |
| ICookie |------------------------->| IObject | : |
+-----------+ | : +-----------+ : |
| V : | : V
| +-----------+ : | : +-----------+
| | ICookie |-------------------------------->| IObject |
| +-----------+ : | : +-----------+
V | : V : |
+-----------+ | : +-----------+ : |
| DCookie |------------------------->| DObject | : |
+-----------+ | : +-----------+ : |
| : : |
+-------+-------+ : : |
| | : : |
V V : : V
+-----------+ +-----------+ : : +-----------+
| DCookie | | DCookie |------------------------>| DObject |
+-----------+ +-----------+ : : +-----------+
: :
In the above illustration, ICookie and IObject represent indices and DCookie
and DObject represent data storage objects. Indices may have representation in
multiple caches, but currently, non-index objects may not. Objects of any type
may also be entirely unrepresented.
As far as the netfs API goes, the netfs is only actually permitted to see
pointers to the cookies. The cookies themselves and any objects attached to
those cookies are hidden from it.
Object Management State Machine
===============================
Within FS-Cache, each active object is managed by its own individual state
machine. The state for an object is kept in the fscache_object struct, in
object->state. A cookie may point to a set of objects that are in different
states.
Each state has an action associated with it that is invoked when the machine
wakes up in that state. There are four logical sets of states:
(1) Preparation: states that wait for the parent objects to become ready. The
representations are hierarchical, and it is expected that an object must
be created or accessed with respect to its parent object.
(2) Initialisation: states that perform lookups in the cache and validate
what's found and that create on disk any missing metadata.
(3) Normal running: states that allow netfs operations on objects to proceed
and that update the state of objects.
(4) Termination: states that detach objects from their netfs cookies, that
delete objects from disk, that handle disk and system errors and that free
up in-memory resources.
In most cases, transitioning between states is in response to signalled events.
When a state has finished processing, it will usually set the mask of events in
which it is interested (object->event_mask) and relinquish the worker thread.
Then when an event is raised (by calling fscache_raise_event()), if the event
is not masked, the object will be queued for processing (by calling
fscache_enqueue_object()).
Provision of CPU Time
---------------------
The work to be done by the various states was given CPU time by the threads of
the slow work facility. This was used in preference to the workqueue facility
because:
(1) Threads may be completely occupied for very long periods of time by a
particular work item. These state actions may be doing sequences of
synchronous, journalled disk accesses (lookup, mkdir, create, setxattr,
getxattr, truncate, unlink, rmdir, rename).
(2) Threads may do little actual work, but may rather spend a lot of time
sleeping on I/O. This means that single-threaded and 1-per-CPU-threaded
workqueues don't necessarily have the right numbers of threads.
Locking Simplification
----------------------
Because only one worker thread may be operating on any particular object's
state machine at once, this simplifies the locking, particularly with respect
to disconnecting the netfs's representation of a cache object (fscache_cookie)
from the cache backend's representation (fscache_object) - which may be
requested from either end.
The Set of States
=================
The object state machine has a set of states that it can be in. There are
preparation states in which the object sets itself up and waits for its parent
object to transit to a state that allows access to its children:
(1) State FSCACHE_OBJECT_INIT.
Initialise the object and wait for the parent object to become active. In
the cache, it is expected that it will not be possible to look an object
up from the parent object, until that parent object itself has been looked
up.
There are initialisation states in which the object sets itself up and accesses
disk for the object metadata:
(2) State FSCACHE_OBJECT_LOOKING_UP.
Look up the object on disk, using the parent as a starting point.
FS-Cache expects the cache backend to probe the cache to see whether this
object is represented there, and if it is, to see if it's valid (coherency
management).
The cache should call fscache_object_lookup_negative() to indicate lookup
failure for whatever reason, and should call fscache_obtained_object() to
indicate success.
At the completion of lookup, FS-Cache will let the netfs go ahead with
read operations, no matter whether the file is yet cached. If not yet
cached, read operations will be immediately rejected with ENODATA until
the first known page is uncached - as to that point there can be no data
to be read out of the cache for that file that isn't currently also held
in the pagecache.
(3) State FSCACHE_OBJECT_CREATING.
Create an object on disk, using the parent as a starting point. This
happens if the lookup failed to find the object, or if the object's
coherency data indicated what's on disk is out of date. In this state,
FS-Cache expects the cache to create
The cache should call fscache_obtained_object() if creation completes
successfully, fscache_object_lookup_negative() otherwise.
At the completion of creation, FS-Cache will start processing write
operations the netfs has queued for an object. If creation failed, the
write ops will be transparently discarded, and nothing recorded in the
cache.
There are some normal running states in which the object spends its time
servicing netfs requests:
(4) State FSCACHE_OBJECT_AVAILABLE.
A transient state in which pending operations are started, child objects
are permitted to advance from FSCACHE_OBJECT_INIT state, and temporary
lookup data is freed.
(5) State FSCACHE_OBJECT_ACTIVE.
The normal running state. In this state, requests the netfs makes will be
passed on to the cache.
(6) State FSCACHE_OBJECT_INVALIDATING.
The object is undergoing invalidation. When the state comes here, it
discards all pending read, write and attribute change operations as it is
going to clear out the cache entirely and reinitialise it. It will then
continue to the FSCACHE_OBJECT_UPDATING state.
(7) State FSCACHE_OBJECT_UPDATING.
The state machine comes here to update the object in the cache from the
netfs's records. This involves updating the auxiliary data that is used
to maintain coherency.
And there are terminal states in which an object cleans itself up, deallocates
memory and potentially deletes stuff from disk:
(8) State FSCACHE_OBJECT_LC_DYING.
The object comes here if it is dying because of a lookup or creation
error. This would be due to a disk error or system error of some sort.
Temporary data is cleaned up, and the parent is released.
(9) State FSCACHE_OBJECT_DYING.
The object comes here if it is dying due to an error, because its parent
cookie has been relinquished by the netfs or because the cache is being
withdrawn.
Any child objects waiting on this one are given CPU time so that they too
can destroy themselves. This object waits for all its children to go away
before advancing to the next state.
(10) State FSCACHE_OBJECT_ABORT_INIT.
The object comes to this state if it was waiting on its parent in
FSCACHE_OBJECT_INIT, but its parent died. The object will destroy itself
so that the parent may proceed from the FSCACHE_OBJECT_DYING state.
(11) State FSCACHE_OBJECT_RELEASING.
(12) State FSCACHE_OBJECT_RECYCLING.
The object comes to one of these two states when dying once it is rid of
all its children, if it is dying because the netfs relinquished its
cookie. In the first state, the cached data is expected to persist, and
in the second it will be deleted.
(13) State FSCACHE_OBJECT_WITHDRAWING.
The object transits to this state if the cache decides it wants to
withdraw the object from service, perhaps to make space, but also due to
error or just because the whole cache is being withdrawn.
(14) State FSCACHE_OBJECT_DEAD.
The object transits to this state when the in-memory object record is
ready to be deleted. The object processor shouldn't ever see an object in
this state.
The Set of Events
-----------------
There are a number of events that can be raised to an object state machine:
FSCACHE_OBJECT_EV_UPDATE
The netfs requested that an object be updated. The state machine will ask
the cache backend to update the object, and the cache backend will ask the
netfs for details of the change through its cookie definition ops.
FSCACHE_OBJECT_EV_CLEARED
This is signalled in two circumstances:
(a) when an object's last child object is dropped and
(b) when the last operation outstanding on an object is completed.
This is used to proceed from the dying state.
FSCACHE_OBJECT_EV_ERROR
This is signalled when an I/O error occurs during the processing of some
object.
FSCACHE_OBJECT_EV_RELEASE, FSCACHE_OBJECT_EV_RETIRE
These are signalled when the netfs relinquishes a cookie it was using.
The event selected depends on whether the netfs asks for the backing
object to be retired (deleted) or retained.
FSCACHE_OBJECT_EV_WITHDRAW
This is signalled when the cache backend wants to withdraw an object.
This means that the object will have to be detached from the netfs's
cookie.
Because the withdrawing releasing/retiring events are all handled by the object
state machine, it doesn't matter if there's a collision with both ends trying
to sever the connection at the same time. The state machine can just pick
which one it wants to honour, and that effects the other.

View File

@ -1,210 +0,0 @@
.. SPDX-License-Identifier: GPL-2.0
================================
Asynchronous Operations Handling
================================
By: David Howells <dhowells@redhat.com>
.. Contents:
(*) Overview.
(*) Operation record initialisation.
(*) Parameters.
(*) Procedure.
(*) Asynchronous callback.
Overview
========
FS-Cache has an asynchronous operations handling facility that it uses for its
data storage and retrieval routines. Its operations are represented by
fscache_operation structs, though these are usually embedded into some other
structure.
This facility is available to and expected to be used by the cache backends,
and FS-Cache will create operations and pass them off to the appropriate cache
backend for completion.
To make use of this facility, <linux/fscache-cache.h> should be #included.
Operation Record Initialisation
===============================
An operation is recorded in an fscache_operation struct::
struct fscache_operation {
union {
struct work_struct fast_work;
struct slow_work slow_work;
};
unsigned long flags;
fscache_operation_processor_t processor;
...
};
Someone wanting to issue an operation should allocate something with this
struct embedded in it. They should initialise it by calling::
void fscache_operation_init(struct fscache_operation *op,
fscache_operation_release_t release);
with the operation to be initialised and the release function to use.
The op->flags parameter should be set to indicate the CPU time provision and
the exclusivity (see the Parameters section).
The op->fast_work, op->slow_work and op->processor flags should be set as
appropriate for the CPU time provision (see the Parameters section).
FSCACHE_OP_WAITING may be set in op->flags prior to each submission of the
operation and waited for afterwards.
Parameters
==========
There are a number of parameters that can be set in the operation record's flag
parameter. There are three options for the provision of CPU time in these
operations:
(1) The operation may be done synchronously (FSCACHE_OP_MYTHREAD). A thread
may decide it wants to handle an operation itself without deferring it to
another thread.
This is, for example, used in read operations for calling readpages() on
the backing filesystem in CacheFiles. Although readpages() does an
asynchronous data fetch, the determination of whether pages exist is done
synchronously - and the netfs does not proceed until this has been
determined.
If this option is to be used, FSCACHE_OP_WAITING must be set in op->flags
before submitting the operation, and the operating thread must wait for it
to be cleared before proceeding::
wait_on_bit(&op->flags, FSCACHE_OP_WAITING,
TASK_UNINTERRUPTIBLE);
(2) The operation may be fast asynchronous (FSCACHE_OP_FAST), in which case it
will be given to keventd to process. Such an operation is not permitted
to sleep on I/O.
This is, for example, used by CacheFiles to copy data from a backing fs
page to a netfs page after the backing fs has read the page in.
If this option is used, op->fast_work and op->processor must be
initialised before submitting the operation::
INIT_WORK(&op->fast_work, do_some_work);
(3) The operation may be slow asynchronous (FSCACHE_OP_SLOW), in which case it
will be given to the slow work facility to process. Such an operation is
permitted to sleep on I/O.
This is, for example, used by FS-Cache to handle background writes of
pages that have just been fetched from a remote server.
If this option is used, op->slow_work and op->processor must be
initialised before submitting the operation::
fscache_operation_init_slow(op, processor)
Furthermore, operations may be one of two types:
(1) Exclusive (FSCACHE_OP_EXCLUSIVE). Operations of this type may not run in
conjunction with any other operation on the object being operated upon.
An example of this is the attribute change operation, in which the file
being written to may need truncation.
(2) Shareable. Operations of this type may be running simultaneously. It's
up to the operation implementation to prevent interference between other
operations running at the same time.
Procedure
=========
Operations are used through the following procedure:
(1) The submitting thread must allocate the operation and initialise it
itself. Normally this would be part of a more specific structure with the
generic op embedded within.
(2) The submitting thread must then submit the operation for processing using
one of the following two functions::
int fscache_submit_op(struct fscache_object *object,
struct fscache_operation *op);
int fscache_submit_exclusive_op(struct fscache_object *object,
struct fscache_operation *op);
The first function should be used to submit non-exclusive ops and the
second to submit exclusive ones. The caller must still set the
FSCACHE_OP_EXCLUSIVE flag.
If successful, both functions will assign the operation to the specified
object and return 0. -ENOBUFS will be returned if the object specified is
permanently unavailable.
The operation manager will defer operations on an object that is still
undergoing lookup or creation. The operation will also be deferred if an
operation of conflicting exclusivity is in progress on the object.
If the operation is asynchronous, the manager will retain a reference to
it, so the caller should put their reference to it by passing it to::
void fscache_put_operation(struct fscache_operation *op);
(3) If the submitting thread wants to do the work itself, and has marked the
operation with FSCACHE_OP_MYTHREAD, then it should monitor
FSCACHE_OP_WAITING as described above and check the state of the object if
necessary (the object might have died while the thread was waiting).
When it has finished doing its processing, it should call
fscache_op_complete() and fscache_put_operation() on it.
(4) The operation holds an effective lock upon the object, preventing other
exclusive ops conflicting until it is released. The operation can be
enqueued for further immediate asynchronous processing by adjusting the
CPU time provisioning option if necessary, eg::
op->flags &= ~FSCACHE_OP_TYPE;
op->flags |= ~FSCACHE_OP_FAST;
and calling::
void fscache_enqueue_operation(struct fscache_operation *op)
This can be used to allow other things to have use of the worker thread
pools.
Asynchronous Callback
=====================
When used in asynchronous mode, the worker thread pool will invoke the
processor method with a pointer to the operation. This should then get at the
container struct by using container_of()::
static void fscache_write_op(struct fscache_operation *_op)
{
struct fscache_storage *op =
container_of(_op, struct fscache_storage, op);
...
}
The caller holds a reference on the operation, and will invoke
fscache_put_operation() when the processor function returns. The processor
function is at liberty to call fscache_enqueue_operation() or to take extra
references.

View File

@ -454,7 +454,8 @@ operation table looks like the following::
void *term_func_priv);
int (*prepare_write)(struct netfs_cache_resources *cres,
loff_t *_start, size_t *_len, loff_t i_size);
loff_t *_start, size_t *_len, loff_t i_size,
bool no_space_allocated_yet);
int (*write)(struct netfs_cache_resources *cres,
loff_t start_pos,
@ -515,11 +516,14 @@ The methods defined in the table are:
* ``prepare_write()``
[Required] Called to adjust a write to the cache and check that there is
sufficient space in the cache. The start and length values indicate the
size of the write that netfslib is proposing, and this can be adjusted by
the cache to respect DIO boundaries. The file size is passed for
information.
[Required] Called to prepare a write to the cache to take place. This
involves checking to see whether the cache has sufficient space to honour
the write. ``*_start`` and ``*_len`` indicate the region to be written; the
region can be shrunk or it can be expanded to a page boundary either way as
necessary to align for direct I/O. i_size holds the size of the object and
is provided for reference. no_space_allocated_yet is set to true if the
caller is certain that no data has been written to that region - for example
if it tried to do a read from there already.
* ``write()``