Commit graph

262828 commits

Author SHA1 Message Date
Lars Ellenberg
569083c08d drbd: fix drbd_delete_device: remove vnr from volumes; idr_remove(); synchronize_rcu(); before cleanup
Still missing: rcu_readlock() on the various call sites that
access/iterate over those idrs.

We don't need a specific write lock, as we only modify from
configuration context, which is already strictly serialized.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:48:07 +02:00
Lars Ellenberg
da4a75d2ef drbd: introduce a bio_set to allocate housekeeping bios from
Don't rely on availability of bios from the global fs_bio_set,
we should use our own bio_set for meta data IO.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:48:06 +02:00
Lars Ellenberg
9db4e77f8c drbd: use the newly introduced page pool for bitmap IO
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:48:05 +02:00
Lars Ellenberg
35abf59424 drbd: add page pool to be used for meta data IO
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:48:04 +02:00
Lars Ellenberg
3c13b680ce drbd: only wakeup if something changed in update_peer_seq
This commit got it wrong:
    drbd: Make the peer_seq updating code more obvious

    Make it more clear that update_peer_seq() is supposed to wake up the
    seq_wait queue whenever the sequence number changes.

We don't need to wake up everytime we receive a sequence number
that is _different_ from our currently stored "newest" sequence number,
but only if we receive a sequence number _newer_ than what we already
have, when we actually change mdev->peer_seq.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:48:04 +02:00
Lars Ellenberg
a5df0e199c drbd: default to detach on-io-error
Old default behaviour was "pass-on",
which is not useful in production at all.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:48:03 +02:00
Lars Ellenberg
2c4a48d097 drbd: remove unused define
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:48:02 +02:00
Philipp Reisner
81a5d60ecf drbd: Replaced the minor_table array by an idr
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:48:01 +02:00
Philipp Reisner
774b305518 drbd: Implemented new commands to create/delete connections/minors
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:48:00 +02:00
Philipp Reisner
80883197da drbd: Converted drbd_nl_(net_conf|disconnect)() from mdev to tconn
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:48:00 +02:00
Philipp Reisner
1aba4d7fcf drbd: Preparing the connector interface to operator on connections
Up to now it only operated on minor numbers. Now it can work also
on named connections.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:59 +02:00
Philipp Reisner
2f5cdd0b2c drbd: Converted the transfer log from mdev to tconn
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:58 +02:00
Philipp Reisner
49559d87fd drbd: Improved the dec_*() macros
Now those can be used with a struct drbd_conf * that has an other
name than 'mdev'.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:57 +02:00
Philipp Reisner
3f9cbe937e drbd: Removed the mdev parameter from the ..to_tags() and ...from_tags() functions
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:56 +02:00
Philipp Reisner
0e29d163f7 drbd: Reworked the unconfiguring and thread stopping code
* Moved CONFIG_PENDING and DEVICE_DYING from mdev to tconn.
* Renamed drbd_reconfig_start() and drbd_reconfig_done() to
  conn_reconfig_start() and conn_reconfig_done().

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:55 +02:00
Andreas Gruenbacher
c66342d949 drbd: Remove left-over function prototypes
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:55 +02:00
Andreas Gruenbacher
7201b972de drbd: Replace get_asender_cmd() with its implementation
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:54 +02:00
Andreas Gruenbacher
6e849ce88c drbd: Get rid of P_MAX_CMD
Instead of artificially enlarging the command decoding arrays to
P_MAX_CMD entries, check if an index is within the valid range using the
ARRAY_SIZE() macro.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:53 +02:00
Andreas Gruenbacher
1b3bb47d52 drbd: Remove redundant check
Opening a device only succeeds on a primary node, or when explicitly
setting the allow_oos module parameter to allow opening the device
read-only on a secondary node.  There is no other way that a request can
get into drbd_make_request(), so this code cannot trigger.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:52 +02:00
Andreas Gruenbacher
7be8da0798 drbd: Improve how conflicting writes are handled
The previous algorithm for dealing with overlapping concurrent writes
was generating unnecessary warnings for scenarios which could be
legitimate, and did not always handle partially overlapping requests
correctly.  Improve it algorithm as follows:

* While local or remote write requests are in progress, conflicting new
  local write requests will be delayed (commit 82172f7).

* When a conflict between a local and remote write request is detected,
  the node with the discard flag decides how to resolve the conflict: It
  will ask its peer to discard conflicting requests which are fully
  contained in the local request and retry requests which overlap only
  partially.  This involves a protocol change.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:51 +02:00
Andreas Gruenbacher
71b1c1eb9c drbd: Use ping-timeout when waiting for missing ack packets
When the node with the discard flag resolves write conflicts in
dual-primary mode, it may determine that its peer has sent ack packets
on the metadata socket which did not arrive, yet.  Wait for the next ack
with ping-timeout instead of a hard-coded 30 seconds.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:51 +02:00
Andreas Gruenbacher
8ccf218e9f drbd: Replace atomic_add_return with atomic_inc_return
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:50 +02:00
Andreas Gruenbacher
206d358941 drbd: Concurrent write detection fix
Commit 9b1e63e changed the concurrent write detection algorithm to only insert
peer requests into write_requests tree after determining that there is no
conflict.  With this change, new conflicting local requests could be added
while the algorithm runs, but this case was not handled correctly.  Instead of
making the algorithm deal with this case, switch back to adding peer requests
to the write_requests tree immediately: this improves fairness.

When a peer request is discarded, remove that request from the write_requests

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:49 +02:00
Andreas Gruenbacher
8050e6d005 drbd: Use container_of() instead of casting
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:48 +02:00
Lars Ellenberg
9676c76097 drbd: fix a wrong likely(), updated comments
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:47 +02:00
Lars Ellenberg
c9d963a46d drbd: silence some log messages on bitmap IO
Summary log messages meant for global bitmap IO
should not be printed for bitmap IO caused by
activity log transactions.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:47 +02:00
Lars Ellenberg
7ad651b522 drbd: new on-disk activity log transaction format
Use a new on-disk transaction format for the activity log, which allows
for multiple changes to the active set per transaction.

Using 4k transaction blocks, we can now get rid of the work-around code
to deal with devices not supporting 512 byte logical block size.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:46 +02:00
Lars Ellenberg
46a15bc3ec lru_cache: allow multiple changes per transaction
Allow multiple changes to the active set of elements in lru_cache.
The only current user of lru_cache, drbd, is driving this generalisation.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:45 +02:00
Lars Ellenberg
45dfffebd0 drbd: allow to select specific bitmap pages for writeout
We are about to allow several changes to the active set in one activity
log transaction. We have to write out the corresponding bitmap pages as
well, if changed.

Introduce drbd_bm_mark_for_writeout(), then re-use the existing bitmap
writeout path to submit all marked pages in one go.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:44 +02:00
Lars Ellenberg
a9efc748d6 lru_cache: consolidate lc_get and lc_try_get
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:43 +02:00
Lars Ellenberg
0097f0405d lru_cache.h: fix comments referring to ts_ instead of lc_
For some time we contemplated calling the "struct lru_cache"
a "struct tracked_set", and some comments kept the ts_ prefix.

Fix those to match the member field names.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:43 +02:00
Lars Ellenberg
4738fa1690 drbd: use clear_bit_unlock() where appropriate
Some open-coded clear_bit(); smp_mb__after_clear_bit();
should in fact have been smp_mb__before_clear_bit(); clear_bit();

Instead, use clear_bit_unlock() to annotate the intention,
and have it do the right thing.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:42 +02:00
Lars Ellenberg
61610420f7 drbd: in drbd_suspend_al, set AL_SUSPENDED before unlocking the activity log
As using an empty activity log is the whole point of the excercise,
make sure it is still empty when setting AL_SUSPENDED.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:41 +02:00
Lars Ellenberg
867f57483b drbd: fix typo in comment
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:40 +02:00
Lars Ellenberg
8c387def58 drbd: simplify condition in drbd_may_do_local_read()
fold
	if (x >= (N+1))
		return 0;
	if (x < N)
		return 0;
into
	if (x != N)
		return 0;

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:39 +02:00
Andreas Gruenbacher
c670a39867 drbd: Use the IS_ALIGNED() macro in some more places
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:39 +02:00
Andreas Gruenbacher
8ca9844f10 drbd: Remove obsolete comment
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:38 +02:00
Andreas Gruenbacher
d0e22a260c drbd: Iterate over all overlapping intervals in a tree
Add a macro and helper function for doing that.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:37 +02:00
Andreas Gruenbacher
fcefa62e4c drbd: Rename drbd_endio_{pri,sec} -> drbd_{,peer_}request_endio
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:36 +02:00
Andreas Gruenbacher
fbe29dec98 drbd: Rename drbd_submit_ee -> drbd_submit_peer_request
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:35 +02:00
Philipp Reisner
df24aa45f4 drbd: Implemented connection wide state changes
That is used for graceful disconnect only

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:32 +02:00
Philipp Reisner
047cd4a682 drbd: implemented receiving of P_CONN_ST_CHG_REQ
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:45:05 +02:00
Philipp Reisner
fc3b10a45f drbd: Implemented receiving of P_CONN_ST_CHG_REPLY
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:45:04 +02:00
Philipp Reisner
5aabf467e3 drbd: Global_state_lock not necessary here...
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:45:03 +02:00
Philipp Reisner
cf29c9d8c8 drbd: Implemented conn_send_state_req()
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:45:02 +02:00
Philipp Reisner
8410da8f0e drbd: Introduced tconn->cstate_mutex
In compatibility mode with old DRBDs, use that as the state_mutex
as well.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:45:01 +02:00
Philipp Reisner
dad2055481 drbd: Removed drbd_state_lock() and drbd_state_unlock()
The lock they constructed is only taken when the state_mutex
was already taken. It is superficial.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:45:01 +02:00
Philipp Reisner
bbeb641c3e drbd: Killed volume0; last step of multi-volume-enablement
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:44:58 +02:00
Philipp Reisner
56707f9e87 drbd: Code de-duplication; new function apply_mask_val()
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:33:22 +02:00
Philipp Reisner
4308a0a390 drbd: Removed the os parameter form sanitize_state()
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:33:21 +02:00