linux-stable

Commit Graph

Author	SHA1	Message	Date
George Spelvin	19acc77a36	random: Fix fast_mix() function There was a bad typo in commit `43759d4f42` ("random: use an improved fast_mix() function") and I didn't notice because it "looked right", so I saw what I expected to see when I reviewed it. Only months later did I look and notice it's not the Threefish-inspired mix function that I had designed and optimized. Mea Culpa. Each input bit still has a chance to affect each output bit, and the fast pool is spilled long before it fills, so it's not a total disaster, but it's definitely not the intended great improvement. I'm still working on finding better rotation constants. These are good enough, but since it's unrolled twice, it's possible to get better mixing for free by using eight different constants rather than repeating the same four. Signed-off-by: George Spelvin <linux@horizon.com> Cc: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org # v3.16+ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-02-09 12:28:42 -08:00
Linus Torvalds	14d4cc0883	This adds a memzero_explicit() call which is guaranteed not to be optimized away by GCC. This is important when we are wiping cryptographically sensitive material. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABCAAGBQJUQTmuAAoJENNvdpvBGATwFToP/jOGL/Z5NE7Oa33jC+oRDdEC 6gDXi27emzkll5BsxRLOR26vxXZ9AsBBI+U9pmhy64pcSUSxocTIZ+Bh0bx/LQyd w6HTTTYFk9GNtQCGrxRoNBPLdH/qz83ClvlWmpjsYpIEFfSOU3YncygSbps3uSeZ tdXiI5G1zZNGrljQrL+roJCZX5TP4XxHFbdUjeyV9Z8210oYTwCfpzHjg9+D24f0 rwTOHa0Lp6IrecU4Vlq4PFP+y4/ZdYYVwnpyX5UtTHP3QP176PcrwvnAl4Ys/8Lx 9uqj+gNrUnC6KHsSKhUxwMq9Ch7nu6iLLAYuIUMvxZargsmbNQFShHZyu2mwDgko bp+oTw8byOQyv6g/hbFpTVwfwpiv/AGu8VxmG3ORGqndOldTh+oQ9xMnuBZA8sXX PxHxEUY9hr66nVFg4iuxT/2KJJA+Ol8ARkB0taCWhwavzxXJeedEVEw5nbtQxRsM AJGxjBsAgSw7SJD03yAQH5kRGYvIdv03JRbIiMPmKjlP+pl1JkzOAPhVMUD+24vI x6oFpSa5FH5utlt3nCZuxlOYBuWhWKIhUzEoY2HwCsyISQScPcwL9EP15sWceY5i 8+Wylvf+yqGVU3KopCBBV/oX3Wm/kj1A8OP/4Kk8UHw9k2btjYETYayhP1DHKnIt /4pr4+oGd5GlFOHRteXp =i29U -----END PGP SIGNATURE----- Merge tag 'random_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random Pull /dev/random updates from Ted Ts'o: "This adds a memzero_explicit() call which is guaranteed not to be optimized away by GCC. This is important when we are wiping cryptographically sensitive material" * tag 'random_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random: crypto: memzero_explicit - make sure to clear out sensitive data random: add and use memzero_explicit() for clearing data	2014-10-24 12:33:32 -07:00
Daniel Borkmann	d4c5efdb97	random: add and use memzero_explicit() for clearing data zatimend has reported that in his environment (3.16/gcc4.8.3/corei7) memset() calls which clear out sensitive data in extract_{buf,entropy, entropy_user}() in random driver are being optimized away by gcc. Add a helper memzero_explicit() (similarly as explicit_bzero() variants) that can be used in such cases where a variable with sensitive data is being cleared out in the end. Other use cases might also be in crypto code. [ I have put this into lib/string.c though, as it's always built-in and doesn't need any dependencies then. ] Fixes kernel bugzilla: 82041 Reported-by: zatimend@hotmail.co.uk Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org	2014-10-17 11:37:29 -04:00
Christoph Lameter	1b2a1a7e8a	drivers/char/random: Replace __get_cpu_var uses A single case of using __get_cpu_var for address calculation. Cc: Arnd Bergmann <arnd@arndb.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2014-08-26 13:45:45 -04:00
Linus Torvalds	f4f142ed4e	Cleanups and bug fixes to /dev/random, add a new getrandom(2) system call, which is a superset of OpenBSD's getentropy(2) call, for use with userspace crypto libraries such as LibreSSL. Also add the ability to have a kernel thread to pull entropy from hardware rng devices into /dev/random. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABCAAGBQJT4VkhAAoJENNvdpvBGATwGMwP/0DvcJnk8Xg2pE67GrBlkL4V ltDYZBUNI3Z9YqPFMbN02kt8jBJ4o8NVrD9XXSAmk0NbNV6pc4SdGUU7BBcms4BF DX4CasmQS1EMKOxsszlvEbj9Q25u9ODJhUKsr1ZQKe3wfjx1gKRQ1QHHcrqgbGc0 tjkBU/TW+8daza6dGYrUrO34BPeN5Y4xbBG5WmVOLGgbDH7J3ZKGzkG21R5zHraI tPJzZ3KGj+Cf1TtamBOpyF+SLqM7qi43JY/1l8LfDzJgJhB3NxOR1ig/Pk6z1qLi 2xYm1hb+EQqJGaToMXEl5fLLcYfnJmLYD/dWNq/pOVXFqC5cGxYIH1h+Nwzywvy3 hVqh4yDU5HXgu8mOMPPc23azicJflZwCNq0vTTDE+orYnb8n9Sbg0l+rUQ45BZua tVfGKT1LZuYtM0axYQ4fIfqS9bxsyRJcF6HNNaEMQJsm0V0prwlz0hXkaod1uOJd CwOn9+CpZUGCgj5paRS+zTOtcl39+X1tIhcWTHEDMpMzIqnk8KpkLGqCDisBZNBF UbjEaTA8w6tBxRX5FZ9qdmRFvsxCJH7nOxmmsaIOZ/7QXQHQNrxI2+v6yd4HWJAw yZnaVR5o6sojKc8zp9nOXQ219G1zvt4l6XyTqIP+gKWJGDKGCsMXXzEg1OchO+rI Oo8s5+ytZB9qei7QwLAf =wLqJ -----END PGP SIGNATURE----- Merge tag 'random_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random Pull randomness updates from Ted Ts'o: "Cleanups and bug fixes to /dev/random, add a new getrandom(2) system call, which is a superset of OpenBSD's getentropy(2) call, for use with userspace crypto libraries such as LibreSSL. Also add the ability to have a kernel thread to pull entropy from hardware rng devices into /dev/random" * tag 'random_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random: hwrng: Pass entropy to add_hwgenerator_randomness() in bits, not bytes random: limit the contribution of the hw rng to at most half random: introduce getrandom(2) system call hw_random: fix sparse warning (NULL vs 0 for pointer) random: use registers from interrupted code for CPU's w/o a cycle counter hwrng: add per-device entropy derating hwrng: create filler thread random: add_hwgenerator_randomness() for feeding entropy from devices random: use an improved fast_mix() function random: clean up interrupt entropy accounting for archs w/o cycle counters random: only update the last_pulled time if we actually transferred entropy random: remove unneeded hash of a portion of the entropy pool random: always update the entropy pool under the spinlock	2014-08-06 08:16:24 -07:00
Theodore Ts'o	48d6be955a	random: limit the contribution of the hw rng to at most half For people who don't trust a hardware RNG which can not be audited, the changes to add support for RDSEED can be troubling since 97% or more of the entropy will be contributed from the in-CPU hardware RNG. We now have a in-kernel khwrngd, so for those people who do want to implicitly trust the CPU-based system, we could create an arch-rng hw_random driver, and allow khwrng refill the entropy pool. This allows system administrator whether or not they trust the CPU (I assume the NSA will trust RDRAND/RDSEED implicitly :-), and if so, what level of entropy derating they want to use. The reason why this is a really good idea is that if different people use different levels of entropy derating, it will make it much more difficult to design a backdoor'ed hwrng that can be generally exploited in terms of the output of /dev/random when different attack targets are using differing levels of entropy derating. Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2014-08-05 16:41:50 -04:00
Theodore Ts'o	c6e9d6f388	random: introduce getrandom(2) system call The getrandom(2) system call was requested by the LibreSSL Portable developers. It is analoguous to the getentropy(2) system call in OpenBSD. The rationale of this system call is to provide resiliance against file descriptor exhaustion attacks, where the attacker consumes all available file descriptors, forcing the use of the fallback code where /dev/[u]random is not available. Since the fallback code is often not well-tested, it is better to eliminate this potential failure mode entirely. The other feature provided by this new system call is the ability to request randomness from the /dev/urandom entropy pool, but to block until at least 128 bits of entropy has been accumulated in the /dev/urandom entropy pool. Historically, the emphasis in the /dev/urandom development has been to ensure that urandom pool is initialized as quickly as possible after system boot, and preferably before the init scripts start execution. This is because changing /dev/urandom reads to block represents an interface change that could potentially break userspace which is not acceptable. In practice, on most x86 desktop and server systems, in general the entropy pool can be initialized before it is needed (and in modern kernels, we will printk a warning message if not). However, on an embedded system, this may not be the case. And so with this new interface, we can provide the functionality of blocking until the urandom pool has been initialized. Any userspace program which uses this new functionality must take care to assure that if it is used during the boot process, that it will not cause the init scripts or other portions of the system startup to hang indefinitely. SYNOPSIS #include <linux/random.h> int getrandom(void buf, size_t buflen, unsigned int flags); DESCRIPTION The system call getrandom() fills the buffer pointed to by buf with up to buflen random bytes which can be used to seed user space random number generators (i.e., DRBG's) or for other cryptographic uses. It should not be used for Monte Carlo simulations or other programs/algorithms which are doing probabilistic sampling. If the GRND_RANDOM flags bit is set, then draw from the /dev/random pool instead of the /dev/urandom pool. The /dev/random pool is limited based on the entropy that can be obtained from environmental noise, so if there is insufficient entropy, the requested number of bytes may not be returned. If there is no entropy available at all, getrandom(2) will either block, or return an error with errno set to EAGAIN if the GRND_NONBLOCK bit is set in flags. If the GRND_RANDOM bit is not set, then the /dev/urandom pool will be used. Unlike using read(2) to fetch data from /dev/urandom, if the urandom pool has not been sufficiently initialized, getrandom(2) will block (or return -1 with the errno set to EAGAIN if the GRND_NONBLOCK bit is set in flags). The getentropy(2) system call in OpenBSD can be emulated using the following function: int getentropy(void buf, size_t buflen) { int ret; if (buflen > 256) goto failure; ret = getrandom(buf, buflen, 0); if (ret < 0) return ret; if (ret == buflen) return 0; failure: errno = EIO; return -1; } RETURN VALUE On success, the number of bytes that was filled in the buf is returned. This may not be all the bytes requested by the caller via buflen if insufficient entropy was present in the /dev/random pool, or if the system call was interrupted by a signal. On error, -1 is returned, and errno is set appropriately. ERRORS EINVAL An invalid flag was passed to getrandom(2) EFAULT buf is outside the accessible address space. EAGAIN The requested entropy was not available, and getentropy(2) would have blocked if the GRND_NONBLOCK flag was not set. EINTR While blocked waiting for entropy, the call was interrupted by a signal handler; see the description of how interrupted read(2) calls on "slow" devices are handled with and without the SA_RESTART flag in the signal(7) man page. NOTES For small requests (buflen <= 256) getrandom(2) will not return EINTR when reading from the urandom pool once the entropy pool has been initialized, and it will return all of the bytes that have been requested. This is the recommended way to use getrandom(2), and is designed for compatibility with OpenBSD's getentropy() system call. However, if you are using GRND_RANDOM, then getrandom(2) may block until the entropy accounting determines that sufficient environmental noise has been gathered such that getrandom(2) will be operating as a NRBG instead of a DRBG for those people who are working in the NIST SP 800-90 regime. Since it may block for a long time, these guarantees do not apply. The user may want to interrupt a hanging process using a signal, so blocking until all of the requested bytes are returned would be unfriendly. For this reason, the user of getrandom(2) MUST always check the return value, in case it returns some error, or if fewer bytes than requested was returned. In the case of !GRND_RANDOM and small request, the latter should never happen, but the careful userspace code (and all crypto code should be careful) should check for this anyway! Finally, unless you are doing long-term key generation (and perhaps not even then), you probably shouldn't be using GRND_RANDOM. The cryptographic algorithms used for /dev/urandom are quite conservative, and so should be sufficient for all purposes. The disadvantage of GRND_RANDOM is that it can block, and the increased complexity required to deal with partially fulfilled getrandom(2) requests. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Zach Brown <zab@zabbo.net>	2014-08-05 16:41:22 -04:00
Hannes Frederic Sowa	79a8468747	random: check for increase of entropy_count because of signed conversion The expression entropy_count -= ibytes << (ENTROPY_SHIFT + 3) could actually increase entropy_count if during assignment of the unsigned expression on the RHS (mind the -=) we reduce the value modulo 2^width(int) and assign it to entropy_count. Trinity found this. [ Commit modified by tytso to add an additional safety check for a negative entropy_count -- which should never happen, and to also add an additional paranoia check to prevent overly large count values to be passed into urandom_read(). ] Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org	2014-07-19 01:42:13 -04:00
Theodore Ts'o	ee3e00e9e7	random: use registers from interrupted code for CPU's w/o a cycle counter For CPU's that don't have a cycle counter, or something equivalent which can be used for random_get_entropy(), random_get_entropy() will always return 0. In that case, substitute with the saved interrupt registers to add a bit more unpredictability. Some folks have suggested hashing all of the registers unconditionally, but this would increase the overhead of add_interrupt_randomness() by at least an order of magnitude, and this would very likely be unacceptable. The changes in this commit have been benchmarked as mostly unaffecting the overhead of add_interrupt_randomness() if the entropy counter is present, and doubling the overhead if it is not present. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: Jörn Engel <joern@logfs.org>	2014-07-15 04:49:41 -04:00
Torsten Duwe	c84dbf61a7	random: add_hwgenerator_randomness() for feeding entropy from devices This patch adds an interface to the random pool for feeding entropy in-kernel. Signed-off-by: Torsten Duwe <duwe@suse.de> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Acked-by: H. Peter Anvin <hpa@zytor.com>	2014-07-15 04:49:40 -04:00
Theodore Ts'o	43759d4f42	random: use an improved fast_mix() function Use more efficient fast_mix() function. Thanks to George Spelvin for doing the leg work to find a more efficient mixing function. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: George Spelvin <linux@horizon.com>	2014-07-15 04:49:40 -04:00
Theodore Ts'o	840f95077f	random: clean up interrupt entropy accounting for archs w/o cycle counters For architectures that don't have cycle counters, the algorithm for deciding when to avoid giving entropy credit due to back-to-back timer interrupts didn't make any sense, since we were checking every 64 interrupts. Change it so that we only give an entropy credit if the majority of the interrupts are not based on the timer. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: George Spelvin <linux@horizon.com>	2014-07-15 04:49:39 -04:00
Theodore Ts'o	cff850312c	random: only update the last_pulled time if we actually transferred entropy In xfer_secondary_pull(), check to make sure we need to pull from the secondary pool before checking and potentially updating the last_pulled time. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: George Spelvin <linux@horizon.com>	2014-07-15 04:49:39 -04:00
Theodore Ts'o	85608f8e16	random: remove unneeded hash of a portion of the entropy pool We previously extracted a portion of the entropy pool in mix_pool_bytes() and hashed it in to avoid racing CPU's from returning duplicate random values. Now that we are using a spinlock to prevent this from happening, this is no longer necessary. So remove it, to simplify the code a bit. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: George Spelvin <linux@horizon.com>	2014-07-15 04:49:39 -04:00
Theodore Ts'o	91fcb532ef	random: always update the entropy pool under the spinlock Instead of using lockless techniques introduced in commit `902c098a36`, use spin_trylock to try to grab entropy pool's lock. If we can't get the lock, then just try again on the next interrupt. Based on discussions with George Spelvin. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: George Spelvin <linux@horizon.com>	2014-07-15 04:49:39 -04:00
Linus Torvalds	5ee22beeb2	random: fix entropy accounting bug introduced in v3.15 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABCAAGBQJToH5YAAoJENNvdpvBGATwtuIQAOHsQAPDHbo7iSullr/tOTRd BZhFfdiG47tS4FkVYsrqSFCloROkneSCIIN0HLeTRbt4hA4SjN+jEkM2mtQ0dA/t ++DVgzFxMUvb7yOIA4uQk1C3kxlvPdx9EeGMHnSZ9u/uNUwfgqvlQ7r+k+kldtGp J+Ouaoy7w+XeXPy3JrFnKmvvFTjC94h0T7VWPJlqXRFmu8fN6sCxgXPfsdQkxcXw q75sD11nuVhUDy8CQbFfT1IHDshiBnFMm6muIipZcY0zu/ecutBkwpA+//ommxnM xPWf1vt3hJj3IGqgz9I0pJhBTHkpmmqVlW8pDMgNVwbAu7kEVrJ0YKfQLkP1JRbF lJe5G0Iy27y1Lx+UBw8WnGe/BxAE+8Ljq1p2gE5qbVZfB7w5/zgZDbREGdZG/+8K kZrYth4gKNVJEZBu1S6g0NSYG6DkF3voMRSan5U+t6pXR7PhEDMl+m4ablUnZjCQ tNK4rPKVtbisfOHcAEd5FNmHOat3hJ6WNAa3dzv7LEH6v2PPU7q1JVDr5tbvmhZr qW63+TvIpfX2kA0DkPnMnj8f3gXrRtZdUXeQF4RTMZRe26Sg262/bx2nR9h4H77n +x75tswu0epo9x/Ip/m9sC6MOzB2s4MUrCEZjpBVzvbgueIo7A16kMKsJbHRRtos 4nMa2AnoMMkfoDn7uQtm =UeTe -----END PGP SIGNATURE----- Merge tag 'random_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random Pull randomness bugfix from Ted Ts'o: "random: fix entropy accounting bug introduced in v3.15" * tag 'random_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random: random: fix nasty entropy accounting bug	2014-06-17 14:23:14 -10:00
Theodore Ts'o	e33ba5fa7a	random: fix nasty entropy accounting bug Commit `0fb7a01af5` "random: simplify accounting code", introduced in v3.15, has a very nasty accounting problem when the entropy pool has has fewer bytes of entropy than the number of requested reserved bytes. In that case, "have_bytes - reserved" goes negative, and since size_t is unsigned, the expression: ibytes = min_t(size_t, ibytes, have_bytes - reserved); ... does not do the right thing. This is rather bad, because it defeats the catastrophic reseeding feature in the xfer_secondary_pool() path. It also can cause the "BUG: spinlock trylock failure on UP" for some kernel configurations when prandom_reseed() calls get_random_bytes() in the early init, since when the entropy count gets corrupted, credit_entropy_bits() erroneously believes that the nonblocking pool has been fully initialized (when in fact it is not), and so it calls prandom_reseed(true) recursively leading to the spinlock BUG. The logic is not the same it was originally, but in the cases where it matters, the behavior is the same, and the resulting code is hopefully easier to read and understand. Fixes: `0fb7a01af5` "random: simplify accounting code" Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: Greg Price <price@mit.edu> Cc: stable@vger.kernel.org #v3.15	2014-06-15 21:04:32 -04:00
Joe Perches	5eb10d912e	random: convert use of typedef ctl_table to struct ctl_table This typedef is unnecessary and should just be removed. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-06-06 16:08:15 -07:00
Linus Torvalds	681a289548	Merge branch 'for-3.16/core' of git://git.kernel.dk/linux-block into next Pull block core updates from Jens Axboe: "It's a big(ish) round this time, lots of development effort has gone into blk-mq in the last 3 months. Generally we're heading to where 3.16 will be a feature complete and performant blk-mq. scsi-mq is progressing nicely and will hopefully be in 3.17. A nvme port is in progress, and the Micron pci-e flash driver, mtip32xx, is converted and will be sent in with the driver pull request for 3.16. This pull request contains: - Lots of prep and support patches for scsi-mq have been integrated. All from Christoph. - API and code cleanups for blk-mq from Christoph. - Lots of good corner case and error handling cleanup fixes for blk-mq from Ming Lei. - A flew of blk-mq updates from me: * Provide strict mappings so that the driver can rely on the CPU to queue mapping. This enables optimizations in the driver. * Provided a bitmap tagging instead of percpu_ida, which never really worked well for blk-mq. percpu_ida relies on the fact that we have a lot more tags available than we really need, it fails miserably for cases where we exhaust (or are close to exhausting) the tag space. * Provide sane support for shared tag maps, as utilized by scsi-mq * Various fixes for IO timeouts. * API cleanups, and lots of perf tweaks and optimizations. - Remove 'buffer' from struct request. This is ancient code, from when requests were always virtually mapped. Kill it, to reclaim some space in struct request. From me. - Remove 'magic' from blk_plug. Since we store these on the stack and since we've never caught any actual bugs with this, lets just get rid of it. From me. - Only call part_in_flight() once for IO completion, as includes two atomic reads. Hopefully we'll get a better implementation soon, as the part IO stats are now one of the more expensive parts of doing IO on blk-mq. From me. - File migration of block code from {mm,fs}/ to block/. This includes bio.c, bio-integrity.c, bounce.c, and ioprio.c. From me, from a discussion on lkml. That should describe the meat of the pull request. Also has various little fixes and cleanups from Dave Jones, Shaohua Li, Duan Jiong, Fengguang Wu, Fabian Frederick, Randy Dunlap, Robert Elliott, and Sam Bradshaw" * 'for-3.16/core' of git://git.kernel.dk/linux-block: (100 commits) blk-mq: push IPI or local end_io decision to __blk_mq_complete_request() blk-mq: remember to start timeout handler for direct queue block: ensure that the timer is always added blk-mq: blk_mq_unregister_hctx() can be static blk-mq: make the sysfs mq/ layout reflect current mappings blk-mq: blk_mq_tag_to_rq should handle flush request block: remove dead code in scsi_ioctl:blk_verify_command blk-mq: request initialization optimizations block: add queue flag for disabling SG merging block: remove 'magic' from struct blk_plug blk-mq: remove alloc_hctx and free_hctx methods blk-mq: add file comments and update copyright notices blk-mq: remove blk_mq_alloc_request_pinned blk-mq: do not use blk_mq_alloc_request_pinned in blk_mq_map_request blk-mq: remove blk_mq_wait_for_tags blk-mq: initialize request in __blk_mq_alloc_request blk-mq: merge blk_mq_alloc_reserved_request into blk_mq_alloc_request blk-mq: add helper to insert requests from irq context blk-mq: remove stale comment for blk_mq_complete_request() blk-mq: allow non-softirq completions ...	2014-06-02 09:29:34 -07:00
Theodore Ts'o	f9c6d4987b	random: fix BUG_ON caused by accounting simplification Commit `ee1de406ba` ("random: simplify accounting logic") simplified things too much, in that it allows the following to trigger an overflow that results in a BUG_ON crash: dd if=/dev/urandom of=/dev/zero bs=67108707 count=1 Thanks to Peter Zihlstra for discovering the crash, and Hannes Frederic for analyizing the root cause. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reported-by: Peter Zijlstra <peterz@infradead.org> Reported-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Cc: Greg Price <price@mit.edu>	2014-05-16 22:18:22 -04:00
Christoph Hellwig	bdcfa3e57c	random: export add_disk_randomness This will be needed for pending changes to the scsi midlayer that now calls lower level block APIs, as well as any blk-mq driver that wants to contribute to the random pool. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Jens Axboe <axboe@fb.com>	2014-04-28 09:29:55 -06:00
H. Peter Anvin	7b878d4b48	random: Add arch_has_random[_seed]() Add predicate functions for having arch_get_random[_seed]*(). The only current use is to avoid the loop in arch_random_refill() when arch_get_random_seed_long() is unavailable. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2014-03-19 22:24:08 -04:00
H. Peter Anvin	331c6490c7	random: If we have arch_get_random_seed(), try it before blocking If we have arch_get_random_seed(), try to use it for emergency refill of the entropy pool before giving up and blocking on /dev/random. It may or may not work in the moment, but if it does work, it will give the user better service than blocking will. Reviewed-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2014-03-19 22:22:06 -04:00
H. Peter Anvin	83664a6928	random: Use arch_get_random_seed() at init time and once a second Use arch_get_random_seed() in two places in the Linux random driver (drivers/char/random.c): 1. During entropy pool initialization, use RDSEED in favor of RDRAND, with a fallback to the latter. Entropy exhaustion is unlikely to happen there on physical hardware as the machine is single-threaded at that point, but could happen in a virtual machine. In that case, the fallback to RDRAND will still provide more than adequate entropy pool initialization. 2. Once a second, issue RDSEED and, if successful, feed it to the entropy pool. To ensure an extra layer of security, only credit half the entropy just in case. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Reviewed-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2014-03-19 22:22:06 -04:00
Theodore Ts'o	46884442fc	random: use the architectural HWRNG for the SHA's IV in extract_buf() To help assuage the fears of those who think the NSA can introduce a massive hack into the instruction decode and out of order execution engine in the CPU without hundreds of Intel engineers knowing about it (only one of which woud need to have the conscience and courage of Edward Snowden to spill the beans to the public), use the HWRNG to initialize the SHA starting value, instead of xor'ing it in afterwards. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2014-03-19 22:18:52 -04:00
Greg Price	2132a96f66	random: clarify bits/bytes in wakeup thresholds These are a recurring cause of confusion, so rename them to hopefully be clearer. Signed-off-by: Greg Price <price@mit.edu> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2014-03-19 22:18:51 -04:00
Greg Price	7d1b08c40c	random: entropy_bytes is actually bits The variable 'entropy_bytes' is set from an expression that actually counts bits. Fortunately it's also only compared to values that also count bits. Rename it accordingly. Signed-off-by: Greg Price <price@mit.edu> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2014-03-19 22:18:51 -04:00
Greg Price	0fb7a01af5	random: simplify accounting code With this we handle "reserved" in just one place. As a bonus the code becomes less nested, and the "wakeup_write" flag variable becomes unnecessary. The variable "flags" was already unused. This code behaves identically to the previous version except in two pathological cases that don't occur. If the argument "nbytes" is already less than "min", then we didn't previously enforce "min". If r->limit is false while "reserved" is nonzero, then we previously applied "reserved" in checking whether we had enough bits, even though we don't apply it to actually limit how many we take. The callers of account() never exercise either of these cases. Before the previous commit, it was possible for "nbytes" to be less than "min" if userspace chose a pathological configuration, but no longer. Cc: Jiri Kosina <jkosina@suse.cz> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Greg Price <price@mit.edu> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2014-03-19 22:18:51 -04:00
Greg Price	8c2aa3390e	random: tighten bound on random_read_wakeup_thresh We use this value in a few places other than its literal meaning, in particular in _xfer_secondary_pool() as a minimum number of bits to pull from the input pool at a time into either output pool. It doesn't make sense to pull more bits than the whole size of an output pool. We could and possibly should separate the quantities "how much should the input pool have to have to wake up /dev/random readers" and "how much should we transfer from the input to an output pool at a time", but nobody is likely to be sad they can't set the first quantity to more than 1024 bits, so for now just limit them both. Signed-off-by: Greg Price <price@mit.edu> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2014-03-19 22:18:51 -04:00
Greg Price	a58aa4edc6	random: forget lock in lockless accounting The only mutable data accessed here is ->entropy_count, but since `10b3a32d2` ("random: fix accounting race condition") we use cmpxchg to protect our accesses to ->entropy_count here. Drop the use of the lock. Cc: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Greg Price <price@mit.edu> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2014-03-19 22:18:51 -04:00
Greg Price	ee1de406ba	random: simplify accounting logic This logic is exactly equivalent to the old logic, but it should be easier to see what it's doing. The equivalence depends on one fact from outside this function: when 'r->limit' is false, 'reserved' is zero. (Well, two facts; the other is that 'reserved' is never negative.) Cc: Jiri Kosina <jkosina@suse.cz> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Greg Price <price@mit.edu> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2014-03-19 22:18:51 -04:00
Greg Price	19fa5be1d9	random: fix comment on "account" This comment didn't quite keep up as extract_entropy() was split into four functions. Put each bit by the function it describes. Signed-off-by: Greg Price <price@mit.edu> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2014-03-19 22:18:50 -04:00
Greg Price	12ff3a517a	random: simplify loop in random_read The loop condition never changes until just before a break, so we might as well write it as a constant. Also since `a996996dd7` ("random: drop weird m_time/a_time manipulation") we don't do anything after the loop finishes, so the 'break's might as well return directly. Some other simplifications. There should be no change in behavior introduced by this commit. Signed-off-by: Greg Price <price@mit.edu> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2014-03-19 22:18:50 -04:00
Greg Price	18e9cea749	random: fix description of get_random_bytes After this remark was written, commit `d2e7c96af` added a use of arch_get_random_long() inside the get_random_bytes codepath. The main point stands, but it needs to be reworded. Signed-off-by: Greg Price <price@mit.edu> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2014-03-19 22:18:50 -04:00
Greg Price	f22052b202	random: fix comment on proc_do_uuid There's only one function here now, as uuid_strategy is long gone. Also make the bit about "If accesses via ..." clearer. Signed-off-by: Greg Price <price@mit.edu> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2014-03-19 22:18:50 -04:00
Greg Price	dfd38750db	random: fix typos / spelling errors in comments Signed-off-by: Greg Price <price@mit.edu> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2014-03-19 22:18:50 -04:00
Linus Torvalds	0891ad829d	The /dev/random changes for 3.13 including a number of improvements in the following areas: performance, avoiding waste of entropy, better tracking of entropy estimates, support for non-x86 platforms that have a register which can't be used for fine-grained timekeeping, but which might be good enough for the random driver. Also add some printk's so that we can see how quickly /dev/urandom can get initialized, and when programs try to use /dev/urandom before it is fully initialized (since this could be a security issue). This shouldn't be an issue on x86 desktop/laptops --- a test on my Lenovo T430s laptop shows that /dev/urandom is getting fully initialized approximately two seconds before the root file system is mounted read/write --- this may be an issue with ARM and MIPS embedded/mobile systems, though. These printk's will be a useful canary before potentially adding a future change to start blocking processes which try to read from /dev/urandom before it is initialized, which is something FreeBSD does already for security reasons, and which security folks have been agitating for Linux to also adopt. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.14 (GNU/Linux) iQIcBAABCAAGBQJShC4MAAoJENNvdpvBGATwC0QQAMujsIxTZnsHwQrbb5eJf1kD 74TwQyEfWw5qnGQrc8JOoAbe1MG7C4QlfHxRsWxvCD8G+Mft4Q5ZgZOt0/ecAGD6 Tid58EaZGSfK9+YE6jgvJFekQADCREdPSxBASJ3cECT6dXXBX9IqR9gbAK02mM+w QZdbgWBMsPJZiHSsCNeRbZ9oIiPdcNDsMJwzJhirPUeAnKCaX3z+LWc3XcMw7wYi q5cSl0ENZd6QsBKs37A1ol5BtLEsoot2t3HKdnpOBsDQKSJ712KduwN5jUfs6h9D 0fqmVHwfKsge+D8/3NgBKz+yWLQnGkuB4Ibo+09BZXwH3rYU1/gKm0iLNi0yQ5fV 73bn4pqF6cZdDNgj0Ic+MyYAW+S/NOQ6TcF/3eSAPW6z/wHZOfZ2njCh1GEHBOKI 6iZZu+Ek7QyFJ/z5Fr1bXFJR7V99r7hRD3gwMCMZ/mjhloB2cyD0a2A9kFP85ykI I4tFEnq0FpX/K60ag4hiLnqVx/TsmbdMoz+8OpQckHgQJrZMuRRf1d+T4au47Y6K uXGLpSuvkALYW2koo2OoO2d873N/89fqFL8lI8Iy0YlgAxxxm++gl1Mql/E1wPOa 5jB0lW/jex/CquE7meTgRlM/fTU/HVbe3608ZNUYBJUHS9K/PaSnCCu2ya8/TsSW xeVS/vMnNvtGerdEIyKm =wla0 -----END PGP SIGNATURE----- Merge tag 'random_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random Pull /dev/random changes from Ted Ts'o: "The /dev/random changes for 3.13 including a number of improvements in the following areas: performance, avoiding waste of entropy, better tracking of entropy estimates, support for non-x86 platforms that have a register which can't be used for fine-grained timekeeping, but which might be good enough for the random driver. Also add some printk's so that we can see how quickly /dev/urandom can get initialized, and when programs try to use /dev/urandom before it is fully initialized (since this could be a security issue). This shouldn't be an issue on x86 desktop/laptops --- a test on my Lenovo T430s laptop shows that /dev/urandom is getting fully initialized approximately two seconds before the root file system is mounted read/write --- this may be an issue with ARM and MIPS embedded/mobile systems, though. These printk's will be a useful canary before potentially adding a future change to start blocking processes which try to read from /dev/urandom before it is initialized, which is something FreeBSD does already for security reasons, and which security folks have been agitating for Linux to also adopt" * tag 'random_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random: random: add debugging code to detect early use of get_random_bytes() random: initialize the last_time field in struct timer_rand_state random: don't zap entropy count in rand_initialize() random: printk notifications for urandom pool initialization random: make add_timer_randomness() fill the nonblocking pool first random: convert DEBUG_ENT to tracepoints random: push extra entropy to the output pools random: drop trickle mode random: adjust the generator polynomials in the mixing function slightly random: speed up the fast_mix function by a factor of four random: cap the rate which the /dev/urandom pool gets reseeded random: optimize the entropy_store structure random: optimize spinlock use in add_device_randomness() random: fix the tracepoint for get_random_bytes(_arch) random: account for entropy loss due to overwrites random: allow fractional bits to be tracked random: statically compute poolbitshift, poolbytes, poolbits random: mix in architectural randomness earlier in extract_buf()	2013-11-16 10:19:15 -08:00
Hannes Frederic Sowa	4af712e8df	random32: add prandom_reseed_late() and call when nonblocking pool becomes initialized The Tausworthe PRNG is initialized at late_initcall time. At that time the entropy pool serving get_random_bytes is not filled sufficiently. This patch adds an additional reseeding step as soon as the nonblocking pool gets marked as initialized. On some machines it might be possible that late_initcall gets called after the pool has been initialized. In this situation we won't reseed again. (A call to prandom_seed_late blocks later invocations of early reseed attempts.) Joint work with Daniel Borkmann. Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-11-11 14:32:14 -05:00
Theodore Ts'o	392a546dc8	random: add debugging code to detect early use of get_random_bytes() Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2013-11-03 18:24:08 -05:00
Theodore Ts'o	644008df89	random: initialize the last_time field in struct timer_rand_state Since we initialize jiffies to wrap five minutes before boot (see INITIAL_JIFFIES defined in include/linux/jiffies.h) it's important to make sure the last_time field is initialized to INITIAL_JIFFIES. Otherwise, the entropy estimator will overestimate the amount of entropy resulting from the first call to add_timer_randomness(), generally by about 8 bits. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2013-11-03 18:20:05 -05:00
Theodore Ts'o	ae9ecd92dd	random: don't zap entropy count in rand_initialize() The rand_initialize() function was being run fairly late in the kernel boot sequence. This was unfortunate, since it zero'ed the entropy counters, thus throwing away credit that was accumulated earlier in the boot sequence, and it also meant that initcall functions run before rand_initialize were using a minimally initialized pool. To fix this, fix init_std_data() to no longer zap the entropy counter; it wasn't necessary, and move rand_initialize() to be an early initcall. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2013-11-03 18:18:49 -05:00
Theodore Ts'o	301f0595c0	random: printk notifications for urandom pool initialization Print a notification to the console when the nonblocking pool is initialized. Also printk a warning when a process tries reading from /dev/urandom before it is fully initialized. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2013-11-03 18:18:48 -05:00
Theodore Ts'o	40db23e533	random: make add_timer_randomness() fill the nonblocking pool first Change add_timer_randomness() so that it directs incoming entropy to the nonblocking pool first if it hasn't been fully initialized yet. This matches the strategy we use in add_interrupt_randomness(), which allows us to push the randomness where we need it the most during when the system is first booting up, so that get_random_bytes() and /dev/urandom become safe to use as soon as possible. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2013-11-03 18:18:47 -05:00
Linus Torvalds	f715729ee4	These patches are designed to enable improvements to /dev/random for non-x86 platforms, in particular MIPS and ARM. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.14 (GNU/Linux) iQIcBAABCAAGBQJSVvO/AAoJENNvdpvBGATwNZ4P+wadRWY/Gdz/p9332qdVrGYs nP4DPWSg+n3RH/fOnacEwHF5vqapTe03G82NriCaVGFP8O9j7bo6ByMKKkIR7yvr 4sHUX4YMc/DwchaIHH2xp8fQoMc3Mv7mn8bodTtPXgveeldEvtuUQM0q+j4DXZUT qSLMGElgJYrpIf2Cm8JAIBkt2QuzpZPPX7Z6glZunpvfLSMmgn3Vj2ilNEx1YCFH v+Rk1ZYLjg2LzUYqaO7HOXlRJqmE10I7ZmNvPXJZ9fuPmGYD9FU6WeHhmIAFYdFw V6bAzou+LbnuNVoW6yiDvrKcOXgh2Spbk6SaKVSrcjVPfc87ocNzGWI4OTfNy1xI Kv9u4YfU3pIUWPDGx0mvT/KXAXl/PGVfu7bYXDcN2I2tqlrbBPdIWqpFB2eTn7/j //XbatoT6gGZTuseCKhYXWpG8AE5pCfbjGnd9il21fvlUDdkIq42wAs96qjc6Ruj tPCi5yYzLiHsn4eau+SJqI1KxPLf6YJw9Qo+f70FGl63wXJU9Vr07ID2rGTwXm1m Qf1joTtx900PvfzUaD0ODbQZaTbX6ebSOkriKpKWYwg+26Gdc7JAxIVI3HDOlOR+ ++r1M4ERwDic/xdVsB6Mngmop3d1BeNU2IAoiRDZwcJpS1+MLivlIbd1PjBAt0bU +oOm+wseHEzSnlgucQ0g =qnTe -----END PGP SIGNATURE----- Merge tag 'random_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random Pull /dev/random changes from Ted Ts'o: "These patches are designed to enable improvements to /dev/random for non-x86 platforms, in particular MIPS and ARM" * tag 'random_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random: random: allow architectures to optionally define random_get_entropy() random: run random_int_secret_init() run after all late_initcalls	2013-10-10 12:31:43 -07:00
Theodore Ts'o	f80bbd8b92	random: convert DEBUG_ENT to tracepoints Instead of using the random driver's ad-hoc DEBUG_ENT() mechanism, use tracepoints instead. This allows for a much more fine-grained control of which debugging mechanism which a developer might need, and unifies the debugging messages with all of the existing tracepoints. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2013-10-10 14:32:23 -04:00
Theodore Ts'o	6265e169cd	random: push extra entropy to the output pools As the input pool gets filled, start transfering entropy to the output pools until they get filled. This allows us to use the output pools to store more system entropy. Waste not, want not.... Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2013-10-10 14:32:22 -04:00
Theodore Ts'o	95b709b6be	random: drop trickle mode The add_timer_randomness() used to drop into trickle mode when entropy pool was estimated to be 87.5% full. This was important when add_timer_randomness() was used to sample interrupts. It's not used for this any more --- add_interrupt_randomness() now uses fast_mix() instead. By elimitating trickle mode, it allows us to fully utilize entropy provided by add_input_randomness() and add_disk_randomness() even when the input pool is above the old trickle threshold of 87.5%. This helps to answer the criticism in [1] in their hypothetical scenario where our entropy estimator was inaccurate, even though the measurements in [2] seem to indicate that our entropy estimator given real-life entropy collection is actually pretty good, albeit on the conservative side (which was as it was designed). [1] http://eprint.iacr.org/2013/338.pdf [2] http://eprint.iacr.org/2012/251.pdf Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2013-10-10 14:32:21 -04:00
Theodore Ts'o	6e9fa2c8a6	random: adjust the generator polynomials in the mixing function slightly Our mixing functions were analyzed by Lacharme, Roeck, Strubel, and Videau in their paper, "The Linux Pseudorandom Number Generator Revisited" (see: http://eprint.iacr.org/2012/251.pdf). They suggested a slight change to improve our mixing functions slightly. I also adjusted the comments to better explain what is going on, and to document why the polynomials were changed. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2013-10-10 14:32:21 -04:00
Theodore Ts'o	655b226470	random: speed up the fast_mix function by a factor of four By mixing the entropy in chunks of 32-bit words instead of byte by byte, we can speed up the fast_mix function significantly. Since it is called on every single interrupt, on systems with a very heavy interrupt load, this can make a noticeable difference. Also fix a compilation warning in add_interrupt_randomness() and avoid xor'ing cycles and jiffies together just in case we have an architecture which tries to define random_get_entropy() by returning jiffies. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reported-by: Jörn Engel <joern@logfs.org>	2013-10-10 14:32:20 -04:00
Theodore Ts'o	f5c2742c23	random: cap the rate which the /dev/urandom pool gets reseeded In order to avoid draining the input pool of its entropy at too high of a rate, enforce a minimum time interval between reseedings of the urandom pool. This is set to 60 seconds by default. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2013-10-10 14:32:19 -04:00
Theodore Ts'o	c59974aea4	random: optimize the entropy_store structure Use smaller types to slightly shrink the size of the entropy store structure. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2013-10-10 14:32:18 -04:00
Theodore Ts'o	3ef4cb2d65	random: optimize spinlock use in add_device_randomness() The add_device_randomness() function calls mix_pool_bytes() twice for the input pool and the non-blocking pool, for a total of four times. By using _mix_pool_byte() and taking the spinlock in add_device_randomness(), we can halve the number of times we need take each pool's spinlock. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2013-10-10 14:32:17 -04:00
Theodore Ts'o	5910895f0e	random: fix the tracepoint for get_random_bytes(_arch) Fix a problem where get_random_bytes_arch() was calling the tracepoint get_random_bytes(). So add a new tracepoint for get_random_bytes_arch(), and make get_random_bytes() and get_random_bytes_arch() call their correct tracepoint. Also, add a new tracepoint for add_device_randomness() Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2013-10-10 14:32:16 -04:00
H. Peter Anvin	30e37ec516	random: account for entropy loss due to overwrites When we write entropy into a non-empty pool, we currently don't account at all for the fact that we will probabilistically overwrite some of the entropy in that pool. This means that unless the pool is fully empty, we are currently guaranteed to overestimate the amount of entropy in the pool! Assuming Shannon entropy with zero correlations we end up with an exponentally decaying value of new entropy added: entropy <- entropy + (pool_size - entropy) * (1 - exp(-add_entropy/pool_size)) However, calculations involving fractional exponentials are not practical in the kernel, so apply a piecewise linearization: For add_entropy <= pool_size/2 then (1 - exp(-add_entropy/pool_size)) >= (add_entropy/pool_size)0.7869... ... so we can approximate the exponential with 3/4add_entropy/pool_size and still be on the safe side by adding at most pool_size/2 at a time. In order for the loop not to take arbitrary amounts of time if a bad ioctl is received, terminate if we are within one bit of full. This way the loop is guaranteed to terminate after no more than log2(poolsize) iterations, no matter what the input value is. The vast majority of the time the loop will be executed exactly once. The piecewise linearization is very conservative, approaching 3/4 of the usable input value for small inputs, however, our entropy estimation is pretty weak at best, especially for small values; we have no handle on correlation; and the Shannon entropy measure (Rényi entropy of order 1) is not the correct one to use in the first place, but rather the correct entropy measure is the min-entropy, the Rényi entropy of infinite order. As such, this conservatism seems more than justified. This does introduce fractional bit values. I have left it to have 3 bits of fraction, so that with a pool of 2^12 bits the multiply in credit_entropy_bits() can still fit into an int, as 2(3+12) < 31. It is definitely possible to allow for more fractional accounting, but that multiply then would have to be turned into a 3232 -> 64 multiply. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: DJ Johnston <dj.johnston@intel.com>	2013-10-10 14:32:15 -04:00
H. Peter Anvin	a283b5c459	random: allow fractional bits to be tracked Allow fractional bits of entropy to be tracked by scaling the entropy counter (fixed point). This will be used in a subsequent patch that accounts for entropy lost due to overwrites. [ Modified by tytso to fix up a few missing places where the entropy_count wasn't properly converted from fractional bits to bits. ] Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2013-10-10 14:32:14 -04:00
H. Peter Anvin	9ed17b70b4	random: statically compute poolbitshift, poolbytes, poolbits Use a macro to statically compute poolbitshift (will be used in a subsequent patch), poolbytes, and poolbits. On virtually all architectures the cost of a memory load with an offset is the same as the one of a memory load. It is still possible for this to generate worse code since the C compiler doesn't know the fixed relationship between these fields, but that is somewhat unlikely. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2013-10-10 14:32:13 -04:00
Theodore Ts'o	85a1f77716	random: mix in architectural randomness earlier in extract_buf() Previously if CPU chip had a built-in random number generator (i.e., RDRAND on newer x86 chips), we mixed it in at the very end of extract_buf() using an XOR operation. We now mix it in right after the calculate a hash across the entire pool. This has the advantage that any contribution of entropy from the CPU's HWRNG will get mixed back into the pool. In addition, it means that if the HWRNG has any defects (either accidentally or maliciously introduced), this will be mitigated via the non-linear transform of the SHA-1 hash function before we hand out generated output. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2013-10-10 14:32:13 -04:00
Theodore Ts'o	61875f30da	random: allow architectures to optionally define random_get_entropy() Allow architectures which have a disabled get_cycles() function to provide a random_get_entropy() function which provides a fine-grained, rapidly changing counter that can be used by the /dev/random driver. For example, an architecture might have a rapidly changing register used to control random TLB cache eviction, or DRAM refresh that doesn't meet the requirements of get_cycles(), but which is good enough for the needs of the random driver. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org	2013-10-10 14:30:53 -04:00
Theodore Ts'o	47d06e532e	random: run random_int_secret_init() run after all late_initcalls The some platforms (e.g., ARM) initializes their clocks as late_initcalls for some unknown reason. So make sure random_int_secret_init() is run after all of the late_initcalls are run. Cc: stable@vger.kernel.org Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2013-09-23 06:35:06 -04:00
Martin Schwidefsky	0244ad004a	Remove GENERIC_HARDIRQ config option After the last architecture switched to generic hard irqs the config options HAVE_GENERIC_HARDIRQS & GENERIC_HARDIRQS and the related code for !CONFIG_GENERIC_HARDIRQS can be removed. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2013-09-13 15:09:52 +02:00
Joe Perches	a151427ed0	char: Convert use of typedef ctl_table to struct ctl_table This typedef is unnecessary and should just be removed. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-06-17 16:43:08 -07:00
Jiri Kosina	10b3a32d29	random: fix accounting race condition with lockless irq entropy_count update Commit `902c098a36` ("random: use lockless techniques in the interrupt path") turned IRQ path from being spinlock protected into lockless cmpxchg-retry update. That commit removed r->lock serialization between crediting entropy bits from IRQ context and accounting when extracting entropy on userspace read path, but didn't turn the r->entropy_count reads/updates in account() to use cmpxchg as well. It has been observed, that under certain circumstances this leads to read() on /dev/urandom to return 0 (EOF), as r->entropy_count gets corrupted and becomes negative, which in turn results in propagating 0 all the way from account() to the actual read() call. Convert the accounting code to be the proper lockless counterpart of what has been partially done by `902c098a36`. Signed-off-by: Jiri Kosina <jkosina@suse.cz> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Greg KH <greg@kroah.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-05-24 16:22:52 -07:00
Jarod Wilson	1e7e2e05c1	drivers/char/random.c: fix priming of last_data Commit `ec8f02da9e` ("random: prime last_data value per fips requirements") added priming of last_data per fips requirements. Unfortuantely, it did so in a way that can lead to multiple threads all incrementing nbytes, but only one actually doing anything with the extra data, which leads to some fun random corruption and panics. The fix is to simply do everything needed to prime last_data in a single shot, so there's no window for multiple cpus to increment nbytes -- in fact, we won't even increment or decrement nbytes anymore, we'll just extract the needed EXTRACT_SIZE one time per pool and then carry on with the normal routine. All these changes have been tested across multiple hosts and architectures where panics were previously encoutered. The code changes are are strictly limited to areas only touched when when booted in fips mode. This change should also go into 3.8-stable, to make the myriads of fips users on 3.8.x happy. Signed-off-by: Jarod Wilson <jarod@redhat.com> Tested-by: Jan Stancek <jstancek@redhat.com> Tested-by: Jan Stodola <jstodola@redhat.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Neil Horman <nhorman@tuxdriver.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Matt Mackall <mpm@selenic.com> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-05-24 16:22:52 -07:00
Andy Shevchenko	16c7fa0582	lib/string_helpers: introduce generic string_unescape There are several places in kernel where modules unescapes input to convert C-Style Escape Sequences into byte codes. The patch provides generic implementation of such approach. Test cases are also included into the patch. [akpm@linux-foundation.org: clarify comment] [akpm@linux-foundation.org: export get_random_int() to modules] Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Samuel Thibault <samuel.thibault@ens-lyon.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jason Baron <jbaron@redhat.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: William Hubbs <w.d.hubbs@gmail.com> Cc: Chris Brannon <chris@the-brannons.com> Cc: Kirk Reiser <kirk@braille.uwo.ca> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-04-30 17:04:03 -07:00
Linus Torvalds	c77f8bf918	Fix a circular locking dependency in random's collection of cputime used by a thread when it exits. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQIcBAABCAAGBQJROTMZAAoJENNvdpvBGATw3/YQAIbsAxj8i16k/fCtKzXNjzL8 /CWdJR7P+hzIpCnbNekpXaTFstvzi4aud5DMV00B17cLk87AGcjm/XhUfHRoWDjf Q15/3Zm+xHDGN/A3vLjQEb15yjlc/Z0X83JR6+StcNYh5tDujwz/QYAAUStH10yV xY8DlErDKANeeoAaPtmbqB+4+mllXzCjp8nqtSMl6aR29YRBi50fOF9Hli9Mrm7+ hqZz61xWBGZpRuWvXEWFkRhRxhwJ03UMOPxzTeGvh4+f8/JexF0U9/a3qMWbJK6P jcuBh6J4MVKN9y77C2Py4VCiDEVQCQHWFfA9+tIG6SxTnkteKi1s7Z5oNDUcobkQ 2cmPcoM7ChDseojxcPJX3rzA0popFk6IzeRYyeKzenKqSsabcFB/BnGR2u0N5hqd 8JIRNu+Wo08OjgP9PFge3quymNOrJThQWlMNMq4rNuk6mKKxAXkLyt87dfYmIzt1 nIVZXjjqaziTR0mIe5FskeAPIUgGsxaN5hAqEfReE2pmykcJSJza1I/9g9FtGXGa bI9UUZsHWZ0lVQMz2axrGJsBmkoJS5E7ZHWjJW0fW0gO6ufLX/kd7eW4PYRvLwMm VTwh1aalcEz0LvPV01Ayc1fq1FEVm7i2OsZ5VI20TR4sbgX5MNjSeDMf2v87/wVd B3NpSTt6FN02VnbaY+Tj =09UJ -----END PGP SIGNATURE----- Merge tag 'random_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random Pull random fixes from Ted Ts'o: "Fix a circular locking dependency in random's collection of cputime used by a thread when it exits." * tag 'random_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random: random: fix locking dependency with the tasklist_lock	2013-03-08 14:42:16 -08:00
Theodore Ts'o	b980955236	random: fix locking dependency with the tasklist_lock Commit `6133705494` introduced a circular lock dependency because posix_cpu_timers_exit() is called by release_task(), which is holding a writer lock on tasklist_lock, and this can cause a deadlock since kill_fasync() gets called with nonblocking_pool.lock taken. There's no reason why kill_fasync() needs to be taken while the random pool is locked, so move it out to fix this locking dependency. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reported-by: Russ Dill <Russ.Dill@gmail.com> Cc: stable@kernel.org	2013-03-04 12:05:15 -05:00
Thomas Gleixner	eece09ec21	locking: Various static lock initializer fixes The static lock initializers want to be fed the proper name of the lock and not some random string. In mainline random strings are obfuscating the readability of debug output, but for RT they prevent the spinlock substitution. Fix it up. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2013-02-19 08:42:45 +01:00
Jarod Wilson	ec8f02da9e	random: prime last_data value per fips requirements The value stored in last_data must be primed for FIPS 140-2 purposes. Upon first use, either on system startup or after an RNDCLEARPOOL ioctl, we need to take an initial random sample, store it internally in last_data, then pass along the value after that to the requester, so that consistency checks aren't being run against stale and possibly known data. CC: Herbert Xu <herbert@gondor.apana.org.au> CC: "David S. Miller" <davem@davemloft.net> CC: Matt Mackall <mpm@selenic.com> CC: linux-crypto@vger.kernel.org Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Jarod Wilson <jarod@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2012-11-08 07:19:18 -05:00
Jiri Kosina	8eb2ffbf7b	random: fix debug format strings Fix the following warnings in formatting debug output: drivers/char/random.c: In function ‘xfer_secondary_pool’: drivers/char/random.c:827: warning: format ‘%d’ expects type ‘int’, but argument 7 has type ‘size_t’ drivers/char/random.c: In function ‘account’: drivers/char/random.c:859: warning: format ‘%d’ expects type ‘int’, but argument 5 has type ‘size_t’ drivers/char/random.c:881: warning: format ‘%d’ expects type ‘int’, but argument 5 has type ‘size_t’ drivers/char/random.c: In function ‘random_read’: drivers/char/random.c:1141: warning: format ‘%d’ expects type ‘int’, but argument 5 has type ‘ssize_t’ drivers/char/random.c:1145: warning: format ‘%d’ expects type ‘int’, but argument 5 has type ‘ssize_t’ drivers/char/random.c:1145: warning: format ‘%d’ expects type ‘int’, but argument 6 has type ‘long unsigned int’ by using '%zd' instead of '%d' to properly denote ssize_t/size_t conversion. Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2012-11-08 07:12:20 -05:00
Jiri Kosina	be5b779ae9	random: make it possible to enable debugging without rebuild The module parameter that turns debugging mode (which basically means printing a few extra lines during runtime) is in '#if 0' block. Forcing everyone who would like to see how entropy is behaving on his system to rebuild seems to be a little bit too harsh. If we were concerned about speed, we could potentially turn 'debug' into a static key, but I don't think it's necessary. Drop the '#if 0' block to allow using the 'debug' parameter without rebuilding. Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2012-10-15 23:24:39 -04:00
H. Peter Anvin	d2e7c96af1	random: mix in architectural randomness in extract_buf() Mix in any architectural randomness in extract_buf() instead of xfer_secondary_buf(). This allows us to mix in more architectural randomness, and it also makes xfer_secondary_buf() faster, moving a tiny bit of additional CPU overhead to process which is extracting the randomness. [ Commit description modified by tytso to remove an extended advertisement for the RDRAND instruction. ] Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: DJ Johnston <dj.johnston@intel.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org	2012-07-27 22:37:20 -04:00
Tony Luck	cbc96b7594	random: Add comment to random_initialize() Many platforms have per-machine instance data (serial numbers, asset tags, etc.) squirreled away in areas that are accessed during early system bringup. Mixing this data into the random pools has a very high value in providing better random data, so we should allow (and even encourage) architecture code to call add_device_randomness() from the setup_arch() paths. However, this limits our options for internal structure of the random driver since random_initialize() is not called until long after setup_arch(). Add a big fat comment to rand_initialize() spelling out this requirement. Suggested-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2012-07-24 13:16:41 -04:00
Theodore Ts'o	c5857ccf29	random: remove rand_initialize_irq() With the new interrupt sampling system, we are no longer using the timer_rand_state structure in the irq descriptor, so we can stop initializing it now. [ Merged in fixes from Sedat to find some last missing references to rand_initialize_irq() ] Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Sedat Dilek <sedat.dilek@gmail.com>	2012-07-19 10:38:32 -04:00
Theodore Ts'o	00ce1db1a6	random: add tracepoints for easier debugging and verification Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2012-07-14 20:17:48 -04:00
Theodore Ts'o	c2557a303a	random: add new get_random_bytes_arch() function Create a new function, get_random_bytes_arch() which will use the architecture-specific hardware random number generator if it is present. Change get_random_bytes() to not use the HW RNG, even if it is avaiable. The reason for this is that the hw random number generator is fast (if it is present), but it requires that we trust the hardware manufacturer to have not put in a back door. (For example, an increasing counter encrypted by an AES key known to the NSA.) It's unlikely that Intel (for example) was paid off by the US Government to do this, but it's impossible for them to prove otherwise --- especially since Bull Mountain is documented to use AES as a whitener. Hence, the output of an evil, trojan-horse version of RDRAND is statistically indistinguishable from an RDRAND implemented to the specifications claimed by Intel. Short of using a tunnelling electronic microscope to reverse engineer an Ivy Bridge chip and disassembling and analyzing the CPU microcode, there's no way for us to tell for sure. Since users of get_random_bytes() in the Linux kernel need to be able to support hardware systems where the HW RNG is not present, most time-sensitive users of this interface have already created their own cryptographic RNG interface which uses get_random_bytes() as a seed. So it's much better to use the HW RNG to improve the existing random number generator, by mixing in any entropy returned by the HW RNG into /dev/random's entropy pool, but to always _use_ /dev/random's entropy pool. This way we get almost of the benefits of the HW RNG without any potential liabilities. The only benefits we forgo is the speed/performance enhancements --- and generic kernel code can't depend on depend on get_random_bytes() having the speed of a HW RNG anyway. For those places that really want access to the arch-specific HW RNG, if it is available, we provide get_random_bytes_arch(). Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org	2012-07-14 20:17:47 -04:00
Theodore Ts'o	e6d4947b12	random: use the arch-specific rng in xfer_secondary_pool If the CPU supports a hardware random number generator, use it in xfer_secondary_pool(), where it will significantly improve things and where we can afford it. Also, remove the use of the arch-specific rng in add_timer_randomness(), since the call is significantly slower than get_cycles(), and we're much better off using it in xfer_secondary_pool() anyway. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org	2012-07-14 20:17:46 -04:00
Linus Torvalds	a2080a67ab	random: create add_device_randomness() interface Add a new interface, add_device_randomness() for adding data to the random pool that is likely to differ between two devices (or possibly even per boot). This would be things like MAC addresses or serial numbers, or the read-out of the RTC. This does not add any actual entropy to the pool, but it initializes the pool to different values for devices that might otherwise be identical and have very little entropy available to them (particularly common in the embedded world). [ Modified by tytso to mix in a timestamp, since there may be some variability caused by the time needed to detect/configure the hardware in question. ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org	2012-07-14 20:17:44 -04:00
Theodore Ts'o	902c098a36	random: use lockless techniques in the interrupt path The real-time Linux folks don't like add_interrupt_randomness() taking a spinlock since it is called in the low-level interrupt routine. This also allows us to reduce the overhead in the fast path, for the random driver, which is the interrupt collection path. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org	2012-07-14 20:17:43 -04:00
Theodore Ts'o	775f4b297b	random: make 'add_interrupt_randomness()' do something sane We've been moving away from add_interrupt_randomness() for various reasons: it's too expensive to do on every interrupt, and flooding the CPU with interrupts could theoretically cause bogus floods of entropy from a somewhat externally controllable source. This solves both problems by limiting the actual randomness addition to just once a second or after 64 interrupts, whicever comes first. During that time, the interrupt cycle data is buffered up in a per-cpu pool. Also, we make sure the the nonblocking pool used by urandom is initialized before we start feeding the normal input pool. This assures that /dev/urandom is returning unpredictable data as soon as possible. (Based on an original patch by Linus, but significantly modified by tytso.) Tested-by: Eric Wustrow <ewust@umich.edu> Reported-by: Eric Wustrow <ewust@umich.edu> Reported-by: Nadia Heninger <nadiah@cs.ucsd.edu> Reported-by: Zakir Durumeric <zakir@umich.edu> Reported-by: J. Alex Halderman <jhalderm@umich.edu>. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org	2012-07-14 20:17:28 -04:00
Theodore Ts'o	74feec5dd8	random: fix up sparse warnings Add extern and static declarations to suppress sparse warnings Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2012-07-06 14:13:25 -04:00
Mathieu Desnoyers	44e4360fa3	drivers/char/random.c: fix boot id uniqueness race /proc/sys/kernel/random/boot_id can be read concurrently by userspace processes. If two (or more) user-space processes concurrently read boot_id when sysctl_bootid is not yet assigned, a race can occur making boot_id differ between the reads. Because the whole point of the boot id is to be unique across a kernel execution, fix this by protecting this operation with a spinlock. Given that this operation is not frequently used, hitting the spinlock on each call should not be an issue. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Matt Mackall <mpm@selenic.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Greg Kroah-Hartman <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-04-12 13:12:12 -07:00
Linus Torvalds	c2bc3a316a	Merge branch 'x86/rdrand' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'x86/rdrand' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: random: Adjust the number of loops when initializing random: Use arch-specific RNG to initialize the entropy store	2012-01-16 18:23:09 -08:00
H. Peter Anvin	2dac8e54f9	random: Adjust the number of loops when initializing When we are initializing using arch_get_random_long() we only need to loop enough times to touch all the bytes in the buffer; using poolwords for that does twice the number of operations necessary on a 64-bit machine, since in the random number generator code "word" means 32 bits. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: "Theodore Ts'o" <tytso@mit.edu> Link: http://lkml.kernel.org/r/1324589281-31931-1-git-send-email-tytso@mit.edu	2012-01-16 11:33:49 -08:00
Theodore Ts'o	3e88bdff1c	random: Use arch-specific RNG to initialize the entropy store If there is an architecture-specific random number generator (such as RDRAND for Intel architectures), use it to initialize /dev/random's entropy stores. Even in the worst case, if RDRAND is something like AES(NSA_KEY, counter++), it won't hurt, and it will definitely help against any other adversaries. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Link: http://lkml.kernel.org/r/1324589281-31931-1-git-send-email-tytso@mit.edu Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2012-01-16 11:18:21 -08:00
Rusty Russell	90ab5ee941	module_param: make bool parameters really bool (drivers & misc) module_param(bool) used to counter-intuitively take an int. In `fddd5201` (mid-2009) we allowed bool or int/unsigned int using a messy trick. It's time to remove the int/unsigned int option. For this version it'll simply give a warning, but it'll break next kernel version. Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2012-01-13 09:32:20 +10:30
Linus Torvalds	cf833d0b99	random: Use arch_get_random_int instead of cycle counter if avail We still don't use rdrand in /dev/random, which just seems stupid. We accept the cyclecounter* as a random input, but we don't accept rdrand? That's just broken. Sure, people can do things in user space (write to /dev/random, use rdrand in addition to /dev/random themselves etc etc), but that still seems to be a particularly stupid reason for saying "we shouldn't bother to try to do better in /dev/random". And even if somebody really doesn't trust rdrand as a source of random bytes, it seems singularly stupid to trust the cycle counter more. So I'd suggest the attached patch. I'm not going to even bother arguing that we should add more bits to the entropy estimate, because that's not the point - I don't care if /dev/random fills up slowly or not, I think it's just stupid to not use the bits we can get from rdrand and mix them into the strong randomness pool. Link: http://lkml.kernel.org/r/CA%2B55aFwn59N1=m651QAyTy-1gO1noGbK18zwKDwvwqnravA84A@mail.gmail.com Acked-by: "David S. Miller" <davem@davemloft.net> Acked-by: "Theodore Ts'o" <tytso@mit.edu> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Cc: Matt Mackall <mpm@selenic.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-12-29 16:49:45 -08:00
Luck, Tony	bd29e568a4	fix typo/thinko in get_random_bytes() If there is an architecture-specific random number generator we use it to acquire randomness one "long" at a time. We should put these random words into consecutive words in the result buffer - not just overwrite the first word again and again. Signed-off-by: Tony Luck <tony.luck@intel.com> Acked-by: H. Peter Anvin <hpa@zytor.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-11-17 11:42:54 -02:00
Linus Torvalds	8e6d539e0f	Merge branch 'x86-rdrand-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'x86-rdrand-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, random: Verify RDRAND functionality and allow it to be disabled x86, random: Architectural inlines to get random integers with RDRAND random: Add support for architectural random hooks Fix up trivial conflicts in drivers/char/random.c: the architectural random hooks touched "get_random_int()" that was simplified to use MD5 and not do the keyptr thing any more (see commit 6e5714eaf77d: "net: Compute protocol sequence numbers and fragment IDs using MD5").	2011-10-28 05:29:07 -07:00
David S. Miller	6e5714eaf7	net: Compute protocol sequence numbers and fragment IDs using MD5. Computers have become a lot faster since we compromised on the partial MD4 hash which we use currently for performance reasons. MD5 is a much safer choice, and is inline with both RFC1948 and other ISS generators (OpenBSD, Solaris, etc.) Furthermore, only having 24-bits of the sequence number be truly unpredictable is a very serious limitation. So the periodic regeneration and 8-bit counter have been removed. We compute and use a full 32-bit sequence number. For ipv6, DCCP was found to use a 32-bit truncated initial sequence number (it needs 43-bits) and that is fixed here as well. Reported-by: Dan Kaminsky <dan@doxpara.com> Tested-by: Willy Tarreau <w@1wt.eu> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-08-06 18:33:19 -07:00
H. Peter Anvin	63d7717326	random: Add support for architectural random hooks Add support for architecture-specific hooks into the kernel-directed random number generator interfaces. This patchset does not use the architecture random number generator interfaces for the userspace-directed interfaces (/dev/random and /dev/urandom), thus eliminating the need to distinguish between them based on a pool pointer. Changes in version 3: - Moved the hooks from extract_entropy() to get_random_bytes(). - Changes the hooks to inlines. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Matt Mackall <mpm@selenic.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "Theodore Ts'o" <tytso@mit.edu>	2011-07-31 13:54:50 -07:00
Eric Dumazet	87c48fa3b4	ipv6: make fragment identifications less predictable IPv6 fragment identification generation is way beyond what we use for IPv4 : It uses a single generator. Its not scalable and allows DOS attacks. Now inetpeer is IPv6 aware, we can use it to provide a more secure and scalable frag ident generator (per destination, instead of system wide) This patch : 1) defines a new secure_ipv6_id() helper 2) extends inet_getid() to provide 32bit results 3) extends ipv6_select_ident() with a new dest parameter Reported-by: Fernando Gont <fernando@gont.com.ar> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-07-21 21:25:58 -07:00
Lucas De Marchi	25985edced	Fix common misspellings Fixes generated by 'codespell' and manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>	2011-03-31 11:26:23 -03:00
Jarod Wilson	442a4fffff	random: update interface comments to reflect reality At present, the comment header in random.c makes no mention of add_disk_randomness, and instead, suggests that disk activity adds to the random pool by way of add_interrupt_randomness, which appears to not have been the case since sometime prior to the existence of git, and even prior to bitkeeper. Didn't look any further back. At least, as far as I can tell, there are no storage drivers setting IRQF_SAMPLE_RANDOM, which is a requirement for add_interrupt_randomness to trigger, so the only way for a disk to contribute entropy is by way of add_disk_randomness. Update comments accordingly, complete with special mention about solid state drives being a crappy source of entropy (see `e2e1a148bc` for reference). Signed-off-by: Jarod Wilson <jarod@redhat.com> Acked-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2011-02-21 22:42:42 +11:00
Christoph Lameter	b29c617af3	random: Use this_cpu_inc_return __this_cpu_inc can create a single instruction to do the same as __get_cpu_var()++. Cc: Richard Kennedy <richard@rsk.demon.co.uk> Cc: Matt Mackall <mpm@selenic.com> Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2010-12-17 15:18:05 +01:00
Arnd Bergmann	6038f373a3	llseek: automatically add .llseek fop All file_operations should get a .llseek operation so we can make nonseekable_open the default for future file operations without a .llseek pointer. The three cases that we can automatically detect are no_llseek, seq_lseek and default_llseek. For cases where we can we can automatically prove that the file offset is always ignored, we use noop_llseek, which maintains the current behavior of not returning an error from a seek. New drivers should normally not use noop_llseek but instead use no_llseek and call nonseekable_open at open time. Existing drivers can be converted to do the same when the maintainer knows for certain that no user code relies on calling seek on the device file. The generated code is often incorrectly indented and right now contains comments that clarify for each added line why a specific variant was chosen. In the version that gets submitted upstream, the comments will be gone and I will manually fix the indentation, because there does not seem to be a way to do that using coccinelle. Some amount of new code is currently sitting in linux-next that should get the same modifications, which I will do at the end of the merge window. Many thanks to Julia Lawall for helping me learn to write a semantic patch that does all this. ===== begin semantic patch ===== // This adds an llseek= method to all file operations, // as a preparation for making no_llseek the default. // // The rules are // - use no_llseek explicitly if we do nonseekable_open // - use seq_lseek for sequential files // - use default_llseek if we know we access f_pos // - use noop_llseek if we know we don't access f_pos, // but we still want to allow users to call lseek // @ open1 exists @ identifier nested_open; @@ nested_open(...) { <+... nonseekable_open(...) ...+> } @ open exists@ identifier open_f; identifier i, f; identifier open1.nested_open; @@ int open_f(struct inode i, struct file f) { <+... ( nonseekable_open(...) \| nested_open(...) ) ...+> } @ read disable optional_qualifier exists @ identifier read_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; expression E; identifier func; @@ ssize_t read_f(struct file f, char p, size_t s, loff_t off) { <+... ( off = E \| off += E \| func(..., off, ...) \| E = off ) ...+> } @ read_no_fpos disable optional_qualifier exists @ identifier read_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; @@ ssize_t read_f(struct file f, char p, size_t s, loff_t off) { ... when != off } @ write @ identifier write_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; expression E; identifier func; @@ ssize_t write_f(struct file f, const char p, size_t s, loff_t off) { <+... ( off = E \| off += E \| func(..., off, ...) \| E = off ) ...+> } @ write_no_fpos @ identifier write_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; @@ ssize_t write_f(struct file f, const char p, size_t s, loff_t off) { ... when != off } @ fops0 @ identifier fops; @@ struct file_operations fops = { ... }; @ has_llseek depends on fops0 @ identifier fops0.fops; identifier llseek_f; @@ struct file_operations fops = { ... .llseek = llseek_f, ... }; @ has_read depends on fops0 @ identifier fops0.fops; identifier read_f; @@ struct file_operations fops = { ... .read = read_f, ... }; @ has_write depends on fops0 @ identifier fops0.fops; identifier write_f; @@ struct file_operations fops = { ... .write = write_f, ... }; @ has_open depends on fops0 @ identifier fops0.fops; identifier open_f; @@ struct file_operations fops = { ... .open = open_f, ... }; // use no_llseek if we call nonseekable_open //////////////////////////////////////////// @ nonseekable1 depends on !has_llseek && has_open @ identifier fops0.fops; identifier nso ~= "nonseekable_open"; @@ struct file_operations fops = { ... .open = nso, ... +.llseek = no_llseek, /* nonseekable / }; @ nonseekable2 depends on !has_llseek @ identifier fops0.fops; identifier open.open_f; @@ struct file_operations fops = { ... .open = open_f, ... +.llseek = no_llseek, / open uses nonseekable / }; // use seq_lseek for sequential files ///////////////////////////////////// @ seq depends on !has_llseek @ identifier fops0.fops; identifier sr ~= "seq_read"; @@ struct file_operations fops = { ... .read = sr, ... +.llseek = seq_lseek, / we have seq_read / }; // use default_llseek if there is a readdir /////////////////////////////////////////// @ fops1 depends on !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier readdir_e; @@ // any other fop is used that changes pos struct file_operations fops = { ... .readdir = readdir_e, ... +.llseek = default_llseek, / readdir is present / }; // use default_llseek if at least one of read/write touches f_pos ///////////////////////////////////////////////////////////////// @ fops2 depends on !fops1 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read.read_f; @@ // read fops use offset struct file_operations fops = { ... .read = read_f, ... +.llseek = default_llseek, / read accesses f_pos / }; @ fops3 depends on !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier write.write_f; @@ // write fops use offset struct file_operations fops = { ... .write = write_f, ... + .llseek = default_llseek, / write accesses f_pos / }; // Use noop_llseek if neither read nor write accesses f_pos /////////////////////////////////////////////////////////// @ fops4 depends on !fops1 && !fops2 && !fops3 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read_no_fpos.read_f; identifier write_no_fpos.write_f; @@ // write fops use offset struct file_operations fops = { ... .write = write_f, .read = read_f, ... +.llseek = noop_llseek, / read and write both use no f_pos / }; @ depends on has_write && !has_read && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier write_no_fpos.write_f; @@ struct file_operations fops = { ... .write = write_f, ... +.llseek = noop_llseek, / write uses no f_pos / }; @ depends on has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read_no_fpos.read_f; @@ struct file_operations fops = { ... .read = read_f, ... +.llseek = noop_llseek, / read uses no f_pos / }; @ depends on !has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; @@ struct file_operations fops = { ... +.llseek = noop_llseek, / no read or write fn */ }; ===== End semantic patch ===== Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Julia Lawall <julia@diku.dk> Cc: Christoph Hellwig <hch@infradead.org>	2010-10-15 15:53:27 +02:00
Richard Kennedy	4015d9a865	random: Reorder struct entropy_store to remove padding on 64bits Re-order structure entropy_store to remove 8 bytes of padding on 64 bit builds, so shrinking this structure from 72 to 64 bytes and allowing it to fit into one cache line. Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk> Signed-off-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2010-07-31 19:58:00 +08:00
Matt Mackall	e954bc91bd	random: simplify fips mode Rather than dynamically allocate 10 bytes, move it to static allocation. This saves space and avoids the need for error checking. Signed-off-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2010-05-20 19:55:01 +10:00
Adam Buchbinder	c41b20e721	Fix misspellings of "truly" in comments. Some comments misspell "truly"; this fixes them. No code changes. Signed-off-by: Adam Buchbinder <adam.buchbinder@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2010-02-04 11:55:45 +01:00
Herbert Xu	cd1510cb5f	random: Remove unused inode variable The previous changeset left behind an unused inode variable. This patch removes it. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2010-02-02 06:50:27 +11:00
Matt Mackall	a996996dd7	random: drop weird m_time/a_time manipulation No other driver does anything remotely like this that I know of except for the tty drivers, and I can't see any reason for random/urandom to do it. In fact, it's a (trivial, harmless) timing information leak. And obviously, it generates power- and flash-cycle wasting I/O, especially if combined with something like hwrngd. Also, it breaks ubifs's expectations. Signed-off-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2010-02-02 06:50:23 +11:00
Joe Perches	35900771c0	random.c: use %pU to print UUIDs Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-12-15 08:53:33 -08:00
Eric W. Biederman	6d4561110a	sysctl: Drop & in front of every proc_handler. For consistency drop & in front of every proc_handler. Explicity taking the address is unnecessary and it prevents optimizations like stubbing the proc_handlers to NULL. Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Joe Perches <joe@perches.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2009-11-18 08:37:40 -08:00
Eric W. Biederman	894d249115	sysctl drivers: Remove dead binary sysctl support Now that sys_sysctl is a wrapper around /proc/sys all of the binary sysctl support elsewhere in the tree is dead code. Cc: Jens Axboe <axboe@kernel.dk> Cc: Corey Minyard <minyard@acm.org> Cc: Greg Kroah-Hartman <gregkh@suse.de> Cc: Matt Mackall <mpm@selenic.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Neil Brown <neilb@suse.de> Cc: "James E.J. Bottomley" <James.Bottomley@suse.de> Acked-by: Clemens Ladisch <clemens@ladisch.de> for drivers/char/hpet.c Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2009-11-12 02:04:58 -08:00
Alexey Dobriyan	8d65af789f	sysctl: remove "struct file *" argument of ->proc_handler It's unused. It isn't needed -- read or write flag is already passed and sysctl shouldn't care about the rest. It _was_ used in two places at arch/frv for some reason. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: David Howells <dhowells@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: "David S. Miller" <davem@davemloft.net> Cc: James Morris <jmorris@namei.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-09-24 07:21:04 -07:00
Neil Horman	5b739ef8a4	random: Add optional continuous repetition test to entropy store based rngs FIPS-140 requires that all random number generators implement continuous self tests in which each extracted block of data is compared against the last block for repetition. The ansi_cprng implements such a test, but it would be nice if the hw rng's did the same thing. Obviously its not something thats always needed, but it seems like it would be a nice feature to have on occasion. I've written the below patch which allows individual entropy stores to be flagged as desiring a continuous test to be run on them as is extracted. By default this option is off, but is enabled in the event that fips mode is selected during bootup. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2009-06-18 19:50:21 +08:00
Linus Torvalds	26a9a41823	Avoid ICE in get_random_int() with gcc-3.4.5 Martin Knoblauch reports that trying to build 2.6.30-rc6-git3 with RHEL4.3 userspace (gcc (GCC) 3.4.5 20051201 (Red Hat 3.4.5-2)) causes an internal compiler error (ICE): drivers/char/random.c: In function `get_random_int': drivers/char/random.c:1672: error: unrecognizable insn: (insn 202 148 150 0 /scratch/build/linux-2.6.30-rc6-git3/arch/x86/include/asm/tsc.h:23 (set (reg:SI 0 ax [91]) (subreg:SI (plus:DI (plus:DI (reg:DI 0 ax [88]) (subreg:DI (reg:SI 6 bp) 0)) (const_int -4 [0xfffffffffffffffc])) 0)) -1 (nil) (nil)) drivers/char/random.c:1672: internal compiler error: in extract_insn, at recog.c:2083 and after some debugging it turns out that it's due to the code trying to figure out the rough value of the current stack pointer by taking an address of an uninitialized variable and casting that to an integer. This is clearly a compiler bug, but it's not worth fighting - while the current stack kernel pointer might be somewhat hard to predict in user space, it's also not generally going to change for a lot of the call chains for a particular process. So just drop it, and mumble some incoherent curses at the compiler. Tested-by: Martin Knoblauch <spamtrap@knobisoft.de> Cc: Matt Mackall <mpm@selenic.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-05-19 11:25:35 -07:00
Linus Torvalds	8a0a9bd4db	random: make get_random_int() more random It's a really simple patch that basically just open-codes the current "secure_ip_id()" call, but when open-coding it we now use a _static_ hashing area, so that it gets updated every time. And to make sure somebody can't just start from the same original seed of all-zeroes, and then do the "half_md4_transform()" over and over until they get the same sequence as the kernel has, each iteration also mixes in the same old "current->pid + jiffies" we used - so we should now have a regular strong pseudo-number generator, but we also have one that doesn't have a single seed. Note: the "pid + jiffies" is just meant to be a tiny tiny bit of noise. It has no real meaning. It could be anything. I just picked the previous seed, it's just that now we keep the state in between calls and that will feed into the next result, and that should make all the difference. I made that hash be a per-cpu data just to avoid cache-line ping-pong: having multiple CPU's write to the same data would be fine for randomness, and add yet another layer of chaos to it, but since get_random_int() is supposed to be a fast interface I did it that way instead. I considered using "__raw_get_cpu_var()" to avoid any preemption overhead while still getting the hash be _mostly_ ping-pong free, but in the end good taste won out. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-05-07 11:59:06 -07:00
Anton Blanchard	417b43d4b7	random: align rekey_work's timer Align rekey_work. Even though it's infrequent, we may as well line it up. Signed-off-by: Anton Blanchard <anton@samba.org> Acked-by: Matt Mackall <mpm@selenic.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-04-02 19:04:49 -07:00
Yinghai Lu	d178a1eb5c	sparseirq: fix build with unknown irq_desc struct Ingo Molnar wrote: > > tip/kernel/fork.c: In function 'copy_signal': > tip/kernel/fork.c:825: warning: unused variable 'ret' > tip/drivers/char/random.c: In function 'get_timer_rand_state': > tip/drivers/char/random.c:584: error: dereferencing pointer to incomplete type > tip/drivers/char/random.c: In function 'set_timer_rand_state': > tip/drivers/char/random.c:594: error: dereferencing pointer to incomplete type > make[3]: *** [drivers/char/random.o] Error 1 irq_desc is defined in linux/irq.h, so include it in the genirq case. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-01-11 16:06:03 +01:00
Yinghai Lu	d7e51e6689	sparseirq: make some func to be used with genirq Impact: clean up sparseirq fallout on random.c Ingo suggested to change some ifdef from SPARSE_IRQ to GENERIC_HARDIRQS so we could some #ifdef later if all arch support genirq Signed-off-by: Yinghai Lu <yinghai@kernel.org> Acked-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-01-11 04:46:26 +01:00
Matt Mackall	cda796a3d5	random: don't try to look at entropy_count outside the lock As a non-atomic value, it's only safe to look at entropy_count when the pool lock is held, so we move the BUG_ON inside the lock for correctness. Also remove the spurious comment. It's ok for entropy_count to temporarily exceed POOLBITS so long as it's left in a consistent state when the lock is released. This is a more correct, simple, and idiomatic fix for the bug in `8b76f46a2d`. I've left the reorderings introduced by that patch in place as they're harmless, even though they don't properly deal with potential atomicity issues. Signed-off-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:30 -08:00
Yinghai Lu	2f98357001	sparseirq: move set/get_timer_rand_state back to .c those two functions only used in that C file Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-03 12:01:23 -08:00
Yinghai Lu	0b8f1efad3	sparse irq_desc[] array: core kernel and x86 changes Impact: new feature Problem on distro kernels: irq_desc[NR_IRQS] takes megabytes of RAM with NR_CPUS set to large values. The goal is to be able to scale up to much larger NR_IRQS value without impacting the (important) common case. To solve this, we generalize irq_desc[NR_IRQS] to an (optional) array of irq_desc pointers. When CONFIG_SPARSE_IRQ=y is used, we use kzalloc_node to get irq_desc, this also makes the IRQ descriptors NUMA-local (to the site that calls request_irq()). This gets rid of the irq_cfg[] static array on x86 as well: irq_cfg now uses desc->chip_data for x86 to store irq_cfg. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-08 14:31:51 +01:00
Al Viro	233e70f422	saner FASYNC handling on file close As it is, all instances of ->release() for files that have ->fasync() need to remember to evict file from fasync lists; forgetting that creates a hole and we actually have a bunch that does forget. So let's keep our lives simple - let __fput() check FASYNC in file->f_flags and call ->fasync() there if it's been set. And lose that crap in ->release() instances - leaving it there is still valid, but we don't have to bother anymore. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-11-01 09:49:46 -07:00
Linus Torvalds	9301975ec2	Merge branch 'genirq-v28-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip This merges branches irq/genirq, irq/sparseirq-v4, timers/hpet-percpu and x86/uv. The sparseirq branch is just preliminary groundwork: no sparse IRQs are actually implemented by this tree anymore - just the new APIs are added while keeping the old way intact as well (the new APIs map 1:1 to irq_desc[]). The 'real' sparse IRQ support will then be a relatively small patch ontop of this - with a v2.6.29 merge target. * 'genirq-v28-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (178 commits) genirq: improve include files intr_remapping: fix typo io_apic: make irq_mis_count available on 64-bit too genirq: fix name space collisions of nr_irqs in arch/* genirq: fix name space collision of nr_irqs in autoprobe.c genirq: use iterators for irq_desc loops proc: fixup irq iterator genirq: add reverse iterator for irq_desc x86: move ack_bad_irq() to irq.c x86: unify show_interrupts() and proc helpers x86: cleanup show_interrupts genirq: cleanup the sparseirq modifications genirq: remove artifacts from sparseirq removal genirq: revert dynarray genirq: remove irq_to_desc_alloc genirq: remove sparse irq code genirq: use inline function for irq_to_desc genirq: consolidate nr_irqs and for_each_irq_desc() x86: remove sparse irq from Kconfig genirq: define nr_irqs for architectures with GENERIC_HARDIRQS=n ...	2008-10-20 13:23:01 -07:00
Alexey Dobriyan	f221e726bf	sysctl: simplify ->strategy name and nlen parameters passed to ->strategy hook are unused, remove them. In general ->strategy hook should know what it's doing, and don't do something tricky for which, say, pointer to original userspace array may be needed (name). Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: David S. Miller <davem@davemloft.net> [ networking bits ] Cc: Ralf Baechle <ralf@linux-mips.org> Cc: David Howells <dhowells@redhat.com> Cc: Matt Mackall <mpm@selenic.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-16 11:21:47 -07:00
Thomas Gleixner	d6c88a507e	genirq: revert dynarray Revert the dynarray changes. They need more thought and polishing. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-10-16 16:53:15 +02:00
Thomas Gleixner	2cc21ef843	genirq: remove sparse irq code This code is not ready, but we need to rip it out instead of rebasing as we would lose the APIC/IO_APIC unification otherwise. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-10-16 16:53:15 +02:00
Yinghai Lu	3060d6fe28	x86: put timer_rand_state pointer into irq_desc irq_timer_state[] is a NR_IRQS sized array that is a side-by array to the real irq_desc[] array. Integrate that field into the (now dynamic) irq_desc dynamic array and save some RAM. v2: keep the old way to support arch not support irq_desc Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-10-16 16:52:31 +02:00
Yinghai Lu	eef1de76da	irqs: make irq_timer_state to use dyn_array Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-10-16 16:52:07 +02:00
Yinghai Lu	1f45f5621d	drivers/char: use nr_irqs convert them to nr_irqs. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-10-16 16:52:05 +02:00
Tejun Heo	f331c0296f	block: don't depend on consecutive minor space * Implement disk_devt() and part_devt() and use them to directly access devt instead of computing it from ->major and ->first_minor. Note that all references to ->major and ->first_minor outside of block layer is used to determine devt of the disk (the part0) and as ->major and ->first_minor will continue to represent devt for the disk, converting these users aren't strictly necessary. However, convert them for consistency. * Implement disk_max_parts() to avoid directly deferencing genhd->minors. * Update bdget_disk() such that it doesn't assume consecutive minor space. * Move devt computation from register_disk() to add_disk() and make it the only one (all other usages use the initially determined value). These changes clean up the code and will help disk->part dereference fix and extended block device numbers. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-10-09 08:56:05 +02:00
Andrew Morton	8b76f46a2d	drivers/char/random.c: fix a race which can lead to a bogus BUG() Fix a bug reported by and diagnosed by Aaron Straus. This is a regression intruduced into 2.6.26 by commit `adc782dae6` Author: Matt Mackall <mpm@selenic.com> Date: Tue Apr 29 01:03:07 2008 -0700 random: simplify and rename credit_entropy_store credit_entropy_bits() does: spin_lock_irqsave(&r->lock, flags); ... if (r->entropy_count > r->poolinfo->POOLBITS) r->entropy_count = r->poolinfo->POOLBITS; so there is a time window in which this BUG_ON(): static size_t account(struct entropy_store r, size_t nbytes, int min, int reserved) { unsigned long flags; BUG_ON(r->entropy_count > r->poolinfo->POOLBITS); / Hold lock while accounting */ spin_lock_irqsave(&r->lock, flags); can trigger. We could fix this by moving the assertion inside the lock, but it seems safer and saner to revert to the old behaviour wherein entropy_store.entropy_count at no time exceeds entropy_store.poolinfo->POOLBITS. Reported-by: Aaron Straus <aaron@merfinllc.com> Cc: Matt Mackall <mpm@selenic.com> Cc: Theodore Ts'o <tytso@mit.edu> Cc: <stable@kernel.org> [2.6.26.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-09-02 19:21:40 -07:00
Stephen Hemminger	9f59365374	nf_nat: use secure_ipv4_port_ephemeral() for NAT port randomization Use incoming network tuple as seed for NAT port randomization. This avoids concerns of leaking net_random() bits, and also gives better port distribution. Don't have NAT server, compile tested only. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> [ added missing EXPORT_SYMBOL_GPL ] Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-08-18 21:32:32 -07:00
Andrea Righi	27ac792ca0	PAGE_ALIGN(): correctly handle 64-bit values on 32-bit architectures On 32-bit architectures PAGE_ALIGN() truncates 64-bit values to the 32-bit boundary. For example: u64 val = PAGE_ALIGN(size); always returns a value < 4GB even if size is greater than 4GB. The problem resides in PAGE_MASK definition (from include/asm-x86/page.h for example): #define PAGE_SHIFT 12 #define PAGE_SIZE (_AC(1,UL) << PAGE_SHIFT) #define PAGE_MASK (~(PAGE_SIZE-1)) ... #define PAGE_ALIGN(addr) (((addr)+PAGE_SIZE-1)&PAGE_MASK) The "~" is performed on a 32-bit value, so everything in "and" with PAGE_MASK greater than 4GB will be truncated to the 32-bit boundary. Using the ALIGN() macro seems to be the right way, because it uses typeof(addr) for the mask. Also move the PAGE_ALIGN() definitions out of include/asm-*/page.h in include/linux/mm.h. See also lkml discussion: http://lkml.org/lkml/2008/6/11/237 [akpm@linux-foundation.org: fix drivers/media/video/uvc/uvc_queue.c] [akpm@linux-foundation.org: fix v850] [akpm@linux-foundation.org: fix powerpc] [akpm@linux-foundation.org: fix arm] [akpm@linux-foundation.org: fix mips] [akpm@linux-foundation.org: fix drivers/media/video/pvrusb2/pvrusb2-dvb.c] [akpm@linux-foundation.org: fix drivers/mtd/maps/uclinux.c] [akpm@linux-foundation.org: fix powerpc] Signed-off-by: Andrea Righi <righi.andrea@gmail.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:21 -07:00
Jeff Dike	9a6f70bbed	random: add async notification support to /dev/random Add async notification support to /dev/random. A little test case is below. Without this patch, you get: $ ./async-random Drained the pool Found more randomness With it, you get: $ ./async-random Drained the pool SIGIO Found more randomness #include <stdio.h> #include <stdlib.h> #include <signal.h> #include <errno.h> #include <fcntl.h> static void handler(int sig) { printf("SIGIO\n"); } int main(int argc, char **argv) { int fd, n, err, flags; if(signal(SIGIO, handler) < 0){ perror("setting SIGIO handler"); exit(1); } fd = open("/dev/random", O_RDONLY); if(fd < 0){ perror("open"); exit(1); } flags = fcntl(fd, F_GETFL); if (flags < 0){ perror("getting flags"); exit(1); } flags \|= O_NONBLOCK; if (fcntl(fd, F_SETFL, flags) < 0){ perror("setting flags"); exit(1); } while((err = read(fd, &n, sizeof(n))) > 0) ; if(err == 0){ printf("random returned 0\n"); exit(1); } else if(errno != EAGAIN){ perror("read"); exit(1); } flags \|= O_ASYNC; if (fcntl(fd, F_SETFL, flags) < 0){ perror("setting flags"); exit(1); } if (fcntl(fd, F_SETOWN, getpid()) < 0) { perror("Setting SIGIO"); exit(1); } printf("Drained the pool\n"); read(fd, &n, sizeof(n)); printf("Found more randomness\n"); return(0); } Signed-off-by: Jeff Dike <jdike@linux.intel.com> Signed-off-by: Matt Mackall <mpm@selenic.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:25 -07:00
Matt Mackall	adc782dae6	random: simplify and rename credit_entropy_store - emphasize bits in the name - make zero bits lock-free - simplify logic Signed-off-by: Matt Mackall <mpm@selenic.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:25 -07:00
Matt Mackall	e68e5b664e	random: make mixing interface byte-oriented Switch add_entropy_words to a byte-oriented interface, eliminating numerous casts and byte/word size rounding issues. This also reduces the overall bit/byte/word confusion in this code. We now mix a byte at a time into the word-based pool. This takes four times as many iterations, but should be negligible compared to hashing overhead. This also increases our pool churn, which adds some depth against some theoretical failure modes. The function name is changed to emphasize pool mixing and deemphasize entropy (the samples mixed in may not contain any). extract is added to the core function to make it clear that it extracts from the pool. Signed-off-by: Matt Mackall <mpm@selenic.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:25 -07:00
Matt Mackall	993ba2114c	random: simplify add_ptr logic The add_ptr variable wasn't used in a sensible way, use only i instead. i got reused later for a different purpose, use j instead. While we're here, put tap0 first in the tap list and add a comment. Signed-off-by: Matt Mackall <mpm@selenic.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:25 -07:00
Matt Mackall	6d38b82740	random: remove some prefetch logic The urandom output pool (ie the fast path) fits in one cacheline, so this is pretty unnecessary. Further, the output path has already fetched the entire pool to hash it before calling in here. (This was the only user of prefetch_range in the kernel, and it passed in words rather than bytes!) Signed-off-by: Matt Mackall <mpm@selenic.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:25 -07:00
Matt Mackall	feee76972b	random: eliminate redundant new_rotate variable - eliminate new_rotate - move input_rotate masking - simplify input_rotate update - move input_rotate update to end of inner loop for readability Signed-off-by: Matt Mackall <mpm@selenic.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:25 -07:00
Matt Mackall	433582093a	random: remove cacheline alignment for locks Earlier changes greatly reduce the number of times we grab the lock per output byte, so we shouldn't need this particular hack any more. Signed-off-by: Matt Mackall <mpm@selenic.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:24 -07:00
Matt Mackall	1c0ad3d492	random: make backtracking attacks harder At each extraction, we change (poolbits / 16) + 32 bits in the pool, or 96 bits in the case of the secondary pools. Thus, a brute-force backtracking attack on the pool state is less difficult than breaking the hash. In certain cases, this difficulty may be is reduced to 2^64 iterations. Instead, hash the entire pool in one go, then feedback the whole hash (160 bits) in one go. This will make backtracking at least as hard as inverting the hash. Signed-off-by: Matt Mackall <mpm@selenic.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:24 -07:00
Matt Mackall	ffd8d3fa58	random: improve variable naming, clear extract buffer - split the SHA variables apart into hash and workspace - rename data to extract - wipe extract and workspace after hashing Signed-off-by: Matt Mackall <mpm@selenic.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:24 -07:00
Matt Mackall	53c3f63e82	random: reuse rand_initialize Signed-off-by: Matt Mackall <mpm@selenic.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:24 -07:00
Matt Mackall	43ae4860ff	random: use unlocked_ioctl No locking actually needed. Signed-off-by: Matt Mackall <mpm@selenic.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:24 -07:00
Matt Mackall	88c730da8c	random: consolidate wakeup logic Signed-off-by: Matt Mackall <mpm@selenic.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:24 -07:00
Matt Mackall	90b75ee546	random: clean up checkpatch complaints Signed-off-by: Matt Mackall <mpm@selenic.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:24 -07:00
Matt Mackall	91f3f1e304	drivers/char/random.c:write_pool() cond_resched() needed Reduce latency for large writes to /dev/[u]random Signed-off-by: Matt Mackall <mpm@selenic.com> Cc: Sami Farin <safari-kernel@safari.iki.fi> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-06 10:41:06 -08:00
Adrian Bunk	640e248e44	unexport add_disk_randomness This patch removes the no longer used EXPORT_SYMBOL(add_disk_randomness). Signed-off-by: Adrian Bunk <bunk@kernel.org> Acked-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-02-01 09:26:32 +01:00
Eric Dumazet	6dd10a6235	[NET] random : secure_tcp_sequence_number should not assume CONFIG_KTIME_SCALAR All 32 bits machines but i386 dont have CONFIG_KTIME_SCALAR. On these machines, ktime.tv64 is more than 4 times the (correct) result given by ktime_to_ns() Again on these machines, using ktime_get_real().tv64 >> 6 give a 32bits rollover every 64 seconds, which is not wanted (less than the 120 s MSL) Using ktime_to_ns() is the portable way to get nsecs from a ktime, and have correct code. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-11-13 21:12:14 -08:00
Stephen Hemminger	c80544dc0b	sparse pointer use of zero as null Get rid of sparse related warnings from places that use integer as NULL pointer. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Cc: Andi Kleen <ak@suse.de> Cc: Jeff Garzik <jeff@garzik.org> Cc: Matt Mackall <mpm@selenic.com> Cc: Ian Kent <raven@themaw.net> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Davide Libenzi <davidel@xmailserver.org> Cc: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:31 -07:00
Eric Dumazet	9b42c336d0	[TCP]: secure_tcp_sequence_number() should not use a too fast clock TCP V4 sequence numbers are 32bits, and RFC 793 assumed a 250 KHz clock. In order to follow network speed increase, we can use a faster clock, but we should limit this clock so that the delay between two rollovers is greater than MSL (TCP Maximum Segment Lifetime : 2 minutes) Choosing a 64 nsec clock should be OK, since the rollovers occur every 274 seconds. Problem spotted by Denys Fedoryshchenko [ This bug was introduced by `f859581519` ] Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-01 21:01:24 -07:00
Matt Mackall	5a021e9ffd	random: fix bound check ordering (CVE-2007-3105) If root raised the default wakeup threshold over the size of the output pool, the pool transfer function could overflow the stack with RNG bytes, causing a DoS or potential privilege escalation. (Bug reported by the PaX Team <pageexec@freemail.hu>) Cc: Theodore Tso <tytso@mit.edu> Cc: Willy Tarreau <w@1wt.eu> Signed-off-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-19 14:21:04 -07:00
Matt Mackall	679ce0ace6	random: fix output buffer folding (As reported by linux@horizon.com) Folding is done to minimize the theoretical possibility of systematic weakness in the particular bits of the SHA1 hash output. The result of this bug is that 16 out of 80 bits are un-folded. Without a major new vulnerability being found in SHA1, this is harmless, but still worth fixing. Signed-off-by: Matt Mackall <mpm@selenic.com> Cc: <linux@horizon.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-06-16 13:16:16 -07:00
Matt Mackall	7f397dcdb7	random: fix seeding with zero entropy Add data from zero-entropy random_writes directly to output pools to avoid accounting difficulties on machines without entropy sources. Tested on lguest with all entropy sources disabled. Signed-off-by: Matt Mackall <mpm@selenic.com> Acked-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-29 20:09:34 -07:00
Matt Mackall	602b6aeefe	random: fix error in entropy extraction Fix cast error in entropy extraction. Add comments explaining the magic 16. Remove extra confusing loop variable. Signed-off-by: Matt Mackall <mpm@selenic.com> Acked-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-29 20:09:34 -07:00
Eric Dumazet	f859581519	[NET]: random functions can use nsec resolution instead of usec In order to get more randomness for secure_tcpv6_sequence_number(), secure_tcp_sequence_number(), secure_dccp_sequence_number() functions, we can use the high resolution time services, providing nanosec resolution. I've also done two kmalloc()/kzalloc() conversions. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Acked-by: James Morris <jmorris@namei.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:28:25 -07:00
Adrian Bunk	cb69cc5236	[TCP/DCCP/RANDOM]: Remove unused exports. This patch removes the following not or no longer used exports: - drivers/char/random.c: secure_tcp_sequence_number - net/dccp/options.c: sysctl_dccp_feat_sequence_window - net/netlink/af_netlink.c: netlink_set_err Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:24:03 -07:00
Arjan van de Ven	2b8693c061	[PATCH] mark struct file_operations const 3 Many struct file_operations in the kernel can be "const". Marking them const moves these to the .rodata section, which avoids false sharing with potential dirty data. In addition it'll catch accidental writes at compile time to these shared resources. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-12 09:48:45 -08:00
Alexey Dobriyan	1f29bcd739	[PATCH] sysctl: remove unused "context" param Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Andi Kleen <ak@suse.de> Cc: "David S. Miller" <davem@davemloft.net> Cc: David Howells <dhowells@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-10 09:55:41 -08:00
Josef Sipek	a7113a9662	[PATCH] struct path: convert char-drivers Signed-off-by: Josef Sipek <jsipek@fsl.cs.sunysb.edu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-08 08:28:44 -08:00
David Howells	4c1ac1b491	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 Conflicts: drivers/infiniband/core/iwcm.c drivers/net/chelsio/cxgb2.c drivers/net/wireless/bcm43xx/bcm43xx_main.c drivers/net/wireless/prism54/islpci_eth.c drivers/usb/core/hub.h drivers/usb/input/hid-core.c net/core/netpoll.c Fix up merge failures with Linus's head and fix new compilation failures. Signed-Off-By: David Howells <dhowells@redhat.com>	2006-12-05 14:37:56 +00:00
Al Viro	b09b845ca6	[RANDOM]: Annotate random.h IP helpers. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-02 21:22:51 -08:00
David Howells	65f27f3844	WorkStruct: Pass the work_struct pointer instead of context data Pass the work_struct pointer to the work function rather than context data. The work function can use container_of() to work out the data. For the cases where the container of the work_struct may go away the moment the pending bit is cleared, it is made possible to defer the release of the structure by deferring the clearing of the pending bit. To make this work, an extra flag is introduced into the management side of the work_struct. This governs auto-release of the structure upon execution. Ordinarily, the work queue executor would release the work_struct for further scheduling or deallocation by clearing the pending bit prior to jumping to the work function. This means that, unless the driver makes some guarantee itself that the work_struct won't go away, the work function may not access anything else in the work_struct or its container lest they be deallocated.. This is a problem if the auxiliary data is taken away (as done by the last patch). However, if the pending bit is not cleared before jumping to the work function, then the work function may access the work_struct and its container with no problems. But then the work function must itself release the work_struct by calling work_release(). In most cases, automatic release is fine, so this is the default. Special initiators exist for the non-auto-release case (ending in _NAR). Signed-Off-By: David Howells <dhowells@redhat.com>	2006-11-22 14:55:48 +00:00
David Howells	52bad64d95	WorkStruct: Separate delayable and non-delayable events. Separate delayable work items from non-delayable work items be splitting them into a separate structure (delayed_work), which incorporates a work_struct and the timer_list removed from work_struct. The work_struct struct is huge, and this limits it's usefulness. On a 64-bit architecture it's nearly 100 bytes in size. This reduces that by half for the non-delayable type of event. Signed-Off-By: David Howells <dhowells@redhat.com>	2006-11-22 14:54:01 +00:00
Dmitry Torokhov	80fc9f532d	Input: add missing exports to fix modular build Signed-off-by: Dmitry Torokhov <dtor@mail.ru>	2006-10-11 01:43:58 -04:00
Serge E. Hallyn	e9ff3990f0	[PATCH] namespaces: utsname: switch to using uts namespaces Replace references to system_utsname to the per-process uts namespace where appropriate. This includes things like uname. Changes: Per Eric Biederman's comments, use the per-process uts namespace for ELF_PLATFORM, sunrpc, and parts of net/ipv4/ipconfig.c [jdike@addtoit.com: UML fix] [clg@fr.ibm.com: cleanup] [akpm@osdl.org: build fix] Signed-off-by: Serge E. Hallyn <serue@us.ibm.com> Cc: Kirill Korotaev <dev@openvz.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Herbert Poetzl <herbert@13thfloor.at> Cc: Andrey Savochkin <saw@sw.ru> Signed-off-by: Cedric Le Goater <clg@fr.ibm.com> Cc: Jeff Dike <jdike@addtoit.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-10-02 07:57:21 -07:00
David Howells	9361401eb7	[PATCH] BLOCK: Make it possible to disable the block layer [try #6 ] Make it possible to disable the block layer. Not all embedded devices require it, some can make do with just JFFS2, NFS, ramfs, etc - none of which require the block layer to be present. This patch does the following: () Introduces CONFIG_BLOCK to disable the block layer, buffering and blockdev support. () Adds dependencies on CONFIG_BLOCK to any configuration item that controls an item that uses the block layer. This includes: () Block I/O tracing. () Disk partition code. () All filesystems that are block based, eg: Ext3, ReiserFS, ISOFS. () The SCSI layer. As far as I can tell, even SCSI chardevs use the block layer to do scheduling. Some drivers that use SCSI facilities - such as USB storage - end up disabled indirectly from this. () Various block-based device drivers, such as IDE and the old CDROM drivers. () MTD blockdev handling and FTL. () JFFS - which uses set_bdev_super(), something it could avoid doing by taking a leaf out of JFFS2's book. () Makes most of the contents of linux/blkdev.h, linux/buffer_head.h and linux/elevator.h contingent on CONFIG_BLOCK being set. sector_div() is, however, still used in places, and so is still available. () Also made contingent are the contents of linux/mpage.h, linux/genhd.h and parts of linux/fs.h. () Makes a number of files in fs/ contingent on CONFIG_BLOCK. () Makes mm/bounce.c (bounce buffering) contingent on CONFIG_BLOCK. () set_page_dirty() doesn't call __set_page_dirty_buffers() if CONFIG_BLOCK is not enabled. () fs/no-block.c is created to hold out-of-line stubs and things that are required when CONFIG_BLOCK is not set: () Default blockdev file operations (to give error ENODEV on opening). () Makes some /proc changes: () /proc/devices does not list any blockdevs. () /proc/diskstats and /proc/partitions are contingent on CONFIG_BLOCK. () Makes some compat ioctl handling contingent on CONFIG_BLOCK. () If CONFIG_BLOCK is not defined, makes sys_quotactl() return -ENODEV if given command other than Q_SYNC or if a special device is specified. () In init/do_mounts.c, no reference is made to the blockdev routines if CONFIG_BLOCK is not defined. This does not prohibit NFS roots or JFFS2. () The bdflush, ioprio_set and ioprio_get syscalls can now be absent (return error ENOSYS by way of cond_syscall if so). () The seclvl_bd_claim() and seclvl_bd_release() security calls do nothing if CONFIG_BLOCK is not set, since they can't then happen. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2006-09-30 20:52:31 +02:00
Ingo Molnar	e4d9191885	[PATCH] lockdep: locking init debugging improvement Locking init improvement: - introduce and use __SPIN_LOCK_UNLOCKED for array initializations, to pass in the name string of locks, used by debugging Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-03 15:27:02 -07:00
Jörn Engel	6ab3d5624e	Remove obsolete #include <linux/config.h> Signed-off-by: Jörn Engel <joern@wohnheim.fh-wedel.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>	2006-06-30 19:25:36 +02:00
Adrian Bunk	30aaa154fc	[IPV6]: Unexport secure_ipv6_port_ephemeral This patch removes the unused EXPORT_SYMBOL(secure_ipv6_port_ephemeral). Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-04-09 22:29:17 -07:00
Stephen Hemminger	d251575ab6	[PATCH] random: get rid of sparse warning Get rid of bogus extern attribute that causes sparse warning. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Acked-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-11 18:42:11 -08:00
Arnaldo Carvalho de Melo	d8313f5ca2	[INET6]: Generalise tcp_v6_hash_connect Renaming it to inet6_hash_connect, making it possible to ditch dccp_v6_hash_connect and share the same code with TCP instead. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:10:56 -08:00
Arnaldo Carvalho de Melo	a7f5e7f164	[INET]: Generalise tcp_v4_hash_connect Renaming it to inet_hash_connect, making it possible to ditch dccp_v4_hash_connect and share the same code with TCP instead. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:10:55 -08:00
Arnaldo Carvalho de Melo	c4365c9235	[RANDOM]: Introduce secure_dccp_sequence_number Code contributed by Stephen Hemminger. Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:49:40 -07:00
Christoph Lameter	6c036527a6	[PATCH] mostly_read data section Add a new section called ".data.read_mostly" for data items that are read frequently and rarely written to like cpumaps etc. If these maps are placed in the .data section then these frequenly read items may end up in cachelines with data is is frequently updated. In that case all processors in an SMP system must needlessly reload the cachelines again and again containing elements of those frequently used variables. The ability to share these cachelines will allow each cpu in an SMP system to keep local copies of those shared cachelines thereby optimizing performance. Signed-off-by: Alok N Kataria <alokk@calsoftinc.com> Signed-off-by: Shobhit Dayal <shobhit@calsoftinc.com> Signed-off-by: Christoph Lameter <christoph@scalex86.org> Signed-off-by: Shai Fultheim <shai@scalex86.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-07-07 18:23:46 -07:00
Matt Mackall	9e95ce279f	[PATCH] update maintainer for /dev/random Ted has agreed to let me take over as maintainer of /dev/random and friends. I've gone ahead and added a line to his entry in CREDITS. Signed-off-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-04-16 15:25:56 -07:00
Linus Torvalds	1da177e4c3	Linux-2.6.12-rc2 Initial git repository build. I'm not bothering with the full history, even though we have it. We can create a separate "historical" git archive of that later if we want to, and in the meantime it's about 3.2GB when imported into git - space that would just make the early git days unnecessarily complicated, when we don't have a lot of good infrastructure for it. Let it rip!	2005-04-16 15:20:36 -07:00

... 2 3 4 5 6 ...

319 Commits