... by adding seemingly redudant posting reads.
This little dragon lair exploded the first time around when we've
refactored the code a bit to use the common wait_for_atomic_us in
"drm/i915: Group the GT routines together in both code and vtable",
which caused QA to file fdo bug #51738.
Chris Wilson entertained a few approaches to fixing #51738: Replacing
the udelay(1) with the previously-used udelay(10) (or any other
"sufficiently larger" delay), adding a posting read, or ditching the
delay completely and using cpu_relax. We went with the cpu_relax and
"915: Workaround hang with BSD and forcewake on SandyBridge". Which
blew up in fdo bug #52424, but adding the posting read while still
using cpu_relax seems to also fix that, it looks like the
posting read is the important ingriedient to fix these rc6 related
hangs on snb.
Popular theories as to why this is like it is include:
- A herd of pink elephants got royally angered somehow.
- The gpu has internally different functional units and judging by the
register offsets, the forcewake request register and the forcewake
ack registers are _not_ in the same functional unit (or at least
aren't reached through the same routes). Hence the posting read
syncs up with the wrong block and gets the entire gpu confused.
- ...
As a minimal ducttape fix for 3.6, let's just put these posting reads
into place again. We can try fancier approaches (like adding back the
cpu_relax instead of the udelay) in -next.
This (re-)fixes a regression introduced in
commit 990bbdadab
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Mon Jul 2 11:51:02 2012 -0300
drm/i915: Group the GT routines together in both code and vtable
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: Du Yan <yanx.du@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=52424
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=51738u
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
************************************************************
* For the very latest on DRI development, please see: *
* http://dri.freedesktop.org/ *
************************************************************
The Direct Rendering Manager (drm) is a device-independent kernel-level
device driver that provides support for the XFree86 Direct Rendering
Infrastructure (DRI).
The DRM supports the Direct Rendering Infrastructure (DRI) in four major
ways:
1. The DRM provides synchronized access to the graphics hardware via
the use of an optimized two-tiered lock.
2. The DRM enforces the DRI security policy for access to the graphics
hardware by only allowing authenticated X11 clients access to
restricted regions of memory.
3. The DRM provides a generic DMA engine, complete with multiple
queues and the ability to detect the need for an OpenGL context
switch.
4. The DRM is extensible via the use of small device-specific modules
that rely extensively on the API exported by the DRM module.
Documentation on the DRI is available from:
http://dri.freedesktop.org/wiki/Documentation
http://sourceforge.net/project/showfiles.php?group_id=387
http://dri.sourceforge.net/doc/
For specific information about kernel-level support, see:
The Direct Rendering Manager, Kernel Support for the Direct Rendering
Infrastructure
http://dri.sourceforge.net/doc/drm_low_level.html
Hardware Locking for the Direct Rendering Infrastructure
http://dri.sourceforge.net/doc/hardware_locking_low_level.html
A Security Analysis of the Direct Rendering Infrastructure
http://dri.sourceforge.net/doc/security_low_level.html