Commit graph

401698 commits

Author SHA1 Message Date
Damien Lespiau
4b30553d89 drm/i915/bdw: Broadwell has 3 pipes
v2: Rebase (Paulo Zanoni)

v3: Rebase on top of num_pipes having moved to intel_device_info.

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> (v1)
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> (v2)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:51 +01:00
Paulo Zanoni
4e8058a20a drm/i915/bdw: add IS_BROADWELL macro
For now it's just equivalent to IS_GEN8, but in the future we might
want to change that (e.g., on Gen 7 we have IS_VALLEYVIEW,
IS_IVYBRIDGE and IS_HASWELL).

Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:51 +01:00
Ben Widawsky
780f18c84c drm/i915/bdw: BSD init for gen8 also
This was an oversight and should have been in a previous series
somewhere.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:50 +01:00
Ben Widawsky
77df677291 drm/i915/bdw: ppgtt info in debugfs
It's not so much that the information is terribly useful, but rather
that the gen6/7 information is completely useless.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:50 +01:00
Ville Syrjälä
b42218c19f drm/i915/bdw: Don't muck with gtt_size on Gen8 when PPGTT setup fails
v2: Resolve rebase conflicts and switch to gen < 8 color for GenX
checking.

v3: Rebase on top of the address space refactoring.

Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> (v1)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:49 +01:00
Ben Widawsky
a5f3d68e2e drm/i915/bdw: Render ring flushing
PIPE_CONTROL added the high address dword. I'm not sure how the
simulator let me get away with this. I've explicitly left out all the
workarounds from Gen7 because in the minimal digging that I did, most
don't seem necessary, and the simulator doesn't complain without them

Note that BLT and BSD ring commands had already been updated previously.
Just render/pipe_control should have been broken.

v2: Squash in a fixup from Ville to follow the recent IVB PIPE_CONTROL
updates: "BDW uses the IVB PIPE_CONTROL style for specifying GTT vs.
PPGTT for the PIPE_CONTROL QW/DW write."

v3: Rebase on top of Chris' cleanup to have an explicit ring->scratch
buffer object instead of an opaque ring->private where everyone stores
the same stuff inside.

Reported-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net> (for the fixup)
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v1)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:49 +01:00
Ben Widawsky
28cf541543 drm/i915/bdw: unleash PPGTT
v2: Squash in fix from Ben: Set PPGTT batches as necessary

This fixes the regression in the last couple of days when we enabled
PPGTT.

v3: Squash in fixup to still use GTT for secure batches from Ville:

BDW doesn't have a separate secure vs. non-secure bit in
MI_BATCH_BUFFER_START. So for secure batches we have to simply
leave the PPGTT bit unset. Fortunately older generations (except
HSW) had similar limitations so execbuffer already creates a GTT
mapping for all secure batches.

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:48 +01:00
Ben Widawsky
94e409c144 drm/i915/bdw: Implement PPGTT enable
Legacy PPGTT on GEN8 requires programming 4 PDP registers per ring.
Since all rings are using the same address space with the current code
the logic is simply to program all the tables we've setup for the PPGTT.

v2: Turn on PPGTT in GFX_MODE

v3: v2 was the wrong patch

v4: Resolve conflicts due to patch series reordering.

v5: Squash in fixup from Ben: Use LRI to write PDPs

The docs (and simulator seems to back up) suggest that we can only
program legacy PPGTT PDPs with LRI commands.

v6: Rebase around context differences conflicts.

v7: Use #defines for per ring PDPs. (Damien)

v8: Don't use typede'f private_t.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (up to v3 and v7)
Reviewed-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:47 +01:00
Ben Widawsky
9df15b499b drm/i915/bdw: Implement PPGTT insert
GEN8 insertion is very similar to GEN6.

v2: Rebase on top of Imre's for_each_sg_page helpers.

v3: Fixup my conversion (spotted by Ville).

v4: Rebase on top of the address space refactoring.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v1)
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:47 +01:00
Ben Widawsky
459108b8cc drm/i915/bdw: Implement PPGTT clear range
GEN8 PPGTT range clearing is very similar to GEN6 if we assume that our
PDEs are all valid, which they should be.

v2: Rebase on top of the address space refactoring.

v3: Rebase on top of the bool use_scratch addition to the clear_range interface.

Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v1)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:46 +01:00
Ben Widawsky
b1fe667329 drm/i915/bdw: Initialize the PDEs
The upcoming clear and insert routines will expect that PDEs all point
to valid Page Directories. Doing that lazily doesn't really buy us
anything.

The page allocation is done regardless earlier in init so it shouldn't
hurt set the PDEs.

v2: Squash in patches to implement fixed PDE write function:

- If I had done this in the first place, the bug that's going to be
  fixed in an upcoming patch would have been much easier to find.

- Use WB for PDEs.

  The PAT bit is used for page size. 2ME PDEs aren't even supported in
  BDW, so this was completely invalid. The solution is to make our
  PDEs WB+LLC instead of the pervious WB+eLLC. As far as I can guess,
  this change won't matter for performance.

  Thanks to Ville for the quick correction when discussing on IRC.

v3: Return the pde type for pde encoding (Damien)

Reviewed-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:46 +01:00
Ben Widawsky
37aca44ad5 drm/i915/bdw: PPGTT init & cleanup
Aside from the potential size increase of the PPGTT, the primary
difference from previous hardware is the Page Directories are no longer
carved out of the Global GTT.

Note that the PDE allocation is done as a 8MB contiguous allocation,
this needs to be eventually fixed (since driver reloading will be a
pain otherwise). Also, this will be a no-go for real PPGTT support.

v2: Move vtable initialization

v3: Resolve conflicts due to patch series reordering.

v4: Rebase on top of the address space refactoring of the PPGTT
support. Drop Imre's r-b tag for v2, too outdated by now.

v5: Free the correct amount of memory, "get_order takes size not a page
count." (Imre)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:45 +01:00
Ben Widawsky
fbe5d36e77 drm/i915/bdw: Support BDW caching
BDW caching works differently than the previous generations. Instead of
having bits in the PTE which directly control how the page is cached,
the 3 PTE bits PWT PCD and PAT provide an index into a PAT defined by
register 0x40e0. This style of caching is functionally equivalent to how
it works on HSW and before.

v2: Tiny bikeshed as discussed on internal irc.

v3: Squash in patch from Ville to mirror the x86 PAT setup more like
in arch/x86/mm/pat.c. Primarily, the 0th index will be WB, and not
uncached.

v4: Comment for reason to not use a 64b write on the PPAT.

v5: Add a FIXME comment that the caching bits in the PAT registers
might be wrong due to doc confusion.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v1)
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:45 +01:00
Ben Widawsky
94ec8f6130 drm/i915/bdw: Add GTT functions
With the PTE clarifications, the bind and clear functions can now be
added for gen8.

v2: Use for_each_sg_pages in gen8_ggtt_insert_entries.

v3: Drop dev argument to pte encode functions, upstream lost it. Also
rebase on top of the scratch page movement.

v4: Rebase on top of the new address space vfuncs.

v5: Add the bool use_scratch argument to clear_range and the bool valid argument
to the PTE encode function to follow upstream changes.

v6: Add a FIXME(BDW) about the size mismatch of the readback check
that Jon Bloomfield spotted.

v7: Squash in fixup patch from Ben for the posting read to match the
64bit ptes and so shut up the WARN.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v1)
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:44 +01:00
Ben Widawsky
d31eb10e6c drm/i915/bdw: Create gen8_gtt_pte_t
With gen6 PTE type in place, pave the way for the new gen8 type.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:44 +01:00
Ben Widawsky
63340133f3 drm/i915/bdw: Make gen8_gmch_probe
Probing gen8 is similar to gen6. To make the code cleaner and more
maintainable however we can use the probe functions to split it out.

v2: Rebased on top of update gtt probe infrastructure.

v3: Rebased on top of Kenneth' Graunke's ->pte_encode refactoring.

V4: Resolve conflicts with Ben's latest ppgtt patches, also switch to
gen < 8 testing instead of gen <= 7.

v5: Resolve conflicts with address space vfunc changes in upstream.

v6: Use 39b DMA mask. At least, for this mode, it is the correct mask.
(Imre)

Cc: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:43 +01:00
Ben Widawsky
d0582ed2ff drm/i915/bdw: Update relevant error state
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:43 +01:00
Ben Widawsky
9d3203e16c drm/i915/bdw: debugfs updates
All the gen8 debugfs stuff I wasn't too lazy to update. We'll need more
later, I am certain.

v2: Fix up the register name in the debugfs output as suggested by
Paulo.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v1)
Reviewed-by: Paulo Zanoni <przanoni@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:42 +01:00
Ben Widawsky
075b3bbaae drm/i915/bdw: Update MI_FLUSH_DW
The code is more verbose than necessary for the reader's sake, hopefully
the compiler optimizes away the if.

Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:42 +01:00
Ben Widawsky
1c7a0623c7 drm/i915/bdw: dispatch updates (64b related)
The command to emit batch buffers has changed to address 48b addresses.
It seemed reasonable that we could still use the old instruction where
emitting 0 for length would do the right thing, but it seems to bother
the simulator when the code does that.

Now the second dword in the command has the upper 16b of the address of
the batchbuffer.

v2: Remove duplicated vfun assignment.

v3: Squash in VECS support changes from Zhao Yakui <yakui.zhao@intel.com>

v4: Make checkpatch happy.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v2)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:41 +01:00
Ben Widawsky
3c94ceeee2 drm/i915/bdw: Support 64b relocations
We don't actually return any to userspace yet, however we can pretend
like we do now so userspace will support it when it happens.

This is just to please Chris as the code itself isn't ready for > 64b
relocations.

v2: Rebase on top of the refactored relocate_entry_gtt|cpu functions.

v3: Squash in fixup from Rafal Barbalho for 64 byte relocs using cpu
relocs and those crossing a page boundary.

v4: Squash in a fixup for the fixup from Rafael.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v1)
Signed-off-by: Barbalho, Rafael <rafael.barbalho@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:41 +01:00
Ben Widawsky
a123f157a3 drm/i915/bdw: Add interrupt info to debugfs
v2: Add missed ring interrupt info

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:40 +01:00
Ben Widawsky
abd58f0175 drm/i915/bdw: Implement interrupt changes
The interrupt handling implementation remains the same as previous
generations with the 4 types of registers, status, identity, mask, and
enable. However the layout of where the bits go have changed entirely.
To address these changes, all of the interrupt vfuncs needed special
gen8 code.

The way it works is there is a top level status register now which
informs the interrupt service routine which unit caused the interrupt,
and therefore which interrupt registers to read to process the
interrupt. For display the division is quite logical, a set of interrupt
registers for each pipe, and in addition to those, a set each for "misc"
and port.

For GT the things get a bit hairy, as seen by the code. Each of the GT
units has it's own bits defined. They all look *very similar* and
resides in 16 bits of a GT register. As an example, RCS and BCS share
register 0. To compact the code a bit, at a slight expense to
complexity, this is exactly how the code works as well. 2 structures are
added to the ring buffer so that our ring buffer interrupt handling code
knows which ring shares the interrupt registers, and a shift value (ie.
the top or bottom 16 bits of the register).

The above allows us to kept the interrupt register caching scheme, the
per interrupt enables, and the code to mask and unmask interrupts
relatively clean (again at the cost of some more complexity).

Most of the GT units mentioned above are command streamers, and so the
symmetry should work quite well for even the yet to be implemented rings
which Broadwell adds.

v2: Fixes up a couple of bugs, and is more verbose about errors in the
Broadwell interrupt handler.

v3: fix DE_MISC IER offset

v4: Simplify interrupts:
I totally misread the docs the first time I implemented interrupts, and
so this should greatly simplify the mess. Unlike GEN6, we never touch
the regular mask registers in irq_get/put.

v5: Rebased on to of recent pch hotplug setup changes.

v6: Fixup on top of moving num_pipes to intel_info.

v7: Rebased on top of Egbert Eich's hpd irq handling rework. Also
wired up ibx_hpd_irq_setup for gen8.

v8: Rebase on top of Jani's asle handling rework.

v9: Rebase on top of Ben's VECS enabling for Haswell, where he
unfortunately went OCD on the gt irq #defines. Not that they're still
not yet fully consistent:
- Used the GT_RENDER_ #defines + bdw shifts.
- Dropped the shift from the L3_PARITY stuff, seemed clearer.
- s/irq_refcount/irq_refcount.gt/

v10: Squash in VECS enabling patches and the gen8_gt_irq_handler
refactoring from Zhao Yakui <yakui.zhao@intel.com>

v11: Rebase on top of the interrupt cleanups in upstream.

v12: Rebase on top of Ben's DPF changes in upstream.

v13: Drop bdw from the HAS_L3_DPF feature flag for now, it's unclear what
exactly needs to be done. Requested by Ben.

v14: Fix the patch.
- Drop the mask of reserved bits and assorted logic, it doesn't match
  the spec.
- Do the posting read inconditionally instead of commenting it out.
- Add a GEN8_MASTER_IRQ_CONTROL definition and use it.
- Fix up the GEN8_PIPE interrupt defines and give the GEN8_ prefixes -
  we actually will need to use them.
- Enclose macros in do {} while (0) (checkpatch).
- Clear DE_MISC interrupt bits only after having processed them.
- Fix whitespace fail (checkpatch).
- Fix overtly long lines where appropriate (checkpatch).
- Don't use typedef'ed private_t (maintainer-scripts).
- Align the function parameter list correctly.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v4)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

bikeshed
2013-11-08 18:09:39 +01:00
Ben Widawsky
9459d25237 drm/i915/bdw: support GMS and GGMS changes
All the BARs have the ability to grow.

v2: Pulled out the simulator workaround to a separate patch.
Rebased.

v3: Rebase onto latest vlv patches from Jesse.

v4: Rebased on top of the early stolen quirk patch from Jesse.

v5: Use the new macro names.
s/INTEL_BDW_PCI_IDS_D/INTEL_BDW_D_IDS
s/INTEL_BDW_PCI_IDS_M/INTEL_BDW_M_IDS
It's Jesse's fault for not following the convention I originally set.

Cc: Ingo Molnar <mingo@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:39 +01:00
Ben Widawsky
4e0bbc316e drm/i915/bdw: display stuff
Just enough to make the code not barf...

Init BDW display to look like HSW. For the simulator this should be
fine, but this will probably require more work.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Add a FIXME comment about RCS flips being untested on bdw.
Also add a note that hblank events are reserved on bdw+ in DERRMR.]
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:38 +01:00
Ben Widawsky
1020a5c2dc drm/i915/bdw: Clock gating init
Clock gating init is really a catch all function for registers we need
to write early in loading the driver.

Atm just the bare metal stuff we need, more will surely come.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:38 +01:00
Ben Widawsky
8897644a6d drm/i915/bdw: HW context support
BDW context sizes varies a bit.

v2: Squash in fixup for the hw context size from Ben.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v1)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:37 +01:00
Ben Widawsky
31a5336e1c drm/i915/bdw: Swizzling support
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:37 +01:00
Ben Widawsky
5ab31333ac drm/i915/bdw: Fences on gen8 look just like gen7
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:36 +01:00
Ben Widawsky
4d4dead67a drm/i915/bdw: Add device IDs
v2: Squash in "drm/i915/bdw: Add BDW to the HAS_DDI check" as
suggested by Damien.

v3: Squash in VEBOX enabling from  Zhao Yakui <yakui.zhao@intel.com>

v4: Rebase on top of Jesse's patch to extract all pci ids to
include/drm/i915_pciids.h.

v4: Replace Halo by its marketing moniker Iris. Requested by Ben.

v5: Switch from info->has*ring to info->ring_mask.

v6: Add 0x16X2 variant (which is newer than this patch)
Rename to use new naming scheme (Chris)
Remove Simulator PCI ids. These snuck in during rebase (Chris)

v7: Fix poor sed job from v6
Make the desktop variants use the desktop macro (Rebase error). Notice
that this makes no functional difference - it's just confusing.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:36 +01:00
Daniel Vetter
8fe6bd239a drm/i915/bdw: Disable PPGTT for now
This will be changed once the gen8 code is fully implemented.

v2: Use ENOSYS instead of ENXIO as suggested by Chris.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:35 +01:00
Ben Widawsky
43d1b64728 drm/i915/bdw: Initialize BDW forcewake vfuncs
Somehow this got missed or dropped during development. The simulator
does not use forcewake, so it's entirely possible it never worked
correctly. After the mmio rework, this will end up in an OOPs, and the
system will not boot.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Use IS_GEN8 instead of IS_BROADWELL.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:08:46 +01:00
Ben Widawsky
ab2aa47e4b drm/i915/bdw: Handle forcewake for writes on gen8
GEN8 removes the GT FIFO which we've all come to know and love. Instead
it offers a wider range of optimized registers which always keep a
shadowed copy, and are fed to the GPU when it wakes.

How this is implemented in hardware is still somewhat of a mystery. As
far as I can tell, the basic design is as follows:

If the register is not optimized, you must use the old forcewake
mechanism to bring the GT out of sleep. [1]

If register is in the optimized list the write will signal that the
GT should begin to come out of whatever sleep state it is in.

While the GT is coming out of sleep, the requested write will be stored
in an intermediate shadow register.

Do to the fact that the implementation details are not clear, I see
several risks:
1. Order is not preserved as it is with GT FIFO. If we issue multiple
writes to optimized registers, where order matters, we may need to
serialize it with forcewake.
2. The optimized registers have only 1 shadowed slot, meaning if we
issue multiple writes to the same register, and those values need to
reach the GPU in order, forcewake will be required.

[1] We could implement a SW queue the way the GT FIFO used to work if
desired.

NOTE: Compile tested only until we get real silicon.

v2:
- Use a default case to make future platforms also work.
- Get rid of IS_BROADWELL since that's not yet defined, but we want to
  MMIO as soon as possible.

v3: Apply suggestions from Mika's review:
- s/optimized/shadowed/
- invert the logic of the helper so that it does what it says (the
  code itself was correct, just confusing to read).

v4:
- Squash in lost break.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v1)
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-07 22:16:04 +01:00
Ben Widawsky
d2980845b7 drm/i915/bdw: IS_GEN8 definition
No PCI ids yet, so nothing should happen.

Rebase-Note: This one needs replacement ;-)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-05 10:57:59 +01:00
Daniel Vetter
7f16e5c141 Merge tag 'v3.12' into drm-intel-next
I want to merge in the new Broadwell support as a late hw enabling
pull request. But since the internal branch was based upon our
drm-intel-nightly integration branch I need to resolve all the
oustanding conflicts in drm/i915 with a backmerge to make the 60+
patches apply properly.

We'll propably have some fun because Linus will come up with a
slightly different merge solution.

Conflicts:
	drivers/gpu/drm/i915/i915_dma.c
	drivers/gpu/drm/i915/i915_drv.c
	drivers/gpu/drm/i915/intel_crt.c
	drivers/gpu/drm/i915/intel_ddi.c
	drivers/gpu/drm/i915/intel_display.c
	drivers/gpu/drm/i915/intel_dp.c
	drivers/gpu/drm/i915/intel_drv.h

All rather simple adjacent lines changed or partial backports from
-next to -fixes, with the exception of the thaw code in i915_dma.c.
That one needed a bit of shuffling to restore the intent.

Oh and the massive header file reordering in intel_drv.h is a bit
trouble. But not much.

v2: Also don't forget the fixup for the silent conflict that results
in compile fail ...

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-04 16:28:52 +01:00
Linus Torvalds
5e01dc7b26 Linux 3.12 2013-11-03 15:41:51 -08:00
Linus Torvalds
17f6ee43c3 Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus
Pull MIPS fixes from Ralf Baechle:
 "Three fixes across arch/mips with the most complex one being the GIC
  interrupt fix - at nine lines still not monster.  I'm confident this
  are the final MIPS patches even if there should go for an rc8"

* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
  MIPS: ralink: fix return value check in rt_timer_probe()
  MIPS: malta: Fix GIC interrupt offsets
  MIPS: Perf: Fix 74K cache map
2013-11-03 11:36:41 -08:00
Mathias Krause
9bf76ca325 ipc, msg: forbid negative values for "msg{max,mnb,mni}"
Negative message lengths make no sense -- so don't do negative queue
lenghts or identifier counts. Prevent them from getting negative.

Also change the underlying data types to be unsigned to avoid hairy
surprises with sign extensions in cases where those variables get
evaluated in unsigned expressions with bigger data types, e.g size_t.

In case a user still wants to have "unlimited" sizes she could just use
INT_MAX instead.

Signed-off-by: Mathias Krause <minipli@googlemail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-03 10:53:11 -08:00
Linus Torvalds
9dc8c89dfb Last minute perf unbreakage for ARM modules; spent a day in linux-next.
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.14 (GNU/Linux)
 
 iQIcBAABAgAGBQJSdC8JAAoJENkgDmzRrbjx6TEP/jLxlenB07cjc8SnPD1m6htT
 DLFr/t1Jz9qr10EzT9wbNlUtrVhw5HRztd3SDGc8I0OivvAzIxMTVv67CYXHXKMH
 ycrjUOIVnu0Dg/+R1QoP9KFG9aR5UIK0kctTWBhn7ayR0TJfoCN7x7fdWYZP2k6g
 QEXYUvAAaOp9Kg/0oZQ8bXmBVsdoRd3s3PGZUxcOqjD+kK3t+hczJtu6sYgLQvBK
 85Pi+oW1/RZ0ewL89BbU80zwn2vfObtS09sbkxlLKuXFdpp+Veay9Ps6imTzI6PB
 A953j3FwbxpCpwiFOSmF8Ba/7JU4cBW6asM7ZnDTRXkJhlIWxtr0DJxP++vGMo1u
 IZsZRJcwVIQJg1+WvLgw9/aVx1+Yz560jaFJYfr/Lvq6wgNimWtCmyQ8SlFsLxYm
 FO0QuImTGo1iB13hLZ4guMm73WxHGhIXn/29FHq5UmGG/FrCSR1YUq9thP221DrP
 MD3ufvTHvZv9Vb6kzMm2A6oM0Z/HsTATIyirryAPQC1sL3ExwMQibNmGBnV2lPL+
 veaVG4QuPiyHhUDigIxqaltGsCXC/MLv8GbwlqcKWYQENHrQ5qtQxMd2sBmALP7f
 mu4qwcbDBVPqJuCj/+H5QbRLiDfTWmKttUWGRwSIwFJDD/uCvl7yfqzFoqckAJql
 xCtWjFzu3NpM/T3xF39j
 =O8nP
 -----END PGP SIGNATURE-----

Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux

Pull ARM kallsyms fix from Rusty Russell:
 "Last minute perf unbreakage for ARM modules; spent a day in
  linux-next"

* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
  scripts/kallsyms: filter symbols not in kernel address space
2013-11-02 10:27:29 -07:00
Vineet Gupta
9c41f4eeb9 ARC: Incorrect mm reference used in vmalloc fault handler
A vmalloc fault needs to sync up PGD/PTE entry from init_mm to current
task's "active_mm".  ARC vmalloc fault handler however was using mm.

A vmalloc fault for non user task context (actually pre-userland, from
init thread's open for /dev/console) caused the handler to deref NULL mm
(for mm->pgd)

The reasons it worked so far is amazing:

1. By default (!SMP), vmalloc fault handler uses a cached value of PGD.
   In SMP that MMU register is repurposed hence need for mm pointer deref.

2. In pre-3.12 SMP kernel, the problem triggering vmalloc didn't exist in
   pre-userland code path - it was introduced with commit 20bafb3d23
   "n_tty: Move buffers into n_tty_data"

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Cc: Gilad Ben-Yossef <gilad@benyossef.com>
Cc: Noam Camus <noamc@ezchip.com>
Cc: stable@vger.kernel.org    #3.10 and 3.11
Cc: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-02 10:27:04 -07:00
Ming Lei
f6537f2f0e scripts/kallsyms: filter symbols not in kernel address space
This patch uses CONFIG_PAGE_OFFSET to filter symbols which
are not in kernel address space because these symbols are
generally for generating code purpose and can't be run at
kernel mode, so we needn't keep them in /proc/kallsyms.

For example, on ARM there are some symbols which may be
linked in relocatable code section, then perf can't parse
symbols any more from /proc/kallsyms, this patch fixes the
problem (introduced b9b32bf70f)

Cc: Russell King <linux@arm.linux.org.uk>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Michal Marek <mmarek@suse.cz>
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: stable@vger.kernel.org
2013-11-02 09:13:02 +10:30
Linus Torvalds
9581b7d268 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
 "Two fixes:

   - Fix 'NMI handler took too long to run' false positives

     [ Genuine NMI overhead speedups will come for v3.13, this commit
       only fixes a measurement bug ]

   - Fix perf ring-buffer missed barrier causing (rare) ring-buffer data
     corruption on ppc64"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86: Fix NMI measurements
  perf: Fix perf ring buffer memory ordering
2013-11-01 12:54:51 -07:00
Linus Torvalds
9119e33e50 USB fixes for 3.12-final
Here is a set of patches that revert all of the changes done to the
 pl2303 USB serial driver in the 3.12-rc timeframe, as it turns out they
 break some devices that work just fine on 3.11.  As it's not a good idea
 to break working systems, drop them all and they will be reworked for
 future kernel versions such that there is no breakage.
 
 I've also included a MAINTAINERS update for the USB serial subsystem and
 a new device id for the ftdi_sio driver as well.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iEYEABECAAYFAlJz3VcACgkQMUfUDdst+ylxEQCgwiApcUC8e01hmRsuQxrbTC+v
 2P8AnA6PIkyK5vO6+8BCy4YDx93dIwyQ
 =Tn7c
 -----END PGP SIGNATURE-----

Merge tag 'usb-3.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

Pull USB fixes from Greg KH:
 "Here is a set of patches that revert all of the changes done to the
  pl2303 USB serial driver in the 3.12-rc timeframe, as it turns out
  they break some devices that work just fine on 3.11.  As it's not a
  good idea to break working systems, drop them all and they will be
  reworked for future kernel versions such that there is no breakage.

  I've also included a MAINTAINERS update for the USB serial subsystem
  and a new device id for the ftdi_sio driver as well"

* tag 'usb-3.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  USB: serial: ftdi_sio: add id for Z3X Box device
  USB: Maintainers change for usb serial drivers
  Revert "USB: pl2303: restrict the divisor based baud rate encoding method to the "HX" chip type"
  Revert "usb: pl2303: fix+improve the divsor based baud rate encoding method"
  Revert "usb: pl2303: do not round to the next nearest standard baud rate for the divisor based baud rate encoding method"
  Revert "usb: pl2303: remove 500000 baud from the list of standard baud rates"
  Revert "usb: pl2303: move the two baud rate encoding methods to separate functions"
  Revert "usb: pl2303: increase the allowed baud rate range for the divisor based encoding method"
  Revert "usb: pl2303: also use the divisor based baud rate encoding method for baud rates < 115200 with HX chips"
  Revert "usb: pl2303: add two comments concerning the supported baud rates with HX chips"
  Revert "pl2303: simplify the else-if contruct for type_1 chips in pl2303_startup()"
  Revert "pl2303: improve the chip type information output on startup"
  Revert "pl2303: improve the chip type detection/distinction"
  Revert "USB: pl2303: distinguish between original and cloned HX chips"
2013-11-01 12:23:56 -07:00
Linus Torvalds
f9adfbfbf3 sound fixes #2 for 3.12-final
The fixes for random bugs that have been reported lately in the game:
 a few fixes in ASoC dpam and wm_hubs bugs spotted by Coverity, a
 one-liner HD-audio fixup, and a fix for Oops with DPCM.
 
 They are not so critically urgent bugs, but all small and safe.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJSc3xXAAoJEGwxgFQ9KSmkeRIP/R7+E7qukg+WOfB5DWr1QOkW
 7Sly5OsqwT6sEy7uXtYZRA1Y+kelLIew2uoSHkGibSmvlBCLQZkG4PWKBpL0jQRZ
 /y64puytmZTo8RVrxsVIkjJJlFhZ5QCzyahxJO2W0BAnDAxLJ3u6abb1M7qeF+zi
 kn8Ad+PRTUmp1ilJQuyl4A+67DJllk61whPkQFCTeZUVPTkWtiIHyicAkmJiptFz
 +zQSVa4hvoZzM8Pw50N+OhmMlWrvgOfEYg8qhv6zE7owhEeKQCuR5HW8QQdKtN4F
 WQeUgRNZYWYHHYGx+Cn1+tjZV8CmH51JKrsbj+xedmWJ4xyeoX2dURHEzZOltpV3
 FGnmttt/2o6bvZt0kwbYdsrGB+aW8bNV20SC036VUYJxXNdJTx3kLa7O9aAbANcs
 HSsTqvH2fqFqLWtjP7yXi+PdrIpap0WDLkQh1S3FNZXquWq9nc9rT0ZUTnoh+1ib
 hx1BCglTtQzprPW/iQ/Kt0Y3jRMuVcPFzuYurz7NVmefxBtSiNFjnX84QTbO6sjk
 Pq+8efxjlQYL4XVDGhCygaiTZmcEQSwH57M6y5Fwl163v0CBdWlr9oicmSyOh67c
 xiSQPrLXTPAsa5okzV09d47Gv4gS5QPuDyDcpR5oY07wCd+XDRsSTnMJzj8c8wLB
 8jf67Z2SDjC3b2Ai4HOd
 =6MXa
 -----END PGP SIGNATURE-----

Merge tag 'sound-3.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull more sound fixes from Takashi Iwai:
 "The fixes for random bugs that have been reported lately in the game:
  a few fixes in ASoC dpam and wm_hubs bugs spotted by Coverity, a
  one-liner HD-audio fixup, and a fix for Oops with DPCM.

  They are not so critically urgent bugs, but all small and safe"

* tag 'sound-3.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: fix oops in snd_pcm_info() caused by ASoC DPCM
  ASoC: wm_hubs: Add missing break in hp_supply_event()
  ALSA: hda - Add a fixup for ASUS N76VZ
  ASoC: dapm: Return -ENOMEM in snd_soc_dapm_new_dai_widgets()
  ASoC: dapm: Fix source list debugfs outputs
2013-11-01 12:23:22 -07:00
Linus Torvalds
68e952d5f9 Merge tag 'clk-fixes-for-linus' of git://git.linaro.org/people/mturquette/linux
Pull clock subsystem fixes from Mike Turquette.

* tag 'clk-fixes-for-linus' of git://git.linaro.org/people/mturquette/linux:
  clk: fixup argument order when setting VCO parameters
  clk: socfpga: Fix incorrect sdmmc clock name
  clk: armada-370: fix tclk frequencies
  clk: nomadik: set all timers to use 2.4 MHz TIMCLK
2013-11-01 12:22:47 -07:00
Greg Thelen
6920a1bd03 memcg: remove incorrect underflow check
When a memcg is deleted mem_cgroup_reparent_charges() moves charged
memory to the parent memcg.  As of v3.11-9444-g3ea67d0 "memcg: add per
cgroup writeback pages accounting" there's bad pointer read.  The goal
was to check for counter underflow.  The counter is a per cpu counter
and there are two problems with the code:

 (1) per cpu access function isn't used, instead a naked pointer is used
     which easily causes oops.
 (2) the check doesn't sum all cpus

Test:
  $ cd /sys/fs/cgroup/memory
  $ mkdir x
  $ echo 3 > /proc/sys/vm/drop_caches
  $ (echo $BASHPID >> x/tasks && exec cat) &
  [1] 7154
  $ grep ^mapped x/memory.stat
  mapped_file 53248
  $ echo 7154 > tasks
  $ rmdir x
  <OOPS>

The fix is to remove the check.  It's currently dangerous and isn't
worth fixing it to use something expensive, such as
percpu_counter_sum(), for each reparented page.  __this_cpu_read() isn't
enough to fix this because there's no guarantees of the current cpus
count.  The only guarantees is that the sum of all per-cpu counter is >=
nr_pages.

Fixes: 3ea67d06e4 ("memcg: add per cgroup writeback pages accounting")
Reported-and-tested-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: Greg Thelen <gthelen@google.com>
Reviewed-by: Sha Zhengju <handai.szj@taobao.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-01 12:22:28 -07:00
Paulo Zanoni
9d1cb9147d drm/i915: avoid unclaimed registers when capturing the error state
Even though we only check for unclaimed registers while we're writing
registers, if we read a bad register we'll still trigger a CPU error
interrupt, and we'll print an "Unclaimed register" DRM_ERROR due to
that. To avoid this error, just avoid touching power domains that are
not enabled.

Use kzalloc so we're sure all the disabled domains will be zeroed on
the error state file. We already print the information that is enough
to discover if the power well is enabled on the error state file, so
this should not be a problem.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=69747
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-01 18:54:51 +01:00
Daniel Vetter
2675680958 drm/i915: Enable DP port CRC for the "auto" source on g4x/vlv
Now that DP port CRCs are stable, we can use it for generic CRC tests.
Yay, the auto CRC source should now work everywhere!

Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-01 18:27:56 +01:00
Daniel Vetter
8d2f24ca1f drm/i915: scramble reset support for DP port CRC on vlv
They've moved the DC balance reset bit around. Again I don't think we
need it, but better safe than sorry and maybe HDMI port CRC will prove
useful for checking infoframes or hdmi audio.

v2: Apply the suggestions from Damien's review.

Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-01 18:24:53 +01:00
Daniel Vetter
8409360381 drm/i915: scramble reset support for DP port CRC on g4x
We need to reset the DP scrambler on every vsync to get stable CRCs.
And since we can't use the normal pipe CRC on DP ports on g4x we
really need them to be able to test modesetting issues on (e)DP
outputs.

Note that the DC balance reset is for SDVO port CRCs so we don't
strictly need it. But better safe than sorry (and it's a nice template
in case we ever want to grab port CRCs for e.g. audio checking).

v2: Apply the suggestions from Damien's review.

Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-01 18:23:41 +01:00