SuccessChanges

Summary

  1. Revert "[NFC][PowerPC] Add a new case to test phi-node-elimination pass" (details)
  2. [ScheduleDAG] Avoid unnecessary recomputation of topological order. (details)
  3. [X86][AVX] Pad small shuffle inputs in combineX86ShufflesRecursively (details)
  4. [X86][AVX] getFauxShuffleMask - don't widen shuffle inputs from INSERT_SUBVECTOR(X,SHUFFLE(Y,Z)) (details)
  5. [PhaseOrdering] add scalarization test for PR42174; NFC (details)
  6. [X86][AVX] Add test case described in D79987 (details)
  7. [X86] getFauxShuffleMask/getTargetShuffleInputs - make SelectionDAG const (PR45974). (details)
  8. [VectorCombine] add tests for scalarizing binop-with-constant; NFC (details)
  9. [X86][AVX] Add SimplifyMultipleUseDemandedBits VBROADCAST handling to SimplifyDemandedVectorElts. (details)
Commit bfdf9ef009ab335981747f09a2c6b9a41c0462b4 by shkzhang
Revert "[NFC][PowerPC] Add a new case to test phi-node-elimination pass"
This case wll be failed on some machines which enable expensive-checks.

This reverts commit af3abbf7bd2213003a133c361c212ac6efb1bd2b.
The file was removedllvm/test/CodeGen/PowerPC/phi-eliminate.mir
Commit ec25a71eb7fc72440149784951d62453301cc960 by flo
[ScheduleDAG] Avoid unnecessary recomputation of topological order.

In some cases ScheduleDAGRRList has to add new nodes to resolve problems
with interfering physical registers. When new nodes are added, it
completely re-computes the topological order, which can take a long
time, but is unnecessary. We only add nodes one by one, and initially
they do not have any predecessors. So we can just insert them at the end
of the vector. Later we add predecessors, but the helper function
properly updates the topological order much more efficiently. With this
change, the compile time for the program below drops from 300s to 30s on
my machine.

    define i11129 @test1() {
      %L1 = load i11129, i11129* undef
      %B30 = ashr i11129 %L1, %L1
      store i11129 %B30, i11129* undef
      ret i11129 %L1
    }

This should be generally beneficial, as we can skip a large amount of
work. Theoretically there are some scenarios where we might not safe
much, e.g. when we add a dependency between the first and last node.
Then we would have to shift all nodes. But we still do not have to spend
the time re-computing the initial order.

Reviewers: MatzeB, atrick, efriedma, niravd, paquette

Reviewed By: paquette

Differential Revision: https://reviews.llvm.org/D59722
The file was modifiedllvm/lib/CodeGen/ScheduleDAG.cpp (diff)
The file was modifiedllvm/lib/CodeGen/SelectionDAG/ScheduleDAGRRList.cpp (diff)
The file was modifiedllvm/include/llvm/CodeGen/ScheduleDAG.h (diff)
Commit 45ebe38ffc40bb7221fc587bfb4481cf7f53ebbc by llvm-dev
[X86][AVX] Pad small shuffle inputs in combineX86ShufflesRecursively

As detailed on PR45974 and D79987, getFauxShuffleMask is creating nodes on the fly to create shuffles with inputs the same size as the result, causing problems for hasOneUse() checks in later simplification stages.

Currently only combineX86ShufflesRecursively benefits from these widened inputs so I've begun moving the functionality there, and out of getFauxShuffleMask. This allows us to remove the widening from VBROADCAST and *EXTEND* faux shuffle cases.

This just leaves the INSERT_SUBVECTOR case in getFauxShuffleMask still creating nodes, which will require more extensive refactoring.
The file was modifiedllvm/lib/Target/X86/X86ISelLowering.cpp (diff)
Commit d33ba1aa0b505e3f4c55b382f171e8cbef6a1843 by llvm-dev
[X86][AVX] getFauxShuffleMask - don't widen shuffle inputs from INSERT_SUBVECTOR(X,SHUFFLE(Y,Z))

Don't create nodes on the fly when decoding INSERT_SUBVECTOR as faux shuffles.
The file was modifiedllvm/lib/Target/X86/X86ISelLowering.cpp (diff)
Commit 129c501aa9199c2c5a69c7a6de8ec9873e3d41a4 by spatel
[PhaseOrdering] add scalarization test for PR42174; NFC

Motivating test for vector-combine enhancement in D80885.
Make sure that vectorization and canonicalization are
working together as expected.
The file was addedllvm/test/Transforms/PhaseOrdering/X86/scalarization.ll
Commit 15b281d7805dde85af532b954e27e3fc8bf2611d by llvm-dev
[X86][AVX] Add test case described in D79987
The file was modifiedllvm/test/CodeGen/X86/oddshuffles.ll (diff)
Commit f046326847076b50017b3d32db62c3511c478888 by llvm-dev
[X86] getFauxShuffleMask/getTargetShuffleInputs - make SelectionDAG const (PR45974).

Try to prevent future node creation issues (as detailed in PR45974) by making the SelectionDAG reference const, so it can still be used for analysis, but not node creation.
The file was modifiedllvm/lib/Target/X86/X86ISelLowering.cpp (diff)
Commit e31f2a894a7bec0a64553d615ef40fa36134844e by spatel
[VectorCombine] add tests for scalarizing binop-with-constant; NFC

Goes with proposal in D80885.

This is adapted from the InstCombine tests that were added for
D50992

But these should be adjusted further to provide more interesting
scenarios for x86-specific codegen. Eg, vector types/sizes will
have different costs depending on ISA attributes.

We also need to add tests that include a load of the scalar
variable and add tests that include extra uses of the insert
to further exercise the cost model.
The file was addedllvm/test/Transforms/VectorCombine/X86/insert-binop-with-constant.ll
Commit 4a2673d79fdbae57a800ec578ee3d58a6890a4f9 by llvm-dev
[X86][AVX] Add SimplifyMultipleUseDemandedBits VBROADCAST handling to SimplifyDemandedVectorElts.

As suggested on D79987.
The file was modifiedllvm/test/CodeGen/X86/oddshuffles.ll (diff)
The file was modifiedllvm/lib/Target/X86/X86ISelLowering.cpp (diff)