SuccessChanges

Summary

  1. [llvm][LV] Replace `unsigned VF` with `ElementCount VF` [NFCI] (details)
  2. GlobalISel: Reduce G_SHL width if source is extension (details)
  3. Revert "[llvm][LV] Replace `unsigned VF` with `ElementCount VF` [NFCI]" (details)
  4. AMDGPU/GlobalISel: Start implementing computeKnownBitsForTargetInstr (details)
  5. [llvm][LV] Replace `unsigned VF` with `ElementCount VF` [NFCI] (details)
  6. [OPENMP]Fix PR47158, case 3: allow devic_typein nested declare target region. (details)
  7. AMDGPU/GlobalISel: Add baseline, failing unmerge tests (details)
  8. AMDGPU/GlobalISel: Use different technique for sample v3s16 values (details)
Commit c8d2b065b98fa91139cc7bb1fd1407f032ef252e by francesco.petrogalli
[llvm][LV] Replace `unsigned VF` with `ElementCount VF` [NFCI]

Changes:

* Change `ToVectorTy` to deal directly with `ElementCount` instances.
* `VF == 1` replaced with `VF.isScalar()`.
* `VF > 1` and `VF >=2` replaced with `VF.isVector()`.
* `VF <=1` is replaced with `VF.isZero() || VF.isScalar()`.
* Add `<` operator to `ElementCount` to be able to use
`llvm::SmallSetVector<ElementCount, ...>`.
* Bits and pieces around printing the ElementCount to string streams.
* Added a static method to `ElementCount` to represent a scalar.

To guarantee that this change is a NFC, `VF.Min` and asserts are used
in the following places:

1. When it doesn't make sense to deal with the scalable property, for
example:
   a. When computing unrolling factors.
   b. When shuffle masks are built for fixed width vector types
In this cases, an
assert(!VF.Scalable && "<mgs>") has been added to make sure we don't
enter coepaths that don't make sense for scalable vectors.
2. When there is a conscious decision to use `FixedVectorType`. These
uses of `FixedVectorType` will likely be removed in favour of
`VectorType` once the vectorizer is generic enough to deal with both
fixed vector types and scalable vector types.
3. When dealing with building constants out of the value of VF, for
example when computing the vectorization `step`, or building vectors
of indices. These operation _make sense_ for scalable vectors too,
but changing the code in these places to be generic and make it work
for scalable vectors is to be submitted in a separate patch, as it is
a functional change.
4. When building the potential VFs in VPlan. Making the VPlan generic
enough to handle scalable vectorization factors is a functional change
that needs a separate patch. See for example `void
LoopVectorizationPlanner::buildVPlans(unsigned MinVF, unsigned
MaxVF)`.
5. The class `IntrinsicCostAttribute`: this class still uses `unsigned
VF` as updating the field to use `ElementCount` woudl require changes
that could result in changing the behavior of the compiler. Will be done
in a separate patch.
7. When dealing with user input for forcing the vectorization
factor. In this case, adding support for scalable vectorization is a
functional change that migh require changes at command line.

Differential Revision: https://reviews.llvm.org/D85794
The file was modifiedllvm/lib/Transforms/Vectorize/VPlan.cpp (diff)
The file was modifiedllvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h (diff)
The file was modifiedllvm/include/llvm/Analysis/TargetTransformInfo.h (diff)
The file was modifiedllvm/include/llvm/Support/TypeSize.h (diff)
The file was modifiedllvm/lib/Transforms/Vectorize/LoopVectorize.cpp (diff)
The file was modifiedllvm/include/llvm/IR/DiagnosticInfo.h (diff)
The file was modifiedllvm/lib/IR/DiagnosticInfo.cpp (diff)
The file was modifiedllvm/include/llvm/Analysis/VectorUtils.h (diff)
The file was modifiedllvm/lib/Transforms/Vectorize/VPlan.h (diff)
Commit e1644a377996565e119aa178f40c567b986a6203 by Matthew.Arsenault
GlobalISel: Reduce G_SHL width if source is extension

shl ([sza]ext x, y) => zext (shl x, y).

Turns expensive 64 bit shifts into 32 bit if it does not overflow the
source type:

This is a port of an AMDGPU DAG combine added in
5fa289f0d8ff85b9e14d2f814a90761378ab54ae. InstCombine does this
already, but we need to do it again here to apply it to shifts
introduced for lowered getelementptrs. This will help matching
addressing modes that use 32-bit offsets in a future patch.

TableGen annoyingly assumes only a single match data operand, so
introduce a reusable struct. However, this still requires defining a
separate GIMatchData for every combine which is still annoying.

Adds a morally equivalent function to the existing
getShiftAmountTy. Without this, we would have to do try to repeatedly
query the legalizer info and guess at what type to use for the shift.
The file was addedllvm/test/CodeGen/AMDGPU/GlobalISel/combine-shl-from-extend-narrow.postlegal.mir
The file was addedllvm/test/CodeGen/AMDGPU/GlobalISel/combine-shl-from-extend-narrow.prelegal.mir
The file was modifiedllvm/include/llvm/Target/GlobalISel/Combine.td (diff)
The file was modifiedllvm/lib/Target/AMDGPU/SIISelLowering.cpp (diff)
The file was modifiedllvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp (diff)
The file was modifiedllvm/lib/Target/AMDGPU/SIISelLowering.h (diff)
The file was modifiedllvm/include/llvm/CodeGen/GlobalISel/CombinerHelper.h (diff)
The file was modifiedllvm/include/llvm/CodeGen/TargetLowering.h (diff)
The file was addedllvm/test/CodeGen/AMDGPU/GlobalISel/shl-ext-reduce.ll
Commit bad7d6b3735d1d855ffb07f32a272049cff085e6 by francesco.petrogalli
Revert "[llvm][LV] Replace `unsigned VF` with `ElementCount VF` [NFCI]"

Reverting because the commit message doesn't reflect the one agreed on
phabricator at https://reviews.llvm.org/D85794.

This reverts commit c8d2b065b98fa91139cc7bb1fd1407f032ef252e.
The file was modifiedllvm/include/llvm/Support/TypeSize.h (diff)
The file was modifiedllvm/include/llvm/IR/DiagnosticInfo.h (diff)
The file was modifiedllvm/include/llvm/Analysis/TargetTransformInfo.h (diff)
The file was modifiedllvm/include/llvm/Analysis/VectorUtils.h (diff)
The file was modifiedllvm/lib/Transforms/Vectorize/LoopVectorize.cpp (diff)
The file was modifiedllvm/lib/Transforms/Vectorize/VPlan.h (diff)
The file was modifiedllvm/lib/Transforms/Vectorize/VPlan.cpp (diff)
The file was modifiedllvm/lib/IR/DiagnosticInfo.cpp (diff)
The file was modifiedllvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h (diff)
Commit 70cd9f5b779c04d1b32c790cb289c9f00f548b57 by Matthew.Arsenault
AMDGPU/GlobalISel: Start implementing computeKnownBitsForTargetInstr

Handle workitem intrinsics. There isn't really away to adequately test
this right now, since none of the known bits users are fine grained
enough to test the edge conditions. This triggers a number of
instances of the new 64-bit to 32-bit shift combine in the existing
tests.
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/insertelement.large.ll (diff)
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.is.private.ll (diff)
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.div.fmas.ll (diff)
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.update.dpp.ll (diff)
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/cvt_f32_ubyte.ll (diff)
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.atomic.dec.ll (diff)
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.div.scale.ll (diff)
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/shl-ext-reduce.ll (diff)
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.atomic.inc.ll (diff)
The file was modifiedllvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp (diff)
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.is.shared.ll (diff)
The file was modifiedllvm/lib/Target/AMDGPU/SIISelLowering.cpp (diff)
The file was modifiedllvm/lib/Target/AMDGPU/AMDGPUSubtarget.h (diff)
The file was modifiedllvm/lib/Target/AMDGPU/SIISelLowering.h (diff)
Commit 5a34b3ab95b5659ef395ffe7b6bd79f2a0423514 by francesco.petrogalli
[llvm][LV] Replace `unsigned VF` with `ElementCount VF` [NFCI]

Changes:

* Change `ToVectorTy` to deal directly with `ElementCount` instances.
* `VF == 1` replaced with `VF.isScalar()`.
* `VF > 1` and `VF >=2` replaced with `VF.isVector()`.
* `VF <=1` is replaced with `VF.isZero() || VF.isScalar()`.
* Replaced the uses of `llvm::SmallSet<ElementCount, ...>` with
   `llvm::SmallSetVector<ElementCount, ...>`. This avoids the need of an
   ordering function for the `ElementCount` class.
* Bits and pieces around printing the `ElementCount` to string streams.

To guarantee that this change is a NFC, `VF.Min` and asserts are used
in the following places:

1. When it doesn't make sense to deal with the scalable property, for
example:
   a. When computing unrolling factors.
   b. When shuffle masks are built for fixed width vector types
In this cases, an
assert(!VF.Scalable && "<mgs>") has been added to make sure we don't
enter coepaths that don't make sense for scalable vectors.
2. When there is a conscious decision to use `FixedVectorType`. These
uses of `FixedVectorType` will likely be removed in favour of
`VectorType` once the vectorizer is generic enough to deal with both
fixed vector types and scalable vector types.
3. When dealing with building constants out of the value of VF, for
example when computing the vectorization `step`, or building vectors
of indices. These operation _make sense_ for scalable vectors too,
but changing the code in these places to be generic and make it work
for scalable vectors is to be submitted in a separate patch, as it is
a functional change.
4. When building the potential VFs in VPlan. Making the VPlan generic
enough to handle scalable vectorization factors is a functional change
that needs a separate patch. See for example `void
LoopVectorizationPlanner::buildVPlans(unsigned MinVF, unsigned
MaxVF)`.
5. The class `IntrinsicCostAttribute`: this class still uses `unsigned
VF` as updating the field to use `ElementCount` woudl require changes
that could result in changing the behavior of the compiler. Will be done
in a separate patch.
7. When dealing with user input for forcing the vectorization
factor. In this case, adding support for scalable vectorization is a
functional change that migh require changes at command line.

Note that in some places the idiom

```
unsigned VF = ...
auto VTy = FixedVectorType::get(ScalarTy, VF)
```

has been replaced with

```
ElementCount VF = ...
assert(!VF.Scalable && ...);
auto VTy = VectorType::get(ScalarTy, VF)
```

The assertion guarantees that the new code is (at least in debug mode)
functionally equivalent to the old version. Notice that this change had been
possible because none of the methods that are specific to `FixedVectorType`
were used after the instantiation of `VTy`.

Reviewed By: rengolin, ctetreau

Differential Revision: https://reviews.llvm.org/D85794
The file was modifiedllvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h (diff)
The file was modifiedllvm/lib/Transforms/Vectorize/LoopVectorize.cpp (diff)
The file was modifiedllvm/lib/Transforms/Vectorize/VPlan.cpp (diff)
The file was modifiedllvm/include/llvm/Analysis/TargetTransformInfo.h (diff)
The file was modifiedllvm/lib/IR/DiagnosticInfo.cpp (diff)
The file was modifiedllvm/lib/Transforms/Vectorize/VPlan.h (diff)
The file was modifiedllvm/include/llvm/Support/TypeSize.h (diff)
The file was modifiedllvm/include/llvm/Analysis/VectorUtils.h (diff)
The file was modifiedllvm/include/llvm/IR/DiagnosticInfo.h (diff)
Commit bedc841a5098bc0a90bbc66328d7aab4b2c23c4a by a.bataev
[OPENMP]Fix PR47158, case 3: allow devic_typein nested declare target region.

OpenMP 5.0 supports nested declare target regions. So, in general,it is
allow to mark a declarationas declare target with different device_type
or link type. Patch adds support for such kind of nesting.

Differential Revision: https://reviews.llvm.org/D86239
The file was modifiedclang/include/clang/Sema/Sema.h (diff)
The file was modifiedclang/lib/AST/AttrImpl.cpp (diff)
The file was modifiedclang/lib/Serialization/ASTReaderDecl.cpp (diff)
The file was modifiedclang/lib/Sema/SemaOpenMP.cpp (diff)
The file was modifiedclang/test/OpenMP/declare_target_ast_print.cpp (diff)
The file was modifiedclang/include/clang/Basic/Attr.td (diff)
The file was modifiedclang/test/AST/dump.cpp (diff)
Commit 9b3222d56067502dc5f6139ca7e4c78171439bd4 by Matthew.Arsenault
AMDGPU/GlobalISel: Add baseline, failing unmerge tests
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/artifact-combiner-unmerge-values.mir (diff)
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/legalize-unmerge-values.mir (diff)
Commit bdb25b3ce5479a820a4c6539d22aeedad7e06875 by Matthew.Arsenault
AMDGPU/GlobalISel: Use different technique for sample v3s16 values

Avoid relying on implicit_def values, and odd sized G_INSERT/G_EXTRACT
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/legalize-ashr.mir (diff)
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/legalize-fmul.mir (diff)
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/legalize-phi.mir (diff)
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/legalize-saddo.mir (diff)
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/legalize-sext-inreg.mir (diff)
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/legalize-uaddo.mir (diff)
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/legalize-usubo.mir (diff)
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/legalize-or.mir (diff)
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/legalize-xor.mir (diff)
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/legalize-fpext.mir (diff)
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/legalize-fadd.mir (diff)
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/legalize-fsub.mir (diff)
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/legalize-shl.mir (diff)
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/legalize-and.mir (diff)
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/legalize-fma.mir (diff)
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/legalize-intrinsic-round.mir (diff)
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/legalize-lshr.mir (diff)
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/legalize-select.mir (diff)
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/legalize-fminnum.mir (diff)
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/legalize-fmaxnum.mir (diff)
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/legalize-ssubo.mir (diff)