SuccessChanges

Summary

  1. [X86] Simplify patterns for avx512 vpcmp. NFC (details)
  2. [GCOV] Drop unnecessary const from return types (NFC) (details)
  3. [TableGen] Use ListSeparator (NFC) (details)
  4. [AsmPrinter] Use range-based for loops (NFC) (details)
  5. [Polly] Hide Simplify implementation from header. NFC. (details)
  6. [AMDGPU] Refactor MIMG tables to better handle hardware variants (details)
  7. [clang][cli] Fix gcc warning (NFC) (details)
  8. [Test] Add negative tests where usub optimization should not apply (details)
  9. [Codegenprepare][X86] Use usub with overflow opt for IV increment (details)
  10. NFC comment-only cleanups (details)
  11. NFC; fix typo in comment (details)
  12. [NFC] Don't pass redundant arguments (details)
Commit 5189c5b940a1dbce699e407214767f9e5bf77ebf by craig.topper
[X86] Simplify patterns for avx512 vpcmp. NFC

This removes the commuted PatFrags that only existed to carry
an SDNodeXForm in its OperandTransform field. We know all the places
that need to use the commuted SDNodeXForm and there is one transform
shared by signed and unsigned compares. So just hardcode the
the SDNodeXForm where it is needed and use the non commuted PatFrag
in the pattern.

I think when I wrote this I thought the SDNodeXForm name had to
match what is in the PatFrag that is being used. But that's not
true. The OperandTransform is only used when the PatFrag is used
in an instruction pattern and not a separate Pat pattern. All
the commuted cases are Pat patterns.
The file was modifiedllvm/lib/Target/X86/X86InstrAVX512.td
Commit d12a0f4fc0b518267ecbdfac37481795957f33be by kazu
[GCOV] Drop unnecessary const from return types (NFC)

Identified with readability-const-return-type.
The file was modifiedllvm/lib/Transforms/Instrumentation/GCOVProfiling.cpp
Commit b16c6b2a83d9ba94cde7cc03dfea932077442859 by kazu
[TableGen] Use ListSeparator (NFC)
The file was modifiedllvm/utils/TableGen/CodeGenSchedule.cpp
Commit c5e90a8857549e4032b9a972cf74452ae12c6b25 by kazu
[AsmPrinter] Use range-based for loops (NFC)
The file was modifiedllvm/lib/CodeGen/AsmPrinter/CodeViewDebug.cpp
The file was modifiedllvm/lib/CodeGen/AsmPrinter/ErlangGCPrinter.cpp
The file was modifiedllvm/lib/CodeGen/AsmPrinter/AccelTable.cpp
The file was modifiedllvm/lib/CodeGen/AsmPrinter/EHStreamer.cpp
The file was modifiedllvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp
Commit 23753c6088873f01fd32c6f3e3bd03ec7c2f8588 by llvm-project
[Polly] Hide Simplify implementation from header. NFC.

Move SimplifiyVisitor from Simplify.h to Simplify.cpp. It is not
relevant for applying the pass in either the NewPM or the legacyPM.
Rename it to SimplifyImpl to account for that.

This is possible due its state not being necessary to be preserved
between runs and thefore SimplifyImpl not needed to be held in the
pass object. Instead, SimplifyImpl is only instatiated for the
current Scop. In the NewPM as a function-local variable, and in the
legacy PM inside a llvm::Optional object because the state must be
preserved between the printScop (invoked by opt -analyze) and the most
recent runOnScop calls.
The file was modifiedpolly/include/polly/Simplify.h
The file was modifiedpolly/lib/Transform/Simplify.cpp
Commit e5b0b434f60aa825509df542402e771fd56826eb by carl.ritson
[AMDGPU] Refactor MIMG tables to better handle hardware variants

Add mimgopc object to represent the opcode allowing different
opcodes for different hardware variants.
This enables image_atomic_fcmpswap, image_atomic_fmin, and
image_atomic_fmax on GFX10

Reviewed By: foad, rampitec

Differential Revision: https://reviews.llvm.org/D96309
The file was modifiedllvm/test/MC/AMDGPU/gfx10_asm_mimg.s
The file was modifiedllvm/lib/Target/AMDGPU/MIMGInstructions.td
The file was modifiedllvm/test/MC/AMDGPU/gfx7_asm_mimg.s
The file was modifiedllvm/test/MC/Disassembler/AMDGPU/gfx10_mimg.txt
Commit 984cfdc6ee8b4550238dccf212d786c4ded49cf7 by nullptr.cpp
[clang][cli] Fix gcc warning (NFC)

GCC warning:
```
/llvm-project/clang/lib/Frontend/TestModuleFileExtension.cpp:131:20: warning: ‘llvm::raw_ostream& clang::operator<<(llvm::raw_ostream&, const clang::TestModuleFileExtension&)’ has not been declared within ‘clang’
  131 | llvm::raw_ostream &clang::operator<<(llvm::raw_ostream &OS,
      |                    ^~~~~
In file included from /llvm-project/clang/lib/Frontend/TestModuleFileExtension.cpp:8:
/llvm-project/clang/lib/Frontend/TestModuleFileExtension.h:75:3: note: only here as a ‘friend’
   75 |   operator<<(llvm::raw_ostream &OS, const TestModuleFileExtension &Extension);
      |   ^~~~~~~~
```
The file was modifiedclang/lib/Frontend/TestModuleFileExtension.cpp
Commit 6efcc2fd3f138160a710f3c152ee1c54c2e50420 by mkazantsev
[Test] Add negative tests where usub optimization should not apply
The file was modifiedllvm/test/CodeGen/X86/usub_inc_iv.ll
Commit 3d15b7e7dfc3e2cefc47791d1e8d95909e937842 by mkazantsev
[Codegenprepare][X86] Use usub with overflow opt for IV increment

Function `replaceMathCmpWithIntrinsic` artificially limits the scope
of the optimization, setting a requirement of two instructions be in
the same block, due to two reasons:
- usage of DT for more general check is costly in terms of compile time;
- risk of creating a new value that lives through multiple blocks.

Because of this, two semantically equivalent tests may be or not be the
subject of this opt depending on where the binary operation is located.
See `test/CodeGen/X86/usub_inc_iv.ll` for motivation

There is one important particular case where this limitation is  too strict:
it is when the binary operation is the increment of the induction variable.
As result, the application of this opt becomes fragile and highly reliant on
where other passes decide to place IV increment. In most cases, they place
it in the end of the latch block, killing the opt opportunity (when in fact it
does not matter where to insert the actual instruction).

This patch handles this particular case separately.
- The detector does not use dom tree and has constant cost;
- The value of IV or IV.next lives through all loop in any case, so this should not
  create a new unexpected long-living value.

As result, the transform becomes more robust. It also seems to lead to
better code generation in some cases (see `test/CodeGen/X86/lsr-loop-exit-cond.ll`).

Differential Revision: https://reviews.llvm.org/D96119
Reviewed By: spatel, reames
The file was modifiedllvm/lib/CodeGen/CodeGenPrepare.cpp
The file was modifiedllvm/test/CodeGen/X86/2020_12_02_decrementing_loop.ll
The file was modifiedllvm/test/CodeGen/X86/usub_inc_iv.ll
The file was modifiedllvm/test/CodeGen/X86/lsr-loop-exit-cond.ll
Commit a76761cf0deeb223ca1c0b0e5ee68cfcd436e0c4 by sanjoy
NFC comment-only cleanups

- Remove leftover comment from de2568aab819f
- Fix a typo in a comment
The file was modifiedmlir/lib/Dialect/Linalg/Transforms/FusionOnTensors.cpp
Commit bac1f12727835bd8b80ad3db256457ef91eed63b by sanjoy
NFC; fix typo in comment

This should have gone in with a76761cf0deeb223ca1c0b0e5ee68cfcd436e0c4.
The file was modifiedmlir/lib/Dialect/Linalg/Transforms/FusionOnTensors.cpp
Commit 8334cdde2e830787029ca819a26b745c47432a64 by aeubanks
[NFC] Don't pass redundant arguments

Some parameters were already part of the Config passed in.
The file was modifiedllvm/lib/LTO/LTOBackend.cpp