SuccessChanges

Changes from Git (git http://labmaster3.local/git/llvm-project.git)

Summary

  1. [AsmParser][ARM] Make .thumb_func imply .thumb (details)
  2. [llvm][NFC] Remove deprecated TargetFrameLowering and InstrTypes alignment functions (details)
  3. [llvm][NFC] Remove remaining deprecated alignment functions from CodeGen (details)
  4. [llvm-dwarfdump] Help option output should be consistent with the command guide (details)
  5. [DebugInfo] Fix updateDbgUsersToReg to support DBG_VALUE_LIST (details)
  6. [NFC][X86][MCA] AMD Zen 3: add tests with eliminatible GPR moves (details)
  7. [X86] AMD Zen 3: 32/64 -bit GPR register moves are zero-cycle (details)
  8. [NFC][X86][MCA] AMD Zen 3: add tests with non-eliminatible MMX moves (details)
  9. AMDGPU: Correct const_index_stride for wave 32 for PAL ABI (details)
  10. [NFC] (test commit) Changed example invocation of C++ for OpenCL (details)
  11. [X86] Ensure we pass DebugLoc by const reference where possible. NFCI. (details)
  12. [SLP] Regenerate tests to reduce diff in D98714. NFCI. (details)
  13. Revert "AMDGPU: Correct const_index_stride for wave 32 for PAL ABI" (details)
  14. [DAG] Add a generic expansion for SHIFT_PARTS opcodes using funnel shifts (details)
  15. [DebugInfo] Fix crash when emitting an invalidated SDDbgValue (details)
  16. [NFC] Correctly assert the indents for printEnumValHelpStr. (details)
  17. [OpenCL] Fix optional image types. (details)
  18. [ARM] Transforming memset to Tail predicated Loop (details)
  19. Fix: [DebugInfo] Fix crash when emitting an invalidated SDDbgValue (details)
  20. AMDGPU: Correct const_index_stride for wave 32 for PAL ABI (details)
  21. [AMDGPU] Restrict immediate scratch offsets (details)
  22. Retire TargetRegisterInfo::getSpillAlignment (details)
  23. [DAG] Ensure all SD classes consistently return a const reference with getDebugLoc(). NFCI. (details)
  24. [CodeGen] Ensure UserValue::getDebugLoc() and UserLabel::getDebugLoc() consistently return a const reference NFCI. (details)
  25. Reapply "[DebugInfo] Drop DBG_VALUE_LISTs with an excessive number of debug operands" (details)
  26. [libc++] [test] Test that list::swap/move/move-assign does not invalidate iterators. (details)
  27. [libc++] [test] Simplify arithmetic in list.special/swap.pass.cpp. NFCI. (details)
  28. [libc++] [test] Test that unordered_*::swap/move/assign does not invalidate iterators. (details)
  29. [NFC][X86][MCA] Increase iteration count in reg move elimination tests (details)
  30. [NFC][X86] AMD Zen 3: move sched classes for renameables moves togeter (details)
  31. [X86] AMD Zen 3: throughput for renameable GPR moves is 6 (details)
  32. [NFC][X86][MCA] AMD Zen 3: Add tests for renameable SSE XMM moves (details)
  33. [NFC][X86][MCA] AMD Zen 3: Add tests for renameable AVX XMM moves (details)
  34. [NFC][X86][MCA] AMD Zen 3: Add tests for renameable AVX YMM moves (details)
  35. [X86] AMD Zen 3: SSE XMM moves are zero-cycle (details)
  36. [X86] AMD Zen 3: AVX XMM moves are zero-cycle (details)
  37. [X86] AMD Zen 3: AVX YMM moves are zero-cycle (details)
  38. [X86] AMD Zen 3: throughput for renameable XMM/YMM moves is 6 (details)
  39. [NFC][X86][MCA] AMD Zen3 Decrease iteration count in reg-move-elimination tests (details)
  40. [PowerPC] Provide MMA builtins for compatibility (details)
  41. [mlir] Rename BufferAliasAnalysis to BufferViewFlowAnalysis (details)
  42. [mlir][linalg] Remove redundant indexOp builder. (details)
  43. [libomptarget] Add support for target memory allocators to cuda RTL (details)
  44. [AArch64] add test for missed vectorization; NFC (details)
  45. BasicAA: Recognize inttoptr as isEscapeSource (details)
  46. [mlir][spirv] add support lowering of extract_slice to scalar type (details)
  47. [mlir][vector] add pattern to cast away leading unit dim for elementwise op (details)
  48. [libFuzzer] Fix stack overflow detection (details)
  49. [NFC][X86][MCA] AMD Zen3: add test for zero-cycle X87 move (details)
  50. [X86] AMD Zen 3: _REV variants of zero-cycles moves are also zero-cycles (PR50261) (details)
  51. [X86] combineXor - limit fold to non-opaque constants (PR50254) (details)
  52. [LoopNest] Consider loop nest with inner loop guard using outer loop (details)
  53. [libFuzzer] Fix stack-overflow-with-asan.test. (details)
  54. [AArch64][ELF] Prefer to lower MC_GlobalAddress operands to .Lfoo$local (details)
  55. [X86] AMD Zen 3: MOVSX32rr32 is a zero-cycle move (details)
  56. [X86] AMD Zen 3: mark XMM/YMM (but not MMX!) reg moves as eliminatible in RegisterFile (details)
  57. lit: revert 134b103fc0f3a995d76398bf4b029d72bebe8162 (details)
  58. [libc++][ci] Run longer CI jobs first (details)
  59. Internalize some cl::opt global variables or move them under namespace llvm (details)
  60. Allow empty value list in propagateMetadata(Inst, ArrayOf...) (details)
  61. [unittest] Fix -Wunused-variable after D94717 (details)
  62. [WebAssembly] Use functions instead of macros for const SIMD intrinsics (details)
  63. [SCEV] By more careful when traversing phis in isImpliedViaMerge. (details)
  64. Revert "[DebugInfo] Fix updateDbgUsersToReg to support DBG_VALUE_LIST" (details)
  65. [mlir][docs] remove stale statement about index type in vectors (details)
  66. [mlir] Add a pattern to bufferize linalg.tensor_reshape. (details)
  67. [mlir] Add a pattern to bufferize std.index_cast. (details)
  68. An attempt to abandon omptarget out-of-tree builds. (details)
  69. [RISCV] Consider scalar types for required extensions. (details)
  70. [BareMetal] Ensure that sysroot always comes after library paths (details)
  71. [flang] Implement NORM2 in the runtime (details)
  72. [LV] Rename Region to TargetRegion, similar to SinkRegion (NFC). (details)
  73. [LV] Assert if trying to sink replicate region into another region (NFC) (details)
  74. [SEH] Fix regression with SEH in noexpect functions (details)
  75. [MCA][RegisterFile] Fix register class check for move elimination (PR50265) (details)
  76. [LV] Remove reference of PHI from comment, they are not recorded (NFC). (details)
  77. Revert "[BareMetal] Ensure that sysroot always comes after library paths" (details)
  78. [mlir][vector] Extend pattern to trim lead unit dimension to Splat Op (details)
  79. [mlir] Missed clang-format (details)
  80. [lld/mac] Write every weak symbol only once in the output (details)
Commit f87638338464e7ff9396e92e04e3f5702d479d39 by thatlemon
[AsmParser][ARM] Make .thumb_func imply .thumb

GNU as documentation states that a `.thumb_func` directive implies `.thumb`, teach the asm parser to switch mode whenever it's encountered. On the other hand the labeled form, exclusive to Apple's toolchain, doesn't switch mode at all.

Reviewed By: nickdesaulniers, peter.smith

Differential Revision: https://reviews.llvm.org/D101975
The file was modifiedllvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp
The file was addedllvm/test/MC/ARM/thumb_func-implies-thumb.s
The file was modifiedlld/test/ELF/arm-ldrlit-err.s
Commit eb1b26ec1d1ac60b2207354fcd003cad40e12b76 by gchatelet
[llvm][NFC] Remove deprecated TargetFrameLowering and InstrTypes alignment functions

Differential Revision: https://reviews.llvm.org/D102056
The file was modifiedllvm/include/llvm/IR/InstrTypes.h
The file was modifiedllvm/include/llvm/CodeGen/TargetFrameLowering.h
Commit e805b7c2d63c1f8b74f228718a55536f54ddd1c0 by gchatelet
[llvm][NFC] Remove remaining deprecated alignment functions from CodeGen

Differential Revision: https://reviews.llvm.org/D102058
The file was modifiedllvm/include/llvm/CodeGen/MachineMemOperand.h
The file was modifiedllvm/include/llvm/CodeGen/SelectionDAGNodes.h
The file was modifiedllvm/include/llvm/CodeGen/MachineFrameInfo.h
The file was modifiedllvm/lib/CodeGen/MachineOperand.cpp
Commit f0762fc42f0f4ecf849bef42eed2bb4c0785ea67 by gbreynoo
[llvm-dwarfdump] Help option output should be consistent with the command guide

The dwarfdump command guide shows the short options used as aliases but
these are not found in the help text unless --show-hidden is used.
Investigating other tools some follow this pattern, others like
llvm-objdump show aliases with --help. This change fixes the help output
to be consistent with the command guide. This includes updating alias
descriptions in the help output to use "--".

As part of this change I updated cmdline.test, including some options
that were missing testing.

Differential Revision: https://reviews.llvm.org/D101646
The file was modifiedllvm/test/tools/llvm-dwarfdump/cmdline.test
The file was modifiedllvm/tools/llvm-dwarfdump/llvm-dwarfdump.cpp
Commit 0791f968fee259e5c34523167bd58179b8b081c2 by stephen.tozer
[DebugInfo] Fix updateDbgUsersToReg to support DBG_VALUE_LIST

This patch modifies updateDbgUsersToReg to properly handle
DBG_VALUE_LIST instructions, by replacing the hard-coded operand indices
(i.e. getOperand(0)) with the more general getDebugOperandsForReg(), and
updating the register for all matching operands.

Differential Revision: https://reviews.llvm.org/D101523
The file was modifiedllvm/include/llvm/CodeGen/MachineRegisterInfo.h
The file was modifiedllvm/lib/CodeGen/MachineCopyPropagation.cpp
The file was addedllvm/test/DebugInfo/ARM/machine-cp-updates-dbg-reg.mir
Commit 227678089cf6d8b15d51e58abfefd4f346e9c7f0 by lebedev.ri
[NFC][X86][MCA] AMD Zen 3: add tests with eliminatible GPR moves
The file was addedllvm/test/tools/llvm-mca/X86/Znver3/reg-move-elimination-gpr.s
Commit 7059b28d5d276cab89815b762d10431329a7da2a by lebedev.ri
[X86] AMD Zen 3: 32/64 -bit GPR register moves are zero-cycle

I've verified this with llvm-exegesis.
This is not limited to zero registers.

Refs:
AMD SOG 19h, 2.9.4 Zero Cycle Move
The processor is able to execute certain register to register
mov operations with zero cycle delay.

Agner,
22.13 Instructions with no latency
Register-to-register move instructions are resolved at
the register rename stage without using any execution units.
These instructions have zero latency. It is possible to do six such
register renamings per clock cycle, and it is even possible to
rename the same register multiple times in one clock cycle.
The file was modifiedllvm/test/tools/llvm-mca/X86/Znver3/reg-move-elimination-gpr.s
The file was modifiedllvm/lib/Target/X86/X86ScheduleZnver3.td
Commit bda9ca3e44c1b67d1c4ed145bb7071c340fe8961 by lebedev.ri
[NFC][X86][MCA] AMD Zen 3: add tests with non-eliminatible MMX moves

In Zen3, MMX moves are *not* eliminated,
i've verified this with llvm-exegesis.
The file was addedllvm/test/tools/llvm-mca/X86/Znver3/reg-move-elimination-mmx.s
Commit 442de0c1adf36bfddb5fb66b442bba8999fa733b by david.stuttard
AMDGPU: Correct const_index_stride for wave 32 for PAL ABI

Since there is a single scratch resource descriptor for all shaders, if there is
a wave32 and a wave64 shader (for instance for VsFs pairs)
then the const_index_stride will be incorrect for wave32 shaders.

Differential Revision: https://reviews.llvm.org/D101830

Change-Id: Id8de5566b0d1a07a814e2e7db016df9d20bf6d2c
The file was modifiedllvm/test/CodeGen/AMDGPU/pal-simple-indirect-call.ll
The file was modifiedllvm/lib/Target/AMDGPU/SIFrameLowering.cpp
Commit f372ff17f74f99f5e1c021a9c919b33c4caf38d9 by olemarius.strohm
[NFC] (test commit) Changed example invocation of C++ for OpenCL
The file was modifiedclang/docs/OpenCLSupport.rst
Commit 8e42024f79997827cefe00d31cd3bc55d1551fec by llvm-dev
[X86] Ensure we pass DebugLoc by const reference where possible. NFCI.

Avoids a lot of unnecessary tracking increments/decrements of the underlying TrackingMDNodeRef
The file was modifiedllvm/lib/Target/X86/X86ISelLowering.cpp
Commit 2a3f60b5f5304f61cab3654a6afb67b79ca7df86 by llvm-dev
[SLP] Regenerate tests to reduce diff in D98714. NFCI.
The file was modifiedllvm/test/Transforms/SLPVectorizer/X86/pr44067.ll
The file was modifiedllvm/test/Transforms/SLPVectorizer/vectorizable-functions-inseltpoison.ll
The file was modifiedllvm/test/Transforms/SLPVectorizer/vectorizable-functions.ll
Commit 793b4b26039e461dc3142a3f667ba7c97b0ed920 by david.stuttard
Revert "AMDGPU: Correct const_index_stride for wave 32 for PAL ABI"

This reverts commit 442de0c1adf36bfddb5fb66b442bba8999fa733b.
The file was modifiedllvm/lib/Target/AMDGPU/SIFrameLowering.cpp
The file was modifiedllvm/test/CodeGen/AMDGPU/pal-simple-indirect-call.ll
Commit 280aa3415e408cacc520274fdb948ec9fc63865a by llvm-dev
[DAG] Add a generic expansion for SHIFT_PARTS opcodes using funnel shifts

Based off a discussion on D89281 - where the AARCH64 implementations were being replaced to use funnel shifts.

Any target that has efficient funnel shift lowering can handle the shift parts expansion using the same expansion, avoiding a lot of duplication.

I've generalized the X86 implementation and moved it to TargetLowering - so far I've found that AARCH64 and AMDGPU benefit, but many other targets (ARM, PowerPC + RISCV in particular) could easily use this with a few minor improvements to their funnel shift lowering (or the folding of their target ops that funnel shifts lower to).

NOTE: I'm trying to avoid adding full SHIFT_PARTS legalizer handling as I think it might actually be possible to remove these opcodes in the medium-term and use funnel shift / libcall expansion directly.

Differential Revision: https://reviews.llvm.org/D101987
The file was modifiedllvm/test/CodeGen/AMDGPU/srl.ll
The file was modifiedllvm/test/CodeGen/AArch64/arm64-long-shift.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/sra.ll
The file was modifiedllvm/lib/Target/AArch64/AArch64ISelLowering.cpp
The file was modifiedllvm/lib/Target/AArch64/AArch64ISelLowering.h
The file was modifiedllvm/lib/Target/AMDGPU/R600ISelLowering.h
The file was modifiedllvm/include/llvm/CodeGen/TargetLowering.h
The file was modifiedllvm/test/CodeGen/AMDGPU/fp_to_sint.ll
The file was modifiedllvm/lib/Target/AMDGPU/R600ISelLowering.cpp
The file was modifiedllvm/lib/Target/X86/X86ISelLowering.cpp
The file was modifiedllvm/test/CodeGen/AMDGPU/shl.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/fp_to_uint.ll
The file was modifiedllvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
Commit ce0c1f3ced9bccb29c34b87de82c5cdffcbcd457 by stephen.tozer
[DebugInfo] Fix crash when emitting an invalidated SDDbgValue

This patch fixes a crash in the compiler that occurs when certain
invalidated SDDbgValues are emitted. The cause of this was that we would
attempt to check the liveness of the debug value's operands, which
triggers an assert if any of those operands are invalid. This patch
changes this check such that it only occurs if the SDDbgValue is valid;
if not, the check is irrelevant anyway, so can be safely ignored.

Differential Revision: https://reviews.llvm.org/D101540
The file was modifiedllvm/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp
The file was addedllvm/test/DebugInfo/Generic/invalidated-dbg-value-is-undef.ll
Commit d9f2960c932c9803e662098e33d899efa3c67f44 by joachim
[NFC] Correctly assert the indents for printEnumValHelpStr.

Only verify that there's no negative indent.
Noted by @chapuni in https://reviews.llvm.org/D93494.

Reviewed By: chapuni

Differential Revision: https://reviews.llvm.org/D102021
The file was modifiedllvm/lib/Support/CommandLine.cpp
Commit 76f1de10f43ec4d1eb6146c45ccd6f93df5aa3e1 by anastasia.stulova
[OpenCL] Fix optional image types.

This change allows the use of identifiers for image types
from `cl_khr_gl_msaa_sharing` freely in the kernel code if
the extension is not supported since they are not in the
list of the reserved identifiers.

This change also removed the need for pragma for the types
in the extensions since the spec does not require the pragma
uses.

Differential Revision: https://reviews.llvm.org/D100983
The file was modifiedclang/test/SemaOpenCL/invalid-image.cl
The file was modifiedclang/test/SemaOpenCL/access-qualifier.cl
The file was modifiedclang/lib/Sema/SemaType.cpp
The file was modifiedclang/include/clang/Basic/OpenCLImageTypes.def
The file was modifiedclang/lib/Parse/ParseDecl.cpp
The file was modifiedclang/lib/Sema/Sema.cpp
Commit dfe3ffaa4a47ea93cc289b4496c093fbaf73adbc by malhar.jajoo
[ARM] Transforming memset to Tail predicated Loop

This patch converts llvm.memset intrinsic into Tail Predicated
Hardware loops for a target that supports the Arm M-profile
Vector Extension (MVE).

The llvm.memset is converted to a TP loop for both
constant and non-constant input sizes (of llvm.memset).

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D100435
The file was modifiedllvm/test/CodeGen/Thumb2/mve-gather-scatter-optimisation.ll
The file was modifiedllvm/lib/Target/ARM/ARMInstrMVE.td
The file was modifiedllvm/test/CodeGen/Thumb2/mve-phireg.ll
The file was modifiedllvm/test/CodeGen/Thumb2/mve-tp-loop.ll
The file was modifiedllvm/lib/Target/ARM/ARMSelectionDAGInfo.cpp
The file was modifiedllvm/lib/Target/ARM/ARMISelLowering.cpp
The file was modifiedllvm/lib/Target/ARM/ARMSubtarget.h
The file was modifiedllvm/lib/Target/ARM/ARMISelLowering.h
The file was modifiedllvm/test/CodeGen/Thumb2/LowOverheadLoops/memcall.ll
The file was modifiedllvm/test/CodeGen/Thumb2/mve-tp-loop.mir
Commit 14818a86d044909d8eeb1f39f689e2785a09823b by stephen.tozer
Fix: [DebugInfo] Fix crash when emitting an invalidated SDDbgValue

This patch is a fix for revision ce0c1f3c, which caused test failures on
bots without x86 as a registered target. This patch moves the test added
in the prior patch to the x86 folder, so that it only runs on bots with
the correct target available.
The file was removedllvm/test/DebugInfo/Generic/invalidated-dbg-value-is-undef.ll
The file was addedllvm/test/DebugInfo/X86/invalidated-dbg-value-is-undef.ll
Commit 606d4e806192013ff7da33351f671d08b4524438 by david.stuttard
AMDGPU: Correct const_index_stride for wave 32 for PAL ABI

Retrying after revert and fix (removed implicit def flag from operand). Now
passes with expensive_checks enabled.

Since there is a single scratch resource descriptor for all shaders, if there is
a wave32 and a wave64 shader (for instance for VsFs pairs)
then the const_index_stride will be incorrect for wave32 shaders.

Differential Revision: https://reviews.llvm.org/D101830

Change-Id: Ie3b8b2921237968caca91527dd0c97b1b0cc0360
The file was modifiedllvm/lib/Target/AMDGPU/SIFrameLowering.cpp
The file was modifiedllvm/test/CodeGen/AMDGPU/pal-simple-indirect-call.ll
Commit 13c0316239dc31a34262f2270d0952aa152a9a76 by sebastian.neubauer
[AMDGPU] Restrict immediate scratch offsets

gfx9 does not work with negative offsets, gfx10 works only with
aligned negative offsets, but not with unaligned negative offsets.

This is slightly more conservative than needed, gfx9 does support
negative offsets when a VGPR address is used and gfx10 supports
negative, unaligned offsets when an SGPR address is used, but we
do not make use of that with this patch.

Differential Revision: https://reviews.llvm.org/D101292
The file was modifiedllvm/lib/Target/AMDGPU/AMDGPU.td
The file was modifiedllvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp
The file was modifiedllvm/test/CodeGen/AMDGPU/local-stack-alloc-block-sp-reference.ll
The file was modifiedllvm/lib/Target/AMDGPU/SIInstrInfo.cpp
The file was modifiedllvm/test/CodeGen/AMDGPU/flat-scratch.ll
The file was modifiedllvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
The file was modifiedllvm/lib/Target/AMDGPU/GCNSubtarget.h
Commit 6248d1119040d5031b248633005998b94b8024d4 by benny.kra
Retire TargetRegisterInfo::getSpillAlignment

getSpillAlign does the same thing.
The file was modifiedllvm/lib/CodeGen/PrologEpilogInserter.cpp
The file was modifiedllvm/include/llvm/CodeGen/TargetRegisterInfo.h
The file was modifiedllvm/lib/Target/Hexagon/HexagonISelLowering.cpp
The file was modifiedllvm/lib/Target/Hexagon/HexagonInstrInfo.cpp
Commit dd21c6b843b25d2d65daab561fe47b4157c32952 by llvm-dev
[DAG] Ensure all SD classes consistently return a const reference with getDebugLoc(). NFCI.

Avoids a lot of unnecessary tracking increments/decrements of the underlying TrackingMDNodeRef.
The file was modifiedllvm/lib/CodeGen/SelectionDAG/SDNodeDbgValue.h
Commit c9d4b4173b56c5a56d32d07be660f872b9746f87 by llvm-dev
[CodeGen] Ensure UserValue::getDebugLoc() and UserLabel::getDebugLoc() consistently return a const reference NFCI.

Avoids a lot of unnecessary tracking increments/decrements of the underlying TrackingMDNodeRef.
The file was modifiedllvm/lib/CodeGen/LiveDebugVariables.cpp
Commit 7bc1dd1191aba77da83f04415ee646cc3381729e by stephen.tozer
Reapply "[DebugInfo] Drop DBG_VALUE_LISTs with an excessive number of debug operands"

Reapply b623df3c, which was reverted while reverting a different patch
with a breaking change. There are no underlying issues with this patch,
so no changes have been made to the original patch.

This reverts commit b11e4c990771541e440861f017afea7b4ba162f4.
The file was addedllvm/test/DebugInfo/X86/live-debug-vars-loc-limit.ll
The file was modifiedllvm/lib/CodeGen/LiveDebugVariables.cpp
Commit 8935c8449b7b17049990d29443ed29dde315f281 by arthur.j.odwyer
[libc++] [test] Test that list::swap/move/move-assign does not invalidate iterators.

And remove the dedicated debug-iterator test; we want to test this in all modes.
We have a CI step for testing the whole test suite with `--debug_level=1` now.

Part of https://reviews.llvm.org/D102003
The file was modifiedlibcxx/test/std/containers/sequences/list/list.special/swap.pass.cpp
The file was modifiedlibcxx/test/std/containers/sequences/list/list.cons/assign_move.pass.cpp
The file was modifiedlibcxx/test/std/containers/sequences/list/list.cons/move.pass.cpp
The file was removedlibcxx/test/libcxx/containers/sequences/list/list.cons/db_move.pass.cpp
Commit a1f75bf091a20132dc44828a2a9a68d559f922f3 by arthur.j.odwyer
[libc++] [test] Simplify arithmetic in list.special/swap.pass.cpp. NFCI.

Part of https://reviews.llvm.org/D102003
The file was modifiedlibcxx/test/std/containers/sequences/list/list.special/swap.pass.cpp
Commit f42355e17c3f3d1d099d028a388796a64724ffdb by arthur.j.odwyer
[libc++] [test] Test that unordered_*::swap/move/assign does not invalidate iterators.

And remove the dedicated debug-iterator tests; we want to test this in all modes.
We have a CI step for testing the whole test suite with `--debug_level=1` now.

Part of https://reviews.llvm.org/D102003
The file was modifiedlibcxx/test/std/containers/unord/unord.multimap/unord.multimap.cnstr/assign_move.pass.cpp
The file was modifiedlibcxx/test/std/containers/unord/unord.set/unord.set.cnstr/assign_move.pass.cpp
The file was removedlibcxx/test/libcxx/containers/unord/unord.multimap/db_move.pass.cpp
The file was removedlibcxx/test/libcxx/containers/unord/unord.multiset/db_move.pass.cpp
The file was modifiedlibcxx/test/std/containers/unord/unord.multimap/unord.multimap.swap/swap_non_member.pass.cpp
The file was modifiedlibcxx/test/std/containers/unord/unord.multiset/unord.multiset.swap/swap_non_member.pass.cpp
The file was modifiedlibcxx/test/std/containers/unord/unord.map/unord.map.cnstr/assign_move.pass.cpp
The file was removedlibcxx/test/libcxx/containers/unord/unord.map/db_move.pass.cpp
The file was modifiedlibcxx/test/std/containers/unord/unord.map/unord.map.swap/swap_non_member.pass.cpp
The file was removedlibcxx/test/libcxx/containers/unord/unord.set/db_move.pass.cpp
The file was modifiedlibcxx/test/std/containers/unord/unord.multimap/unord.multimap.cnstr/move.pass.cpp
The file was modifiedlibcxx/test/std/containers/unord/unord.set/unord.set.cnstr/move.pass.cpp
The file was modifiedlibcxx/test/std/containers/unord/unord.set/unord.set.swap/swap_non_member.pass.cpp
The file was modifiedlibcxx/test/std/containers/unord/unord.multiset/unord.multiset.cnstr/assign_move.pass.cpp
The file was modifiedlibcxx/test/std/containers/unord/unord.map/unord.map.cnstr/move.pass.cpp
The file was modifiedlibcxx/test/std/containers/unord/unord.multiset/unord.multiset.cnstr/move.pass.cpp
Commit e6d688ec96706c1bbcb27419333828ec61752fab by lebedev.ri
[NFC][X86][MCA] Increase iteration count in reg move elimination tests

So the IPC actually stabilizes at 6.
The file was modifiedllvm/test/tools/llvm-mca/X86/Znver3/reg-move-elimination-mmx.s
The file was modifiedllvm/test/tools/llvm-mca/X86/Znver3/reg-move-elimination-gpr.s
Commit c3cd8ed0097b07e5454255ffe5899ded21ca0bff by lebedev.ri
[NFC][X86] AMD Zen 3: move sched classes for renameables moves togeter
The file was modifiedllvm/lib/Target/X86/X86ScheduleZnver3.td
Commit d8c6202576771f0e1478b3abdd246600caf7d704 by lebedev.ri
[X86] AMD Zen 3: throughput for renameable GPR moves is 6

They are resolved at the register rename stage without
using any execution units.
The file was modifiedllvm/lib/Target/X86/X86ScheduleZnver3.td
The file was modifiedllvm/test/tools/llvm-mca/X86/Znver3/reg-move-elimination-gpr.s
Commit cbabe4f4d62a6bcee206e0673de559805a092420 by lebedev.ri
[NFC][X86][MCA] AMD Zen 3: Add tests for renameable SSE XMM moves
The file was addedllvm/test/tools/llvm-mca/X86/Znver3/reg-move-elimination-sse-xmm.s
Commit bcbfc22ff9b2f16d77489b0ce34e8d96e4f9ae5b by lebedev.ri
[NFC][X86][MCA] AMD Zen 3: Add tests for renameable AVX XMM moves
The file was addedllvm/test/tools/llvm-mca/X86/Znver3/reg-move-elimination-avx-xmm.s
Commit 0d961fbd525cb7df3e981d6469b81cbf8f5e5883 by lebedev.ri
[NFC][X86][MCA] AMD Zen 3: Add tests for renameable AVX YMM moves
The file was addedllvm/test/tools/llvm-mca/X86/Znver3/reg-move-elimination-avx-ymm.s
Commit 9db4203883f57f34e7e88fd6deb761ef8a9f7d5a by lebedev.ri
[X86] AMD Zen 3: SSE XMM moves are zero-cycle

I've verified this with llvm-exegesis.
This is not limited to zero registers.

Refs:
AMD SOG 19h, 2.9.4 Zero Cycle Move
The processor is able to execute certain register to register
mov operations with zero cycle delay.

Agner,
22.13 Instructions with no latency
Register-to-register move instructions are resolved at
the register rename stage without using any execution units.
These instructions have zero latency. It is possible to do six such
register renamings per clock cycle, and it is even possible to
rename the same register multiple times in one clock cycle.
The file was modifiedllvm/test/tools/llvm-mca/X86/Znver3/reg-move-elimination-sse-xmm.s
The file was modifiedllvm/lib/Target/X86/X86ScheduleZnver3.td
Commit ee020b930d1299acf42b759dd15a44d2020ef963 by lebedev.ri
[X86] AMD Zen 3: AVX XMM moves are zero-cycle

I've verified this with llvm-exegesis.
This is not limited to zero registers.
The file was modifiedllvm/test/tools/llvm-mca/X86/Znver3/reg-move-elimination-avx-xmm.s
The file was modifiedllvm/lib/Target/X86/X86ScheduleZnver3.td
Commit 715c0d0bd412141e0404d5bfcad4dddac3bfc0d0 by lebedev.ri
[X86] AMD Zen 3: AVX YMM moves are zero-cycle

I've verified this with llvm-exegesis.
This is not limited to zero registers.
The file was modifiedllvm/lib/Target/X86/X86ScheduleZnver3.td
The file was modifiedllvm/test/tools/llvm-mca/X86/Znver3/reg-move-elimination-avx-ymm.s
Commit 758c173309edbd6ac3958eb08dc01b6524badff8 by lebedev.ri
[X86] AMD Zen 3: throughput for renameable XMM/YMM moves is 6

They are resolved at the register rename stage without
using any execution units.
The file was modifiedllvm/test/tools/llvm-mca/X86/Znver3/resources-sse1.s
The file was modifiedllvm/test/tools/llvm-mca/X86/Znver3/reg-move-elimination-avx-xmm.s
The file was modifiedllvm/test/tools/llvm-mca/X86/Znver3/resources-avx1.s
The file was modifiedllvm/test/tools/llvm-mca/X86/Znver3/reg-move-elimination-avx-ymm.s
The file was modifiedllvm/lib/Target/X86/X86ScheduleZnver3.td
The file was modifiedllvm/test/tools/llvm-mca/X86/Znver3/reg-move-elimination-sse-xmm.s
The file was modifiedllvm/test/tools/llvm-mca/X86/Znver3/resources-sse2.s
Commit 34de155f7e335e9e69276356565dcc31ed7d8535 by lebedev.ri
[NFC][X86][MCA] AMD Zen3 Decrease iteration count in reg-move-elimination tests

Drop it just enough so it still produces the right IPC.
The file was modifiedllvm/test/tools/llvm-mca/X86/Znver3/reg-move-elimination-avx-xmm.s
The file was modifiedllvm/test/tools/llvm-mca/X86/Znver3/reg-move-elimination-avx-ymm.s
The file was modifiedllvm/test/tools/llvm-mca/X86/Znver3/reg-move-elimination-sse-xmm.s
The file was modifiedllvm/test/tools/llvm-mca/X86/Znver3/reg-move-elimination-mmx.s
Commit 25bbff632d018d178272a61c0732203d53d3a2e3 by saghir
[PowerPC] Provide MMA builtins for compatibility

Vector pair intrinsics and builtins were renamed in
https://reviews.llvm.org/D91974 to replace the _mma_ prefix by _vsx_.
However, some projects used the _mma_ version, so this patch adds
these intrinsics to provide compatibility.

Fixes Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=50159

Reviewed By: nemanjai, amyk

Differential Revision: https://reviews.llvm.org/D100482
The file was modifiedclang/include/clang/Basic/BuiltinsPPC.def
The file was modifiedclang/test/CodeGen/builtins-ppc-pair-mma.c
The file was modifiedclang/lib/Sema/SemaChecking.cpp
The file was modifiedclang/lib/CodeGen/CGBuiltin.cpp
Commit faab8c140ab2480d978ccc3ea11cbc3b279736b6 by tpopp
[mlir] Rename BufferAliasAnalysis to BufferViewFlowAnalysis

This it to make more clear the difference between this and
an AliasAnalysis.

For example, given a sequence of subviews that create values
A -> B -> C -> d:
BufferViewFlowAnalysis::resolve(B) => {B, C, D}
AliasAnalysis::resolve(B) => {A, B, C, D}

Differential Revision: https://reviews.llvm.org/D100838
The file was modifiedmlir/lib/Analysis/CMakeLists.txt
The file was modifiedmlir/include/mlir/Transforms/Bufferize.h
The file was addedmlir/include/mlir/Analysis/BufferViewFlowAnalysis.h
The file was modifiedmlir/lib/Transforms/BufferOptimizations.cpp
The file was removedmlir/include/mlir/Analysis/BufferAliasAnalysis.h
The file was modifiedmlir/include/mlir/Transforms/BufferUtils.h
The file was modifiedmlir/lib/Transforms/BufferDeallocation.cpp
The file was removedmlir/lib/Analysis/BufferAliasAnalysis.cpp
The file was addedmlir/lib/Analysis/BufferViewFlowAnalysis.cpp
Commit f31531a30b124042d8523b7d50053ade82659c5b by gysit
[mlir][linalg] Remove redundant indexOp builder.

Remove the builder signature taking a signed dimension identifier.

Reviewed By: ergawy

Differential Revision: https://reviews.llvm.org/D102055
The file was modifiedmlir/include/mlir/Dialect/Linalg/IR/LinalgOps.td
The file was modifiedmlir/lib/Dialect/Linalg/Transforms/FusionOnTensors.cpp
The file was modifiedmlir/lib/Dialect/Linalg/Transforms/Interchange.cpp
Commit a15f8589f4e81973b096a5ccc7b5b687c3284ebe by huberjn
[libomptarget] Add support for target memory allocators to cuda RTL

Summary:
The allocator interface added in D97883 allows the RTL to allocate shared and
host-pinned memory from the cuda plugin. This patch adds support for these to
the runtime.

Reviewed By: grokos

Differential Revision: https://reviews.llvm.org/D102000
The file was addedopenmp/libomptarget/test/api/omp_host_pinned_memory.c
The file was modifiedopenmp/libomptarget/plugins/cuda/src/rtl.cpp
The file was modifiedopenmp/libomptarget/plugins/common/MemoryManager/MemoryManager.h
The file was addedopenmp/libomptarget/test/api/omp_device_managed_memory.c
Commit 0a6f11aabdd3f116b603694a0d4f9abbba62ade4 by spatel
[AArch64] add test for missed vectorization; NFC

This is a reduction of the example in:
https://llvm.org/PR50256
The file was addedllvm/test/Transforms/SLPVectorizer/AArch64/widen.ll
Commit bc302bfbef84bd778a9e5e0a1b5851c6a55c1d9c by jotrem
BasicAA: Recognize inttoptr as isEscapeSource

Pointers escape when converted to integers, so a pointer produced by
converting an integer to a pointer must not be a local non-escaping
object.

Reviewed By: nikic, nlopes, aqjune

Differential Revision: https://reviews.llvm.org/D101541
The file was addedllvm/test/Analysis/BasicAA/noalias-inttoptr.ll
The file was modifiedllvm/lib/Analysis/BasicAliasAnalysis.cpp
Commit 565ee6afc707d5744d0ec90936f0c0564c1acf69 by thomasraoux
[mlir][spirv] add support lowering of extract_slice to scalar type

Differential Revision: https://reviews.llvm.org/D102041
The file was modifiedmlir/test/Conversion/VectorToSPIRV/simple.mlir
The file was modifiedmlir/lib/Conversion/VectorToSPIRV/VectorToSPIRV.cpp
Commit a970e69d6b62d60c4c222e2a4be0a73999c97651 by thomasraoux
[mlir][vector] add pattern to cast away leading unit dim for elementwise op

Differential Revision: https://reviews.llvm.org/D102034
The file was modifiedmlir/lib/Dialect/Vector/VectorTransforms.cpp
The file was modifiedmlir/test/Dialect/Vector/vector-transforms.mlir
Commit 70cbc6dbef7048d3b1aa89a676d96c6ba075b41b by mascasa
[libFuzzer] Fix stack overflow detection

Address sanitizer can detect stack exhaustion via its SEGV handler, which is
executed on a separate stack using the sigaltstack mechanism. When libFuzzer is
used with address sanitizer, it installs its own signal handlers which defer to
those put in place by the sanitizer before performing additional actions. In the
particular case of a stack overflow, the current setup fails because libFuzzer
doesn't preserve the flag for executing the signal handler on a separate stack:
when we run out of stack space, the operating system can't run the SEGV handler,
so address sanitizer never reports the issue. See the included test for an
example.

This commit fixes the issue by making libFuzzer preserve the SA_ONSTACK flag
when installing its signal handlers; the dedicated signal-handler stack set up
by the sanitizer runtime appears to be large enough to support the additional
frames from the fuzzer.

Reviewed By: morehouse

Differential Revision: https://reviews.llvm.org/D101824
The file was modifiedcompiler-rt/lib/fuzzer/FuzzerUtilPosix.cpp
The file was addedcompiler-rt/test/fuzzer/stack-overflow-with-asan.test
The file was addedcompiler-rt/test/fuzzer/StackOverflowTest.cpp
Commit a8e30e63aca0e9c61f956e61303ae3694cf00f2c by lebedev.ri
[NFC][X86][MCA] AMD Zen3: add test for zero-cycle X87 move
The file was addedllvm/test/tools/llvm-mca/X86/Znver3/reg-move-elimination-x87.s
Commit 2819009b5aa9725aebba63e8722e31943a7fb36f by lebedev.ri
[X86] AMD Zen 3: _REV variants of zero-cycles moves are also zero-cycles (PR50261)

Sometimes disassembler picks _REV variants of instructions
over the plain ones, which in this case exposed an issue
that the _REV variants aren't being modelled as optimizable moves.
The file was modifiedllvm/test/tools/llvm-mca/X86/Znver3/reg-move-elimination-avx-xmm.s
The file was modifiedllvm/test/tools/llvm-mca/X86/Znver3/reg-move-elimination-avx-ymm.s
The file was modifiedllvm/lib/Target/X86/X86ScheduleZnver3.td
Commit f744723f7538934e0beb5d8a2267afeb86345986 by llvm-dev
[X86] combineXor - limit fold to non-opaque constants (PR50254)

Ensure we don't try to fold when one might be an opaque constant - the constant fold will fail and then the reverse fold will happen in DAGCombine.....
The file was modifiedllvm/lib/Target/X86/X86ISelLowering.cpp
The file was addedllvm/test/CodeGen/X86/pr50254.ll
Commit 1006ac3963eaf39153d6637b631662e87ebf3b4d by whitneyt
[LoopNest] Consider loop nest with inner loop guard using outer loop
induction variable to be perfect

This patch allow more conditional branches to be considered as loop
guard, and so more loop nests can be considered perfect.

Reviewed By: bmahjour, sidbav

Differential Revision: https://reviews.llvm.org/D94717
The file was modifiedllvm/test/Analysis/LoopNestAnalysis/perfectnest.ll
The file was modifiedllvm/unittests/Analysis/LoopInfoTest.cpp
The file was modifiedllvm/include/llvm/Analysis/LoopNestAnalysis.h
The file was modifiedllvm/lib/Analysis/LoopNestAnalysis.cpp
The file was modifiedllvm/lib/Analysis/LoopInfo.cpp
The file was modifiedllvm/test/Analysis/LoopNestAnalysis/imperfectnest.ll
Commit f09414499c4717b66baa9581c641e8a636e5dcc1 by mascasa
[libFuzzer] Fix stack-overflow-with-asan.test.

Fix function return type and remove check for SUMMARY, since it doesn't
seem to be output in Windows.
The file was modifiedcompiler-rt/test/fuzzer/stack-overflow-with-asan.test
The file was modifiedcompiler-rt/test/fuzzer/StackOverflowTest.cpp
Commit 6a2850f3fc24cc53da6543ee98bd837007c65725 by i
[AArch64][ELF] Prefer to lower MC_GlobalAddress operands to .Lfoo$local

Similar to X86 D73230 & 46788a21f9152be3950e57dc526454655682bdd4

With this change, we can set dso_local in clang's -fpic -fno-semantic-interposition mode,
for default visibility external linkage non-ifunc-non-COMDAT definitions.

For such dso_local definitions, variable access/taking the address of a
function/calling a function will go through a local alias to avoid GOT/PLT.

Note: the 'S' inline assembly constraint refers to an absolute symbolic address
or a label reference (D46745).

Differential Revision: https://reviews.llvm.org/D101872
The file was addedllvm/test/CodeGen/AArch64/semantic-interposition-asm.ll
The file was addedllvm/test/CodeGen/AArch64/elf-preemption.ll
The file was modifiedllvm/test/CodeGen/AArch64/basic-pic.ll
The file was modifiedllvm/lib/CodeGen/AsmPrinter/AsmPrinterInlineAsm.cpp
The file was modifiedllvm/lib/Target/AArch64/AArch64MCInstLower.cpp
The file was modifiedllvm/test/CodeGen/AArch64/elf-globals-static.ll
Commit 5b1610a25054b308d02be8882dd34bed3dc29ef4 by lebedev.ri
[X86] AMD Zen 3: MOVSX32rr32 is a zero-cycle move

It measures as such, and the reference docs agree.

I can't easily add a MCA test, because there's no mnemonic for it,
it can only be disassembled or created as a MCInst.
The file was modifiedllvm/lib/Target/X86/X86ScheduleZnver3.td
Commit b8701dc1749e228b886e53bdb32eeebba00e30da by lebedev.ri
[X86] AMD Zen 3: mark XMM/YMM (but not MMX!) reg moves as eliminatible in RegisterFile
The file was modifiedllvm/lib/Target/X86/X86ScheduleZnver3.td
Commit d319005a3746a7661c8c9a3302266b6ff7cf61be by Saleem Abdulrasool
lit: revert 134b103fc0f3a995d76398bf4b029d72bebe8162

Revert the 32-process cap on Windows.  When testing with Swift, we found
that there was a time reduction for testing with the higher load.  This
should hopefully not matter much in practice.  In the case that the
original problem with python remains with a high subprocess count, we
can easily revert this change.
The file was modifiedllvm/utils/lit/lit/util.py
Commit 8002c5d65fdc979fc2f4fa33509f6c32caca3dce by Louis Dionne
[libc++][ci] Run longer CI jobs first

Jobs that test with a more recent standard version run more tests, so
they take longer. We'll decrease the average latency by running them
first instead of last.
The file was modifiedlibcxx/utils/ci/buildkite-pipeline.yml
Commit d8aba75a768033c326613d85e8789703cb4565d2 by i
Internalize some cl::opt global variables or move them under namespace llvm
The file was modifiedllvm/lib/MC/MCAsmInfoXCOFF.cpp
The file was modifiedllvm/lib/Analysis/BlockFrequencyInfoImpl.cpp
The file was modifiedllvm/lib/Analysis/CallGraphSCCPass.cpp
The file was modifiedllvm/lib/CodeGen/MachineBlockPlacement.cpp
The file was modifiedllvm/lib/Transforms/Utils/AssumeBundleBuilder.cpp
The file was modifiedllvm/lib/LTO/SummaryBasedOptimizations.cpp
The file was modifiedllvm/include/llvm/Transforms/Utils/SizeOpts.h
The file was modifiedllvm/lib/Passes/PassBuilder.cpp
The file was modifiedllvm/lib/Transforms/IPO/SyntheticCountsPropagation.cpp
The file was modifiedllvm/lib/Analysis/BlockFrequencyInfo.cpp
The file was modifiedllvm/lib/MC/MCAsmInfo.cpp
The file was modifiedllvm/lib/Transforms/IPO/WholeProgramDevirt.cpp
The file was modifiedllvm/lib/Analysis/AliasAnalysis.cpp
The file was modifiedpolly/lib/Analysis/ScopDetectionDiagnostic.cpp
The file was modifiedllvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
The file was modifiedllvm/lib/Transforms/IPO/BlockExtractor.cpp
The file was modifiedllvm/lib/Transforms/Instrumentation/PGOInstrumentation.cpp
The file was modifiedllvm/include/llvm/Analysis/BlockFrequencyInfoImpl.h
The file was modifiedllvm/lib/Transforms/Utils/SizeOpts.cpp
The file was modifiedllvm/tools/opt/NewPMDriver.cpp
The file was modifiedllvm/lib/CodeGen/MachineBranchProbabilityInfo.cpp
The file was modifiedllvm/tools/opt/opt.cpp
The file was modifiedllvm/unittests/Analysis/AssumeBundleQueriesTest.cpp
The file was modifiedllvm/lib/CodeGen/MachineBlockFrequencyInfo.cpp
The file was modifiedllvm/lib/Transforms/IPO/PassManagerBuilder.cpp
Commit 50cf0a1d1ae48bd0397b41a400e01c62975b6706 by kparzysz
Allow empty value list in propagateMetadata(Inst, ArrayOf...)

This will allow writing
  propagateMetadata(Inst, collectInterestingValues(...))
without concern about empty lists. In case of an empty list,
Inst is returned without any changes.
The file was modifiedllvm/lib/Analysis/VectorUtils.cpp
Commit 724604901a104d8ba9e48ca0330e164a66c1c7ac by i
[unittest] Fix -Wunused-variable after D94717
The file was modifiedllvm/unittests/Analysis/LoopInfoTest.cpp
Commit 1e9c39a3f982fe2f50cd19c74be8b64dfba4baad by tlively
[WebAssembly] Use functions instead of macros for const SIMD intrinsics

To improve hygiene, consistency, and usability, it would be good to replace all
the macro intrinsics in wasm_simd128.h with functions. The reason for using
macros in the first place was to enforce the use of constants for some arguments
using `_Static_assert` with `__builtin_constant_p`. This commit switches to
using functions and uses the `__diagnose_if__` attribute rather than
`_Static_assert` to enforce constantness.

The remaining macro intrinsics cannot be made into functions until the builtin
functions they are implemented with can be replaced with normal code patterns
because the builtin functions themselves require that their arguments are
constants.

This commit also fixes a bug with the const_splat intrinsics in which the f32x4
and f64x2 variants were incorrectly producing integer vectors.

Differential Revision: https://reviews.llvm.org/D102018
The file was modifiedclang/lib/Headers/wasm_simd128.h
The file was modifiedclang/test/Headers/wasm.c
Commit 6c99e631201aaea0a75708749cbaf2ba08a493f9 by flo
[SCEV] By more careful when traversing phis in isImpliedViaMerge.

I think currently isImpliedViaMerge can incorrectly return true for phis
in a loop/cycle, if the found condition involves the previous value of

Consider the case in exit_cond_depends_on_inner_loop.

At some point, we call (modulo simplifications)
isImpliedViaMerge(<=, %x.lcssa, -1, %call, -1).

The existing code tries to prove IncV <= -1 for all incoming values
InvV using the found condition (%call <= -1). At the moment this succeeds,
but only because it does not compare the same runtime value. The found
condition checks the value of the last iteration, but the incoming value
is from the *previous* iteration.

Hence we incorrectly determine that the *previous* value was <= -1,
which may not be true.

I think we need to be more careful when looking at the incoming values
here. In particular, we need to rule out that a found condition refers to
any value that may refer to one of the previous iterations. I'm not sure
there's a reliable way to do so (that also works of irreducible control
flow).

So for now this patch adds an additional requirement that the incoming
value must properly dominate the phi block. This should ensure the
values do not change in a cycle. I am not entirely sure if will catch
all cases and I appreciate a through second look in that regard.

Alternatively we could also unconditionally bail out in this case,
instead of checking the incoming values

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D101829
The file was modifiedllvm/test/Transforms/IndVarSimplify/eliminate-exit.ll
The file was modifiedllvm/test/Transforms/IRCE/decrementing-loop.ll
The file was modifiedllvm/lib/Analysis/ScalarEvolution.cpp
Commit 7ca26c5fa2df253878cab22e1e2f0d6f1b481218 by aeubanks
Revert "[DebugInfo] Fix updateDbgUsersToReg to support DBG_VALUE_LIST"

This reverts commit 0791f968fee259e5c34523167bd58179b8b081c2.

Causing crashes: https://crbug.com/1206764
The file was removedllvm/test/DebugInfo/ARM/machine-cp-updates-dbg-reg.mir
The file was modifiedllvm/lib/CodeGen/MachineCopyPropagation.cpp
The file was modifiedllvm/include/llvm/CodeGen/MachineRegisterInfo.h
Commit 21db1e3b01402678994a291930eadf82187750c4 by gysit
[mlir][docs] remove stale statement about index type in vectors

b614ada0e8 ("[mlir] add support for index type in vectors.") removed
this limitation.

Differential Revision: https://reviews.llvm.org/D102081
The file was modifiedmlir/include/mlir/IR/BuiltinTypes.td
Commit a3f22d020b2709b2b4897ae3450c33834e646329 by pifon
[mlir] Add a pattern to bufferize linalg.tensor_reshape.

Differential Revision: https://reviews.llvm.org/D102089
The file was modifiedmlir/lib/Dialect/Linalg/Transforms/Bufferize.cpp
The file was modifiedmlir/test/Dialect/Linalg/bufferize.mlir
Commit 3444996b4c45f6efdd731100e8ca6c6105407045 by pifon
[mlir] Add a pattern to bufferize std.index_cast.

Differential Revision: https://reviews.llvm.org/D102088
The file was modifiedmlir/test/Dialect/Standard/bufferize.mlir
The file was modifiedmlir/lib/Dialect/StandardOps/Transforms/Bufferize.cpp
Commit f2f88f3e7a110b2d4d9da446e45f0dba040e62b2 by vyacheslav.p.zakharin
An attempt to abandon omptarget out-of-tree builds.

I want to start using LLVM component libraries in libomptarget
to stop duplicating implementations already available in LLVM
(e.g. LLVMObject, LLVMSupport, etc.). Without relying on LLVM
in all libomptarget builds one has to provide fallback implementation
for each used LLVM feature.

This is an attempt to stop supporting out-of-llvm-tree builds of libomptarget.

I understand that I may need to revert this,
if this affects downstream projects in a bad way.

Differential Revision: https://reviews.llvm.org/D101509
The file was modifiedopenmp/libomptarget/src/CMakeLists.txt
The file was modifiedopenmp/libomptarget/deviceRTLs/amdgcn/CMakeLists.txt
The file was modifiedopenmp/libomptarget/cmake/Modules/LibomptargetGetDependencies.cmake
The file was modifiedopenmp/README.rst
The file was modifiedopenmp/CMakeLists.txt
Commit c04c66d705b4f6e95a6325ef6d6c647ebc622165 by kai.wang
[RISCV] Consider scalar types for required extensions.

We have vector operations on double vector and float scalar. For
example, vfwadd.wf is such a instruction.

vfloat64m1_t vfwadd_wf(vfloat64m1_t op0, float op1, size_t op2);

We should specify F and D extensions for it.

Differential Revision: https://reviews.llvm.org/D102051
The file was modifiedclang/utils/TableGen/RISCVVEmitter.cpp
Commit 6b00b34b8a05896f79b18a1963811299b83d5b21 by phosek
[BareMetal] Ensure that sysroot always comes after library paths

This addresses an issue introduced in D91559. We would invoke the
compiler with -Lpath/to/lib --sysroot=path/to/sysroot where both
locations contain libraries with the same name, but we expect linker
to pick up the library in path/to/lib since that version is more
specialized. This was the case before D91559 where the sysroot path
would be ignored, but after that change linker would now pick up the
library from the sysroot which resulted in unexpected behavior.

The sysroot path should always come after any user provided library
paths, followed by compiler runtime paths. We want for libraries in user
provided library paths to always take precedence over sysroot libraries.
This matches the behavior of other toolchains used with other targets.

Differential Revision: https://reviews.llvm.org/D102049
The file was modifiedclang/test/Driver/baremetal-sysroot.cpp
The file was modifiedclang/lib/Driver/ToolChains/BareMetal.cpp
Commit 01c78a0b0764e5c254c745a21c35f7950b6c8816 by pklausler
[flang] Implement NORM2 in the runtime

Implement the reduction transformational intrinsic function NORM2 in
the runtime, using infrastructure already in place for MAXVAL & al.

Differential Revision: https://reviews.llvm.org/D102024
The file was modifiedflang/runtime/extrema.cpp
The file was modifiedflang/runtime/reduction.cpp
The file was modifiedflang/runtime/reduction.h
The file was modifiedflang/unittests/RuntimeGTest/Reduction.cpp
Commit 01c26d4e048cf9812e7675cb704c2a4461b68e4c by flo
[LV] Rename Region to TargetRegion, similar to SinkRegion (NFC).

Adjust the name to make it clearer this is the region containing the
target recipe, similar to SinkRegion below.

Suggested post-commit for ccebf7a1096a.
The file was modifiedllvm/lib/Transforms/Vectorize/LoopVectorize.cpp
Commit 337d7652823f59f4613552cebdf81292bf8f393d by flo
[LV] Assert if trying to sink replicate region into another region (NFC)

Currently sinking a replicate region into another replicate region is
not supported. Add an assert, to make the problem more obvious, should
it occur.

Discussed post-commit for ccebf7a1096a.
The file was modifiedllvm/lib/Transforms/Vectorize/LoopVectorize.cpp
Commit c4adc49a1c988e6ea8a340b6245525ef5599812c by rnk
[SEH] Fix regression with SEH in noexpect functions

Commit 5baea0560160a693b19022c5d0ba637b6b46b2d8 set the CurCodeDecl
because it was needed to pass the assert in CodeGenFunction::EmitLValueForLambdaField,
But this was not right to do as CodeGenFunction::FinishFunction passes it to EmitEndEHSpec
and cause corruption of the EHStack.

Revert the part of the commit that changes the CurCodeDecl, and instead
adjust the assert to check for a null CurCodeDecl.

Differential Revision: https://reviews.llvm.org/D102027
The file was modifiedclang/lib/CodeGen/CGExpr.cpp
The file was modifiedclang/lib/CodeGen/CGException.cpp
The file was modifiedclang/test/CodeGenCXX/exceptions-seh.cpp
Commit 3822ac909ead8f41ebc81e382bb01908bf04f407 by andrea.dibiagio
[MCA][RegisterFile] Fix register class check for move elimination (PR50265)

The register file should always check if the destination register is from a
register class that allows move elimination.

Before this change, the check on the register class was only performed in a few
very specific cases. However, it should have always been performed.
This patch fixes the issue.

Note that none of the upstream scheduling models is currently affected by this
bug, so there is no test for it. The issue was found by Roman while working on
the znver3 model. I was able to reproduce the issue locally by tweaking the
btver2 model. I then verified that this patch fixes the issue.
The file was modifiedllvm/lib/MCA/HardwareUnits/RegisterFile.cpp
Commit 75b9997760c69968863740ded6c89d4faf29ca7f by flo
[LV] Remove reference of PHI from comment, they are not recorded (NFC).

The comment incorrectly states that the PHI is recorded. That's not
accurate, only the recipe for the incoming value is recorded.

Suggested post-commit for 4ba8720f8844.
The file was modifiedllvm/lib/Transforms/Vectorize/LoopVectorize.cpp
Commit f97ada27aaf64207a2ffad937ce3ccf009e81bd8 by phosek
Revert "[BareMetal] Ensure that sysroot always comes after library paths"

This reverts commit 6b00b34b8a05896f79b18a1963811299b83d5b21.
The file was modifiedclang/test/Driver/baremetal-sysroot.cpp
The file was modifiedclang/lib/Driver/ToolChains/BareMetal.cpp
Commit d0453a8933a14c9441b2d89e6f934bd1bc243200 by thomasraoux
[mlir][vector] Extend pattern to trim lead unit dimension to Splat Op

Differential Revision: https://reviews.llvm.org/D102091
The file was modifiedmlir/test/Dialect/Vector/vector-transforms.mlir
The file was modifiedmlir/lib/Dialect/Vector/VectorTransforms.cpp
Commit b90b66bcbe3ec909d386d3d546cd116099619641 by thomasraoux
[mlir] Missed clang-format
The file was modifiedmlir/lib/Dialect/Vector/VectorTransforms.cpp
Commit d5a70db1938c06380bdab033b7d47a7437914f4c by thakis
[lld/mac] Write every weak symbol only once in the output

Before this, if an inline function was defined in several input files,
lld would write each copy of the inline function the output. With this
patch, it only writes one copy.

Reduces the size of Chromium Framework from 378MB to 345MB (compared
to 290MB linked with ld64, which also does dead-stripping, which we
don't do yet), and makes linking it faster:

        N           Min           Max        Median           Avg        Stddev
    x  10     3.9957051     4.3496981     4.1411121      4.156837    0.10092097
    +  10      3.908154      4.169318     3.9712729     3.9846753   0.075773012
    Difference at 95.0% confidence
            -0.172162 +/- 0.083847
            -4.14165% +/- 2.01709%
            (Student's t, pooled s = 0.0892373)

Implementation-wise, when merging two weak symbols, this sets a
"canOmitFromOutput" on the InputSection belonging to the weak symbol not put in
the symbol table. We then don't write InputSections that have this set, as long
as they are not referenced from other symbols. (This happens e.g. for object
files that don't set .subsections_via_symbols or that use .alt_entry.)

Some restrictions:
- not yet done for bitcode inputs
- no "comdat" handling (`kindNoneGroupSubordinate*` in ld64) --
  Frame Descriptor Entries (FDEs), Language Specific Data Areas (LSDAs)
  (that is, catch block unwind information) and Personality Routines
  associated with weak functions still not stripped. This is wasteful,
  but harmless.
- However, this does strip weaks from __unwind_info (which is needed for
  correctness and not just for size)
- This nopes out on InputSections that are referenced form more than
  one symbol (eg from .alt_entry) for now

Things that work based on symbols Just Work:
- map files (change in MapFile.cpp is no-op and not needed; I just
  found it a bit more explicit)
- exports

Things that work with inputSections need to explicitly check if
an inputSection is written (e.g. unwind info).

This patch is useful in itself, but it's also likely also a useful foundation
for dead_strip.

I used to have a "canoncialRepresentative" pointer on InputSection instead of
just the bool, which would be handy for ICF too. But I ended up not needing it
for this patch, so I removed that again for now.

Differential Revision: https://reviews.llvm.org/D102076
The file was modifiedlld/MachO/SymbolTable.cpp
The file was modifiedlld/MachO/InputFiles.cpp
The file was modifiedlld/MachO/InputSection.cpp
The file was modifiedlld/MachO/Writer.cpp
The file was addedlld/test/MachO/weak-definition-gc.s
The file was modifiedlld/MachO/InputSection.h
The file was modifiedlld/MachO/Symbols.h
The file was modifiedlld/MachO/UnwindInfoSection.cpp
The file was modifiedlld/MachO/MapFile.cpp