SuccessChanges

Summary

  1. Remove deprecated methods from OpState. (details)
  2. Scalar: Don't visit constants in findInnerReductionPhi in LoopInterchange (details)
  3. [SLP] rename reduction variable to avoid shadowing; NFC (details)
  4. [LV][ARM] Inloop reduction cost modelling (details)
  5. [lldb-vscode] improve modules request (details)
  6. [libc++abi] Add an option to avoid demangling in terminate. (details)
  7. Revert [mlir] Link mlir_runner_utils statically into cuda/rocm-runtime-wrappers (cf50f4f76456) (details)
  8. [WebAssembly] Test that invalid symbol/relocation types generate errors (details)
  9. Fix crash when emitting NullReturn guards for functions returning BOOL (details)
  10. Add Python bindings for the builtin dialect (details)
  11. [llvm-mca] Initial implementation of serialization using JSON. The views (details)
  12. [libc++abi] Simplify scan_eh_tab (details)
  13. [gn build] Port d38be2ba0e4e (details)
Commit 8827e07aaf2114b7f09e229e22481cd58137ea6a by csigg
Remove deprecated methods from OpState.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D95123
The file was modifiedmlir/include/mlir/IR/OpDefinition.h
The file was modifiedmlir/lib/IR/Operation.cpp
Commit bfec9148a042c8fd6093ae0d54c784211a295c6c by Duncan P. N. Exon Smith
Scalar: Don't visit constants in findInnerReductionPhi in LoopInterchange

In LoopInterchange, `findInnerReductionPhi()` looks for reduction
variables, which cannot be constants. Update it to return early in that
case.

This also addresses a blocker for removing use-lists from ConstantData,
whose users could be spread across arbitrary modules in the same
LLVMContext.

Differential Revision: https://reviews.llvm.org/D94712
The file was modifiedllvm/test/Transforms/LoopInterchange/reductions-across-inner-and-outer-loop.ll
The file was modifiedllvm/lib/Transforms/Scalar/LoopInterchange.cpp
Commit 2f03528f5e7fd9df0a12091392e000c697497262 by spatel
[SLP] rename reduction variable to avoid shadowing; NFC

The code structure can likely be improved now that
'OperationData' is gone.
The file was modifiedllvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
Commit 39db5753f993abcc4289dd165e8297a4e28f4b0a by david.green
[LV][ARM] Inloop reduction cost modelling

This adds cost modelling for the inloop vectorization added in
745bf6cf4471. Up until now they have been modelled as the original
underlying instruction, usually an add. This happens to works OK for MVE
with instructions that are reducing into the same type as they are
working on. But MVE's instructions can perform the equivalent of an
extended MLA as a single instruction:

  %sa = sext <16 x i8> A to <16 x i32>
  %sb = sext <16 x i8> B to <16 x i32>
  %m = mul <16 x i32> %sa, %sb
  %r = vecreduce.add(%m)
  ->
  R = VMLADAV A, B

There are other instructions for performing add reductions of
v4i32/v8i16/v16i8 into i32 (VADDV), for doing the same with v4i32->i64
(VADDLV) and for performing a v4i32/v8i16 MLA into an i64 (VMLALDAV).
The i64 are particularly interesting as there are no native i64 add/mul
instructions, leading to the i64 add and mul naturally getting very
high costs.

Also worth mentioning, under NEON there is the concept of a sdot/udot
instruction which performs a partial reduction from a v16i8 to a v4i32.
They extend and mul/sum the first four elements from the inputs into the
first element of the output, repeating for each of the four output
lanes. They could possibly be represented in the same way as above in
llvm, so long as a vecreduce.add could perform a partial reduction. The
vectorizer would then produce a combination of in and outer loop
reductions to efficiently use the sdot and udot instructions. Although
this patch does not do that yet, it does suggest that separating the
input reduction type from the produced result type is a useful concept
to model. It also shows that a MLA reduction as a single instruction is
fairly common.

This patch attempt to improve the costmodelling of in-loop reductions
by:
- Adding some pattern matching in the loop vectorizer cost model to
   match extended reduction patterns that are optionally extended and/or
   MLA patterns. This marks the cost of the reduction instruction correctly
   and the sext/zext/mul leading up to it as free, which is otherwise
   difficult to tell and may get a very high cost. (In the long run this
   can hopefully be replaced by vplan producing a single node and costing
   it correctly, but that is not yet something that vplan can do).
- getExtendedAddReductionCost is added to query the cost of these
   extended reduction patterns.
- Expanded the ARM costs to account for these expanded sizes, which is a
   fairly simple change in itself.
- Some minor alterations to allow inloop reduction larger than the highest
   vector width and i64 MVE reductions.
- An extra InLoopReductionImmediateChains map was added to the vectorizer
   for it to efficiently detect which instructions are reductions in the
   cost model.
- The tests have some updates to show what I believe is optimal
   vectorization and where we are now.

Put together this can greatly improve performance for reduction loop
under MVE.

Differential Revision: https://reviews.llvm.org/D93476
The file was modifiedllvm/include/llvm/CodeGen/BasicTTIImpl.h
The file was modifiedllvm/lib/Analysis/TargetTransformInfo.cpp
The file was modifiedllvm/include/llvm/Analysis/TargetTransformInfoImpl.h
The file was modifiedllvm/lib/Target/ARM/ARMTargetTransformInfo.h
The file was modifiedllvm/test/Transforms/LoopVectorize/ARM/mve-reductions.ll
The file was modifiedllvm/lib/Target/ARM/ARMTargetTransformInfo.cpp
The file was modifiedllvm/include/llvm/Analysis/TargetTransformInfo.h
The file was modifiedllvm/test/Transforms/LoopVectorize/ARM/mve-reduction-types.ll
The file was modifiedllvm/lib/Transforms/Vectorize/LoopVectorize.cpp
Commit 39239f9b5666bebb059fa562badeffb9f1c3afab by a20012251
[lldb-vscode] improve modules request

lldb-vsdode was communicating the list of modules to the IDE with events, which in practice ended up having some drawbacks
- when debugging large targets, the number of these events were easily 10k, which polluted the messages being transmitted, which caused the following: a harder time debugging the messages, a lag after terminated the process because of these messages being processes (this could easily take several seconds). The latter was specially bad, as users were complaining about it even when they didn't check the modules view.
- these events were rarely used, as users only check the modules view when something is wrong and they try to debug things.

After getting some feedback from users, we realized that it's better to not used events but make this simply a request and is triggered by users whenever they needed.

This diff achieves that and does some small clean up in the existing code.

Differential Revision: https://reviews.llvm.org/D94033
The file was modifiedlldb/test/API/tools/lldb-vscode/module/TestVSCode_module.py
The file was modifiedlldb/tools/lldb-vscode/VSCode.cpp
The file was modifiedlldb/tools/lldb-vscode/lldb-vscode.cpp
The file was modifiedlldb/packages/Python/lldbsuite/test/tools/lldb-vscode/vscode.py
Commit 866d480fe0549d616bfdd69986dd07a7b2dc5b52 by danalbert
[libc++abi] Add an option to avoid demangling in terminate.

We've been using this patch in Android so we can avoid including the
demangler in libc++.so. It comes with a rather large cost in RSS and
isn't commonly needed.

Reviewed By: #libc_abi, compnerd

Differential Revision: https://reviews.llvm.org/D88189
The file was modifiedlibcxxabi/src/cxa_default_handlers.cpp
The file was modifiedlibcxxabi/CMakeLists.txt
Commit bd3a387ee76f58caa0d7901f3f84e9bb3d006f27 by csigg
Revert [mlir] Link mlir_runner_utils statically into cuda/rocm-runtime-wrappers (cf50f4f76456)

There are cmake failures that I do not know how to fix.

Differential Revision: https://reviews.llvm.org/D95162
The file was modifiedmlir/test/mlir-cuda-runner/all-reduce-xor.mlir
The file was modifiedmlir/test/mlir-rocm-runner/vecadd.mlir
The file was modifiedmlir/test/mlir-cuda-runner/all-reduce-max.mlir
The file was modifiedmlir/test/mlir-cuda-runner/all-reduce-min.mlir
The file was modifiedmlir/test/mlir-rocm-runner/vector-transferops.mlir
The file was modifiedmlir/test/mlir-rocm-runner/gpu-to-hsaco.mlir
The file was modifiedmlir/test/mlir-cuda-runner/multiple-all-reduce.mlir
The file was modifiedmlir/test/mlir-cuda-runner/two-modules.mlir
The file was modifiedmlir/tools/mlir-cuda-runner/CMakeLists.txt
The file was modifiedmlir/tools/mlir-rocm-runner/CMakeLists.txt
The file was modifiedmlir/test/mlir-cuda-runner/all-reduce-region.mlir
The file was modifiedmlir/lib/ExecutionEngine/CMakeLists.txt
The file was modifiedmlir/test/mlir-cuda-runner/gpu-to-cubin.mlir
The file was modifiedmlir/test/mlir-cuda-runner/all-reduce-or.mlir
The file was modifiedmlir/test/mlir-rocm-runner/two-modules.mlir
The file was modifiedmlir/test/mlir-cuda-runner/shuffle.mlir
The file was modifiedmlir/test/mlir-cuda-runner/all-reduce-and.mlir
The file was modifiedmlir/test/mlir-cuda-runner/all-reduce-op.mlir
Commit d75b3719828f3e0c9736476e50a08e5083f90c0b by sbc
[WebAssembly] Test that invalid symbol/relocation types generate errors

See https://bugs.llvm.org/show_bug.cgi?id=48827

Differential Revision: https://reviews.llvm.org/D95163
The file was modifiedllvm/lib/Object/WasmObjectFile.cpp
The file was addedllvm/test/Object/wasm-bad-symbol-type.test
The file was addedllvm/test/Object/Inputs/WASM/bad-reloc-type.wasm
The file was addedllvm/test/Object/wasm-bad-reloc-type.test
The file was addedllvm/test/Object/Inputs/WASM/bad-symbol-type.wasm
Commit 1deee5cacbb76578367186d7ff2937b6fa79b827 by jonathan_roelofs
Fix crash when emitting NullReturn guards for functions returning BOOL

CodeGenModule::EmitNullConstant() creates constants with their "in memory"
type, not their "in vregs" type. The one place where this difference matters is
when the type is _Bool, as that is an i1 when in vregs and an i8 in memory.

Fixes: rdar://73361264
The file was modifiedclang/lib/CodeGen/CGObjCMac.cpp
The file was addedclang/test/CodeGenObjC/null-check-bool-ret.m
Commit 922b26cde4d1c89a5fa90e6a1d6d97d0f8eace6d by joker.eph
Add Python bindings for the builtin dialect

This includes some minor customization for FuncOp and ModuleOp.

Differential Revision: https://reviews.llvm.org/D95022
The file was modifiedmlir/lib/Bindings/Python/CMakeLists.txt
The file was addedmlir/test/Bindings/Python/.style.yapf
The file was addedmlir/test/Bindings/Python/dialects/builtin.py
The file was modifiedmlir/tools/mlir-tblgen/OpPythonBindingGen.cpp
The file was addedmlir/lib/Bindings/Python/mlir/dialects/_builtin.py
The file was addedmlir/lib/Bindings/Python/BuiltinOps.td
The file was modifiedmlir/lib/Bindings/Python/mlir/dialects/__init__.py
Commit d38be2ba0e4ebfed4c13ab79f3a8631011d185eb by wolfgang_pieb
[llvm-mca] Initial implementation of serialization using JSON. The views
implemented at this time are Summary, Timeline, ResourcePressure and InstructionInfo.
Use --json on the command line to obtain JSON output.
The file was addedllvm/test/tools/llvm-mca/JSON/X86/views.s
The file was modifiedllvm/tools/llvm-mca/Views/SummaryView.cpp
The file was modifiedllvm/tools/llvm-mca/Views/BottleneckAnalysis.h
The file was modifiedllvm/tools/llvm-mca/Views/InstructionInfoView.cpp
The file was modifiedllvm/tools/llvm-mca/Views/TimelineView.h
The file was modifiedllvm/tools/llvm-mca/Views/TimelineView.cpp
The file was modifiedllvm/tools/llvm-mca/Views/InstructionInfoView.h
The file was modifiedllvm/tools/llvm-mca/Views/BottleneckAnalysis.cpp
The file was addedllvm/tools/llvm-mca/Views/InstructionView.h
The file was modifiedllvm/tools/llvm-mca/Views/SchedulerStatistics.h
The file was addedllvm/tools/llvm-mca/Views/InstructionView.cpp
The file was modifiedllvm/tools/llvm-mca/Views/ResourcePressureView.h
The file was modifiedllvm/tools/llvm-mca/Views/RetireControlUnitStatistics.h
The file was modifiedllvm/tools/llvm-mca/llvm-mca.cpp
The file was modifiedllvm/tools/llvm-mca/PipelinePrinter.h
The file was modifiedllvm/tools/llvm-mca/Views/View.h
The file was modifiedllvm/tools/llvm-mca/CMakeLists.txt
The file was modifiedllvm/tools/llvm-mca/Views/SummaryView.h
The file was modifiedllvm/tools/llvm-mca/Views/DispatchStatistics.h
The file was modifiedllvm/tools/llvm-mca/Views/ResourcePressureView.cpp
The file was modifiedllvm/tools/llvm-mca/PipelinePrinter.cpp
The file was modifiedllvm/docs/CommandGuide/llvm-mca.rst
The file was modifiedllvm/docs/ReleaseNotes.rst
The file was modifiedllvm/tools/llvm-mca/Views/RegisterFileStatistics.h
The file was modifiedllvm/tools/llvm-mca/Views/View.cpp
Commit cfe9ccbddd98b55e49e46bb40877ece6a47a7625 by i
[libc++abi] Simplify scan_eh_tab

1.
All `_URC_HANDLER_FOUND` return values need to set `landingPad`
and its value does not matter for `_URC_CONTINUE_UNWIND`. So we
can always set `landingPad` to unify code.

2.
For an exception specification (`ttypeIndex < 0`), we can check `_UA_FORCE_UNWIND` first.

3.
The so-called type 3 search (`actions & _UA_CLEANUP_PHASE && !(actions & _UA_HANDLER_FRAME)`)
is actually conceptually wrong.  For a catch handler or an unmatched dynamic
exception specification, `_UA_HANDLER_FOUND` should be returned immediately.  It
still appeared to work because the `ttypeIndex==0` case would return
`_UA_HANDLER_FOUND` at a later time.

This patch fixes the conceptual error and simplifies the code by handling type 3
the same way as type 2 (which is also what libsupc++ does).
The only difference between phase 1 and phase 2 is what to do with a cleanup
(`actionEntry==0`, or a `ttypeIndex==0` is found in the action record chain):
phase 1 returns `_URC_CONTINUE_UNWIND` while phase 2 returns `_URC_HANDLER_FOUND`.

Reviewed By: #libc_abi, compnerd

Differential Revision: https://reviews.llvm.org/D93190
The file was modifiedlibcxxabi/src/cxa_personality.cpp
Commit 0cd1e47327e68bba4b92338bf58bbab922c5d85b by llvmgnsyncbot
[gn build] Port d38be2ba0e4e
The file was modifiedllvm/utils/gn/secondary/llvm/tools/llvm-mca/BUILD.gn