FailedChanges

Summary

  1. [DAGCombiner] try repeated fdiv divisor transform before building estimate (2nd try) The original patch was committed at rL359398 and reverted at rL359695 because of infinite looping. This includes a fix to check for a vector splat of "1.0" to avoid the infinite loop. Original commit message: This was originally part of D61028, but it's an independent diff. If we try the repeated divisor reciprocal transform before producing an estimate sequence, then we have an opportunity to use scalar fdiv. On x86, the trade-off is 1 divss vs. 5 vector FP ops in the default estimate sequence. On recent chips (Skylake, Ryzen), the full-precision division is only 3 cycle throughput, so that's probably the better perf default option and avoids problems from x86's inaccurate estimates. The last 2 tests show that users still have the option to override the defaults by using the function attributes for reciprocal estimates, but those patterns are potentially made faster by converting the vector ops (including ymm ops) to scalar math. Differential Revision: https://reviews.llvm.org/D61149
  2. [SelectionDAG] remove constant folding limitations based on FP exceptions We don't have FP exception limits in the IR constant folder for the binops (apart from strict ops), so it does not make sense to have them here in the DAG either. Nothing else in the backend tries to preserve exceptions (again outside of strict ops), so I don't see how this could have ever worked for real code that cares about FP exceptions. There are still cases (examples: unary opcodes in SDAG, FMA in IR) where we are trying (at least partially) to preserve exceptions without even asking if the target supports FP exceptions. Those should be corrected in subsequent patches. Real support for FP exceptions requires several changes to handle the constrained/strict FP ops. Differential Revision: https://reviews.llvm.org/D61331
  3. [X86][SSE] lowerAddSubToHorizontalOp - enable ymm extraction+fold Limiting scalar hadd/hsub generation to the lowest xmm looks to be unnecessary - we will be extracting one upper xmm whatever, and we can remove a shuffle by using the hop which is inline with what shouldUseHorizontalOp expects to happen anyway. Testing on btver2 (the main target for fast-hops) shows this is beneficial even for float ops where we have a 'shuffle' to extract the float result: https://godbolt.org/z/0R-U-K Differential Revision: https://reviews.llvm.org/D61426
  4. [X86][SSE] Move shouldUseHorizontalOp inside isHorizontalBinOp. NFCI. Matches what we do for lowerAddSubToHorizontalOp and will make it easier to peek through subvectors to help fix PR39921
  5. [llvm-strip]Add --no-strip-all to disable --strip-all behaviour (including default stripping) If certain switches are not specified, llvm-strip behaves as if --strip-all were specified. This means that for testing, when we don't want the stripping behaviour, we have to specify one of these switches, which can be confusing. This change adds --no-strip-all to allow an alternative way of suppressing the default stripping, in a less confusing manner. Reviewed by: jakehehrlich, MaskRay Differential Revision: https://reviews.llvm.org/D61377
Revision 359793 by spatel:
[DAGCombiner] try repeated fdiv divisor transform before building estimate (2nd try)

The original patch was committed at rL359398 and reverted at rL359695 because of
infinite looping.

This includes a fix to check for a vector splat of "1.0" to avoid the infinite loop.

Original commit message:

This was originally part of D61028, but it's an independent diff.

If we try the repeated divisor reciprocal transform before producing an estimate sequence,
then we have an opportunity to use scalar fdiv. On x86, the trade-off is 1 divss vs. 5
vector FP ops in the default estimate sequence. On recent chips (Skylake, Ryzen), the
full-precision division is only 3 cycle throughput, so that's probably the better perf
default option and avoids problems from x86's inaccurate estimates.

The last 2 tests show that users still have the option to override the defaults by using
the function attributes for reciprocal estimates, but those patterns are potentially made
faster by converting the vector ops (including ymm ops) to scalar math.

Differential Revision: https://reviews.llvm.org/D61149
Change TypePath in RepositoryPath in Workspace
The file was modified/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpptrunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
The file was modified/llvm/trunk/test/CodeGen/X86/fdiv-combine-vec.lltrunk/test/CodeGen/X86/fdiv-combine-vec.ll
Revision 359791 by spatel:
[SelectionDAG] remove constant folding limitations based on FP exceptions

We don't have FP exception limits in the IR constant folder for the binops (apart from strict ops),
so it does not make sense to have them here in the DAG either. Nothing else in the backend tries
to preserve exceptions (again outside of strict ops), so I don't see how this could have ever
worked for real code that cares about FP exceptions.

There are still cases (examples: unary opcodes in SDAG, FMA in IR) where we are trying (at least
partially) to preserve exceptions without even asking if the target supports FP exceptions. Those
should be corrected in subsequent patches.

Real support for FP exceptions requires several changes to handle the constrained/strict FP ops.

Differential Revision: https://reviews.llvm.org/D61331
Change TypePath in RepositoryPath in Workspace
The file was modified/llvm/trunk/include/llvm/CodeGen/TargetLowering.htrunk/include/llvm/CodeGen/TargetLowering.h
The file was modified/llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpptrunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
The file was modified/llvm/trunk/lib/CodeGen/TargetLoweringBase.cpptrunk/lib/CodeGen/TargetLoweringBase.cpp
The file was modified/llvm/trunk/lib/Target/AMDGPU/SIISelLowering.cpptrunk/lib/Target/AMDGPU/SIISelLowering.cpp
The file was modified/llvm/trunk/lib/Target/WebAssembly/WebAssemblyISelLowering.cpptrunk/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
The file was modified/llvm/trunk/test/CodeGen/AArch64/fp-const-fold.lltrunk/test/CodeGen/AArch64/fp-const-fold.ll
Revision 359786 by rksimon:
[X86][SSE] lowerAddSubToHorizontalOp - enable ymm extraction+fold

Limiting scalar hadd/hsub generation to the lowest xmm looks to be unnecessary - we will be extracting one upper xmm whatever, and we can remove a shuffle by using the hop which is inline with what shouldUseHorizontalOp expects to happen anyway.

Testing on btver2 (the main target for fast-hops) shows this is beneficial even for float ops where we have a 'shuffle' to extract the float result:
https://godbolt.org/z/0R-U-K

Differential Revision: https://reviews.llvm.org/D61426
Change TypePath in RepositoryPath in Workspace
The file was modified/llvm/trunk/lib/Target/X86/X86ISelLowering.cpptrunk/lib/Target/X86/X86ISelLowering.cpp
The file was modified/llvm/trunk/test/CodeGen/X86/haddsub.lltrunk/test/CodeGen/X86/haddsub.ll
The file was modified/llvm/trunk/test/CodeGen/X86/phaddsub-extract.lltrunk/test/CodeGen/X86/phaddsub-extract.ll
Revision 359782 by rksimon:
[X86][SSE] Move shouldUseHorizontalOp inside isHorizontalBinOp. NFCI.

Matches what we do for lowerAddSubToHorizontalOp and will make it easier to peek through subvectors to help fix PR39921
Change TypePath in RepositoryPath in Workspace
The file was modified/llvm/trunk/lib/Target/X86/X86ISelLowering.cpptrunk/lib/Target/X86/X86ISelLowering.cpp
Revision 359781 by jhenderson:
[llvm-strip]Add --no-strip-all to disable --strip-all behaviour (including default stripping)

If certain switches are not specified, llvm-strip behaves as if
--strip-all were specified. This means that for testing, when we don't
want the stripping behaviour, we have to specify one of these switches,
which can be confusing. This change adds --no-strip-all to allow an
alternative way of suppressing the default stripping, in a less
confusing manner.

Reviewed by: jakehehrlich, MaskRay

Differential Revision: https://reviews.llvm.org/D61377
Change TypePath in RepositoryPath in Workspace
The file was modified/llvm/trunk/test/tools/llvm-objcopy/ELF/basic-only-keep-debug.testtrunk/test/tools/llvm-objcopy/ELF/basic-only-keep-debug.test
The file was modified/llvm/trunk/test/tools/llvm-objcopy/ELF/dynsym-error-remove-strtab.testtrunk/test/tools/llvm-objcopy/ELF/dynsym-error-remove-strtab.test
The file was added/llvm/trunk/test/tools/llvm-objcopy/ELF/no-strip-all.testtrunk/test/tools/llvm-objcopy/ELF/no-strip-all.test
The file was modified/llvm/trunk/test/tools/llvm-objcopy/ELF/reloc-error-remove-symtab.testtrunk/test/tools/llvm-objcopy/ELF/reloc-error-remove-symtab.test
The file was modified/llvm/trunk/test/tools/llvm-objcopy/ELF/remove-linked-section.testtrunk/test/tools/llvm-objcopy/ELF/remove-linked-section.test
The file was modified/llvm/trunk/test/tools/llvm-objcopy/ELF/symtab-error-on-remove-strtab.testtrunk/test/tools/llvm-objcopy/ELF/symtab-error-on-remove-strtab.test
The file was modified/llvm/trunk/test/tools/llvm-objcopy/ELF/symtab-link.testtrunk/test/tools/llvm-objcopy/ELF/symtab-link.test
The file was modified/llvm/trunk/tools/llvm-objcopy/CopyConfig.cpptrunk/tools/llvm-objcopy/CopyConfig.cpp
The file was modified/llvm/trunk/tools/llvm-objcopy/StripOpts.tdtrunk/tools/llvm-objcopy/StripOpts.td

Summary

  1. [OpenCL] Deduce static data members to __global addr space. Similarly to static variables in OpenCL, static class data members should be deduced to __global addr space. Differential Revision: https://reviews.llvm.org/D61304
Revision 359789 by stulova:
[OpenCL] Deduce static data members to __global addr space.

Similarly to static variables in OpenCL, static class data
members should be deduced to __global addr space.

Differential Revision: https://reviews.llvm.org/D61304
Change TypePath in RepositoryPath in Workspace
The file was modified/cfe/trunk/lib/Sema/SemaType.cpptrunk/lib/Sema/SemaType.cpp
The file was added/cfe/trunk/test/SemaOpenCLCXX/address-space-deduction.cltrunk/test/SemaOpenCLCXX/address-space-deduction.cl

Summary

  1. [clangd] Fix code completion of macros defined in the preamble region of the main file. Summary: This is a tricky case (we baked the assumption that symbols come from the preamble xor mainfile pretty deeply) and the fix is a bit of a hack: We look at the code to guess the macro names, and deserialize them from the preamble "by hand". Reviewers: ilya-biryukov Subscribers: ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D60937
Revision 359778 by sammccall:
[clangd] Fix code completion of macros defined in the preamble region of the main file.

Summary:
This is a tricky case (we baked the assumption that symbols come from
the preamble xor mainfile pretty deeply) and the fix is a bit of a hack:
We look at the code to guess the macro names, and deserialize them from
the preamble "by hand".

Reviewers: ilya-biryukov

Subscribers: ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D60937
Change TypePath in RepositoryPath in Workspace
The file was modified/clang-tools-extra/trunk/clangd/ClangdUnit.cpptrunk/clangd/ClangdUnit.cpp
The file was modified/clang-tools-extra/trunk/clangd/ClangdUnit.htrunk/clangd/ClangdUnit.h
The file was modified/clang-tools-extra/trunk/clangd/CodeComplete.cpptrunk/clangd/CodeComplete.cpp
The file was modified/clang-tools-extra/trunk/clangd/index/SymbolCollector.cpptrunk/clangd/index/SymbolCollector.cpp
The file was modified/clang-tools-extra/trunk/clangd/unittests/CodeCompleteTests.cpptrunk/clangd/unittests/CodeCompleteTests.cpp