Releases: intel/llvm
Releases · intel/llvm
DPC++ daily 2021-11-24
LLVM and SPIRV-LLVM-Translator pulldown (WW46-47) LLVM: llvm/llvm-project@0f652d8f527f SPIRV-LLVM-Translator: KhronosGroup/SPIRV-LLVM-Translator@05e183d
DPC++ daily 2021-11-23
[SYCL] Fix vec class alignment on windows platform (#4953) Currently the sycl::vec type can be copied in the way which doesn't preserve the default alignment on windows. This can causes crashes since the sycl:;vec code expects the vector to be aligned and uses vector instructions. We used default alignment because we cannot set correct alignment in all cases. The patch adds alignment of vector types, if alignment required is larger than 64, it is limited to 64.
DPC++ daily 2021-11-22
[SYCL][ESIMD] Add ESIMD-specific IR verification pass (#4965) Signed-off-by: Sergey Dmitriev <[email protected]>
DPC++ daily 2021-11-20
[SYCL] Fix memory leak in online compiler (#4963) The experimental online compiler may leak memory in compileToSPIRV. These changes address this leak by storing the SPIR-V binary information directly in the vector that will later be returned. Signed-off-by: Steffen Larsen <[email protected]>
oneAPI DPC++ Compiler 2021-09
New features
SYCL Compiler
SYCL Library
- Added sRGBA support
[e488327][191efdd] - Added a preview feature implementation for the DPC++ experimental
matrix extension
[7f21853] [a95f46d] - Added support for SYCL 2020 exceptions [5c0f748][eef0760][5af8c43]
- Added support for SYCL_EXT_INTEL_BF16_CONVERSION extension
[8075463] - Added support for fallback implementation of assert feature
[56c9ec4] - Added support SYCL 2020
sycl::logical_and
andsycl::logical_or
operators
[6c077a0]
Documentation
- Added design document for optional kernel features
[88cfe16] - Added SYCL_INTEL_bf16_conversion extension document
[9f8cc3a] - Align SYCL_EXT_ONEAPI_GROUP_MASK extension
with SYCL 2020 specification [a06bd1f] - Added documentation of XPTI related
tracing in SYCL [1308fe7] - Align
SYCL_EXT_ONEAPI_LOCAL_MEMORY
extension
document with SYCL 2020
specification [6ed6565]
Improvements
SYCL Compiler
- Added default device triple
spir64
when the compiler encounters any
incoming object/libraries that have been built with thespir64
target.
-fno-sycl-link-spirv
can be used for disabling this behaviour [1342360] - Added support for non-uniform
IMul
andFMul
operation forptx-nvidiacl
[98a339d] - Added splitting modules capability when compiling for NVPTX and AMDGCN
[c1324e6] - Added
-fsycl-footer-path=<path>
command-line option to set path where to
store integration footer [155acd1] - Improved read only accessor handling - added
readonly
attribute to the
associated pointer to memory [3661685] - Improved the output project name generation. If the output value has one of
.a .o .out .lib .obj .exe extension, the output project directory name will
omit extension. For any other extension the output project directory will
keep this extension [d8237a6] - Improved handling of default device with AOCX archive [e3a579f]
- Added support for NVPTX device
printf
[4af2eb5] - Added support for non-const private statics in ESIMD kernels [bc51fe0]
- Improved diagnostic generation when incorrect accessor format is used
[a292214] - Allowed passing
-Xsycl-target-backend
and-Xsycl-target-link
when
default target is used [d37b832] - Disabled kernel function propagation up the call tree to callee when in
SYCL 2020 mode [2667e3e]
SYCL Library
- Improved information passed to XPTI subscribers [2af0599] [66770f0]
- Added event interoperability to Level Zero plugin [ef33c57]
- Enabled blitter engine for in-order queues in Level Zero plugin [904967e]
- Removed deprecation warning for SYCL 1.2.1 barriers [18c80fa]
- Moved free function queries to experimental namespace
sycl::ext::oneapi::experimental
[63ba1ce] - Added info query for
device::info::atomic_memory_order_capabilities
and
context::info::atomic_memory_order_capabilities
[9b04f41] - Improved performance of generic shuffles [fb08adf]
- Renamed
ONEAPI/INTEL
namespace toext::oneapi/intel
[d703f57] [ea4b8a9]
[e9d308e] - Added Level Zero interoperability which allows to specify ownership of queue
[4614ee4] [6cf48fa] - Added support for
reqd_work_group_size
attribute in CUDA plugin [a8fe4a5] - Introduced
SYCL_CACHE_DIR
environment variable which allows to specify a
directory for persistent cache [4011775] - Added version of
parallel_for
acceptingrange
and a reduction variable
[d1556e4] - Added verbosity to some errors handling [84ee39a]
- Added SYCL 2020
sycl::errc_for
API [02756e3] - Added SYCL 2020
byte_size
method forsycl::buffer
andsycl::vec
classes.get_size
was deprecated [282d1de] - Added support for USM pointers for
sycl::joint_exclusive_scan
and
sycl::joint_inclusive_scan
[2de0f92] - Added copy and move constructors for
sycl::ext::intel::experimental::esimd::simd_view
[daae147] - Optimized memory allocation when sub-devices are used in Level Zero plugin
[6504ba0] - Added constexpr constructors for
vec
andmarray
classes
[e7cd86b][449721b] - Optimized kernel cache [c16705a]
- Added caching of device properties in Level Zero plugin [a50f45b]
- Optimized Cuda plugin work with small kernels [07189af]
- Optimized submission of kernels [441dc3b][33432df]
- Aligned implementation of
SYCL_EXT_ONEAPI_LOCAL_MEMORY
extension
document with updated
document [b3db5e5] - Improved
sycl::accessor
initialization performance on device [a10199d] - Added support
sycl::get_kernel_ids
and cache forsycl::kernel_id
objects
[491ec6d] - Deprecated
::half
since it should not be available in global
namespace,sycl::half
can be used instead [6ff9cf7] - Deprecated
sycl::interop_handler
,sycl::handler::interop_task
,
sycl::handler::run_on_host_intel
,sycl::kernel::get_work_group_info
and
sycl::spec_constant
APIs [5120763] - Marked
sycl::marray
device copyable [6e02880] - Made Level Zero interoperability API SYCL 2020 compliant for
sycl::platform
,sycl::device
andsycl::context
[c696415] - Deprecated unstable keys of
SYCL_DEVICE_ALLOWLIST
[b27c57c] - Added predefined vendor macro
SYCL_IMPLEMENTATION_ONEAPI
and
SYCL_IMPLEMENTATION_INTEL
[6d34ebf] - Deprecated
sycl::ext::intel::online_compiler
,
sycl::ext::intel::experimental::online_compiler
can be used instead
[7fb56cf] - Deprecated
global_device_space
andglobal_host_space
values of
sycl::access::address_space
enumeration,ext_intel_global_device_space
ext_intel_host_device_space
can be used instead [7fb56cf] - Deprecated
sycl::handler::barrier
andsycl::queue::submit_barrier
,
sycl::handler::ext_oneapi_barrier
and
sycl::queue::ext_oneapi_submit_barrier
can be used instead [7fb56cf] - Removed
sycl::handler::codeplay_host_task
API [9a0ea9a]
Tools
- Added support for ROCm devices in
get_device_count_by_type
[03155e7]
Documentation
- Extended group sort algorithms extension
with interfaces to scratchpad memory [f57091d] - Updated several extension documents to follow SYCL 2020 extension rules
[7fb56cf]
Bug fixes
SYCL Compiler
- Fixed emission of integration header with type aliases [e3cfa19]
- Fixed compilation for AMD GPU with
-fsycl-dead-args-optimization
[5ed48b4] - Removed faulty implementations for atomic loads and stores for
acquire
,
release
andseq_cst
memory orders libclc for NVPTX [4876443] - Fixed having two specialization for the
specialization_id
, one of which was
invalid [f71a1d5] - Fixed context destruction in HIP plugin [6042d3a]
- Changed
queue::mem_advise
andhandler::mem_advise
to takeint
instead
ofpi_mem_advice
[af2bf96] - Prevented passing of
-fopenmp-simd
to device compilation when used along
with-fsycl
[226ed8b] - Fixed generation of the integration header when non-base-ascii chars are
used in the kernel name [91f5047] - Fixed a problem which could lead to picking up incorrect kernel at runtime in
some situations when unnamed lambda feature is used [27c632e] - Fixed suggested target triple in the warning message [7cc89fa]
- Fixed identity for multiplication on CUDA backend [a6447ca]
- Fixed a problem with dependency file generation [fd6d948] [1d5b2cb]
- Fixed builtins address space type for CUDA backend [1e3136e]
- Fixed a problem which could lead to incorrect user header to be picked up
[c23fe4b]
SYCL Library
- Added assign operator to specializations of
sycl::ext::oneapi::atomic_ref
[c6bc5a6] - Fixed the way managed memory is freed in CUDA plugin [e825916]
- Changed names of some SYCL internal enumerations to avoid possible
conflicts with user macro [1419415] - Fixed status which was returned for host events by
event::get_info<info::event::command_execution_status>()
call [09715f6] - Fixed memory ordering used for barriers [73455a1]
- Fixed several places in CUDA and HIP plugins where
bool
was used instead
ofuint32_t
[764b6ff] - Fixed event pool memory leak in Level Zero plugin [0e95e5a]
- Removed redundant
memcpy
call for copying struct usingfpga_reg
[a5d290d] - Fixed an issue where the native memory object passed to interoperability
memory object constructor was ignored on devices without host unified memory
[da19678] - Fixed a bug in
simd::replicate_w
API [d36480d] - Fixed group operations for
(u)int8/16
types [6a055ec] - Fixed a problem with non-native specialization constants being undefined if
they are not explicitly updated to non-default values [3d96e1d] - Fixed a crash which could happen when a default constructed event is passed
...
DPC++ daily 2021-11-19
sycl-nightly/20211119 [libclc] Delete the wrong file name in the SOURCE file, and add a new…
DPC++ daily 2021-11-18
[SYCL] Fix sub-group mask for smaller SG sizes (#4916) Fix accessing sub-group mask when sub-group size is less than 32. Make sure that false is returned for positions that are more than sub-group size. Update the test to check this case.
DPC++ daily 2021-11-17
[SYCL] Generate and install stripped PDBs for SYCL libraries (#4915) Adds stripped PDB files for SYCL library and the PI plugins when building with MSVC. Full PDB files will also be generated, but only the stripped variants will be installed. The stripped PDB files will only be generated and installed if the used linker supports the /PDBSTRIPPED options. LLD does not currently support this option. If the stripped PDB is not generated, no PDB files are installed for the SYCL libraries and PI plugins. Signed-off-by: Steffen Larsen <[email protected]>
DPC++ daily 2021-11-16
[SYCL] group algorithm routines with broadened supported types (#4910) Signed-off-by: Chris Perkins <[email protected]>
DPC++ daily 2021-11-15
sycl-nightly/20211115 [Driver][SYCL] Update default SPIR device arch to correlate with host…