Releases: GridTools/gt4py
Releases · GridTools/gt4py
GT4Py v1.1.9
Summary of changes since v1.1.8
Cartesian
- Fix loop re-ordering in schedule tree.
- Fix scalarization of temporaries
- Redundant region syntax removed.
All changes
- fix[next]: Fix bug in literal values when using Enums by @egparedes in #2551
- ci: Fix daily CI job with explicit uv lock upgrades by @egparedes in #2552
- feat[cartesian]: Remove redundant region syntax by @twicki in #2554
- ci: Set Kubernetes memory request and limit to 64Gi on beverin by @havogt in #2556
- fix[cartesian]: loop reordering in schedule tree by @romanc in #2546
- fix[next-dace]: Suppress dace progress both in transformations and code generation by @edopao in #2557
- build: update frozen dependencies by @havogt in #2548
- fix[next]: Don't format non-visible Python code by @havogt in #2560
- fix[cartesian]: Scalarization of temporaries by @romanc in #2558
- build: add extra cuda13, rocm7; remove cuda11, rocm4,5 by @havogt in #2561
- Releasing v1.1.9 by @havogt in #2562
Full Changelog: v1.1.8...v1.1.9
GT4Py v1.1.8
Summary of changes since v1.1.7
Cartesian
- Fix default compile flags for for various compilers.
- Generate numpy >= 2.0 compatible code in the debug backend.
- Support for data dims of size one in dace backends.
All changes
- fix|build[cartesian]: Fix NVCC default flags for
-O0and refactor configuration out ofconfig.pyby @FlorianDeconinck in #2524 - feat[next]: support Enums as constant namespaces for value inlining by @egparedes in #2515
- ci: update github actions dependencies by @romanc in #2525
- build: fix versioningit complaints in shallow git clones by @egparedes in #2518
- fix[cartesian]: GCC 12/13 cxx default compile flag fix by @FlorianDeconinck in #2528
- ci[cartesian]: restore OpenMP for macos on daily ci by @romanc in #2530
- fix[next]: fix metrics source key context handlers by @egparedes in #2533
- feat[cartesian]: Update debug backend to generate
numpy >= 2.0compatible code by @twicki in #2526 - fix[next]: Reduce type ignores in client code by @DropD in #2484
- docs[cartesian]: Update ADRs with recent development by @romanc in #2512
- refactor[cartesian]: Separate horizontal and vertical interval parsers by @twicki in #2510
- feat[cartesian] Add
icpxto default compilers +distutilsbetter imports by @FlorianDeconinck in #2542 - fix[cartesian]: In the debug backend: Itemize lower dimensional fields also on the LHS for numpy
>= 2.0by @twicki in #2543 - feat[next-dace]: Disable dace trace for SDFG transformation progress by @edopao in #2540
- feat[next]: extend and refactor node fingerprinting utils by @egparedes in #2535
- fix[next]: properly ignore ndarray embedded caches by @egparedes in #2536
- feat[next-dace]: Introduce backend option to enable horizontal unit stride by @edopao in #2539
- fix[eve]:
lru_cacheshould not call eq when a key is provided by @tehrengruber in #2529 - feat[next]: Embedded domain construction from dimension comparison by @havogt in #2532
- feat[next-dace]: Less Verbose Warnings by @philip-paul-mueller in #2544
- fix[next-dace]: Add entry-point synchronization by @edopao in #2527
- fix[cartesian]: Support for data dims of size one in dace backends by @romanc in #2547
- feat[next]: warn in case python is run without -O for non-embedded by @havogt in #2538
- feat[next]: Expose lru_cache cache_clear functionality by @tehrengruber in #2549
- Releasing v1.1.8 by @havogt in #2550
Full Changelog: v1.1.7...v1.1.8
GT4Py v1.1.7
Summary of changes since v1.1.6
Cartesian
- Leverage unrolling of integer power calls, in dace backends, for exponents 1, 2, and 3.
- Faster SDFG construction in dace backends.
- Introduce default compiler flags.
All changes
- feat[cartesian]: leverage integer power in dace backend by @romanc in #2502
- test[cartesian]: attempt to stabilize
test_ij_field_resetby @romanc in #2505 - ci: don't format
uv.lockfile by @romanc in #2513 - fix[next]: fix metrics collection when compiling the same program multiple times by @egparedes in #2504
- fix[next]: Fix where & concat_where with named collections by @tehrengruber in #2511
- perf[next-dace]: Enhance MoveDataflowIntoIfBody transformation by @iomaganaris in #2514
- perf[cartesian]: reduce SDFG construction time in dace backends by @romanc in #2519
- feat[cartesian]: Introduce default compiler flags by @FlorianDeconinck in #2520
- perf[next-dace]: Allow more fusion of ConditionalBlocks by @iomaganaris in #2517
- Releasing v1.1.7 by @edopao in #2523
Full Changelog: v1.1.6...v1.1.7
GT4Py v1.1.6
Summary of changes since v1.1.5
Cartesian
- Added a
GT4PY_CARTESIAN_ENABLE_OPENMPenvironment variable to disable OpenMP, which allows to support compilers (e.g.apple-clang) not shipping with OpenMP by default. - Fixed issue in the
numpybackend related toK-query expressions, where an internal variable was shadowing any user-providedk_maskvariable.
All changes
- fix[cartesian]: Disable OpenMP multithreading for DaCe backends by @FlorianDeconinck in #2491
- perf[next-dace]: Only write to global on scan last level by @edopao in #2497
- fix[next-dace]: Add debug information to dace build depending on config by @edopao in #2499
- fix[cartesian]: Protect
k_maskuser variable by renaming internal mask by @FlorianDeconinck in #2501 - Releasing v1.1.6 by @edopao in #2500
Full Changelog: v1.1.5...v1.1.6
GT4Py v1.1.5
Summary of changes since v1.1.4
Cartesian
- Switch to DaCe main development branch, which is going to be released as DaCe v2.x.
- Changes in
dace:Xbackends to reduce the size of SDFGs saved to disk. - Fix casting of arguments of power function.
All changes
- feat[next]: Compile time domains by @SF-N in #2173
- build[cartesian]: update DaCe to get ScheduleTree based on mainline DaCe by @romanc in #2458
- refactor[next]: rename and simplify the programming language concept in the toolchain by @egparedes in #2478
- feat[next]: add atexit handler to dump the performance metrics at exit by @egparedes in #2481
- feat[cartesian]: save compressed SDFGs by @romanc in #2485
- fix[cartesian]: minified SDFG wihout duplication by @romanc in #2486
- fix[cartesian]: don't upcast arguments of power function by @romanc in #2489
- feat[next-dace]: Added
gt_replace_concat_where_node()by @philip-paul-mueller in #2482 - feat[next]: add support for array_namespace allocation by @havogt in #2442
- feat[next-dace]: Support scan with single level output by @edopao in #2490
- fix[next]: Fix default value for
use_max_domain_range_on_unstructured_shiftby @tehrengruber in #2493 - fix[next-dace]: Emit warning in
concat_where_mapperonly in debug mode by @edopao in #2495 - feat[next]: Integrate jax.numpy in testing by @havogt in #2488
- Releasing v1.1.5 by @edopao in #2494
Full Changelog: v1.1.4...v1.1.5
GT4Py v1.1.4
Summary of changes since v1.1.3
Cartesian
- Fixed issue in program bindings by locking
pybind11to2.x
All changes
- refactor[next]: Refactor metrics by @egparedes in #2450
- refactor[next-dace]: Use import alias for dace.nodes by @edopao in #2461
- feat[next]: Compiled variant for field operators by @tehrengruber in #2368
- feat[next-dace]: Updated
MoveDataflowIntoIfBodyby @philip-paul-mueller in #2460 - refactor[next-dace]: New Optimization Scheme in Intra-Map Optimization by @philip-paul-mueller in #2457
- build[cartesian]: Keep
pybind11to 2.x by @FlorianDeconinck in #2468 - build[next-dace]: Updated DaCe Dependency by @philip-paul-mueller in #2471
- fix[next-dace]: Fix Memory Layout for CPU by @philip-paul-mueller in #2459
- fix[next-dace]: Avoid adding instrumention to programs without GPU schedule by @iomaganaris in #2473
- docs: Update Slack join link by @havogt in #2470
- feat[next-dace]: Enable setting gpu_maxnreg attribute in maps by @iomaganaris in #2464
- fix[next]: Fix different static args after
with_backendby @tehrengruber in #2475 - fix[next]: remove indeterminism in closure_var extraction by @havogt in #2476
- refactor[next]: Rename and refactor toolchain type definitions by @egparedes in #2474
- ci: Update Ubuntu version to 24.04 for beverin, disable MPS on santis by @havogt in #2455
- fix[next-dace]: Added a Check for Symbol Conflicts Upon relocation by @philip-paul-mueller in #2472
- fix[next]: Fix segfault for nanobind >=2.10 by @tehrengruber in #2431
- feat[next]: Add instrumentation package and user-defineable hooks by @egparedes in #2437
- perf[next-dace]: RemoveScalarCopies and FuseHorizontalConditionBlocks transformations by @iomaganaris in #2469
- build[next]: Update dace to version 2026.02.12 by @edopao in #2479
- Releasing v1.1.4 by @edopao in #2480
Full Changelog: v1.1.3...v1.1.4
GT4Py v1.1.3
Summary of changes since v1.1.2
Cartesian
- New features:
- Support for K iterator access in
numpybackend - New
dace_KJIbackend that operates on fields with Fortran memory layout - Automatic match for
dace:Xbackends between layout and schedule, cache-optimal by default
- Support for K iterator access in
All changes
- ci: fix daily CI task for python 3.14 by @egparedes in #2415
- fix[next][dace]: Support lowering of let-lambdas inside an iterator expression by @edopao in #2420
- fix[next][dace]: Remove isolated access nodes to unused lambda args by @edopao in #2418
- build[next]: Update dace version by @edopao in #2419
- fix[next][dace]: Remove isolated access nodes for unused args in let-lambda by @edopao in #2422
- fix[next]: Reuse parameters in direct fo calls by @SF-N in #2375
- fix[cartesian] Remove the
vloop_sectionsfor a more unqiueid(node)by @FlorianDeconinck in #2427 - ci: try node sharing by @havogt in #2135
- fix[next]: don't cse literal expressions (e.g. scan's init) by @havogt in #2421
- feat[cartesian]: Layout & Schedule pairing for
dace:Xby @FlorianDeconinck in #2426 - fix[next]: type checking with named collections in scans by @havogt in #2416
- fix[next]: Support named collections with multiple output domains by @havogt in #2428
- fix[next][dace]: Better usage of
SubgraphContextduring SDFG lowering by @edopao in #2413 - build[next]: Update dace version to
2026_01_12by @edopao in #2432 - build[dace][next]: Added Custom Python Package for DaCe in GT4Py.Next by @philip-paul-mueller in #2423
- Add NVTX marker instrumentation by @iomaganaris in #2345
- feat[next][dace]: Enable async memory alloc on DaCe-HIP backend by @edopao in #2433
- feat[cartesian]:
numpybackend support forKiterator access by @FlorianDeconinck in #2430 - fix[next][dace]: Make all node labels unique by @edopao in #2436
- feat[next]: Enable unrolling scan loops by @iomaganaris in #2434
- feat[dace][next] Updated
gt_inline_nested_sdfg()by @philip-paul-mueller in #2385 - feat[dace][next]: Deterministic
gt_split_access_nodes()by @philip-paul-mueller in #2383 - fix[next-dace]: Remove isolated node generated from splitting by @edopao in #2444
- ci[cartesian]: Disable maocOS test in daily CI by @edopao in #2438
- refactor[next]: Add config variable
GT4PY_ADD_GPU_TRACE_MARKERSby @edopao in #2440 - fix[next-dace]: Addressed Memlet Caching Issue by @philip-paul-mueller in #2445
- fix[next]: Test cleanups by @havogt in #2441
- refactor[cartesian]: remove dead code from backend/base.py by @romanc in #2446
- fix[next-dace]: Update connectivities in fastcall by @edopao in #2449
- build[next-dace]: Update DaCe Version to 2026_01_21 by @edopao in #2451
- build[next-dace]: Use dace setting for compiler setting by @edopao in #2453
- Releasing v1.1.3 by @edopao in #2448
Full Changelog: v1.1.2...v1.1.3
GT4Py v1.1.2
Summary of changes since v1.1.1
General
- Added support for Python 3.14.
Cartesian
- New feature: Runtime Interval Bounds
All changes
- refactor[next][dace]: Simplify visitor of
concat_whereexpressions by @edopao in #2394 - fix[next][dace]: Use explicit dataflow in let-lambdas by @edopao in #2396
- ci[next]: skip
test_compile_variants_args_and_kwargswith dace backend and ROCM device by @edopao in #2400 - build[dace][next]: Updated DaCe Dependency by @philip-paul-mueller in #2401
- ci[next]: skip
test_compile_variants_tuplewith dace backend and ROCM device by @edopao in #2402 - feat[cartesian]: Runtime Interval Bounds by @twicki in #2395
- style[cartesian]: Cleanup in error messaging by @katrinafandrich in #2387
- build[dace][next]: Updated DaCe Dependency by @philip-paul-mueller in #2405
- bug[next]: Source location agnostic compilation hash by @tehrengruber in #2397
- fix[next][dace]: Use symbol mapping for stride propagation across nested SDFGs by @edopao in #2404
- ci/build/refactor[all]: python 3.14 support by @DropD in #2399
- ci[cartesian,next]: Upgrade scipy min version for py3.14 by @edopao in #2408
- refactor[eve,next]: UIDGenerator -> SequentialIDGenerator by @havogt in #2407
- fix[next]: apply cse in fuse_as_fieldop by @havogt in #2257
- refactor[next]: unroll_reduce deduce shift via typesystem by @havogt in #2267
- build: remove support for python releases breaking networkx by @egparedes in #2410
- fix[next][dace]: Unconditional execution of else-branch in if-statements by @edopao in #2412
- refactor[next][dace]: Move SDFG-lowering modules into subpackage by @edopao in #2409
- Releasing v1.1.2 by @havogt in #2414
New Contributors
- @katrinafandrich made their first contribution in #2387
Full Changelog: v1.1.1...v1.1.2
GT4Py v1.1.1
Summary of changes since v1.1.0
Cartesian
- Allow self-assignment with offset in K dimension in sequential vertical loops
- Bug fixes:
- Skip implicit upcasting in (explicit) cast operations
- Respect precision of Constants
All changes
- fix[cartesian]: upcast arguments of cast operations by @romanc in #2369
- docs[next][dace]: Updated
CompiledDaceProgramby @philip-paul-mueller in #2374 - Fix warning in thread block size setting by @iomaganaris in #2371
- fix[next][dace]: Use correct dtype for connectivity tables by @edopao in #2373
- feat[dace][next]: Added
DoubleWriteRemoverTransformation by @philip-paul-mueller in #2370 - fix[cartesian]: respect percision of constants by @romanc in #2377
- refactor[next][dace]: Rename array symbols for shape and strides by @edopao in #2378
- fix[next][dace]: Generate temporary array ids in determistic way by @edopao in #2380
- fix[next][dace]: Avoid name conflict between tasklet connector and SDFG data by @edopao in #2382
- refactor[next][dace]: Introduce
SDFGBuilder.add_nested_sdfgby @edopao in #2372 - perf[next]: Refactor of Program call arguments canonicalization by @egparedes in #2379
- fix[next][dace] Set properly strides of
datain CPU by @iomaganaris in #2384 - feat[next]: adding custom named collections of scalars and fields by @egparedes in #2232
- fix[cartesian]: Allow self-assignment with offset in K dimension in sequential vertical loops by @romanc in #2388
- bug[next]: disallow implicit bool conversion of fields by @havogt in #2393
- fix[gt4py]: Transform enums in tuple static args by @SF-N in #2389
- Releasing v1.1.1 by @havogt in #2391
Full Changelog: v1.1.0...v1.1.1
GT4Py v1.1.0
Summary of changes since v1.0.10
Cartesian
- New experimental feature: 2D temporaries.
- Removed deprecated
cudabackend. - Bug fixes:
- Absolute field-access in while-condition
- Upcasting of expressions on the lhs of assignments
Versioning
- Added a fallback version of the form
1.0.10+unknown.version.details, when no version is available from git.
Next
See commit history.
All changes
- fix[dace][next]: Fixed
DeadMapEliminationby @philip-paul-mueller in #2340 - feat[dace][next]: Added
MapToCopyby @philip-paul-mueller in #2311 - ci[cartesian]: increase time limit for cartesian / dace tests on gh200 by @romanc in #2335
- feat[cartesian]: 2D temporaries [Experimental] by @FlorianDeconinck in #2314
- refactor[cartesian]: Remove
cudabackend by @romanc in #2337 - cartesian[fix]: Remove hard check on ADR existing for experimental feature by @FlorianDeconinck in #2343
- cartesian[fix]: Temporary annotation parsing by @FlorianDeconinck in #2347
- Improve setting of thread block size for 1D maps by @iomaganaris in #2344
- refactor[next][dace]: Refactoring tests of GTIR to SDFG lowering by @edopao in #2346
- feat[dace][next]: Map Splitter by @philip-paul-mueller in #2304
- build[dace][next]: Updated DaCe Dependency by @philip-paul-mueller in #2348
- feat[dace][next]: Modified How Degenerated 1D Maps Are Handled by @philip-paul-mueller in #2349
- perf[dace][next]: Updated Top Level Dataflow Optimization Stage by @philip-paul-mueller in #2284
- build: add script to set the default fallback version to latest release by @egparedes in #2341
- perf[next]: cache low level buffer information in fields by @egparedes in #2342
- refactor[next][dace]: Update lowering of domain by @edopao in #2350
- feat[dace][next]: Included
TrivialTaskleteliminationingt_simplify()by @philip-paul-mueller in #2316 - feat[dace][next]: Block Inlining of NestedSDFG Containing Scans by @philip-paul-mueller in #2315
- build[dace][next]: Updated DaCe Dependency by @philip-paul-mueller in #2352
- refactor[next][dace]: Cleanup SDFG lowering of tuples by @edopao in #2354
- feat[dace][next]: Strides of Views by @philip-paul-mueller in #2334
- fix[cartesian]: Fix absolute field-access in while-conditionals by @twicki in #2357
- fix[next][dace]: More strict check on node sequencing in
gt_create_local_double_bufferingby @edopao in #2359 - fix[dace][next]: Fixed Setting Strides of Views by @philip-paul-mueller in #2358
- feat[next]: Multiple output domains by @SF-N in #2225
- feat[dace][next]: Using the
canonicalize_memlet_trees()function by @philip-paul-mueller in #2351 - fix[cartesian]: do upcasting on both sides of assignments by @romanc in #2361
- test[cartesian]: absolute k access with index field and computation by @romanc in #2362
- perf[next][dace]: Remove temporary inside scan map scope by @edopao in #2355
- feat[dace][next] More Roboust Calling by @philip-paul-mueller in #2353
- bug[next]: fix timer precision loss by @havogt in #2364
- refactor[next]: Add check for GPU errors in debug mode by @edopao in #2365
- build[dace][next]: Updated DaCe Dependency by @philip-paul-mueller in #2367
- feat[next][dace]: Targeted transformation to remove copies in ICON4Pys vertically implicit solvers by @iomaganaris in #2329
- Releasing v1.1.0 by @havogt in #2366
Full Changelog: v1.0.10...v1.1.0