OpenXLA AI Compiler Infrastructure

www.openxla.org

3

About this website

XLA (Accelerated Linear Algebra) is an open-source compiler infrastructure for machine learning that optimizes and compiles ML models for execution on various hardware accelerators (GPUs, TPUs, CPUs, and custom AI chips). Originally developed by Google for TensorFlow and the TPU (first released in 2016), XLA has been adopted by PyTorch/torch-xla, JAX, Julia, and other ML frameworks. The OpenXLA project, under the Linux Foundation, governs the ecosystem with contributions from Google, NVIDIA, AMD, Intel, AWS, Meta, and Apple. Key features: ahead-of-time (AOT) and just-in-time (JIT) compilation: XLA compiles ML model graphs into optimized machine code for the target hardware. HLO IR (High-Level Optimizer Intermediate Representation): XLA uses HLO IR as its internal representation, a tensor-based dataflow graph. Models from TensorFlow, PyTorch, JAX are lowered to HLO IR. HLO operations include elementwise ops, reduction, broadcast, reshape, transpose, dot product (GEMM), convolution, and gather/scatter. Optimization passes: over 60 passes including algebraic simplification, constant folding, loop fusion (producer-consumer fusion), layout assignment (optimal memory layout for the target hardware), buffer allocation (memory planning), and operation scheduling. Code generation: HLO IR is lowered to LLO and then to LLVM IR for LLVM-based code generation, or to vendor-specific backends (NVIDIA cuDNN/cuBLAS, AMD ROCm/MIOpen, Intel oneDNN). Custom-call mechanism for calling vendor-optimized kernels. Sharding and SPMD: automatic model parallelism and data parallelism via GSPMD. Used by Google Cloud TPU, Google Colab, and major ML platforms. Apache-2.0.

Tags & Categories

Statistics

3

Views

0

Clicks

0

Like

0

Dislike

Comments

Log In to post a comment

No comments yet. Be the first!

OpenXLA AI Compiler Infrastructure

Leaving SiteNav

About this website

Tags & Categories

Categories

Tags

Statistics

Comments

Choose a folder