OpenXLA AI Compiler Infrastructure
www.openxla.org
3
Leaving SiteNav
External Link Disclaimer
You are about to visit www.openxla.org. This website is not operated by us. We are not responsible for its content or privacy practices.
About this website
XLA (Accelerated Linear Algebra) is an open-source compiler infrastructure for machine learning that optimizes and compiles ML models for execution on various hardware accelerators (GPUs, TPUs, CPUs, and custom AI chips). Originally developed by Google for TensorFlow and the TPU (first released in 2016), XLA has been adopted by PyTorch/torch-xla, JAX, Julia, and other ML frameworks. The OpenXLA project, under the Linux Foundation, governs the ecosystem with contributions from Google, NVIDIA, AMD, Intel, AWS, Meta, and Apple. Key features: ahead-of-time (AOT) and just-in-time (JIT) compilation: XLA compiles ML model graphs into optimized machine code for the target hardware. HLO IR (High-Level Optimizer Intermediate Representation): XLA uses HLO IR as its internal representation, a tensor-based dataflow graph. Models from TensorFlow, PyTorch, JAX are lowered to HLO IR. HLO operations include elementwise ops, reduction, broadcast, reshape, transpose, dot product (GEMM), convolution, and gather/scatter. Optimization passes: over 60 passes including algebraic simplification, constant folding, loop fusion (producer-consumer fusion), layout assignment (optimal memory layout for the target hardware), buffer allocation (memory planning), and operation scheduling. Code generation: HLO IR is lowered to LLO and then to LLVM IR for LLVM-based code generation, or to vendor-specific backends (NVIDIA cuDNN/cuBLAS, AMD ROCm/MIOpen, Intel oneDNN). Custom-call mechanism for calling vendor-optimized kernels. Sharding and SPMD: automatic model parallelism and data parallelism via GSPMD. Used by Google Cloud TPU, Google Colab, and major ML platforms. Apache-2.0.
Statistics
3
Views
0
Clicks
0
Like
0
Dislike