A high-throughput and memory-efficient inference and serving engine for LLMs
-
Updated
Dec 16, 2025 - Python
A high-throughput and memory-efficient inference and serving engine for LLMs
Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.
Run, manage, and scale AI workloads on any AI infrastructure. Use one system to access & manage all AI compute (Kubernetes, 20+ clouds, or on-prem).
GPT2 for Multiple Languages, including pretrained models. GPT2 多语言支持, 15亿参数中文预训练模型
Everything you want to know about Google Cloud TPU
Differentiable Fluid Dynamics Package
JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs welcome).
DECIMER Image Transformer is a deep-learning-based tool designed for automated recognition of chemical structure images. Leveraging transformer architectures, the model converts chemical images into SMILES strings, enabling the digitization of chemical data from scanned documents, literature, and patents.
Benchmarking suite to evaluate 🤖 robotics computing performance. Vendor-neutral. ⚪Grey-box and ⚫Black-box approaches.
torchax is a PyTorch frontend for JAX. It gives JAX the ability to author JAX programs using familiar PyTorch syntax. It also provides JAX-Pytorch interoperability, meaning, one can mix JAX & Pytorch syntax together when authoring ML programs, and run it in every hardware JAX can run.
Simple and efficient RevNet-Library for PyTorch with XLA and DeepSpeed support and parameter offload
🖼 Training StyleGAN2 at scale on TPUs
EfficientNet, MobileNetV3, MobileNetV2, MixNet, etc in JAX w/ Flax Linen and Objax
EvoPose2D is a two-stage human pose estimation model that was designed using neuroevolution. It achieves state-of-the-art accuracy on COCO.
Edge TPU Accelerator / Multi-TPU + MobileNet-SSD v2 + Python + Async + LattePandaAlpha/RaspberryPi3/LaptopPC
PyTorch/XLA integration with JetStream (https://github.com/google/JetStream) for LLM inference"
Repository for Google Summer of Code 2019 https://summerofcode.withgoogle.com/projects/#4662790671826944
🪐 The Sebulba architecture to scale reinforcement learning on Cloud TPUs in JAX
Add a description, image, and links to the tpu topic page so that developers can more easily learn about it.
To associate your repository with the tpu topic, visit your repo's landing page and select "manage topics."