Evaluating Trained Policies

Contents

Evaluating Trained Policies#

Our evaluation pipeline involves launching separate environment and policy servers, which we describe in detail in the running the inference server documentation.

OpenPI models#

Our openpi fork which supports MESA inference is available here. To run inference with our trained models, clone this repository and install it as described in the README. Then, download the checkpoints from Hugging Face. Finally, from the openpi-mesa repository, run

uv run scripts/serve_policy.py \
  --port <port> \
  policy:checkpoint \
  --policy.config=<config_name, one of {pi0_mesa, pi05_mesa}> \
  --policy.dir=<path_to_checkpoint>

We open source the following models:

GR00T models#

Coming soon.