If you process raw evaluation data (optional; see “Evaluation data” below), use the environment suggested in its docs (some scripts assume Python 3.11). UI-TARS-1 ...