Horizon Adaptive Offline Policy Learning via
Value Stitching

This repository contains the official implementation of VAST (Horizon Adaptive Offline Policy Learning via VAlue STitching) designed for long-horizon, complex offline RL tasks.

Overview

Traditional TD-based value learning relies on fixed-step backups, however, often fail to capture the complex temporal structure of long-horizon, multi-stage tasks. VAST overcomes this limitation by coupling value optimization with

a future-conditioned auxiliary value function,
a stitching policy that optimally selects the reward maximizing future.

VAST enables direct estimation and compositional "stitching" of variable-length returns grounded in actionable sub-goal states, providing an accurate and greedily exploitable value-supervision signal for offline policy optimization.

To Do List

The code will be released gradually.

arxiv released
initial repo

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
assets		assets
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Horizon Adaptive Offline Policy Learning via
Value Stitching

Overview

To Do List

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Horizon Adaptive Offline Policy Learning via Value Stitching

Overview

To Do List

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Horizon Adaptive Offline Policy Learning via
Value Stitching

Packages