Skip to content

Latest commit

 

History

History
41 lines (22 loc) · 1.72 KB

File metadata and controls

41 lines (22 loc) · 1.72 KB

OSPO: One Step Policy Optimization

Article: "One Step is Enough: Multi-Agent Reinforcement Learning based on One-Step Policy Optimization for Order Dispatch on Ride-Sharing Platforms" (under review)

1. Workflow

2. Simulator

The dataset used in this study is derived from the yellow taxi data in Manhattan.

For route planning, we utilize the Project-OSRM/osrm-backend: Open Source Routing Machine - C++ backend. Specifically, we employ the US Northeast region for our experiments, with the OSRM file available for download at the Geofabrik Download Server. To avoid conflicts with other programs on our device, we chose to use port 6000 instead of the default port 5000. Consequently, you can use the following command in Docker:

docker run -t -i -p 6000:6000 -v "${PWD}:/data" ghcr.io/project-osrm/osrm-backend osrm-routed --algorithm mld /data/us-northeast-latest.osrm -p 6000

The processed data can be found in the ./data directory.

Considering the copyright, we have removed the processed data. However, the data processing code is available in the ./data directory. Please download the dataset from the link provided above and use our code to process it.

3. How to Run

python train.py

You can also set different parameters in the process function in Worker.py of GRPO to replicate the ablation study presented in our paper.

4. Parameters

The model parameters and training log files are located in the ./GRPO/parameters and ./OSPO/parameters directory.

5. Citation