DOUBLE-PDF: DOUBLE Ppo and double Dqn for on-demand Food delivery

Article:

Journal Version: Zijian Zhao, Sen Li*, "Discriminatory order assignment and payment-setting of on-demand food-delivery platforms: A multi-action and multi-agent reinforcement learning framework"，Transportation Research Part E: Logistics and Transportation Review (TR_E), 2025

Conference Version: Zijian Zhao, Sen Li*, "Multi-Agent Reinforcement Learning for Order Assignment and Payment Setting on Food-Delivery Platforms: The Implicit Algorithmic Biases", ISTDM 2025

Notice: We have updated the code in line with our recent work, "The Impacts of Data Privacy Regulations on Food-Delivery Platforms" (Transportation Research Part C 2025). You can access the new-version code and find instructions on how to run it at the GDPR-Food-Delivery. The latest version will be provided at AV-Food-Delivery

Acknowledgement: Some parts of the code (simulator part) is based on the work of ‪Yulong Hu‬‬.

1. Workflow

2. Dataset

Due to copyright restrictions, we cannot provide the data used in this paper. However, we offer a brief introduction to the data format so you can utilize our code with your own dataset.

Our dataset consists of one hour of food delivery data in Hong Kong, China, containing approximately 10,000 records. It is saved in a CSV file, where each column represents an attribute and each row corresponds to an order. The relevant attributes include:

dlat: Latitude of the destination
dlon: Longitude of the destination
plat: Latitude of the origin
plon: Longitude of the origin
minute: The minute at which the order is placed

As you can see, there is no ground truth for salary information. Therefore, we simply set the reservation value to range from 0.85 to 1.15, without a specific unit.

Notice: We have provided the synthetic food delivery data generated by a GAN, thanks to Yitong Shang.

3. Simulator

For route planning, we utilize the Project-OSRM/osrm-backend: Open Source Routing Machine - C++ backend.

4. Citation

@article{ZHAO2026104653,
title = {Discriminatory order assignment and payment-setting of on-demand food-delivery platforms: A multi-action and multi-agent reinforcement learning framework},
journal = {Transportation Research Part E: Logistics and Transportation Review},
volume = {208},
pages = {104653},
year = {2026},
issn = {1366-5545},
doi = {https://doi.org/10.1016/j.tre.2025.104653},
url = {https://www.sciencedirect.com/science/article/pii/S1366554525006751},
author = {Zijian Zhao and Sen Li}

Name		Name	Last commit message	Last commit date
Latest commit History 69 Commits
data		data
img		img
Centeral_Platform.py		Centeral_Platform.py
Order_Env.py		Order_Env.py
README.md		README.md
Worker.py		Worker.py
eval.py		eval.py
models.py		models.py
osrm.py		osrm.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DOUBLE-PDF: DOUBLE Ppo and double Dqn for on-demand Food delivery

1. Workflow

2. Dataset

3. Simulator

4. Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

DOUBLE-PDF: DOUBLE Ppo and double Dqn for on-demand Food delivery

1. Workflow

2. Dataset

3. Simulator

4. Citation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages