HumanClassifier

Hi,
I modified the reward system to use the HumanClassifierWrapper, and the system works as expected. However, I noticed that this feature only allows the reward to be manually assigned when the episode ends (i.e., when the max step is reached). I want to enable assigning rewards in real-time during the rollout process.
To achieve this, I added a real-time keyboard listener, without making any other changes to the code. However, I found that the learning process no longer works (after training for a long time, there was no progress).
So, I wanted to ask if you have implemented the functionality of assigning rewards in real-time. Have you encountered any similar issues? And why did you decide not to implement real-time reward assignment (since the classifier network operates in real-time)?
Is this due to uncontrollable system issues, or are there other reasons? I would greatly appreciate any advice or experience from those who have worked on or implemented this functionality.
Thank you!


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HumanClassifier #44

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

HumanClassifier #44

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions