Skip to content

HumanClassifier #44

@Awilekong

Description

@Awilekong

Hi,
I modified the reward system to use the HumanClassifierWrapper, and the system works as expected. However, I noticed that this feature only allows the reward to be manually assigned when the episode ends (i.e., when the max step is reached). I want to enable assigning rewards in real-time during the rollout process.
To achieve this, I added a real-time keyboard listener, without making any other changes to the code. However, I found that the learning process no longer works (after training for a long time, there was no progress).
So, I wanted to ask if you have implemented the functionality of assigning rewards in real-time. Have you encountered any similar issues? And why did you decide not to implement real-time reward assignment (since the classifier network operates in real-time)?
Is this due to uncontrollable system issues, or are there other reasons? I would greatly appreciate any advice or experience from those who have worked on or implemented this functionality.
Thank you!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions