Hi,
I modified the reward system to use the HumanClassifierWrapper, and the system works as expected. However, I noticed that this feature only allows the reward to be manually assigned when the episode ends (i.e., when the max step is reached). I want to enable assigning rewards in real-time during the rollout process.
To achieve this, I added a real-time keyboard listener, without making any other changes to the code. However, I found that the learning process no longer works (after training for a long time, there was no progress).
So, I wanted to ask if you have implemented the functionality of assigning rewards in real-time. Have you encountered any similar issues? And why did you decide not to implement real-time reward assignment (since the classifier network operates in real-time)?
Is this due to uncontrollable system issues, or are there other reasons? I would greatly appreciate any advice or experience from those who have worked on or implemented this functionality.
Thank you!
Hi,
I modified the reward system to use the HumanClassifierWrapper, and the system works as expected. However, I noticed that this feature only allows the reward to be manually assigned when the episode ends (i.e., when the max step is reached). I want to enable assigning rewards in real-time during the rollout process.
To achieve this, I added a real-time keyboard listener, without making any other changes to the code. However, I found that the learning process no longer works (after training for a long time, there was no progress).
So, I wanted to ask if you have implemented the functionality of assigning rewards in real-time. Have you encountered any similar issues? And why did you decide not to implement real-time reward assignment (since the classifier network operates in real-time)?
Is this due to uncontrollable system issues, or are there other reasons? I would greatly appreciate any advice or experience from those who have worked on or implemented this functionality.
Thank you!