Skip to content

Question about initial success rate in Figure 4 (0-step performance before training) #43

@Alicia320

Description

@Alicia320

Hi, thank you very much for releasing this great work and the codebase.

I have a question regarding Figure 4 in the paper. In the plot, the success rate at step 0 (before any RL training) is already around 0.2–0.6, depending on the task. However, in the franka_walkthrough.md documentation, it seems that the model is randomly initialized and there is no explicit pretraining or warm-start stage mentioned.

I would like to ask:

  • How is the initial policy (at step 0) able to achieve a non-zero success rate?

  • Did you use any kind of offline BC pretraining, imitation initialization, or previously collected policy checkpoints when generating the results in Figure 4?

  • Or did the authors observe non-zero success rate even with a purely random-initialized policy?
    (In my experiments, a random-init model makes the arm move unpredictably and has ~0 success rate.)

I am probably missing something, so any clarification would be greatly appreciated.
Thanks again for your excellent work!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions