What about a comparison with ASFT Loss?

Great work! Are you considering a comparison with ASFT Loss [https://github.com/zhuchichi56/ASFT]?

I noticed that your paper compares SFT, SFT_kl, and DFT. ASFT demonstrated that DFT_kl performs quite well in their experiments. Since both works explore forgetting mitigation in SFT, I was wondering if a comparison with ASFT might provide additional insights. Of course, this is just a suggestion based on curiosity – would love to hear your thoughts on whether such a comparison would be relevant to your work.

Looking forward to your insights!



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What about a comparison with ASFT Loss? #5

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

What about a comparison with ASFT Loss? #5

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions