🐛 Issue: BERT encoder appears unused despite text_encoder_type='bert'

Hi, and thanks for the great work on this project!

I'm currently working with the training code and noticed something potentially inconsistent. While the documentation and flags suggest support for --text_encoder_type bert, it looks like the dataset is still loading GloVe embeddings via this line:

<pre>
self.w_vectorizer = WordVectorizer(pjoin(opt.cache_dir, 'glove'), 'our_vab')
</pre>

This occurs in:
<pre>
data_loaders/humanml/data/dataset.py
</pre>

This raises a few questions:

Is BERT actually used anywhere in the dataset loading or preprocessing pipeline?

If BERT is supported, where is it being applied?

Is there a separate dataset class or flow for BERT-based encoding?

I’d love clarification so I can ensure the correct embeddings are used during training.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🐛 Issue: BERT encoder appears unused despite text_encoder_type='bert' #255

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

🐛 Issue: BERT encoder appears unused despite text_encoder_type='bert' #255

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions