Preparing dialog data in /var/lib/tf_seq2seq_chatbot/data
Creating vocabulary /var/lib/tf_seq2seq_chatbot/data/vocab20000.in from data /var/lib/tf_seq2seq_chatbot/data/chat.in
Traceback (most recent call last):
File "train.py", line 15, in
tf.app.run()
File "/home/minh/opt/anaconda3/envs/ml/lib/python3.5/site-packages/tensorflow/python/platform/app.py", line 30, in run
sys.exit(main(sys.argv))
File "train.py", line 12, in main
train()
File "/home/minh/source/machine-learing/chatbot/tf_seq2seq_chatbot/tf_seq2seq_chatbot/lib/train.py", line 22, in train
train_data, dev_data, _ = data_utils.prepare_dialog_data(FLAGS.data_dir, FLAGS.vocab_size)
File "/home/minh/source/machine-learing/chatbot/tf_seq2seq_chatbot/tf_seq2seq_chatbot/lib/data_utils.py", line 200, in prepare_dialog_data
create_vocabulary(vocab_path, train_path + ".in", vocabulary_size)
File "/home/minh/source/machine-learing/chatbot/tf_seq2seq_chatbot/tf_seq2seq_chatbot/lib/data_utils.py", line 70, in create_vocabulary
for line in f:
File "/home/minh/opt/anaconda3/envs/ml/lib/python3.5/site-packages/tensorflow/python/platform/gfile.py", line 176, in next
return next(self._fp)
File "/home/minh/opt/anaconda3/envs/ml/lib/python3.5/codecs.py", line 321, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xad in position 2329: invalid start byte
Preparing dialog data in /var/lib/tf_seq2seq_chatbot/data
Creating vocabulary /var/lib/tf_seq2seq_chatbot/data/vocab20000.in from data /var/lib/tf_seq2seq_chatbot/data/chat.in
Traceback (most recent call last):
File "train.py", line 15, in
tf.app.run()
File "/home/minh/opt/anaconda3/envs/ml/lib/python3.5/site-packages/tensorflow/python/platform/app.py", line 30, in run
sys.exit(main(sys.argv))
File "train.py", line 12, in main
train()
File "/home/minh/source/machine-learing/chatbot/tf_seq2seq_chatbot/tf_seq2seq_chatbot/lib/train.py", line 22, in train
train_data, dev_data, _ = data_utils.prepare_dialog_data(FLAGS.data_dir, FLAGS.vocab_size)
File "/home/minh/source/machine-learing/chatbot/tf_seq2seq_chatbot/tf_seq2seq_chatbot/lib/data_utils.py", line 200, in prepare_dialog_data
create_vocabulary(vocab_path, train_path + ".in", vocabulary_size)
File "/home/minh/source/machine-learing/chatbot/tf_seq2seq_chatbot/tf_seq2seq_chatbot/lib/data_utils.py", line 70, in create_vocabulary
for line in f:
File "/home/minh/opt/anaconda3/envs/ml/lib/python3.5/site-packages/tensorflow/python/platform/gfile.py", line 176, in next
return next(self._fp)
File "/home/minh/opt/anaconda3/envs/ml/lib/python3.5/codecs.py", line 321, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xad in position 2329: invalid start byte