Assertion Failed during training #45

zealot52099 · 2024-09-24T01:57:39Z

Hi there ! When I try to train the model, the following assertion failed:

assert (sum([(cur == AUDIO_TOKEN_INDEX).sum() for cur in input_ids]) + sum([(AUDIO_TOKEN_INDEX not in cur) for cur in input_ids]) == audio_features["inputs_embeds"].shape[0]

I checked the value of sums are 8 and 0 respectively if image is included in my data, otherwise 0 and 8, and the value of audio_features["inputs_embeds"].shape[0] is 40.

My json file of dataset is like this:

{
"set": "sharegpt4",
"conversations": [
{
"from": "human",
"value": "\n\n 请尽量准确地转录所有内容，并在不确定发音时提供可能的替代选项。开始转录："
},
{
"from": "gpt",
"value": "也成为地方政府的眼中钉"
}
],
"image": "/workspace/frame_1.jpg",
"audio": [
"/dataset/audio_1.wav"
]
}

Is there something I did wrong? Thanks!

linhaojia13 · 2024-09-26T08:08:37Z

Hi @zealot52099 , you should add <audio> in the conversation.

MonolithFoundation · 2024-09-27T09:06:55Z

@zealot52099 Hi, why do u have image and audio data, does the audio has any relationship to the image

I thought VITA didn't have audio and image data at the same time

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Assertion Failed during training #45

Assertion Failed during training #45

zealot52099 commented Sep 24, 2024

linhaojia13 commented Sep 26, 2024

MonolithFoundation commented Sep 27, 2024

Assertion Failed during training #45

Assertion Failed during training #45

Comments

zealot52099 commented Sep 24, 2024

linhaojia13 commented Sep 26, 2024

MonolithFoundation commented Sep 27, 2024