Skip to content

A way to change segment length? #223

Answered by jongwook
neongreen asked this question in Q&A
Discussion options

You must be logged in to vote

The current decoding strategy for timestamps is that it opts to sample a timestamp token when the sum of probability over timestamps is above any other text tokens:

whisper/whisper/decoding.py

Lines 431 to 437 in 0b1ba3d

# if sum of probability over timestamps is above any other token, sample timestamp
logprobs = F.log_softmax(logits.float(), dim=-1)
for k in range(tokens.shape[0]):
timestamp_logprob = logprobs[k, self.tokenizer.timestamp_begin :].logsumexp(dim=-1)
max_text_token_logprob = logprobs[k, : self.tokenizer.timestamp_begin].max()
if timestamp_logprob > max_text_token_logprob:
logits[k, : self.tokenizer.timestamp_begin] = -np.inf

Y…

Replies: 9 comments 12 replies

Comment options

You must be logged in to vote
1 reply
@jongwook
Comment options

Comment options

You must be logged in to vote
3 replies
@timminata
Comment options

@yeetus1992
Comment options

@couchpotatochip21
Comment options

Answer selected by jongwook
Comment options

You must be logged in to vote
1 reply
@strukturedkaos
Comment options

Comment options

You must be logged in to vote
5 replies
@yeetus1992
Comment options

@harryy38
Comment options

@yeetus1992
Comment options

@timminata
Comment options

@yeetus1992
Comment options

Comment options

You must be logged in to vote
1 reply
@MatteoFasulo
Comment options

Comment options

You must be logged in to vote
1 reply
@MatteoFasulo
Comment options

Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet