Skip to content

Conversation

@soukouki
Copy link

For lightweight models like those used in local LLMs, inference may continue indefinitely without stopping, continuously outputting meaningless characters. To address this, I've added options for both a stop token and maximum token limit.

@soukouki soukouki marked this pull request as draft November 27, 2025 08:41
@soukouki soukouki force-pushed the sou7/added-stop-tokens-and-max-tokens branch from cbd1727 to 651126f Compare November 27, 2025 11:48
@soukouki soukouki marked this pull request as ready for review November 27, 2025 11:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant