Experiment with different parameter values
Each time you send a prompt to a model, it uses certain parameters that influence how the response is generated. Adjusting these values can help you fine-tune the output to better fit your task. Different models may offer different parameter options, but here are the most commonly used ones:Max Output Tokens
This limits the maximum length of the response. One token is roughly 4 characters or ¾ of a word.- Use a lower value for short answers
- Use a higher value for longer, more detailed responses
Temperature
Controls how creative or random the response is.- Lower values (e.g., 0.2) make outputs more focused and consistent
- Higher values (e.g., 0.7+) lead to more varied, creative responses
- A temperature of 0 always picks the most likely next word (deterministic)
Top-K
Limits the model’s choices for the next word to the top K most likely tokens.- Top-K = 1 gives the most predictable result
- Top-K = 3 or more allows more randomness and variation Top-K is often used with temperature and top-P together.
Top-P
Instead of picking from a fixed number of top options like Top-K, Top-P chooses from the smallest set of words whose total probability exceeds the specified threshold.- Lower top-P (e.g., 0.5) = safer, more focused responses
- Higher top-P (e.g., 0.95) = more variety and creativity