How many characters of transcript are grouped together when searching for relevant context. Increase for longer, more complete passages; decrease if answers include too much irrelevant text.
How many characters adjacent chunks share. Higher overlap reduces the chance of missing context that falls at a chunk boundary, at the cost of some redundancy.
Controls how creative or random responses are. Lower values (e.g. 0.1) give focused, factual answers closely tied to the transcript; higher values (e.g. 0.8) produce more varied, exploratory responses.
Extra instructions sent to the model with every question. Edit or clear to customise how the model responds.
The maximum length of each response. A token is roughly ¾ of a word. Use 256–512 for concise answers; 1024–2048 if you need detailed summaries or the model is cutting off mid-sentence.