temperature parameter in shared/llm.py is silently ignored at generation time

The `temperature` parameter on `LocalLLM.__init__` is passed to `Llama(...)` at
construction, but current versions of `llama-cpp-python` apply temperature at
sampling time (per-call), not at model load. As a result, changing the value
in `__init__` has no effect on output — the library's internal default takes
over for every call.

Separately, no `seed` is passed, so even when temperature is correctly applied,
identical responses can repeat across runs.

This affects Lesson 01's Exercise 2 ("Change the temperature in shared/llm.py
and observe the response"). The change is silently ignored.

Repro:
1. Set `temperature: float = 1.5` in `LocalLLM.__init__`.
2. Run `lesson_01_basic_chat()` twice with the same prompt.
3. Observe byte-identical responses.

Proposed fix:
- Store `self.temperature = temperature` in `__init__`.
- In `generate()`, fall back to `self.temperature` when no override is passed.
- Add `seed=-1` to the `Llama(...)` call so each load gets fresh randomness.
- Remove the now-unused `temperature=temperature` from the `Llama(...)` call.

Happy to submit a PR if this is a confirmed bug.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

temperature parameter in shared/llm.py is silently ignored at generation time #6

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

temperature parameter in shared/llm.py is silently ignored at generation time #6

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions