TLDR: Developers can now specify seed
parameter in the Chat Completion request to receive (mostly) consistent outputs. To help you keep track of these changes, we expose the system_fingerprint
field. If this value is different, you may see different outputs due to changes we've made on our systems. Please note that this feature is in beta and only currently supported for gpt-4-1106-preview
and gpt-3.5-turbo-1106
.
Context
Reproducibility has always been a big request from user communities when using our APIs. For instance, when granted the capability of getting reproducible numerical result, users can unlock quite a bit of use cases that’s sensitive to numerical changes.
Model level features for consistent outputs
The Chat Completions and Completions APIs are non-deterministic by default (which means model outputs may differ from request to request), but now offer some control towards deterministic outputs using a few model level controls.
This can unlock consistent completions which enables full control on the model behaviors for anything built on top of the APIs, and quite useful for reproducing results and testing so you know get peace of mind from knowing exactly what you’d get.
Implementing consistent outputs
To receive mostly deterministic outputs across API calls:
- Set the
seed
parameter to any integer of your choice, but use the same value across requests. For example,12345
. - Set all other parameters (prompt, temperature, top_p, etc.) to the same values across requests.
- In the response, check the
system_fingerprint
field. The system fingerprint is an identifier for the current combination of model weights, infrastructure, and other configuration options used by OpenAI servers to generate the completion. It changes whenever you change request parameters, or OpenAI updates numerical configuration of the infrastructure serving our models (which may happen a few times a year).
If the seed
, request parameters, and system_fingerprint
all match across your requests, then model outputs will mostly be identical. There is a small chance that responses differ even when request parameters and system_fingerprint
match, due to the inherent non-determinism of our models.