What is a token?
You can think of tokens as parts of words used for natural language processing. For English text, 1 token is approximately 4 characters or 0.75 words. As a point of reference, Shakespeare's collected works are about 900,000 words or 1.2 million index cards.
Which model should I use?
While Davinci (text-davinci-003) is generally the most capable model, the other models can perform certain tasks extremely well, and in some cases significantly faster. They also have cost advantages. For example, Curie can perform many of the same tasks as Davinci, but faster and at 1/10 the cost. We encourage developers to experiment to find the most efficient model for their application. Visit our documentation for a more detailed model comparison.
How will I know how many tokens I've used each month?
Log in to your account to view your usage control panel. This page will show how many tokens you have used during current and past billing cycles.
How can I manage my expenses?
You can set a hard usage limit in your billing settings, after which we will stop fulfilling your requests. You can also set up a soft threshold to receive an email alert once you exceed a certain usage threshold. There may be a delay in applying the limit and you are responsible for any overage incurred. We recommend checking your usage dashboard regularly to monitor your spending.
Is the ChatGPT API included in the ChatGPT Plus subscription?
No, ChatGPT API and ChatGPT Plus subscription are billed separately. The API has its own price. The ChatGPT Plus subscription covers chat usage only and costs $20/month.
Does using the Playground count against my quota?
Yes, we treat Playground usage the same as normal API usage.
How is the price calculated for completions?
Completion requests are charged based on the number of tokens sent in your prompt plus the number of tokens in the completion(s) returned by the API.
The best_of and n parameters can also affect costs. Since these parameters generate multiple completions per prompt, they act as multipliers on the number of tokens returned.
Your request can use up to num_tokens(prompt) + max_tokens * max(n, best_of) tokens, which will be charged at the per-engine rates described at the top of this page.
In the simplest case, if your prompt contains 10 tokens and you request a single padding of 90 tokens from the davinci engine, your request will use 100 tokens and cost $0.002.
You can limit the costs by reducing the prompt length or maximum response length, limiting the use of best_of/n , adding appropriate stopping sequences, or using mechanisms with lower per-token costs.
How is the price calculated for fine-tuning?
There are two components to adjusting pricing: training and usage.
When training a fitted model, the total tokens used will be charged at our training rates. Note that the number of training tokens depends on the number of tokens in your training dataset and the chosen number of training epochs. The default number of epochs is 4.
(Tokens in your training file * Number of training epochs) = Total training tokens
After tuning a model, you will only be charged for the tokens you use. Requests submitted for fitted templates are charged at our usage rates.
Is there an SLA on the various models?
We will publish an SLA shortly. In the meantime, you can visit our status page to monitor service availability and view historical uptime. If your company or application has specific requirements, please contact our sales team.
Is the API available on Microsoft Azure?
Yes. Azure customers can access the OpenAI API on Azure with the compliance, regional support, and enterprise-grade security that Azure provides.