qianfan.resources.llm package

Submodules

qianfan.resources.llm.base module

class qianfan.resources.llm.base.BaseResource(model: Optional[str] = None, endpoint: Optional[str] = None, **kwargs: Any)[source]

Bases: object

base class of Qianfan object

access_token() → str[source]: get access token

classmethod models() → Set[str][source]: get all supported model names

class qianfan.resources.llm.base.BatchRequestFuture(tasks: Sequence[Callable[[], Union[QfResponse, Iterator[QfResponse]]]], worker_num: Optional[int] = None)[source]

Bases: object

Future object for batch request

finished_count() → int[source]: Return the number of tasks that have been finished

results() → List[Union[QfResponse, Iterator[QfResponse], Exception]][source]: Wait for all tasks to be finished, and return the results. The order of the elements in the output is the same as the order of the elements in the input.

task_count() → int[source]: Return the total count of tasks

wait() → None[source]: Wait for all tasks to be finished

qianfan.resources.llm.chat_completion module

class qianfan.resources.llm.chat_completion.ChatCompletion(model: Optional[str] = None, endpoint: Optional[str] = None, **kwargs: Any)[source]

Bases: BaseResource

QianFan ChatCompletion is an agent for calling QianFan ChatCompletion API.

async abatch_do(messages_list: List[Union[List[Dict], QfMessages]], worker_num: Optional[int] = None, **kwargs: Any) → List[Union[QfResponse, AsyncIterator[QfResponse]]][source]

Async batch perform chat-based language generation using user-supplied messages.

Parameters:

messages_list: List[Union[List[Dict], QfMessages]]:: List of the messages list in the conversation. Please refer to ChatCompletion.do for more information of each messages.
worker_num (Optional[int]):: The number of prompts to process at the same time, default to None, which means this number will be decided dynamically.
kwargs (Any):: Please refer to ChatCompletion.do for other parameters such as model, endpoint, retry_count, etc.

``` response_list = await ChatCompletion().abatch_do([…], worker_num = 10) for response in response_list:

# response is QfResponse if succeed, or response will be exception print(response)

```

async ado(messages: Union[List[Dict], QfMessages], model: Optional[str] = None, endpoint: Optional[str] = None, stream: bool = False, retry_count: int = 1, request_timeout: float = 60, request_id: Optional[str] = None, backoff_factor: float = 1, auto_concat_truncate: bool = False, truncated_continue_prompt: str = '继续', **kwargs: Any) → Union[QfResponse, AsyncIterator[QfResponse]][source]

Async perform chat-based language generation using user-supplied messages.

Parameters:

messages (Union[List[Dict], QfMessages]):: A list of messages in the conversation including the one from system. Each message should be a dictionary containing ‘role’ and ‘content’ keys, representing the role (either ‘user’, or ‘assistant’) and content of the message, respectively. Alternatively, you can provide a QfMessages object for convenience.
model (Optional[str]):: The name or identifier of the language model to use. If not specified, the default model is used(ERNIE-Bot-turbo).
endpoint (Optional[str]):: The endpoint for making API requests. If not provided, the default endpoint is used.
stream (bool):: If set to True, the responses are streamed back as an iterator. If False, a single response is returned.
retry_count (int):: The number of times to retry the request in case of failure.
request_timeout (float):: The maximum time (in seconds) to wait for a response from the model.
backoff_factor (float):: A factor to increase the waiting time between retry attempts.
auto_concat_truncate (bool):: [Experimental] If set to True, continuously requesting will be run until is_truncated is False. As a result, the entire reply will be returned. Cause this feature highly relies on the understanding ability of LLM, Use it carefully.
truncated_continue_prompt (str):: [Experimental] The prompt to use when requesting more content for auto truncated reply.
kwargs (Any):: Additional keyword arguments that can be passed to customize the request.

Additional parameters like temperature will vary depending on the model, please refer to the API documentation. The additional parameters can be passed as follows:

` ChatCompletion().ado(messages = ..., temperature = 0.2, top_p = 0.5) `

batch_do(messages_list: Union[List[List[Dict]], List[QfMessages]], worker_num: Optional[int] = None, **kwargs: Any) → BatchRequestFuture[source]

Batch perform chat-based language generation using user-supplied messages.

Parameters:

messages_list: List[Union[List[Dict], QfMessages]]:: List of the messages list in the conversation. Please refer to ChatCompletion.do for more information of each messages.
worker_num (Optional[int]):: The number of prompts to process at the same time, default to None, which means this number will be decided dynamically.
kwargs (Any):: Please refer to ChatCompletion.do for other parameters such as model, endpoint, retry_count, etc.

``` response_list = ChatCompletion().batch_do([…], worker_num = 10) for response in response_list:

# return QfResponse if succeed, or exception will be raised print(response.result())

# or while response_list.finished_count() != response_list.task_count():

time.sleep(1)

print(response_list.results()) ```

do(messages: Union[List[Dict], QfMessages], model: Optional[str] = None, endpoint: Optional[str] = None, stream: bool = False, retry_count: int = 1, request_timeout: float = 60, request_id: Optional[str] = None, backoff_factor: float = 1, auto_concat_truncate: bool = False, truncated_continue_prompt: str = '继续', **kwargs: Any) → Union[QfResponse, Iterator[QfResponse]][source]

Perform chat-based language generation using user-supplied messages.

Parameters:

messages (Union[List[Dict], QfMessages]):: A list of messages in the conversation including the one from system. Each message should be a dictionary containing ‘role’ and ‘content’ keys, representing the role (either ‘user’, or ‘assistant’) and content of the message, respectively. Alternatively, you can provide a QfMessages object for convenience.
model (Optional[str]):: The name or identifier of the language model to use. If not specified, the default model is used(ERNIE-Bot-turbo).
endpoint (Optional[str]):: The endpoint for making API requests. If not provided, the default endpoint is used.
stream (bool):: If set to True, the responses are streamed back as an iterator. If False, a single response is returned.
retry_count (int):: The number of times to retry the request in case of failure.
request_timeout (float):: The maximum time (in seconds) to wait for a response from the model.
backoff_factor (float):: A factor to increase the waiting time between retry attempts.
auto_concat_truncate (bool):: [Experimental] If set to True, continuously requesting will be run until is_truncated is False. As a result, the entire reply will be returned. Cause this feature highly relies on the understanding ability of LLM, Use it carefully.
truncated_continue_prompt (str):: [Experimental] The prompt to use when requesting more content for auto truncated reply.
kwargs (Any):: Additional keyword arguments that can be passed to customize the request.

Additional parameters like temperature will vary depending on the model, please refer to the API documentation. The additional parameters can be passed as follows:

` ChatCompletion().do(messages = ..., temperature = 0.2, top_p = 0.5) `

qianfan.resources.llm.completion module

class qianfan.resources.llm.completion.Completion(model: Optional[str] = None, endpoint: Optional[str] = None, **kwargs: Any)[source]

Bases: BaseResource

QianFan Completion is an agent for calling QianFan completion API.

async abatch_do(prompt_list: List[str], worker_num: Optional[int] = None, **kwargs: Any) → List[Union[QfResponse, AsyncIterator[QfResponse]]][source]

Async batch generate a completion based on the user-provided prompt.

Parameters:

prompt_list (List[str]):: The input prompt list to generate the continuation from.
worker_num (Optional[int]):: The number of prompts to process at the same time, default to None, which means this number will be decided dynamically.
kwargs (Any):: Please refer to Completion.ado for other parameters such as model, endpoint, retry_count, etc.

``` response_list = await Completion().abatch_do([…], worker_num = 10) for response in response_list:

# response is QfResponse if succeed, or response will be exception print(response)

```

async ado(prompt: str, model: Optional[str] = None, endpoint: Optional[str] = None, stream: bool = False, retry_count: int = 1, request_timeout: float = 60, request_id: Optional[str] = None, backoff_factor: float = 1, **kwargs: Any) → Union[QfResponse, AsyncIterator[QfResponse]][source]

Async generate a completion based on the user-provided prompt.

Parameters:

prompt (str):: The input prompt to generate the continuation from.
model (Optional[str]):: The name or identifier of the language model to use. If not specified, the default model is used(ERNIE-Bot-turbo).
endpoint (Optional[str]):: The endpoint for making API requests. If not provided, the default endpoint is used.
stream (bool):: If set to True, the responses are streamed back as an iterator. If False, a single response is returned.
retry_count (int):: The number of times to retry the request in case of failure.
request_timeout (float):: The maximum time (in seconds) to wait for a response from the model.
backoff_factor (float):: A factor to increase the waiting time between retry attempts.
kwargs (Any):: Additional keyword arguments that can be passed to customize the request.

Additional parameters like temperature will vary depending on the model, please refer to the API documentation. The additional parameters can be passed as follows:

` Completion().do(prompt = ..., temperature = 0.2, top_p = 0.5) `

batch_do(prompt_list: List[str], worker_num: Optional[int] = None, **kwargs: Any) → BatchRequestFuture[source]

Batch generate a completion based on the user-provided prompt.

Parameters:

prompt_list (List[str]):: The input prompt list to generate the continuation from.
worker_num (Optional[int]):: The number of prompts to process at the same time, default to None, which means this number will be decided dynamically.
kwargs (Any):: Please refer to Completion.do for other parameters such as model, endpoint, retry_count, etc.

``` response_list = Completion().batch_do([”…”, “…”], worker_num = 10) for response in response_list:

# return QfResponse if succeed, or exception will be raised print(response.result())

# or while response_list.finished_count() != response_list.task_count():

time.sleep(1)

print(response_list.results()) ```

do(prompt: str, model: Optional[str] = None, endpoint: Optional[str] = None, stream: bool = False, retry_count: int = 1, request_timeout: float = 60, request_id: Optional[str] = None, backoff_factor: float = 1, **kwargs: Any) → Union[QfResponse, Iterator[QfResponse]][source]

Generate a completion based on the user-provided prompt.

Parameters:

prompt (str):: The input prompt to generate the continuation from.
model (Optional[str]):: The name or identifier of the language model to use. If not specified, the default model is used(ERNIE-Bot-turbo).
endpoint (Optional[str]):: The endpoint for making API requests. If not provided, the default endpoint is used.
stream (bool):: If set to True, the responses are streamed back as an iterator. If False, a single response is returned.
retry_count (int):: The number of times to retry the request in case of failure.
request_timeout (float):: The maximum time (in seconds) to wait for a response from the model.
backoff_factor (float):: A factor to increase the waiting time between retry attempts.
kwargs (Any):: Additional keyword arguments that can be passed to customize the request.

Additional parameters like temperature will vary depending on the model, please refer to the API documentation. The additional parameters can be passed as follows:

` Completion().do(prompt = ..., temperature = 0.2, top_p = 0.5) `

qianfan.resources.llm.embedding module

class qianfan.resources.llm.embedding.Embedding(model: Optional[str] = None, endpoint: Optional[str] = None, **kwargs: Any)[source]

Bases: BaseResource

QianFan Embedding is an agent for calling QianFan embedding API.

async abatch_do(texts_list: List[List[str]], worker_num: Optional[int] = None, **kwargs: Any) → List[Union[QfResponse, AsyncIterator[QfResponse]]][source]

Async batch generate embeddings for a list of input texts using a specified model.

Parameters:

texts_list (List[List[str]]):: List of the input text list to generate the embeddings.
worker_num (Optional[int]):: The number of prompts to process at the same time, default to None, which means this number will be decided dynamically.
kwargs (Any):: Please refer to Embedding.ado for other parameters such as model, endpoint, retry_count, etc.

``` response_list = await Embedding().abatch_do([…], worker_num = 10) for response in response_list:

# response is QfResponse if succeed, or response will be exception print(response)

```

async ado(texts: List[str], model: Optional[str] = None, endpoint: Optional[str] = None, stream: bool = False, retry_count: int = 1, request_timeout: float = 60, request_id: Optional[str] = None, backoff_factor: float = 1, **kwargs: Any) → Union[QfResponse, AsyncIterator[QfResponse]][source]

Async generate embeddings for a list of input texts using a specified model.

Parameters:

texts (List[str]):: A list of input texts for which embeddings need to be generated.
model (Optional[str]):: The name or identifier of the language model to use. If not specified, the default model is used(ERNIE-Bot-turbo).
endpoint (Optional[str]):: The endpoint for making API requests. If not provided, the default endpoint is used.
stream (bool):: If set to True, the responses are streamed back as an iterator. If False, a single response is returned.
retry_count (int):: The number of times to retry the request in case of failure.
request_timeout (float):: The maximum time (in seconds) to wait for a response from the model.
backoff_factor (float):: A factor to increase the waiting time between retry attempts.
kwargs (Any):: Additional keyword arguments that can be passed to customize the request.

Additional parameters like temperature will vary depending on the model, please refer to the API documentation. The additional parameters can be passed as follows:

` Embedding().do(texts = ..., temperature = 0.2, top_p = 0.5) `

batch_do(texts_list: List[List[str]], worker_num: Optional[int] = None, **kwargs: Any) → BatchRequestFuture[source]

Batch generate embeddings for a list of input texts using a specified model.

Parameters:

texts_list (List[List[str]]):: List of the input text list to generate the embeddings.
worker_num (Optional[int]):: The number of prompts to process at the same time, default to None, which means this number will be decided dynamically.
kwargs (Any):: Please refer to Completion.do for other parameters such as model, endpoint, retry_count, etc.

``` response_list = Completion().batch_do([”…”, “…”], worker_num = 10) for response in response_list:

# return QfResponse if succeed, or exception will be raised print(response.result())

# or while response_list.finished_count() != response_list.task_count():

time.sleep(1)

print(response_list.results()) ```

do(texts: List[str], model: Optional[str] = None, endpoint: Optional[str] = None, stream: bool = False, retry_count: int = 1, request_timeout: float = 60, request_id: Optional[str] = None, backoff_factor: float = 1, **kwargs: Any) → Union[QfResponse, Iterator[QfResponse]][source]

Generate embeddings for a list of input texts using a specified model.

Parameters:

texts (List[str]):: A list of input texts for which embeddings need to be generated.
model (Optional[str]):: The name or identifier of the language model to use. If not specified, the default model is used(ERNIE-Bot-turbo).
endpoint (Optional[str]):: The endpoint for making API requests. If not provided, the default endpoint is used.
stream (bool):: If set to True, the responses are streamed back as an iterator. If False, a single response is returned.
retry_count (int):: The number of times to retry the request in case of failure.
request_timeout (float):: The maximum time (in seconds) to wait for a response from the model.
backoff_factor (float):: A factor to increase the waiting time between retry attempts.
kwargs (Any):: Additional keyword arguments that can be passed to customize the request.

Additional parameters like temperature will vary depending on the model, please refer to the API documentation. The additional parameters can be passed as follows:

` Embedding().do(texts = ..., temperature = 0.2, top_p = 0.5) `

qianfan.resources.llm.plugin module

class qianfan.resources.llm.plugin.Plugin(model: str = 'EBPlugin', endpoint: Optional[str] = None, **kwargs: Any)[source]

Bases: BaseResource

QianFan Plugin API Resource

async abatch_do(query_list: List[Union[str, QfMessages, List[Dict]]], worker_num: Optional[int] = None, **kwargs: Any) → List[Union[QfResponse, AsyncIterator[QfResponse]]][source]

Async batch execute a plugin action on the provided input prompt and generate responses.

Parameters:

query_list List[Union[str, QfMessages, List[Dict]]]:: The list user input messages or prompt for which a response is generated.
worker_num (Optional[int]):: The number of prompts to process at the same time, default to None, which means this number will be decided dynamically.
kwargs (Any):: Please refer to Plugin.ado for other parameters such as model, endpoint, retry_count, etc.

``` response_list = await Plugin().abatch_do([…], worker_num = 10) for response in response_list:

# response is QfResponse if succeed, or response will be exception print(response)

```

async ado(query: Union[str, QfMessages, List[Dict]], plugins: Optional[List[str]] = None, model: Optional[str] = None, endpoint: Optional[str] = None, stream: bool = False, retry_count: int = 1, request_timeout: float = 60, request_id: Optional[str] = None, backoff_factor: float = 1, **kwargs: Any) → Union[QfResponse, AsyncIterator[QfResponse]][source]

Async execute a plugin action on the provided input prompt and generate responses.

Parameters:

query Union[str, QfMessages, List[Dict]]:: The user input for which a response is generated. Concretely, the following types are supported:

query should be str for qianfan plugin, while query should be either QfMessages or list for EBPlugin
plugins (Optional[List[str]]):: A list of plugins to be used.
model (Optional[str]):: The name or identifier of the language model to use. If not specified, the default model is used(ERNIE-Bot-turbo).
endpoint (Optional[str]):: The endpoint for making API requests. If not provided, the default endpoint is used.
stream (bool):: If set to True, the responses are streamed back as an iterator. If False, a single response is returned.
retry_count (int):: The number of times to retry the request in case of failure.
request_timeout (float):: The maximum time (in seconds) to wait for a response from the model.
backoff_factor (float):: A factor to increase the waiting time between retry attempts.
kwargs (Any):: Additional keyword arguments that can be passed to customize the request.

Additional parameters like temperature will vary depending on the model, please refer to the API documentation. The additional parameters can be passed as follows:

` Plugin().do(prompt = ..., temperature = 0.2, top_p = 0.5) `

batch_do(query_list: List[Union[str, QfMessages, List[Dict]]], worker_num: Optional[int] = None, **kwargs: Any) → BatchRequestFuture[source]

Batch generate execute a plugin action on the provided input prompt and generate responses.

Parameters:

query_list List[Union[str, QfMessages, List[Dict]]]:: The list user input messages or prompt for which a response is generated.
worker_num (Optional[int]):: The number of prompts to process at the same time, default to None, which means this number will be decided dynamically.
kwargs (Any):: Please refer to Plugin.do for other parameters such as model, endpoint, retry_count, etc.

``` response_list = Plugin().batch_do([”…”, “…”], worker_num = 10) for response in response_list:

# return QfResponse if succeed, or exception will be raised print(response.result())

# or while response_list.finished_count() != response_list.task_count():

time.sleep(1)

print(response_list.results()) ```

do(query: Union[str, QfMessages, List[Dict]], plugins: Optional[List[str]] = None, model: Optional[str] = None, endpoint: Optional[str] = None, stream: bool = False, retry_count: int = 1, request_timeout: float = 60, request_id: Optional[str] = None, backoff_factor: float = 1, **kwargs: Any) → Union[QfResponse, Iterator[QfResponse]][source]

Execute a plugin action on the provided input prompt and generate responses.

Parameters:

query Union[str, QfMessages, List[Dict]]:: The user input for which a response is generated. Concretely, the following types are supported:

query should be str for qianfan plugin, while query should be either QfMessages or list for EBPlugin
plugins (Optional[List[str]]):: A list of plugins to be used.
model (Optional[str]):: The name or identifier of the language model to use. If not specified, the default model is used(ERNIE-Bot-turbo).
endpoint (Optional[str]):: The endpoint for making API requests. If not provided, the default endpoint is used.
stream (bool):: If set to True, the responses are streamed back as an iterator. If False, a single response is returned.
retry_count (int):: The number of times to retry the request in case of failure.
request_timeout (float):: The maximum time (in seconds) to wait for a response from the model.
backoff_factor (float):: A factor to increase the waiting time between retry attempts.
kwargs (Any):: Additional keyword arguments that can be passed to customize the request.

Additional parameters like temperature will vary depending on the model, please refer to the API documentation. The additional parameters can be passed as follows:

` Plugin().do(prompt = ..., temperature = 0.2, top_p = 0.5) `