`phasellm.llms`#

Abstract classes and wrappers for LLMs, chatbots, and prompts.

Module Contents#

Classes#

`Message`	Initialize self. See help(type(self)) for accurate signature.
`EnhancedMessage`	Initialize self. See help(type(self)) for accurate signature.
`LanguageModelWrapper`	Abstract Class for interacting with large language models.
`StreamingLanguageModelWrapper`	Abstract class for streaming language models. Extends the regular LanguageModelWrapper.
`ChatPrompt`	This is used to generate messages for a ChatBot. Like the Prompt class, it enables you to to have variables that
`Prompt`	Prompts are used to generate text completions. Prompts can be simple Strings. They can also include variables
`HuggingFaceInferenceWrapper`	Wrapper for Hugging Face's Inference API. Requires access to Hugging Face's inference API.
`BloomWrapper`	Wrapper for Hugging Face's BLOOM model. Requires access to Hugging Face's inference API.
`StreamingOpenAIGPTWrapper`	Streaming compliant wrapper for the OpenAI API. Supports all major text and chat completion models by OpenAI.
`OpenAIGPTWrapper`	Wrapper for the OpenAI API. Supports all major text and chat completion models by OpenAI.
`StreamingVertexAIWrapper`	Streaming wrapper for Vertex AI LLMs. Supports all major text and chat completion models, including Gemeni.
`VertexAIWrapper`	Wrapper for Vertex AI LLMs. Supports all major text and chat completion models, including Gemeni.
`StreamingClaudeWrapper`	Streaming wrapper for Anthropic's Claude large language model.
`ClaudeWrapper`	Wrapper for Anthropic's Claude large language model.
`GPT2Wrapper`	Wrapper for GPT-2 implementation (via Hugging Face).
`DollyWrapper`	Wrapper for Dolly 2.0 (via Hugging Face).
`ReplicateLlama2Wrapper`	Wrapper for Llama 2, provided via Replicate. See https://replicate.com/ for more information.
`CohereWrapper`	Wrapper for Cohere's API.
`ChatBot`	Allows you to have a chat conversation with an LLM wrapper.

Functions#

`_fill_variables`(→ str)	Fills variables in a string with the values provided in kwargs.
`_clean_messages_to_prompt`(→ str)	Converts an array of messages in the form {"role": <str>, "content":<str>} into a String.
`_truncate_completion`(→ str)	Truncates a completion to the first newline character.
`_remove_prompt_from_completion`(→ str)	Remove the prompt from the completion.
`_get_stop_sequences_from_messages`(→ List[str])	Generates a list of strings of stop sequences from an array of messages in the form
`_format_sse`(→ str)	Formats the content for Server Sent Events (SSE). Additionally, handles newline characters gracefully.
`_conditional_format_sse_response`(→ str)	Conditionally formats the response as an SSE.
`swap_roles`(→ List[Message])	Creates a new messages stack with the new_prompt as the system prompt and the 'user' and 'assistant' roles swapped.

Attributes#

`variable_pattern`
`variable_regex`
`STOP_TOKEN`

phasellm.llms.variable_pattern = '\\{\\s*[a-zA-Z0-9_]+\\s*\\}'#

phasellm.llms.variable_regex#

phasellm.llms.STOP_TOKEN = '<|END|>'#

class phasellm.llms.Message#

Bases: typing_extensions.TypedDict

Initialize self. See help(type(self)) for accurate signature.

role: str#

content: str#

class phasellm.llms.EnhancedMessage#

Bases: Message

Initialize self. See help(type(self)) for accurate signature.

timestamp_utc: datetime.datetime#

log_time_seconds: float#

phasellm.llms._fill_variables(source: str, **kwargs: Any) → str#

Fills variables in a string with the values provided in kwargs.

Parameters:

source – The string to fill.
**kwargs – The values to fill the string with.

Returns:

The filled string.

phasellm.llms._clean_messages_to_prompt(messages: List[Message]) → str#

Converts an array of messages in the form {“role”: <str>, “content”:<str>} into a String.

This is influenced by the OpenAI chat completion API.

Parameters:: messages – The messages to convert.
Returns:: The messages as a String.

phasellm.llms._truncate_completion(completion: str) → str#

Truncates a completion to the first newline character.

Parameters:: completion – The completion to truncate.
Returns:: The truncated completion.

phasellm.llms._remove_prompt_from_completion(prompt: str, completion: str) → str#

Remove the prompt from the completion.

Parameters:

prompt – The prompt to remove.
completion – The completion to remove the prompt from.

Returns:

The completion without the prompt.

phasellm.llms._get_stop_sequences_from_messages(messages: List[Message]) → List[str]#

Generates a list of strings of stop sequences from an array of messages in the form {“role”: <str>, “content”:<str>}.

Parameters:: messages – The messages to generate stop sequences from.
Returns:: A list of stop sequences.

phasellm.llms._format_sse(content: str) → str#

Formats the content for Server Sent Events (SSE). Additionally, handles newline characters gracefully.

Parameters:: content – The content to format.
Returns:: The formatted content.

phasellm.llms._conditional_format_sse_response(content: str, format_sse: bool) → str#

Conditionally formats the response as an SSE.

Parameters:

content – The content to format.
format_sse – Whether to format the response as an SSE.

Returns:

The formatted content.

phasellm.llms.swap_roles(messages: List[Message], new_prompt: str) → List[Message]#

Creates a new messages stack with the new_prompt as the system prompt and the ‘user’ and ‘assistant’ roles swapped. All other messages are ignored.

Parameters:

messages – the current messages.
new_prompt – the new system prompt.

Returns:

A new list of messages with the new_prompt as the system prompt and user/assistant prompts swapped out.

class phasellm.llms.LanguageModelWrapper(temperature: float | None = None, **kwargs: Any)#

Bases: abc.ABC

Abstract Class for interacting with large language models.

Parameters:

temperature – The temperature to use for the language model.
**kwargs – Keyword arguments to pass to the underlying language model API.

property last_response_header: dict | None#

Returns the last response header from the LLM API.

Returns:: A dictionary containing the last response header.

chat_completion_preamble: str = "You are a friendly chat assistant. You are speaking to the 'user' below and will respond at the..."#

_last_response_header: dict | None#

__repr__()#: Return repr(self).

abstract complete_chat(messages: List[Message], append_role: str | None = None, prepend_role: str | None = None) → str | Generator#

Takes an array of messages in the form {“role”: <str>, “content”:<str>} and generate a response.

This is influenced by the OpenAI chat completion API.

Parameters:

messages – The messages to generate a response from.
append_role – The role to append to the end of the prompt.
prepend_role – The role to prepend to the beginning of the prompt.

Returns:

The chat completion string or generator, depending on if the class is implemented as a streaming language model wrapper.

abstract text_completion(prompt: str) → str | Generator#

Standardizes text completion for large language models.

Parameters:: prompt – The prompt to generate a response from.
Returns:: The text completion string or generator, depending on if the class is implemented as a streaming language model wrapper.

prep_prompt_from_messages(messages: List[Message] = None, prepend_role: str | None = None, append_role: str | None = None, include_preamble: bool | None = False) → str#

Prepares the prompt for an LLM API call.

Parameters:

messages – The messages to prepare the prompt from.
prepend_role – The role to prepend to the beginning of the prompt.
append_role – The role to append to the end of the prompt.
include_preamble – Whether to include the chat completion preamble.

Returns:

The prepared prompt.

static prep_prompt(prompt: str, prepend_role: str | None = None, append_role: str | None = None) → str#

Prepares the prompt for an LLM API call.

Parameters:

prompt – The prompt to prepare.
prepend_role – The role to prepend to the beginning of the prompt.
append_role – The role to append to the end of the prompt.

Returns:

The prepared prompt.

_prep_common_kwargs(api_config: phasellm.types.OPENAI_API_CONFIG | None = None)#

This method prepares the common kwargs for the OpenAI APIs.

Returns:: The kwargs to pass to the API.

class phasellm.llms.StreamingLanguageModelWrapper(temperature: float, format_sse: bool, append_stop_token: bool = True, stop_token: str = STOP_TOKEN, **kwargs: Any)#

Bases: LanguageModelWrapper

Abstract class for streaming language models. Extends the regular LanguageModelWrapper.

Parameters:

temperature – The temperature to use for the language model.
format_sse – Whether to format the response as an SSE.
append_stop_token – Whether to append a stop token to the end of the prompt.
stop_token – The stop token to append to the end of the prompt.
**kwargs – Keyword arguments to pass to the underlying language model APIs.

class phasellm.llms.ChatPrompt(messages: List[Message] = None)#

This is used to generate messages for a ChatBot. Like the Prompt class, it enables you to to have variables that get replaced. This can be done for roles and messages.

Parameters:: messages – The messages to generate a chat prompt from.

__repr__()#: Return repr(self).

chat_repr() → str#

Returns a string representation of the chat prompt.

Returns:: The string representation of the chat prompt.

fill(**kwargs) → List[Message]#

Fills the variables in the chat prompt.

Parameters:: **kwargs – The variables to fill.
Returns:: The filled chat prompt.

class phasellm.llms.Prompt(prompt: str)#

Prompts are used to generate text completions. Prompts can be simple Strings. They can also include variables surrounded by curly braces.

Example

>>> Prompt("Hello {name}!")
In this case, 'name' can be filled using the fill() function. This makes it easier to loop through prompts
that follow a specific pattern or structure.

Parameters:: prompt – The prompt to generate a text completion from.

__repr__()#: Return repr(self).

get_prompt() → str#

Return the raw prompt command (i.e., does not fill in variables.)

Returns:: The raw prompt command.

fill(**kwargs: Any) → str#

Return a prompt with variables filled in.

Parameters:: **kwargs – The variables to fill.
Returns:: The filled prompt.

class phasellm.llms.HuggingFaceInferenceWrapper(apikey: str, model_url: str = 'https://api-inference.huggingface.co/models/bigscience/bloom', temperature: float = None, **kwargs: Any)#

Bases: LanguageModelWrapper

Wrapper for Hugging Face’s Inference API. Requires access to Hugging Face’s inference API.

Parameters:

apikey – The API key to access the Hugging Face Inference API.
model_url – The model URL to use for the Hugging Face Inference API.
temperature – The temperature to use for the language model.
**kwargs – Keyword arguments to pass to the Hugging Face Inference API.

__repr__()#: Return repr(self).

_call_model(prompt: str) → str#

This method is used to call the Hugging Face Inference API. It is used by the complete_chat() and text_completion() methods.

Parameters:: prompt – The prompt to call the model with.
Returns:: The response from the Hugging Face Inference API.

complete_chat(messages: List[Message], append_role: str = None, prepend_role: str = None) → str#

Mimics a chat scenario via a list of {“role”: <str>, “content”:<str>} objects.

Parameters:

messages – The messages to generate a chat completion from.
append_role – The role to append to the end of the prompt.
prepend_role – The role to prepend to the beginning of the prompt.

Returns:

The chat completion.

text_completion(prompt: str) → str#

Generates a text completion from a prompt.

Parameters:: prompt – The prompt to generate a text completion from.
Returns:: The text completion.

class phasellm.llms.BloomWrapper(apikey: str, temperature: float = None, **kwargs: Any)#

Bases: LanguageModelWrapper

Wrapper for Hugging Face’s BLOOM model. Requires access to Hugging Face’s inference API.

Parameters:

apikey – The API key to access the Hugging Face Inference API.
temperature – The temperature to use for the language model.
**kwargs – Keyword arguments to pass to the underlying language model API.

API_URL = 'https://api-inference.huggingface.co/models/bigscience/bloom'#

__repr__()#: Return repr(self).

_call_model(prompt: str) → str#

This method is used to call the Hugging Face Inference API. It is used by the complete_chat() and text_completion() methods.

Parameters:: prompt – The prompt to call the model with.
Returns:: The response from the Hugging Face Inference API.

complete_chat(messages: List[Message], append_role: str = None, prepend_role: str = None) → str#

Mimics a chat scenario with BLOOM, via a list of {“role”: <str>, “content”:<str>} objects.

Parameters:

messages – The messages to generate a chat completion from.
append_role – The role to append to the end of the prompt.
prepend_role – The role to prepend to the beginning of the prompt.

Returns:

The chat completion.

text_completion(prompt: str) → str#

Completes text via BLOOM (Hugging Face).

Parameters:: prompt – The prompt to generate a text completion from.
Returns:: The text completion.

class phasellm.llms.StreamingOpenAIGPTWrapper(apikey: str | None = None, model: str = 'gpt-3.5-turbo', format_sse: bool = False, append_stop_token: bool = True, stop_token: str = STOP_TOKEN, temperature: float = None, api_config: phasellm.types.OPENAI_API_CONFIG | None = None, **kwargs: Any)#

Bases: StreamingLanguageModelWrapper

Streaming compliant wrapper for the OpenAI API. Supports all major text and chat completion models by OpenAI.

This wrapper can be configured to use OpenAI’s API or Microsoft Azure’s API. To use Azure, pass in the appropriate api_config. To use OpenAI’s API, pass in an apikey and model. If both api_config and apikey are passed in, api_config takes precedence.

Examples

>>> from phasellm.llms import StreamingOpenAIGPTWrapper

Use OpenAI’s API:

>>> llm = StreamingOpenAIGPTWrapper(apikey="my-api-key", model="gpt-3.5-turbo")
>>> llm.text_completion(prompt="Hello, my name is")
"Hello, my name is ChatGPT."

Use OpenAI’s API with api_config:

>>> from phasellm.configurations import OpenAIConfiguration
>>> llm = StreamingOpenAIGPTWrapper(api_config=OpenAIConfiguration(
...     api_key="my-api-key",
...     organization="my-org",
...     model="gpt-3.5-turbo"
... ))

Use Azure’s API:

>>> from phasellm.configurations import AzureAPIConfiguration
>>> llm = StreamingOpenAIGPTWrapper(api_config=AzureAPIConfiguration(
...     api_key="azure_api_key",
...     api_base='https://{your-resource-name}.openai.azure.com/openai/deployments/{your-deployment-id}',
...     api_version='2023-05-15',
...     deployment_id='your-deployment-id'
... ))
>>> llm.text_completion(prompt="Hello, my name is")
"Hello, my name is ChatGPT."

Use Azure’s API with Active Directory authentication:

>>> from phasellm.configurations import AzureActiveDirectoryConfiguration
>>> llm = StreamingOpenAIGPTWrapper(api_config=AzureActiveDirectoryConfiguration(
...     api_base='https://{your-resource-name}.openai.azure.com/openai/deployments/{your-deployment-id}',
...     api_version='2023-05-15',
...     deployment_id='your-deployment-id'
... ))
>>> llm.text_completion(prompt="Hello, my name is")
"Hello, my name is ChatGPT."

Parameters:

apikey – The API key to access the OpenAI API.
model – The model to use. Defaults to “gpt-3.5-turbo”.
format_sse – Whether to format the SSE response from OpenAI. Defaults to False.
append_stop_token – Whether to append the stop token to the end of the prompt. Defaults to True.
stop_token – The stop token to use. Defaults to <|END|>.
temperature – The temperature to use for the language model.
api_config – The API configuration to use. Defaults to None. Takes precedence over apikey and model.
**kwargs – Keyword arguments to pass to the OpenAI API.

__repr__()#: Return repr(self).

_yield_response(response) → Generator#

Yields the response content. Can handle multiple API versions.

Parameters:: response – The response to yield text from.
Returns:: Text generator

complete_chat(messages: List[Message], append_role: str = None, prepend_role: str = None) → Generator#

Completes chat with OpenAI. If using GPT 3.5 or 4, will simply send the list of {“role”: <str>, “content”:<str>} objects to the API.

If using an older model, it will structure the messages list into a prompt first.

Yields the text as it is generated, rather than waiting for the entire completion.

Parameters:

messages – The messages to generate a chat completion from.
append_role – The role to append to the end of the prompt.
prepend_role – The role to prepend to the beginning of the prompt.

Returns:

The chat completion generator.

text_completion(prompt: str, stop_sequences: List[str] = None) → Generator#

Completes text via OpenAI. Note that this doesn’t support GPT 3.5 or later, as they are chat models.

Yields the text as it is generated, rather than waiting for the entire completion.

Parameters:

prompt – The prompt to generate a text completion from.
stop_sequences – The stop sequences to use. Defaults to None.

Returns:

The text completion generator.

_set_last_response_header(response: httpx.Response) → None#

Sets the last response header.

Parameters:: response – The response to set the last response header from.
Returns:: None

class phasellm.llms.OpenAIGPTWrapper(apikey: str | None = None, model: str = 'gpt-3.5-turbo', temperature: float = None, api_config: phasellm.types.OPENAI_API_CONFIG | None = None, **kwargs: Any)#

Bases: LanguageModelWrapper

Wrapper for the OpenAI API. Supports all major text and chat completion models by OpenAI.

Examples

>>> from phasellm.llms import OpenAIGPTWrapper

Use OpenAI’s API:

>>> llm = OpenAIGPTWrapper(apikey="my-api-key", model="gpt-3.5-turbo")
>>> llm.text_completion(prompt="Hello, my name is")
"Hello, my name is ChatGPT."

Use OpenAI’s API with api_config:

>>> from phasellm.configurations import OpenAIConfiguration
>>> llm = OpenAIGPTWrapper(api_config=OpenAIConfiguration(
...     api_key="my-api-key",
...     organization="my-org",
...     model="gpt-3.5-turbo"
... ))

Use Azure’s API:

>>> from phasellm.configurations import AzureAPIConfiguration
>>> llm = OpenAIGPTWrapper(api_config=AzureAPIConfiguration(
...     api_key="azure_api_key",
...     api_base='https://{your-resource-name}.openai.azure.com/openai/deployments/{your-deployment-id}',
...     api_version='2023-08-01-preview',
...     deployment_id='your-deployment-id'
... ))
>>> llm.text_completion(prompt="Hello, my name is")
"Hello, my name is ChatGPT."

Use Azure’s API with Active Directory authentication:

>>> from phasellm.configurations import AzureActiveDirectoryConfiguration
>>> llm = OpenAIGPTWrapper(api_config=AzureActiveDirectoryConfiguration(
...     api_base='https://{your-resource-name}.openai.azure.com/openai/deployments/{your-deployment-id}',
...     api_version='2023-08-01-preview',
...     deployment_id='your-deployment-id'
... ))
>>> llm.text_completion(prompt="Hello, my name is")
"Hello, my name is ChatGPT."

Parameters:

apikey – The API key to access the OpenAI API.
model – The model to use. Defaults to “gpt-3.5-turbo”.
temperature – The temperature to use for the language model.
api_config – The API configuration to use. Defaults to None. Takes precedence over apikey and model.
**kwargs – Keyword arguments to pass to the OpenAI API.

__repr__()#: Return repr(self).

complete_chat(messages: List[Message], append_role: str = None, prepend_role: str = None) → str#

Completes chat with OpenAI. If using GPT 3.5 or 4, will simply send the list of {“role”: <str>, “content”:<str>} objects to the API.

If using an older model, it will structure the messages list into a prompt first.

Parameters:

messages – The messages to generate a chat completion from.
append_role – The role to append to the end of the prompt.
prepend_role – The role to prepend to the beginning of the prompt.

Returns:

The chat completion.

text_completion(prompt: str, stop_sequences: List[str] = None) → str#

Completes text via OpenAI. Note that this doesn’t support GPT 3.5 or later, as they are chat models.

Parameters:

prompt – The prompt to generate a text completion from.
stop_sequences – The stop sequences to use. Defaults to None.

Returns:

The text completion.

_set_last_response_header(response: httpx.Response) → None#

Sets the last response header.

Parameters:: response – The response to set the last response header from.
Returns:: None

class phasellm.llms.StreamingVertexAIWrapper(model: str = None, format_sse: bool = False, append_stop_token: bool = True, stop_token: str = STOP_TOKEN, temperature: float = None, api_config: phasellm.types.VERTEXAI_API_CONFIG | None = None, **kwargs: Any)#

Bases: StreamingLanguageModelWrapper

Streaming wrapper for Vertex AI LLMs. Supports all major text and chat completion models, including Gemeni.

This wrapper depends on Google’s Application Default Credentials (ADC) to authenticate.

Setting up ADC: 1. Install the Google Cloud SDK: https://cloud.google.com/sdk/docs/install 2. Authenticate with gcloud: https://cloud.google.com/sdk/gcloud/reference/auth/application-default/login >>> gcloud auth application-default login

Example

>>> from phasellm.llms import StreamingVertexAIWrapper

Use Vertex AI’s API:

>>> llm = StreamingVertexAIWrapper(model="gemini-1.0-pro-001")
>>> llm.text_completion(prompt="Hello, my name is")
"Hello, my name is Gemeni."

Note that when passing no model, the default model is “gemini-1.0-pro-001”.

Use Vertex AI’s API with api_config:

>>> from phasellm.configurations import VertexAIConfiguration
>>> llm = StreamingVertexAIWrapper(api_config=VertexAIConfiguration(
...     model="gemini-1.0-pro-001"
... ))

Use temperature parameter:

>>> llm = VertexAIWrapper(model="gemini-1.0-pro-001", temperature=0.5)
>>> llm.text_completion(prompt="Hello, my name is")
"Hello, my name is Gemeni."

Use max_output_tokens parameter:

>>> llm = VertexAIWrapper(model="gemini-1.0-pro-001", max_output_tokens=50)
>>> llm.text_completion(prompt="Hello, my name is")
"Hello, my name is Gemeni."

Potential parameters (model dependent):

max_output_tokens
candidate_count
top_p
top_k
logprobs
presence_penalty
frequency_penalty
logit_bias

Parameters:

model – The model to use. Defaults to “gemini-1.0-pro-001”.
format_sse – Whether to format the response as an SSE.
append_stop_token – Whether to append a stop token to the end of the prompt.
stop_token – The stop token to append to the end of the prompt.
temperature – The temperature to use for the language model.
api_config – The API configuration to use. Defaults to None. Takes precedence over model.
**kwargs – Keyword arguments to pass to the Vertex AI API.

__repr__()#: Return repr(self).

_call_model(prompt: str, stop_sequences: List[str]) → Generator#

Calls the model with the given prompt.

Parameters:

prompt – The prompt to generate a text completion from.
stop_sequences – The stop sequences to use. Defaults to None.

Returns:

The text completion generator.

complete_chat(messages: List[Message], append_role: str = None, prepend_role: str = None) → Generator#

Completes chat with Vertex AI.

Parameters:

messages – The messages to generate a chat completion from.
append_role – The role to append to the end of the prompt.
prepend_role – The role to prepend to the beginning of the prompt.

Returns:

The chat completion generator.

text_completion(prompt: str, stop_sequences: List[str] = None) → Generator#

Completes text based on provided prompt.

Yields the text as it is generated, rather than waiting for the entire completion.

Parameters:

prompt – The prompt to generate a text completion from.
stop_sequences – The stop sequences to use. Defaults to None.

Returns:

The text completion generator.

class phasellm.llms.VertexAIWrapper(model: str = None, temperature: float = None, api_config: phasellm.types.VERTEXAI_API_CONFIG | None = None, **kwargs: Any)#

Bases: LanguageModelWrapper

Wrapper for Vertex AI LLMs. Supports all major text and chat completion models, including Gemeni.

This wrapper depends on Google’s Application Default Credentials (ADC) to authenticate.

Example

>>> from phasellm.llms import VertexAIWrapper

Text completion:

>>> llm = VertexAIWrapper(model="gemini-1.0-pro-001")
>>> llm.text_completion(prompt="Hello, my name is")
"Hello, my name is Gemeni."

Note that when passing no model, the default model is “gemini-1.0-pro-001”.

Configure with api_config:

>>> from phasellm.configurations import VertexAIConfiguration
>>> llm = VertexAIWrapper(api_config=VertexAIConfiguration(
...     model="gemini-1.0-pro-001"
... ))

Use temperature parameter:

>>> llm = VertexAIWrapper(model="gemini-1.0-pro-001", temperature=0.5)
>>> llm.text_completion(prompt="Hello, my name is")
"Hello, my name is Gemeni."

Use max_output_tokens parameter:

>>> llm = VertexAIWrapper(model="gemini-1.0-pro-001", max_output_tokens=50)
>>> llm.text_completion(prompt="Hello, my name is")
"Hello, my name is Gemeni."

Potential parameters (model dependent):

max_output_tokens
candidate_count
top_p
top_k
logprobs
presence_penalty
frequency_penalty
logit_bias

Parameters:

model – The model to use. Defaults to “gemini-1.0-pro-001”.
temperature – The temperature to use for the language model.
api_config – The API configuration to use. Defaults to None. Takes precedence over model.
**kwargs – Keyword arguments to pass to the Vertex AI API.

__repr__()#: Return repr(self).

_call_model(prompt: str, stop_sequences: List[str]) → str#

Calls the model with the given prompt.

Parameters:

prompt – The prompt to generate a text completion from.
stop_sequences – The stop sequences to use. Defaults to None.

Returns:

The text completion.

complete_chat(messages: List[Message], append_role: str = None, prepend_role: str = None) → str#

Completes chat.

Parameters:

messages – The messages to generate a chat completion from.
append_role – The role to append to the end of the prompt.
prepend_role – The role to prepend to the beginning of the prompt.

Returns:

The chat completion.

text_completion(prompt: str, stop_sequences: List[str] = None) → str#

Completes text based on provided prompt.

Parameters:

prompt – The prompt to generate a text completion from.
stop_sequences – The stop sequences to use. Defaults to None.

Returns:

The text completion.

class phasellm.llms.StreamingClaudeWrapper(apikey: str, model: phasellm.types.CLAUDE_MODEL = 'claude-2', format_sse: bool = False, append_stop_token: bool = True, stop_token: str = STOP_TOKEN, temperature: float = None, anthropic_version: str = '2023-06-01', **kwargs: Any)#

Bases: StreamingLanguageModelWrapper

Streaming wrapper for Anthropic’s Claude large language model.

We’ve opted to call Anthropic’s API directly rather than using their Python offering.

Yields the text as it is generated, rather than waiting for the entire completion.

Parameters:

apikey – The API key to access the Anthropic API.
model – The model to use. Defaults to “claude-2”.
format_sse – Whether to format the SSE response. Defaults to False.
append_stop_token – Whether to append the stop token to the end of the prompt. Defaults to True.
stop_token – The stop token to use. Defaults to <|END|>.
temperature – The temperature to use for the language model.
anthropic_version – The version of the Anthropic API to use. See https://docs.anthropic.com/claude/reference/versioning
**kwargs – Keyword arguments to pass to the Anthropic API.

API_URL = 'https://api.anthropic.com/v1/complete'#

__repr__()#: Return repr(self).

_call_model(prompt: str, stop_sequences: List[str]) → Generator#

Calls the model with the given prompt.

Parameters:

prompt – The prompt to generate a text completion from.
stop_sequences – The stop sequences to use. Defaults to None.

Returns:

The text completion generator.

complete_chat(messages: List[Message], append_role: str = 'Assistant', prepend_role: str = 'Human') → Generator#

Completes chat with Claude. Since Claude doesn’t support a chat interface via API, we mimic the chat via a prompt.

Parameters:

messages – The messages to generate a chat completion from.
append_role – The role to append to the end of the prompt. Defaults to “Assistant:”.
prepend_role – The role to prepend to the beginning of the prompt. Defaults to “Human:”.

Returns:

The chat completion generator.

text_completion(prompt: str, stop_sequences: List[str] = None) → Generator#

Completes text based on provided prompt.

Yields the text as it is generated, rather than waiting for the entire completion.

Parameters:

prompt – The prompt to generate a text completion from.
stop_sequences – The stop sequences to use. Defaults to None.

Returns:

The text completion generator.

class phasellm.llms.ClaudeWrapper(apikey: str, model: phasellm.types.CLAUDE_MODEL = 'claude-2', temperature: float = None, anthropic_version: str = '2023-06-01', **kwargs: Any)#

Bases: LanguageModelWrapper

Wrapper for Anthropic’s Claude large language model.

We’ve opted to call Anthropic’s API directly rather than using their Python offering.

See here for model options: https://docs.anthropic.com/claude/reference/selecting-a-model

Parameters:

apikey – The API key to access the Anthropic API.
model – The model to use. Defaults to “claude-v1”.
temperature – The temperature to use for the language model.
anthropic_version – The version of the Anthropic API to use. See https://docs.anthropic.com/claude/reference/versioning
**kwargs – Keyword arguments to pass to the Anthropic API.

API_URL = 'https://api.anthropic.com/v1/complete'#

__repr__()#: Return repr(self).

_call_model(prompt: str, messages: List[Message]) → str#

Calls the model with the given prompt.

Parameters:

prompt – The prompt to call the model with.
messages – The messages to generate stop sequences from.

Returns:

The completion.

complete_chat(messages: List[Message], append_role: str = 'Assistant', prepend_role: str = 'Human') → str#

Completes chat with Claude. Since Claude doesn’t support a chat interface via API, we mimic the chat via a prompt.

Parameters:

messages – The messages to generate a chat completion from.
append_role – The role to append to the end of the prompt. Defaults to “Assistant:”.
prepend_role – The role to prepend to the beginning of the prompt. Defaults to “Human:”.

Returns:

The chat completion.

text_completion(prompt: str, stop_sequences: List[str] = None) → str#

Completes text based on provided prompt.

Parameters:

prompt – The prompt to generate a text completion from.
stop_sequences – The stop sequences to use. Defaults to None.

Returns:

The text completion.

class phasellm.llms.GPT2Wrapper(temperature: float = None, **kwargs: Any)#

Bases: LanguageModelWrapper

Wrapper for GPT-2 implementation (via Hugging Face).

Note that you must have the phasellm[complete] extra installed to use this wrapper.

Parameters:

temperature – The temperature to use for the language model.
**kwargs – Keyword arguments to pass to the GPT-2 model.

__repr__()#: Return repr(self).

_call_model(prompt: str, max_length: int = 300) → str#

Calls the model with the given prompt.

Parameters:

prompt – The prompt to call the model with.
max_length – The maximum length of the completion. Defaults to 300.

Returns:

The completion.

complete_chat(messages: List[Message], append_role: str = None, max_length: int = 300, prepend_role: str = None) → str#

Mimics a chat scenario via a list of {“role”: <str>, “content”:<str>} objects.

Parameters:

messages – The messages to generate a chat completion from.
append_role – The role to append to the end of the prompt. Defaults to None.
max_length – The maximum length of the completion. Defaults to 300.
prepend_role – The role to prepend to the beginning of the prompt. Defaults to None.

Returns:

The chat completion.

text_completion(prompt: str, max_length: int = 200) → str#

Completes text via GPT-2.

Parameters:

prompt – The prompt to generate a text completion from.
max_length – The maximum length of the completion. Defaults to 200.

Returns:

The text completion.

class phasellm.llms.DollyWrapper(temperature: float = None, **kwargs: Any)#

Bases: LanguageModelWrapper

Wrapper for Dolly 2.0 (via Hugging Face).

Note that you must have the phasellm[complete] extra installed to use this wrapper.

Parameters:

temperature – The temperature to use for the language model.
**kwargs – Keyword arguments to pass to the Dolly model.

__repr__()#: Return repr(self).

_call_model(prompt: str) → str#

Calls the model with the given prompt.

Parameters:: prompt – The prompt to call the model with.
Returns:: The completion.

complete_chat(messages: List[Message], append_role: str = None, prepend_role: str = None) → str#

Mimics a chat scenario via a list of {“role”: <str>, “content”:<str>} objects.

Parameters:

messages – The messages to generate a chat completion from.
append_role – The role to append to the end of the prompt. Defaults to None.
prepend_role – The role to prepend to the beginning of the prompt. Defaults to None.

Returns:

The chat completion.

text_completion(prompt: str) → str#

Completes text via Dolly.

Parameters:: prompt – The prompt to generate a text completion from.
Returns:: The text completion.

class phasellm.llms.ReplicateLlama2Wrapper(apikey: str, model: str = 'meta/llama-2-70b-chat:02e509c789964a7ea8736978a43525956ef40397be9033abf9fd2badfe68c9e3', temperature: float = None, **kwargs: Any)#

Bases: LanguageModelWrapper

Wrapper for Llama 2, provided via Replicate. See https://replicate.com/ for more information.

Parameters:

apikey – The Replicate API key to use.
model – The Llama 2 model to use.
temperature – The temperature to use for the language model.
**kwargs – Keyword arguments to pass to the API.

base_system_chat_prompt = 'You are a friendly chatbot.'#

Used in defining the system prompt for Llama 2 calls. Only used if a system prompt doesn’t exist in the message stack provided.

Type:: str

first_user_message = 'Hi.'#

Used as the first ‘user’ message in chat completions when the chat’s first non-system message is from ‘assistant’.

Type:: str

__repr__()#: Return repr(self).

build_chat_completion_prompt(messages: List[Message])#

Converts a list of messages into a properly structured Llama 2 prompt for chats, as outlined here: https://huggingface.co/blog/llama2#how-to-prompt-llama-2

Note that we make a few changes to the process above: (1) if a system prompt does not exist, we use a very basic default prompt. Otherwise, we convert the system prompt to the relevant instructions in the Llama 2 chat prompt.

Next, we iterate through the the rest of the message stack. We assume that the message stack contains alternating messages from a ‘user’ and ‘assistant’. If your first messages outside of a system prompt is from the ‘assistant’, then we include a ‘Hi.’ messages from the user.

Please note that this is a work in progress. If you have feedback on the process about, email w –at– phaseai –dot– com

Parameters:

messages – The messages to generate a chat completion from.
append_role – The role to append to the end of the prompt. Defaults to None.
prepend_role – The role to prepend to the beginning of the prompt. Defaults to None.

Returns:

The chat completion.

_clean_response(assistant_message: str) → str#

Cleans up the chat response, mainly by stripping whitespace and removing “Assistant:” prepends.

Parameters:: assistant_message – The message received from the API.
Returns:: The chat completion, cleaned up.

complete_chat(messages: List[Message], append_role: str = None, prepend_role: str = None) → str#

Mimics a chat scenario via a list of {“role”: <str>, “content”:<str>} objects.

Parameters:: messages – The messages to generate a chat completion from.
Returns:: The chat completion.

text_completion(prompt: str, stop_sequences: List[str] = None) → str#

Completes text via Replicate’s Llama 2 service.

Parameters:

prompt – The prompt to generate a text completion from.
stop_sequences – The stop sequences to use. Defaults to None.

Returns:

The text completion.

class phasellm.llms.CohereWrapper(apikey: str, model: str = 'xlarge', temperature: float = None, **kwargs: Any)#

Bases: LanguageModelWrapper

Wrapper for Cohere’s API.

Parameters:

apikey – The API key to use.
model – The model to use. Defaults to “xlarge”.
temperature – The temperature to use for the language model.
**kwargs – Keyword arguments to pass to the Cohere API.

__repr__()#: Return repr(self).

_call_model(prompt, stop_sequences: List[str])#

Calls the model with the given prompt.

Parameters:

prompt – The prompt to call the model with.
stop_sequences – The stop sequences to use.

Returns:

The completion.

complete_chat(messages: List[Message], append_role: str = None, prepend_role: str = None) → str#

Mimics a chat scenario via a list of {“role”: <str>, “content”:<str>} objects.

Parameters:

messages – The messages to generate a chat completion from.
append_role – The role to append to the end of the prompt. Defaults to None.
prepend_role – The role to prepend to the beginning of the prompt. Defaults to None.

Returns:

The chat completion.

text_completion(prompt: str, stop_sequences: List[str] = None) → str#

Completes text via Cohere.

Parameters:

prompt – The prompt to generate a text completion from.
stop_sequences – The stop sequences to use. Defaults to None.

Returns:

The text completion.

class phasellm.llms.ChatBot(llm: LanguageModelWrapper, initial_system_prompt: str = 'You are a friendly chatbot assistant.')#

Allows you to have a chat conversation with an LLM wrapper.

In short, it manages the list of {“role”: <str>, “content”:<str>} objects for you, so you don’t have to figure this out. It also interacts directly with the model.

Warning: not all LLMs are trained to use instructions provided in a system prompt.

Parameters:

llm – The LLM wrapper to use for the ChatBot.
initial_system_prompt – The initial system prompt to use. Defaults to “You are a friendly chatbot
chatbot. (assistant.". Use this to change the behavior of the) –

_response(response: str, start_time: float) → str#

Handles a response from the LLM by appending it to the message stack.

Parameters:

response – The response from the LLM.
start_time – The start time of the request.

Returns:

The response.

_streaming_response(response: Generator, start_time: float) → Generator#

Handles a streaming response from the LLM by appending it to the message stack.

Since the response is a generator, we’ll need intercept it so that we can append it to the message stack. (Generators only yield their results once).

Parameters:

response – The response from the LLM.
start_time – The start time of the request.

Returns:

The response.

append_message(role: str, message: str, log_time_seconds: float = None) → None#

Saves a message to the ChatBot message stack.

Parameters:

role – The role of the message.
message – The message.
log_time_seconds – The time it took to generate the message. Defaults to None.

resend() → str | Generator | None#

If the last message in the messages stack (i.e. array of role and content pairs) is from the user, it will resend the message and return the response. It’s similar to erasing the last message in the stack and resending the last user message to the chat model.

This is useful if a model raises an error or if you are building a broader messages stack outside of the actual chatbot.

Returns:: The response from the chatbot if the last message in the stack was from the user. Otherwise, None.

chat(message: str) → str | Generator#

Chats with the chatbot.

Parameters:: message – The message to send to the chatbot.
Returns:: The response from the chatbot. Either a string or a generator, depending on if a streaming LLM wrapper is used.

phasellm.llms#

Module Contents#

Classes#

Functions#

Attributes#

`phasellm.llms`#