module medagentgym.agent
The MedAgent class serves as an agent designed to interact with Electronic Health Record (EHR) environments using Large Language Models (LLMs). It dynamically selects appropriate model configurations, manages conversation history, and processes environment observations to generate relevant actions based on permitted actions.
dataclass LLMConfig
A dataclass defining basic configuration parameters for the LLM:
Args:
model_path
(str): Path to the LLM model.tokenizer_path
(str): Path to the tokenizer.max_length
(int): Maximum token length for generated sequences.device
(str): Device specifcation (e.g., "cpu", "cuda").port
(int): Network port for model deployment (used primarily forvLLM
). The default value is 8000.
class MedAgentClass
Initialization: Initializes the agent with the provided configuration and permitted actions:
agent_config
(dict): Configuration dictionary containing model specifications, retry logic, and other operational parameters.permitted_actions
(list): List of permitted actions the agent can perform.
The initialization dynamically sets up an LLM model based on the specified type (OpenAI
, Azure
, or vLLM
). It also initializes:
- Conversation history management.
- Prompt handling with
DynamicPrompt
. - LLM response parsing.
- Cost tracking.
Supported LLM Types:
OpenAI
: Standard OpenAI models.Azure
: Azure-hosted models with configurable deployment parameters.vLLM
: Locally hosted vLLM instances on specific ports.
Raises an error if an unsupported model type is provided.
class act
act(obs: Any) -> Tuple[str, Dict[str, Any]]
Generates actions based on the observation (obs) from the MedAgentGym environment
Input:
- obs (dict): A dictionary containing keys such as
info
andenv_message>
. Typically includes the current instruction, task goals, or environment feedback.
Output: Returns a tuple
action
(str): Action identified by the LLM based on observations and conversation history.params
(dict): Additional parameters or details associated with the action.
Process Flow
Initialization of Conversation: On first interaction:
- Extracts and formats permitted actions.
- Constructs a prompt from
instruction
,action_definitions
, andaction_formats
. - Initializes the conversation history with system and user messages based on the LLM type.
LLM Interaction:
- Attempts LLM invocation, retrying up to
n_retry
times upon failure. - Logs errors and appends error handling messages to the conversation history.
Response Parsing:
- Parses response using
parse_llm_response
. - Adds parsed actions and response to conversation history.
Error Handling:
- Handles exceptions in parsing and LLM response generation gracefully by retrying and appending error messages.
Error Handling
- implements retries upon encountering exceptions.
- Delays retries based on
retry_delay
specified in the agent configuration. - Logs detailed error messages to aid debugging.
Usage Example
agent_config = {
'llm': {
'model_type': 'Azure',
'model_name': 'gpt-4.1-mini',
'temperature': 0.0,
'max_new_tokens': 8192,
'deployment_name': 'gpt-4.1-mini',
'log_probs': False
},
'n_retry': 3,
'retry_delay': 10
}
permitted_actions = ['retrieve', 'update', 'submit']
agent = EHRAgent(agent_config, permitted_actions)
obs = {
'info': {'instruction': 'Retrieve patient data', 'task_goal': 'Get recent medical records'},
'env_message': 'Patient ID 1234'
}
action, params = agent.act(obs)
agent_config = {
'llm': {
'model_type': 'Azure',
'model_name': 'gpt-4.1-mini',
'temperature': 0.0,
'max_new_tokens': 8192,
'deployment_name': 'gpt-4.1-mini',
'log_probs': False
},
'n_retry': 3,
'retry_delay': 10
}
permitted_actions = ['retrieve', 'update', 'submit']
agent = EHRAgent(agent_config, permitted_actions)
obs = {
'info': {'instruction': 'Retrieve patient data', 'task_goal': 'Get recent medical records'},
'env_message': 'Patient ID 1234'
}
action, params = agent.act(obs)
This documentation covers the essential methods, configuration options, and detailed error-handling processes within the EHRAgent class for effective integration and utilization.