negotiation_platform.core package

Core Platform Components

Central coordination layer for the negotiation platform architecture.

This module provides the core components that orchestrate negotiation sessions, manage AI models, compute performance metrics, and handle configuration. These components work together to provide a comprehensive framework for running and analyzing multi-agent negotiations.

Core Components:
  • LLMManager: AI model loading, management, and interaction coordination

  • GameEngine: Game instance creation, registration, and lifecycle management

  • MetricsCalculator: Performance metric computation and analysis

  • SessionManager: End-to-end negotiation session orchestration

  • ConfigManager: Configuration loading, validation, and management

Architecture:

The core components follow a modular design where each component has clear responsibilities and well-defined interfaces. They can be used independently or composed together for comprehensive negotiation platform functionality.

Example

>>> from negotiation_platform.core import (
...     LLMManager, GameEngine, MetricsCalculator,
...     SessionManager, ConfigManager
... )
>>>
>>> # Initialize core components
>>> config_manager = ConfigManager("config.yaml")
>>> llm_manager = LLMManager(config_manager.get_model_configs())
>>> game_engine = GameEngine()
>>> metrics_calculator = MetricsCalculator()
>>>
>>> # Create session manager with all components
>>> session_manager = SessionManager(
...     llm_manager=llm_manager,
...     game_engine=game_engine,
...     metrics_calculator=metrics_calculator
... )
>>>
>>> # Run complete negotiation session
>>> result = session_manager.run_negotiation(
...     game_type="price_bargaining",
...     players=["model_a", "model_b"]
... )
class negotiation_platform.core.LLMManager(model_configs: dict)[source]

Bases: object

Advanced Large Language Model management system with memory optimization and threading support.

The LLMManager orchestrates the complete lifecycle of AI model instances used in negotiations. It provides intelligent loading, caching, and memory management to efficiently handle multiple models while minimizing GPU memory usage and maximizing performance.

Core Capabilities:
  • Dynamic model registration from configuration files

  • Lazy loading with on-demand model instantiation

  • Intelligent memory management with LRU eviction

  • Thread-safe operations for concurrent negotiation sessions

  • Model aliasing for flexible configuration management

  • Shared model instances to reduce memory footprint

  • Plug-and-play architecture supporting multiple model types

Memory Strategy:

Uses a sophisticated multi-level caching system: 1. Model Registry: Configurations for all available models 2. Shared Instances: Actual loaded models (limited by max_loaded_models) 3. Model Aliases: Mapping from model IDs to shared instances 4. LRU Eviction: Automatic unloading of least recently used models

models

Registry of model ID to wrapper instances.

Type:

Dict[str, BaseLLMModel]

shared_models

Cache of actual loaded model instances.

Type:

Dict[str, BaseLLMModel]

model_aliases

Mapping from model IDs to shared model names.

Type:

Dict[str, str]

model_configs

Configuration for all registered models.

Type:

Dict[str, Dict[str, Any]]

max_loaded_models

Maximum number of models to keep loaded simultaneously.

Type:

int

loaded_order

LRU tracking for loaded models.

Type:

List[str]

manager_lock

Thread synchronization for safe concurrent access.

Type:

threading.Lock

Example

>>> model_configs = {
...     "model_a": {"model_name": "meta-llama/Llama-2-7b-chat-hf", "type": "huggingface"},
...     "model_b": {"model_name": "mistralai/Mistral-7B-Instruct-v0.1", "type": "huggingface"}
... }
>>> llm_manager = LLMManager(model_configs)
>>> response = llm_manager.generate_response("model_a", "Hello, how are you?")
>>> print(response)
"I'm doing well, thank you for asking!"
Thread Safety:

All public methods are thread-safe and can be called concurrently from multiple negotiation sessions without risk of race conditions or memory corruption.

__init__(model_configs: dict)[source]

Initialize the LLMManager with model configurations and setup internal data structures.

Creates a new LLMManager instance with the provided model configurations and initializes all internal data structures for model management, caching, and thread safety. Models are registered but not loaded until first requested.

Parameters:

model_configs (dict) –

Dictionary mapping model IDs to their configurations. Each configuration should contain:

  • model_name (str): HuggingFace model identifier

  • type (str): Model wrapper type (e.g., ‘huggingface’)

  • device (str): Target device (‘cuda:0’, ‘cpu’, etc.)

  • generation_config (dict): Model generation parameters

  • api_key (str): Authentication token (environment variable)

Architecture Setup:
  • models: Registry mapping model IDs to wrapper instances

  • shared_models: Cache of actual loaded model instances (limited size)

  • model_aliases: Mapping from model IDs to shared model names

  • loaded_order: LRU tracking list for memory management

  • manager_lock: Thread synchronization primitive

Memory Management:

The manager is configured to keep a maximum of 2 models loaded simultaneously (configurable via max_loaded_models). This prevents GPU memory exhaustion while maintaining reasonable performance for active negotiations.

Example

>>> configs = {
...     "model_a": {
...         "model_name": "meta-llama/Llama-2-7b-chat-hf",
...         "type": "huggingface",
...         "device": "cuda:0"
...     }
... }
>>> manager = LLMManager(configs)
>>> print(len(manager.model_configs))
1

Note

Models are registered during initialization but not loaded into memory. Loading occurs lazily when the first generation request is made.

register_model(model_id: str, model_config: Dict[str, Any])[source]

Register a new model configuration without loading it into memory.

Adds a model configuration to the registry, making it available for future loading and generation requests. The model is not loaded into memory during registration - this occurs lazily when first requested.

Parameters:
  • model_id (str) – Unique identifier for this model instance. Used to reference the model in generation requests.

  • model_config (Dict[str, Any]) – Configuration dictionary containing: - model_name (str): HuggingFace model identifier - type (str): Model wrapper type (currently supports ‘huggingface’) - device (str): Target device specification - generation_config (dict): Model-specific generation parameters - api_key (str): Authentication token reference

Supported Model Types:
  • huggingface: HuggingFace Transformers models with GPU/CPU support

Registration Process:
  1. Validates model type is supported

  2. Creates model alias mapping for shared instance management

  3. Stores configuration for lazy loading

  4. Sets up wrapper instance placeholder

Example

>>> manager = LLMManager({})
>>> config = {
...     "model_name": "meta-llama/Llama-2-7b-chat-hf",
...     "type": "huggingface",
...     "device": "cuda:0"
... }
>>> manager.register_model("my_model", config)
>>> "my_model" in manager.models
True
Raises:

ValueError – If model_type is not supported (currently only ‘huggingface’).

load_model(model_id: str)[source]

Load a specific model into memory with intelligent resource management.

Loads the specified model into GPU/CPU memory using a smart caching system that prevents memory overflow by automatically evicting least-recently-used models when memory limits are approached. Thread-safe loading ensures no concurrent model initialization conflicts.

Parameters:

model_id (str) – Registered identifier of the model to load into memory.

Returns:

The loaded model wrapper ready for text generation.

Return type:

BaseLLMModel

Memory Management:
  • Maintains LRU cache of loaded models

  • Automatically evicts oldest models when max_loaded_models exceeded

  • Shares model instances across multiple aliases to save memory

  • Thread-safe loading prevents concurrent initialization conflicts

Example

>>> manager = LLMManager(model_configs)
>>> model = manager.load_model("llama-7b-chat")
>>> response = model.generate("Hello, how are you?")
Raises:

ValueError – If model_id is not registered in the manager.

Note

Loading is performed lazily - models are only loaded when explicitly requested or when generating responses.

switch_model(model_id: str)[source]

Switch the active model for subsequent generation requests.

Changes the default model used for generation requests that don’t specify a model_id. Enables dynamic model switching during runtime without requiring manager reconfiguration or restart.

Parameters:

model_id (str) – Registered identifier of the model to make active.

Lazy Loading:

If the target model is not currently loaded in memory, it will be loaded automatically using the intelligent memory management system.

Example

>>> manager.switch_model("llama-7b-chat")
>>> response = manager.get_response("Hello")  # Uses llama-7b-chat
>>> manager.switch_model("mistral-7b")
>>> response = manager.get_response("Hello")  # Uses mistral-7b
Raises:

ValueError – If model_id is not registered in the manager.

Note

Switching models does not immediately unload the previous active model - it remains in memory subject to LRU eviction policies.

get_response(prompt: str, model_id: str | None = None, **kwargs) str[source]

Generate text response using specified model or currently active model.

Primary interface for text generation that handles model selection, lazy loading, and response generation. Automatically loads models on-demand if not already in memory and manages the generation pipeline.

Parameters:
  • prompt (str) – Input text prompt for the language model to process.

  • model_id (str, optional) – Specific model identifier to use for generation. If None, uses the currently active model.

  • **kwargs – Additional generation parameters passed to the model: - temperature (float): Randomness control (0.0-1.0) - max_length (int): Maximum response length - top_p (float): Nucleus sampling parameter - do_sample (bool): Whether to use sampling

Returns:

Generated text response from the language model.

Return type:

str

Lazy Loading:

Models are loaded automatically if not already in memory, making this method fully self-contained for generation requests.

Example

>>> manager = LLMManager(model_configs)
>>> response = manager.get_response("What is AI?", temperature=0.7)
>>> print(response)
"Artificial Intelligence is..."
Raises:
  • RuntimeError – If no model is specified and no active model is set.

  • ValueError – If the specified model_id is not registered.

Note

This method automatically updates the active model if a specific model_id is provided, making it the new default for future calls.

unload_model(model_id: str)[source]

Unload a specific model from GPU/CPU memory to free resources.

Removes the specified model from memory while respecting shared model instances. If multiple model aliases reference the same underlying model, the unload operation will only proceed if no other aliases are using it.

Parameters:

model_id (str) – Registered identifier of the model to unload from memory.

Memory Management:
  • Respects shared model instances across multiple aliases

  • Updates LRU tracking to reflect unloaded state

  • Frees GPU/CPU memory allocated to the model

Example

>>> manager.unload_model("llama-7b-chat")
>>> # Model is removed from memory, alias remains registered
>>> manager.load_model("llama-7b-chat")  # Can be reloaded later
Raises:

ValueError – If model_id is not registered in the manager.

Note

Unloading does not remove the model from the registry - it can be reloaded later via load_model(). This is primarily useful for manual memory management in resource-constrained environments.

list_models() Dict[str, Dict[str, Any]][source]

Retrieve comprehensive information about all registered models.

Provides detailed metadata about all models in the registry including configuration details, loading status, and memory usage information. Useful for debugging, monitoring, and system introspection.

Returns:

Dictionary mapping model IDs to their

detailed information including: - model_name (str): HuggingFace model identifier - is_loaded (bool): Current memory loading status - device (str): Target device specification - configuration details from the model wrapper

Return type:

Dict[str, Dict[str, Any]]

Example

>>> models = manager.list_models()
>>> for model_id, info in models.items():
...     print(f"{model_id}: loaded={info.get('is_loaded', False)}")
llama-7b-chat: loaded=True
mistral-7b: loaded=False

Note

This method is primarily useful for monitoring and debugging rather than normal operation. The information reflects the current state and may change as models are loaded and unloaded.

cleanup()[source]

Unload all models and free all allocated resources.

Performs comprehensive cleanup by unloading all models from memory, clearing caches, and resetting internal state. Should be called when the manager is no longer needed to ensure proper resource cleanup.

Cleanup Operations:
  • Unloads all models from GPU/CPU memory

  • Clears LRU tracking and shared model caches

  • Resets active model state

  • Frees all allocated GPU/CPU resources

Example

>>> manager = LLMManager(model_configs)
>>> # ... use models ...
>>> manager.cleanup()  # Free all resources

Note

After cleanup, the manager instance should not be used further. Create a new manager instance if continued operation is needed. This method is particularly important for long-running applications to prevent memory leaks.

class negotiation_platform.core.GameEngine[source]

Bases: object

Factory and registry for negotiation game types with dynamic instantiation capabilities.

The GameEngine manages the complete lifecycle of game type registration and instance creation. It maintains a registry of available game types and provides type-safe instantiation with configuration validation.

Design Pattern:

Implements the Factory pattern combined with a Registry pattern to enable clean separation between game logic and session orchestration. Game types are registered once and can be instantiated multiple times with different configurations.

Key Features:
  • Type-safe game registration with BaseGame inheritance validation

  • Configuration-driven game instantiation

  • Built-in registration of default game types

  • Runtime game type discovery and metadata retrieval

  • Extensible architecture for custom game implementations

  • Comprehensive error handling for invalid game types

registered_games

Registry mapping game type names to their corresponding game class implementations.

Type:

Dict[str, Type[BaseGame]]

logger

Logger for game engine events and debugging.

Type:

logging.Logger

Example

>>> engine = GameEngine()
>>> print(engine.get_available_games())
['company_car', 'resource_allocation', 'integrative_negotiations']
>>> game = engine.create_game('company_car', {'max_rounds': 5})
>>> isinstance(game, BaseGame)
True
Raises:

ValueError – If attempting to register invalid game class or create unknown game type.

__init__()[source]

Initialize a new GameEngine instance with default game types.

Creates an empty game registry and automatically registers all built-in negotiation game types. The engine is immediately ready for use after initialization.

Default Game Types Registered:
  • company_car: Bilateral vehicle price negotiation

  • resource_allocation: Multi-resource team distribution

  • integrative_negotiations: Multi-issue collaborative negotiation

Example

>>> engine = GameEngine()
>>> 'company_car' in engine.get_available_games()
True
register_game_type(game_name: str, game_class: Type[BaseGame])[source]

Register a new game type with the engine for future instantiation.

Adds a new game class to the registry, making it available for creation via create_game(). Validates that the provided class properly inherits from BaseGame to ensure interface compliance.

Parameters:
  • game_name (str) – The unique identifier for this game type. Used in create_game() calls and must be unique within the registry.

  • game_class (Type[BaseGame]) – The game class to register. Must inherit from BaseGame and implement all required abstract methods.

Raises:

ValueError – If game_class does not inherit from BaseGame.

Example

>>> from my_games import CustomNegotiationGame
>>> engine = GameEngine()
>>> engine.register_game_type("custom_game", CustomNegotiationGame)
>>> "custom_game" in engine.get_available_games()
True
create_game(game_type: str, config: Dict[str, Any]) BaseGame[source]

Create a new game instance of the specified type with given configuration.

Instantiates a registered game class with the provided configuration dictionary. The configuration is passed directly to the game’s constructor and should contain all necessary parameters for that specific game type.

Parameters:
  • game_type (str) – The registered name of the game type to create. Must be a key that exists in the registered_games registry.

  • config (Dict[str, Any]) – Configuration dictionary containing game-specific parameters. The exact structure depends on the target game type: - company_car: max_rounds, batna_decay_rate, etc. - resource_allocation: resource_limits, team_preferences, etc. - integrative_negotiations: issues, priorities, etc.

Returns:

A fully initialized game instance ready for play.

Return type:

BaseGame

Raises:
  • ValueError – If game_type is not registered in the engine.

  • TypeError – If config is missing required parameters for the game type.

Example

>>> engine = GameEngine()
>>> config = {"max_rounds": 5, "batna_decay_rate": 0.1}
>>> game = engine.create_game("company_car", config)
>>> game.max_rounds
5
get_available_games() list[source]

Retrieve list of all registered game type identifiers.

Returns the identifiers of all game types currently registered with the engine, including both built-in games and any custom games that have been added via register_game_type().

Returns:

List of string identifiers for all registered game types.

These identifiers can be used with create_game() to instantiate specific game instances.

Return type:

list

Example

>>> engine = GameEngine()
>>> games = engine.get_available_games()
>>> print(games)
['company_car', 'resource_allocation', 'integrative_negotiations']

Note

The returned list reflects the current state of the registry and will include any custom games registered after initialization.

get_game_info(game_type: str) Dict[str, Any][source]

Retrieve detailed metadata information about a registered game type.

Provides comprehensive introspection capabilities for registered games, returning class information, documentation, and metadata that can be used for debugging, documentation generation, or dynamic game analysis.

Parameters:

game_type (str) – Identifier of the registered game type to inspect.

Returns:

Dictionary containing comprehensive game metadata:
  • name (str): The registered identifier for the game type

  • class (str): Name of the game class implementation

  • description (str): Class docstring or fallback description

Return type:

Dict[str, Any]

Example

>>> engine = GameEngine()
>>> info = engine.get_game_info("company_car")
>>> print(info["name"])
'company_car'
>>> print(info["class"])
'CompanyCarGame'
Raises:

ValueError – If the specified game_type is not registered.

Note

This method is primarily useful for debugging, testing, and documentation generation rather than normal game operation.

class negotiation_platform.core.MetricsCalculator(config: Dict[str, Any] | None = None)[source]

Bases: object

Centralized calculator for negotiation performance metrics with plug-and-play architecture.

This class serves as the main coordinator for metric computation, providing a standardized interface for calculating comprehensive performance metrics across various dimensions of negotiation analysis. It implements dynamic metric registration and supports both built-in and custom metric implementations.

The calculator automatically converts session manager data formats (game_state and actions_history) into metric-compatible formats (GameResult and PlayerAction objects), enabling seamless integration with the broader negotiation platform architecture.

Key Features:
  • Dynamic metric registration and discovery system

  • Automatic data format conversion for metric compatibility

  • Comprehensive error handling with graceful degradation

  • Extensible architecture supporting custom metric implementations

  • Standardized result formatting and aggregation

  • Performance report generation with summary statistics

Architecture:

The calculator maintains a registry of BaseMetric implementations and dynamically instantiates them during initialization. Each metric implements standardized calculation methods, ensuring consistent computation and result formats.

Data Flow: 1. Session data (game_state, actions_history) received from SessionManager 2. Data converted to GameResult and PlayerAction objects for metric compatibility 3. Each registered metric calculates its specific performance measures 4. Individual results aggregated into comprehensive metrics dictionary 5. Optional report generation with summary statistics and analysis

Supported Operations:
  • Registration: Add/remove metrics dynamically during runtime

  • Calculation: Compute all or specific subsets of registered metrics

  • Reporting: Generate comprehensive performance reports with statistics

  • Introspection: List available metrics and retrieve descriptions

config

Configuration parameters for metric calculations.

Type:

Dict[str, Any]

metrics

Registry of available metric implementations.

Type:

Dict[str, BaseMetric]

Example

>>> # Initialize with default metrics
>>> calculator = MetricsCalculator()
>>>
>>> # Register custom metric
>>> custom_metric = MyCustomMetric()
>>> calculator.register_metric("custom_analysis", custom_metric)
>>>
>>> # Calculate all metrics for a session
>>> results = calculator.calculate_all(game_state, actions_history)
>>>
>>> # Generate comprehensive report
>>> report = calculator.generate_report(game_result, player_actions)
>>> print(f"Success rate: {report['success']}")
>>> print(f"Average utility: {report['summary_stats']['utility_surplus']['avg']}")
Raises:
  • ValueError – If invalid game state or actions history data provided.

  • TypeError – If metric registration receives non-BaseMetric implementations.

  • RuntimeError – If critical metric computation failures occur across all metrics.

__init__(config: Dict[str, Any] | None = None)[source]

Initialize MetricsCalculator with configuration and default metrics.

Creates a new metrics calculator instance, sets up the metric registry, and automatically registers default performance metrics for immediate use.

Parameters:

config (Dict[str, Any], optional) – Configuration parameters for metric calculations. Can include thresholds, weights, or calculation preferences. If None, uses empty configuration with default settings.

Example

>>> # Initialize with default configuration
>>> calculator = MetricsCalculator()
>>>
>>> # Initialize with custom configuration
>>> config = {
...     'utility_threshold': 0.8,
...     'risk_tolerance': 0.2,
...     'enable_detailed_logging': True
... }
>>> calculator = MetricsCalculator(config)
register_metric(metric_id: str, metric: BaseMetric)[source]

Register a new metric for dynamic calculation (plug-and-play functionality).

Adds a metric implementation to the calculator’s registry, making it available for computation in all subsequent metric calculations. Supports runtime addition of custom metrics without requiring system restart or reconfiguration.

Parameters:
  • metric_id (str) – Unique identifier for the metric. Used as key in results dictionary and for metric-specific operations. Must be unique across all registered metrics.

  • metric (BaseMetric) – Metric implementation instance that inherits from BaseMetric and implements required calculation methods.

Example

>>> calculator = MetricsCalculator()
>>> custom_metric = MyCustomAnalysisMetric()
>>> calculator.register_metric("custom_analysis", custom_metric)
>>>
>>> # Metric now available in calculations
>>> results = calculator.calculate_all(game_state, actions_history)
>>> custom_result = results["custom_analysis"]
Raises:
  • TypeError – If metric is not an instance of BaseMetric.

  • ValueError – If metric_id is already registered (use unregister first).

unregister_metric(metric_id: str)[source]

Remove a metric from the calculation registry.

Removes a previously registered metric from the calculator’s registry, preventing it from being included in future metric calculations. Useful for dynamically adjusting analysis scope or removing problematic metrics.

Parameters:

metric_id (str) – Identifier of the metric to remove. Must match the identifier used during registration.

Example

>>> calculator = MetricsCalculator()
>>> calculator.unregister_metric("risk_minimization")
>>>
>>> # risk_minimization no longer included in calculations
>>> results = calculator.calculate_all(game_state, actions_history)
>>> # 'risk_minimization' key will not be present in results

Note

Silently ignores attempts to unregister non-existent metrics. No error is raised if the metric_id is not found in the registry.

calculate_all(game_state: Dict[str, Any], actions_history: List[Dict[str, Any]]) Dict[str, Dict[str, float]][source]

Calculate all registered metrics using session manager data formats.

Primary interface for metric calculation that accepts data directly from SessionManager. Automatically converts session data formats into metric-compatible objects and computes all registered metrics in a single operation.

This method serves as the main entry point for comprehensive performance analysis, handling data transformation, metric computation, and result aggregation in a unified workflow.

Parameters:
  • game_state (Dict[str, Any]) – Final game state from completed negotiation session. Expected to contain keys like ‘game_type’, ‘players’, ‘agreement_reached’, ‘final_utilities’, ‘current_round’, and game-specific data.

  • actions_history (List[Dict[str, Any]]) – Chronological log of all actions taken during the negotiation. Each entry should contain ‘round’ and ‘actions’ keys with player actions for that round.

Returns:

Nested dictionary where top-level keys are metric identifiers and values are dictionaries mapping player IDs to their computed metric values.

Return type:

Dict[str, Dict[str, float]]

Example

>>> game_state = {
...     'game_type': 'price_bargaining',
...     'players': ['Player1', 'Player2'],
...     'agreement_reached': True,
...     'final_utilities': {'Player1': 85, 'Player2': 75},
...     'current_round': 3
... }
>>> actions_history = [
...     {'round': 1, 'actions': {'Player1': {'type': 'offer', 'value': 100}}},
...     {'round': 2, 'actions': {'Player2': {'type': 'counteroffer', 'value': 90}}},
...     {'round': 3, 'actions': {'Player1': {'type': 'accept'}}}
... ]
>>> results = calculator.calculate_all(game_state, actions_history)
>>> print(results)
{
    'utility_surplus': {'Player1': 25.0, 'Player2': 15.0},
    'risk_minimization': {'Player1': 0.8, 'Player2': 0.6},
    'deadline_sensitivity': {'Player1': 0.9, 'Player2': 0.7},
    'feasibility': {'Player1': 1.0, 'Player2': 1.0}
}
Raises:
  • ValueError – If game_state or actions_history contain invalid or missing data.

  • RuntimeError – If data conversion or metric computation fails critically.

calculate_all_metrics(game_result: GameResult, actions_history: List[PlayerAction]) Dict[str, Dict[str, float]][source]

Calculate all registered metrics using GameResult and PlayerAction objects.

Core computation method that executes all registered metrics against standardized data objects. Provides comprehensive error handling to ensure partial results are returned even if individual metrics fail.

Parameters:
  • game_result (GameResult) – Standardized game outcome data containing final scores, winner information, and game metadata.

  • actions_history (List[PlayerAction]) – Chronological sequence of player actions taken during the negotiation session.

Returns:

Nested dictionary where top-level keys are metric identifiers and values are player-to-score mappings.

Return type:

Dict[str, Dict[str, float]]

Example

>>> game_result = GameResult(game_id="test", players=["P1", "P2"], ...)
>>> actions = [PlayerAction(player_id="P1", action_type="offer", ...)]
>>> results = calculator.calculate_all_metrics(game_result, actions)
>>> print(results["utility_surplus"]["P1"])
25.0

Note

Failed metric calculations are logged and replaced with default values (0.0 for each player) to ensure consistent result structure.

calculate_specific_metrics(metric_ids: List[str], game_result: GameResult, actions_history: List[PlayerAction]) Dict[str, Dict[str, float]][source]

Calculate only specified metrics for targeted performance analysis.

Computes a subset of registered metrics based on provided identifiers. Useful for focused analysis or when computational resources are limited and only specific metrics are required.

Parameters:
  • metric_ids (List[str]) – List of metric identifiers to calculate. Must match registered metric IDs. Unknown metrics are skipped.

  • game_result (GameResult) – Standardized game outcome data.

  • actions_history (List[PlayerAction]) – Player action sequence.

Returns:

Results dictionary containing only requested metrics. Structure matches calculate_all_metrics output.

Return type:

Dict[str, Dict[str, float]]

Example

>>> # Calculate only utility and risk metrics
>>> specific_results = calculator.calculate_specific_metrics(
...     ["utility_surplus", "risk_minimization"],
...     game_result,
...     actions_history
... )
>>> print(list(specific_results.keys()))
['utility_surplus', 'risk_minimization']

Note

Unregistered metric IDs are logged as warnings and skipped. Failed calculations are replaced with default values.

get_metric_descriptions() Dict[str, str][source]

Retrieve descriptions of all registered metrics for documentation and analysis.

Provides human-readable descriptions of each registered metric, useful for generating documentation, user interfaces, and analytical reports that explain what each metric measures.

Returns:

Mapping of metric identifiers to their descriptive text. Each description explains what the metric measures and its significance.

Return type:

Dict[str, str]

Example

>>> descriptions = calculator.get_metric_descriptions()
>>> print(descriptions["utility_surplus"])
"Measures the utility gained above BATNA baseline for each player"
list_metrics() List[str][source]

List all currently registered metric identifiers.

Returns the identifiers of all metrics currently available for calculation. Useful for introspection, validation, and dynamic metric selection.

Returns:

List of registered metric identifiers that can be used with calculate_specific_metrics or other metric operations.

Return type:

List[str]

Example

>>> available_metrics = calculator.list_metrics()
>>> print(available_metrics)
['utility_surplus', 'risk_minimization', 'deadline_sensitivity', 'feasibility']
generate_report(game_result: GameResult, actions_history: List[PlayerAction]) Dict[str, Any][source]

Generate comprehensive performance report with metrics and summary statistics.

Creates a detailed analytical report that includes all metric calculations plus derived summary statistics such as averages, ranges, and totals. Ideal for comprehensive performance analysis and comparative studies.

Parameters:
  • game_result (GameResult) – Final game outcome data.

  • actions_history (List[PlayerAction]) – Complete sequence of player actions.

Returns:

Comprehensive report containing:
  • game_id: Game identifier

  • players: List of participating players

  • success: Whether negotiation was successful

  • total_rounds: Number of negotiation rounds

  • metrics: Full metric calculation results

  • summary_stats: Derived statistics (avg, min, max, total) per metric

Return type:

Dict[str, Any]

Example

>>> report = calculator.generate_report(game_result, actions_history)
>>> print(report["summary_stats"]["utility_surplus"]["avg"])
20.0
>>> print(report["success"])
True
>>> print(f"Game completed in {report['total_rounds']} rounds")
Game completed in 3 rounds

Note

Summary statistics are calculated only for metrics that return numeric values. Non-numeric metrics are included in the metrics section but excluded from summary statistics.

class negotiation_platform.core.SessionManager(llm_manager: LLMManager, game_engine: GameEngine, metrics_calculator: MetricsCalculator, *, max_turn_retries: int = 3, logger: Logger | None = None)[source]

Bases: object

High-level driver that orchestrates complete negotiation sessions between AI agents.

This class coordinates all aspects of a negotiation session, from game initialization through completion and metrics calculation. It serves as the central orchestrator that manages turn-based interactions, validates actions, handles errors, and computes final performance metrics.

Key Responsibilities:
  • Game bootstrap and initial state creation

  • Turn-based scheduling and player action coordination

  • Action validation using game-specific rules

  • State transition management and action history logging

  • Game termination detection based on stopping criteria

  • Comprehensive metrics calculation via MetricsCalculator

  • Fault tolerance for malformed LLM outputs with retry logic

  • Winner determination and performance analysis

Workflow:
  1. Initialize game instance with specified configuration

  2. Establish initial game state and player assignments

  3. Coordinate alternating turns between players

  4. Validate each action against game rules

  5. Process valid actions and update game state

  6. Log all actions and state transitions

  7. Check termination conditions after each round

  8. Calculate comprehensive metrics upon completion

  9. Return enriched results with performance analysis

llm_manager

Manages AI model loading and interaction.

Type:

LLMManager

game_engine

Creates and manages game instances.

Type:

GameEngine

metrics_calculator

Computes performance metrics.

Type:

MetricsCalculator

max_turn_retries

Maximum retry attempts for invalid actions.

Type:

int

logger

Logger for session events and debugging.

Type:

logging.Logger

Example

>>> llm_manager = LLMManager(model_configs)
>>> game_engine = GameEngine()
>>> metrics_calc = MetricsCalculator()
>>> session = SessionManager(llm_manager, game_engine, metrics_calc)
>>> result = session.run_negotiation(
...     game_type="price_bargaining",
...     players=["model_a", "model_b"],
...     game_config={"max_rounds": 5}
... )
>>> print(result['agreement_reached'])
True
Raises:
  • ValueError – If invalid game type or player configuration provided.

  • RuntimeError – If session execution fails due to unrecoverable errors.

__init__(llm_manager: LLMManager, game_engine: GameEngine, metrics_calculator: MetricsCalculator, *, max_turn_retries: int = 3, logger: Logger | None = None) None[source]

Initialize a new SessionManager instance with required components.

Creates a session manager that orchestrates negotiations between AI agents using the provided LLM manager, game engine, and metrics calculator.

Parameters:
  • llm_manager (LLMManager) – Manager for loading and interacting with AI models. Must be configured with the models that will participate in negotiations.

  • game_engine (GameEngine) – Engine for creating and managing game instances. Should have all required game types registered.

  • metrics_calculator (MetricsCalculator) – Calculator for computing performance metrics. Will be used to analyze negotiation outcomes and player performance.

  • max_turn_retries (int, optional) – Maximum number of retry attempts when a player provides an invalid action. Defaults to 3. Higher values increase robustness but may slow down sessions with consistently invalid players.

  • logger (logging.Logger, optional) – Logger for session events and debugging. If None, creates a new logger using the class name.

Example

>>> llm_manager = LLMManager({"model_a": model_config})
>>> game_engine = GameEngine()
>>> metrics_calc = MetricsCalculator()
>>> session = SessionManager(
...     llm_manager=llm_manager,
...     game_engine=game_engine,
...     metrics_calculator=metrics_calc,
...     max_turn_retries=5  # Allow more retries for unstable models
... )
run_negotiation(*, game_type: str, players: List[str], game_config: Dict[str, Any] | None = None, session_id: str | None = None, seed_messages: Dict[str, str] | None = None) Dict[str, Any][source]

Execute a full negotiation session and return the enriched game result that also contains computed metrics and a complete action log.

Parameters:
  • game_type (the registered key inside GameEngine (e.g. "company_car").)

  • players (ordered list of model names (length == 2 for bilateral games).)

  • game_config (per-game configuration dictionary; if None, defaults are used.)

  • session_id (optional external identifier; autogenerated if omitted.)

  • seed_messages (dict of {player_name: system_prompt} to prime behaviour.)

Returns:

result

  • raw game_state at termination

  • ”actions_history”: chronological list of {round, actions} dicts

  • ”metrics”: Dict[str, Dict[str, float]] (metric → player → value)

  • ”session_metadata”: misc run info (IDs, timestamps, etc.)

Return type:

Dict containing

class negotiation_platform.core.ConfigManager(config_dir: str = 'configs')[source]

Bases: object

Centralized configuration management system for the negotiation platform.

The ConfigManager handles all aspects of configuration loading, parsing, and access for the negotiation platform. It provides a unified interface for accessing model configurations, game settings, and platform parameters from YAML files.

Architecture:

Uses a hierarchical configuration system with separate files for different concerns (models, games, platform). All configurations are loaded at initialization and cached for efficient access.

Key Features:
  • Automatic discovery and loading of YAML configuration files

  • Type-safe configuration access with proper error handling

  • Centralized configuration validation and defaults

  • Efficient caching of parsed configurations

  • Graceful handling of missing or invalid configuration files

  • Hierarchical organization supporting different configuration scopes

config_dir

Directory path containing configuration files.

Type:

Path

_configs

Cache of loaded configurations organized by configuration type (model_configs, game_configs, etc.).

Type:

Dict[str, Dict[str, Any]]

Configuration Files:
  • model_configs.yaml: AI model definitions, API keys, and parameters

  • game_configs.yaml: Game-specific rules, defaults, and constraints

  • platform_config.yaml: Global platform settings and behaviors

Example

>>> config_manager = ConfigManager()
>>> model_config = config_manager.get_model_config("model_a")
>>> print(model_config['model_name'])
'meta-llama/Llama-2-7b-chat-hf'
>>> game_config = config_manager.get_game_config("company_car")
>>> print(game_config['max_rounds'])
5
Raises:
  • FileNotFoundError – If configuration directory doesn’t exist.

  • yaml.YAMLError – If configuration files contain invalid YAML syntax.

__init__(config_dir: str = 'configs')[source]

Initialize the ConfigManager with automatic configuration loading.

Creates a new ConfigManager instance and immediately loads all available configuration files from the specified directory. The directory path is resolved relative to the negotiation_platform package root.

Parameters:

config_dir (str, optional) – Name of the configuration directory relative to the negotiation_platform root. Defaults to “configs”.

Configuration Loading Order:
  1. model_configs.yaml - AI model configurations

  2. game_configs.yaml - Game-specific settings

  3. platform_config.yaml - Global platform settings

Example

>>> # Use default config directory
>>> config_manager = ConfigManager()
>>> # Use custom config directory
>>> config_manager = ConfigManager("custom_configs")

Note

Missing configuration files are handled gracefully - the manager will continue loading other files and provide empty dictionaries for missing configurations rather than failing completely.

load_all_configs()[source]

Load all platform configuration files into memory for efficient access.

Discovers and loads all recognized configuration files from the config directory. Each file is parsed as YAML and stored in the internal configuration cache with the filename (minus extension) as the key.

Configuration Files Loaded:
  • model_configs.yaml: AI model definitions and API configurations

  • game_configs.yaml: Game-specific rules and default parameters

  • platform_config.yaml: Global platform settings and behaviors

Error Handling:

Missing files are skipped gracefully without throwing exceptions. Invalid YAML files will log errors but won’t prevent other files from loading successfully.

Example

>>> config_manager = ConfigManager()
>>> config_manager.load_all_configs()  # Called automatically in __init__
>>> print(list(config_manager._configs.keys()))
['model_configs', 'game_configs', 'platform_config']

Note

This method is called automatically during initialization and typically doesn’t need to be called directly unless reloading configurations.

load_config(config_path: Path) Dict[str, Any][source]

Load and parse a YAML configuration file from the specified path.

Reads a YAML configuration file and parses it into a Python dictionary. Handles file reading errors gracefully by returning empty dictionaries rather than crashing the configuration system.

Parameters:

config_path (Path) – Path object pointing to the YAML configuration file to be loaded and parsed.

Returns:

Parsed configuration dictionary containing all

key-value pairs from the YAML file. Returns empty dictionary if file is missing, empty, or contains parsing errors.

Return type:

Dict[str, Any]

Error Handling:
  • Missing files: Returns empty dictionary

  • Invalid YAML syntax: Logs error and returns empty dictionary

  • Permission errors: Logs error and returns empty dictionary

Example

>>> config_path = Path("configs/model_configs.yaml")
>>> config = manager.load_config(config_path)
>>> print(config.keys())
dict_keys(['model_a', 'model_b', 'model_c'])

Note

This method is used internally by load_all_configs() and typically doesn’t need to be called directly by users.

get_config(config_type: str) Dict[str, Any][source]

Retrieve complete configuration dictionary for a specific configuration type.

Provides access to loaded configuration data by configuration type name. Returns the complete configuration dictionary for the specified type, enabling direct access to all settings within that configuration category.

Parameters:

config_type (str) – Type identifier for the configuration to retrieve. Common types include ‘model_configs’, ‘game_configs’, and ‘platform_config’ corresponding to their respective YAML files.

Returns:

Complete configuration dictionary for the specified

type. Returns empty dictionary if the configuration type is not found or failed to load.

Return type:

Dict[str, Any]

Example

>>> models_config = manager.get_config('model_configs')
>>> print(list(models_config.keys()))
['model_a', 'model_b', 'model_c']
>>> platform_config = manager.get_config('platform_config')
>>> print(platform_config.get('timeout_seconds', 30))
30

Note

For model-specific or game-specific configurations, use the more specialized get_model_config() or get_game_config() methods instead.

get_model_config(model_name: str) Dict[str, Any] | None[source]

Retrieve configuration for a specific AI model by name.

Looks up the configuration for the specified model from the loaded model_configs.yaml file. Returns the complete configuration dictionary for that model including API keys, model parameters, and loading settings.

Parameters:

model_name (str) – The registered name/identifier of the model to retrieve. Must match a key in the model_configs.yaml file.

Returns:

The model’s configuration dictionary if found,

None if the model name doesn’t exist in the configuration.

Return type:

Optional[Dict[str, Any]]

Configuration Contents:
Typical model configuration includes:
  • model_name: HuggingFace model identifier

  • type: Model wrapper type (usually ‘huggingface’)

  • device: GPU/CPU device specification

  • generation_config: Model generation parameters

  • api_key: Authentication token (using environment variables)

Example

>>> config_manager = ConfigManager()
>>> config = config_manager.get_model_config("model_a")
>>> print(config['model_name'])
'meta-llama/Llama-2-7b-chat-hf'
>>> print(config['device'])
'cuda:0'
Returns:

If model_name is not found in the configuration.

Return type:

None

get_game_config(game_name: str) Dict[str, Any] | None[source]

Retrieve configuration for a specific negotiation game by name.

Looks up the configuration for the specified game from the loaded game_configs.yaml file. Returns the complete configuration dictionary containing game-specific rules, parameters, and default settings.

Parameters:

game_name (str) – The registered name/identifier of the game to retrieve. Must match a key in the game_configs.yaml file. Common values include “company_car”, “resource_allocation”, “integrative_negotiations”.

Returns:

The game’s configuration dictionary if found,

None if the game name doesn’t exist in the configuration.

Return type:

Optional[Dict[str, Any]]

Configuration Contents:
Typical game configuration includes:
  • max_rounds: Maximum number of negotiation rounds

  • batna_values: Best Alternative to Negotiated Agreement values

  • resource_limits: Available resources for allocation games

  • scoring_weights: Point values for different outcomes

  • time_limits: Round or session time constraints

Example

>>> config_manager = ConfigManager()
>>> config = config_manager.get_game_config("company_car")
>>> print(config['max_rounds'])
5
>>> print(config['buyer_batna'])
41000
Returns:

If game_name is not found in the configuration.

Return type:

None

Submodules

Base Metric Module

Base Metric Interface

Defines the abstract interface for all negotiation performance metrics within the platform. This interface enables plug-and-play metric integration with consistent API and type safety across all metric implementations.

Key Features:
  • Abstract base class ensuring consistent metric API

  • Type-safe metric calculation with standardized inputs

  • Configurable metrics with parameter support

  • Self-documenting metrics with description methods

  • Plug-and-play architecture for custom metric development

Architecture:

All metrics inherit from BaseMetric and implement the abstract methods. This ensures consistent behavior and enables dynamic metric registration and calculation within the MetricsCalculator system.

Metric Types:

The platform supports various metric categories: - Utility Metrics: Measure player value extraction and satisfaction - Efficiency Metrics: Analyze negotiation process effectiveness - Strategic Metrics: Evaluate tactical decision-making quality - Behavioral Metrics: Assess communication and interaction patterns

class negotiation_platform.core.base_metric.BaseMetric(metric_name: str, config: Dict[str, Any] | None = None)[source]

Bases: ABC

Abstract base class defining the interface for all negotiation performance metrics.

The BaseMetric class provides the foundation for all metric implementations within the negotiation platform. It defines a consistent API that enables plug-and-play metric integration and ensures type safety across the metrics system.

Design Pattern:

Implements the Template Method pattern where concrete metrics provide specific calculation logic while inheriting common infrastructure and interface methods. This enables consistent metric behavior and easy integration with the MetricsCalculator system.

Key Responsibilities:
  • Define abstract interface for metric calculation

  • Provide common metric infrastructure (naming, configuration)

  • Ensure type safety with standardized input/output types

  • Enable self-documentation through description methods

  • Support configurable metrics with parameter dictionaries

metric_name

Human-readable name for this metric.

Type:

str

config

Configuration parameters for metric behavior.

Type:

Dict[str, Any]

Implementation Requirements:

Subclasses must implement: - calculate(): Core metric computation logic - get_description(): Human-readable metric explanation

Example

>>> class CustomMetric(BaseMetric):
...     def calculate(self, game_result, actions_history):
...         return {player: 1.0 for player in game_result.players}
...     def get_description(self):
...         return "Example custom metric"
>>> metric = CustomMetric("custom", {"param": "value"})
>>> result = metric.calculate(game_result, actions_history)

Note

This is an abstract class and cannot be instantiated directly. Use concrete implementations like UtilitySurplusMetric, FeasibilityMetric, etc.

__init__(metric_name: str, config: Dict[str, Any] | None = None)[source]

Initialize a new metric instance with name and configuration.

Sets up the basic infrastructure for a metric implementation including the human-readable name and any configuration parameters needed for metric calculation.

Parameters:
  • metric_name (str) – Human-readable name for this metric. Used in reports, logging, and metric identification. Should be descriptive and unique.

  • config (Dict[str, Any], optional) –

    Configuration dictionary containing metric-specific parameters. Common parameters include:

    • weights: Importance weights for different components

    • thresholds: Cutoff values for categorical metrics

    • normalization: Scaling parameters for metric values

    • enabled_features: Feature flags for metric behavior

    Defaults to empty dictionary if not provided.

Example

>>> config = {"weight": 0.7, "threshold": 0.5}
>>> metric = CustomMetric("utility_surplus", config)
>>> print(metric.metric_name)
'utility_surplus'
>>> print(metric.config['weight'])
0.7

Note

Configuration parameters are metric-specific and should be documented in each concrete metric implementation.

abstract calculate(game_result: GameResult, actions_history: List[PlayerAction]) Dict[str, float][source]

Calculate the metric value for each player based on game outcome and actions.

This is the core method that concrete metrics must implement. It analyzes the final game state and complete action history to compute metric values for all players who participated in the negotiation.

Parameters:
  • game_result (GameResult) – Complete game outcome containing: - game_id: Unique identifier for the game session - players: List of all player identifiers - winner: Winning player(s) if applicable - final_scores: Player scores from game-specific scoring - total_rounds: Number of rounds played - game_data: Complete final game state - success: Whether the game concluded successfully

  • actions_history (List[PlayerAction]) – Chronological list of all actions taken during the negotiation. Each PlayerAction contains: - player_id: Which player took the action - action_type: Type of action (proposal, acceptance, etc.) - action_data: Action-specific data and parameters - timestamp: When the action was taken - round_number: Which round the action occurred in

Returns:

Dictionary mapping each player ID to their metric value.

All players from game_result.players must be included in the result. Values should be normalized to a consistent scale when possible.

Return type:

Dict[str, float]

Implementation Guidelines:
  • Handle edge cases gracefully (no actions, early termination, etc.)

  • Return consistent value ranges for comparison across games

  • Use configuration parameters to customize calculation behavior

  • Provide meaningful values even for unsuccessful negotiations

  • Consider both final outcomes and process quality in calculations

Example

>>> def calculate(self, game_result, actions_history):
...     values = {}
...     for player in game_result.players:
...         # Calculate player-specific metric value
...         values[player] = self._compute_player_value(player, game_result)
...     return values
Raises:

NotImplementedError – If called on the abstract base class.

abstract get_description() str[source]

Provide a human-readable description of what this metric measures.

Returns a clear, concise explanation of the metric’s purpose, calculation method, and interpretation. This description is used in reports, documentation, and user interfaces to help users understand metric meanings.

Returns:

Human-readable description explaining:
  • What aspect of negotiation performance the metric measures

  • How the metric is calculated (high-level approach)

  • How to interpret the metric values (higher/lower is better)

  • Any important limitations or assumptions

Return type:

str

Description Guidelines:
  • Use clear, non-technical language when possible

  • Explain the business/strategic meaning, not just the calculation

  • Mention the scale and interpretation of values

  • Include any important caveats or limitations

  • Keep descriptions concise but informative (1-3 sentences)

Example

>>> def get_description(self):
...     return ("Measures the utility surplus each player achieved "
...             "compared to their BATNA (Best Alternative to Negotiated Agreement). "
...             "Higher values indicate better negotiation outcomes for the player.")
Common Metric Categories:
  • Utility Metrics: “Measures player value extraction and satisfaction”

  • Efficiency Metrics: “Evaluates negotiation process effectiveness”

  • Strategic Metrics: “Assesses tactical decision-making quality”

  • Behavioral Metrics: “Analyzes communication and interaction patterns”

Raises:

NotImplementedError – If called on the abstract base class.

get_name() str[source]

Return the human-readable name of this metric.

Provides access to the metric’s display name as specified during initialization. This name is used for identification in reports, logging, metric registration, and user interfaces.

Returns:

The human-readable name of this metric as provided during initialization.

Return type:

str

Example

>>> metric = CustomMetric("utility_surplus", {})
>>> print(metric.get_name())
'utility_surplus'

Note

This is a simple accessor method that returns the metric_name attribute. The name is set during initialization and remains constant throughout the metric’s lifecycle.

Configuration Manager Module

Configuration Manager

The ConfigManager provides centralized configuration management for the entire negotiation platform. It handles loading, parsing, and accessing YAML configuration files for models, games, and platform settings.

Key Features:
  • Centralized YAML configuration loading and parsing

  • Type-safe configuration access with fallback defaults

  • Hierarchical configuration organization (models, games, platform)

  • Automatic configuration file discovery and loading

  • Error-tolerant loading with graceful fallbacks

Configuration Structure:
  • model_configs.yaml: AI model definitions and parameters

  • game_configs.yaml: Game-specific settings and rules

  • platform_config.yaml: Global platform settings and defaults

Usage Pattern:

The ConfigManager follows a lazy-loading pattern where all configuration files are loaded during initialization, then cached for efficient access throughout the application lifecycle.

class negotiation_platform.core.config_manager.ConfigManager(config_dir: str = 'configs')[source]

Bases: object

Centralized configuration management system for the negotiation platform.

The ConfigManager handles all aspects of configuration loading, parsing, and access for the negotiation platform. It provides a unified interface for accessing model configurations, game settings, and platform parameters from YAML files.

Architecture:

Uses a hierarchical configuration system with separate files for different concerns (models, games, platform). All configurations are loaded at initialization and cached for efficient access.

Key Features:
  • Automatic discovery and loading of YAML configuration files

  • Type-safe configuration access with proper error handling

  • Centralized configuration validation and defaults

  • Efficient caching of parsed configurations

  • Graceful handling of missing or invalid configuration files

  • Hierarchical organization supporting different configuration scopes

config_dir

Directory path containing configuration files.

Type:

Path

_configs

Cache of loaded configurations organized by configuration type (model_configs, game_configs, etc.).

Type:

Dict[str, Dict[str, Any]]

Configuration Files:
  • model_configs.yaml: AI model definitions, API keys, and parameters

  • game_configs.yaml: Game-specific rules, defaults, and constraints

  • platform_config.yaml: Global platform settings and behaviors

Example

>>> config_manager = ConfigManager()
>>> model_config = config_manager.get_model_config("model_a")
>>> print(model_config['model_name'])
'meta-llama/Llama-2-7b-chat-hf'
>>> game_config = config_manager.get_game_config("company_car")
>>> print(game_config['max_rounds'])
5
Raises:
  • FileNotFoundError – If configuration directory doesn’t exist.

  • yaml.YAMLError – If configuration files contain invalid YAML syntax.

__init__(config_dir: str = 'configs')[source]

Initialize the ConfigManager with automatic configuration loading.

Creates a new ConfigManager instance and immediately loads all available configuration files from the specified directory. The directory path is resolved relative to the negotiation_platform package root.

Parameters:

config_dir (str, optional) – Name of the configuration directory relative to the negotiation_platform root. Defaults to “configs”.

Configuration Loading Order:
  1. model_configs.yaml - AI model configurations

  2. game_configs.yaml - Game-specific settings

  3. platform_config.yaml - Global platform settings

Example

>>> # Use default config directory
>>> config_manager = ConfigManager()
>>> # Use custom config directory
>>> config_manager = ConfigManager("custom_configs")

Note

Missing configuration files are handled gracefully - the manager will continue loading other files and provide empty dictionaries for missing configurations rather than failing completely.

load_all_configs()[source]

Load all platform configuration files into memory for efficient access.

Discovers and loads all recognized configuration files from the config directory. Each file is parsed as YAML and stored in the internal configuration cache with the filename (minus extension) as the key.

Configuration Files Loaded:
  • model_configs.yaml: AI model definitions and API configurations

  • game_configs.yaml: Game-specific rules and default parameters

  • platform_config.yaml: Global platform settings and behaviors

Error Handling:

Missing files are skipped gracefully without throwing exceptions. Invalid YAML files will log errors but won’t prevent other files from loading successfully.

Example

>>> config_manager = ConfigManager()
>>> config_manager.load_all_configs()  # Called automatically in __init__
>>> print(list(config_manager._configs.keys()))
['model_configs', 'game_configs', 'platform_config']

Note

This method is called automatically during initialization and typically doesn’t need to be called directly unless reloading configurations.

load_config(config_path: Path) Dict[str, Any][source]

Load and parse a YAML configuration file from the specified path.

Reads a YAML configuration file and parses it into a Python dictionary. Handles file reading errors gracefully by returning empty dictionaries rather than crashing the configuration system.

Parameters:

config_path (Path) – Path object pointing to the YAML configuration file to be loaded and parsed.

Returns:

Parsed configuration dictionary containing all

key-value pairs from the YAML file. Returns empty dictionary if file is missing, empty, or contains parsing errors.

Return type:

Dict[str, Any]

Error Handling:
  • Missing files: Returns empty dictionary

  • Invalid YAML syntax: Logs error and returns empty dictionary

  • Permission errors: Logs error and returns empty dictionary

Example

>>> config_path = Path("configs/model_configs.yaml")
>>> config = manager.load_config(config_path)
>>> print(config.keys())
dict_keys(['model_a', 'model_b', 'model_c'])

Note

This method is used internally by load_all_configs() and typically doesn’t need to be called directly by users.

get_config(config_type: str) Dict[str, Any][source]

Retrieve complete configuration dictionary for a specific configuration type.

Provides access to loaded configuration data by configuration type name. Returns the complete configuration dictionary for the specified type, enabling direct access to all settings within that configuration category.

Parameters:

config_type (str) – Type identifier for the configuration to retrieve. Common types include ‘model_configs’, ‘game_configs’, and ‘platform_config’ corresponding to their respective YAML files.

Returns:

Complete configuration dictionary for the specified

type. Returns empty dictionary if the configuration type is not found or failed to load.

Return type:

Dict[str, Any]

Example

>>> models_config = manager.get_config('model_configs')
>>> print(list(models_config.keys()))
['model_a', 'model_b', 'model_c']
>>> platform_config = manager.get_config('platform_config')
>>> print(platform_config.get('timeout_seconds', 30))
30

Note

For model-specific or game-specific configurations, use the more specialized get_model_config() or get_game_config() methods instead.

get_model_config(model_name: str) Dict[str, Any] | None[source]

Retrieve configuration for a specific AI model by name.

Looks up the configuration for the specified model from the loaded model_configs.yaml file. Returns the complete configuration dictionary for that model including API keys, model parameters, and loading settings.

Parameters:

model_name (str) – The registered name/identifier of the model to retrieve. Must match a key in the model_configs.yaml file.

Returns:

The model’s configuration dictionary if found,

None if the model name doesn’t exist in the configuration.

Return type:

Optional[Dict[str, Any]]

Configuration Contents:
Typical model configuration includes:
  • model_name: HuggingFace model identifier

  • type: Model wrapper type (usually ‘huggingface’)

  • device: GPU/CPU device specification

  • generation_config: Model generation parameters

  • api_key: Authentication token (using environment variables)

Example

>>> config_manager = ConfigManager()
>>> config = config_manager.get_model_config("model_a")
>>> print(config['model_name'])
'meta-llama/Llama-2-7b-chat-hf'
>>> print(config['device'])
'cuda:0'
Returns:

If model_name is not found in the configuration.

Return type:

None

get_game_config(game_name: str) Dict[str, Any] | None[source]

Retrieve configuration for a specific negotiation game by name.

Looks up the configuration for the specified game from the loaded game_configs.yaml file. Returns the complete configuration dictionary containing game-specific rules, parameters, and default settings.

Parameters:

game_name (str) – The registered name/identifier of the game to retrieve. Must match a key in the game_configs.yaml file. Common values include “company_car”, “resource_allocation”, “integrative_negotiations”.

Returns:

The game’s configuration dictionary if found,

None if the game name doesn’t exist in the configuration.

Return type:

Optional[Dict[str, Any]]

Configuration Contents:
Typical game configuration includes:
  • max_rounds: Maximum number of negotiation rounds

  • batna_values: Best Alternative to Negotiated Agreement values

  • resource_limits: Available resources for allocation games

  • scoring_weights: Point values for different outcomes

  • time_limits: Round or session time constraints

Example

>>> config_manager = ConfigManager()
>>> config = config_manager.get_game_config("company_car")
>>> print(config['max_rounds'])
5
>>> print(config['buyer_batna'])
41000
Returns:

If game_name is not found in the configuration.

Return type:

None

Game Engine Module

Game Engine

The GameEngine serves as a factory and registry for all negotiation game types. It manages game registration, instantiation, and provides discovery capabilities for available game types within the platform.

Key Features:
  • Dynamic game type registration and discovery

  • Type-safe game instantiation with configuration validation

  • Built-in registry of default negotiation games

  • Extensible architecture for custom game implementations

  • Comprehensive game metadata and information retrieval

Supported Game Types:
  • company_car: Bilateral price negotiation for vehicle purchases

  • resource_allocation: Multi-resource distribution between teams

  • integrative_negotiations: Multi-issue collaborative negotiations

Architecture:

The GameEngine follows the Factory pattern, maintaining a registry of game classes mapped to string identifiers. This allows for clean separation between game logic and session management while enabling runtime game type selection and dynamic extension.

class negotiation_platform.core.game_engine.GameEngine[source]

Bases: object

Factory and registry for negotiation game types with dynamic instantiation capabilities.

The GameEngine manages the complete lifecycle of game type registration and instance creation. It maintains a registry of available game types and provides type-safe instantiation with configuration validation.

Design Pattern:

Implements the Factory pattern combined with a Registry pattern to enable clean separation between game logic and session orchestration. Game types are registered once and can be instantiated multiple times with different configurations.

Key Features:
  • Type-safe game registration with BaseGame inheritance validation

  • Configuration-driven game instantiation

  • Built-in registration of default game types

  • Runtime game type discovery and metadata retrieval

  • Extensible architecture for custom game implementations

  • Comprehensive error handling for invalid game types

registered_games

Registry mapping game type names to their corresponding game class implementations.

Type:

Dict[str, Type[BaseGame]]

logger

Logger for game engine events and debugging.

Type:

logging.Logger

Example

>>> engine = GameEngine()
>>> print(engine.get_available_games())
['company_car', 'resource_allocation', 'integrative_negotiations']
>>> game = engine.create_game('company_car', {'max_rounds': 5})
>>> isinstance(game, BaseGame)
True
Raises:

ValueError – If attempting to register invalid game class or create unknown game type.

__init__()[source]

Initialize a new GameEngine instance with default game types.

Creates an empty game registry and automatically registers all built-in negotiation game types. The engine is immediately ready for use after initialization.

Default Game Types Registered:
  • company_car: Bilateral vehicle price negotiation

  • resource_allocation: Multi-resource team distribution

  • integrative_negotiations: Multi-issue collaborative negotiation

Example

>>> engine = GameEngine()
>>> 'company_car' in engine.get_available_games()
True
register_game_type(game_name: str, game_class: Type[BaseGame])[source]

Register a new game type with the engine for future instantiation.

Adds a new game class to the registry, making it available for creation via create_game(). Validates that the provided class properly inherits from BaseGame to ensure interface compliance.

Parameters:
  • game_name (str) – The unique identifier for this game type. Used in create_game() calls and must be unique within the registry.

  • game_class (Type[BaseGame]) – The game class to register. Must inherit from BaseGame and implement all required abstract methods.

Raises:

ValueError – If game_class does not inherit from BaseGame.

Example

>>> from my_games import CustomNegotiationGame
>>> engine = GameEngine()
>>> engine.register_game_type("custom_game", CustomNegotiationGame)
>>> "custom_game" in engine.get_available_games()
True
create_game(game_type: str, config: Dict[str, Any]) BaseGame[source]

Create a new game instance of the specified type with given configuration.

Instantiates a registered game class with the provided configuration dictionary. The configuration is passed directly to the game’s constructor and should contain all necessary parameters for that specific game type.

Parameters:
  • game_type (str) – The registered name of the game type to create. Must be a key that exists in the registered_games registry.

  • config (Dict[str, Any]) – Configuration dictionary containing game-specific parameters. The exact structure depends on the target game type: - company_car: max_rounds, batna_decay_rate, etc. - resource_allocation: resource_limits, team_preferences, etc. - integrative_negotiations: issues, priorities, etc.

Returns:

A fully initialized game instance ready for play.

Return type:

BaseGame

Raises:
  • ValueError – If game_type is not registered in the engine.

  • TypeError – If config is missing required parameters for the game type.

Example

>>> engine = GameEngine()
>>> config = {"max_rounds": 5, "batna_decay_rate": 0.1}
>>> game = engine.create_game("company_car", config)
>>> game.max_rounds
5
get_available_games() list[source]

Retrieve list of all registered game type identifiers.

Returns the identifiers of all game types currently registered with the engine, including both built-in games and any custom games that have been added via register_game_type().

Returns:

List of string identifiers for all registered game types.

These identifiers can be used with create_game() to instantiate specific game instances.

Return type:

list

Example

>>> engine = GameEngine()
>>> games = engine.get_available_games()
>>> print(games)
['company_car', 'resource_allocation', 'integrative_negotiations']

Note

The returned list reflects the current state of the registry and will include any custom games registered after initialization.

get_game_info(game_type: str) Dict[str, Any][source]

Retrieve detailed metadata information about a registered game type.

Provides comprehensive introspection capabilities for registered games, returning class information, documentation, and metadata that can be used for debugging, documentation generation, or dynamic game analysis.

Parameters:

game_type (str) – Identifier of the registered game type to inspect.

Returns:

Dictionary containing comprehensive game metadata:
  • name (str): The registered identifier for the game type

  • class (str): Name of the game class implementation

  • description (str): Class docstring or fallback description

Return type:

Dict[str, Any]

Example

>>> engine = GameEngine()
>>> info = engine.get_game_info("company_car")
>>> print(info["name"])
'company_car'
>>> print(info["class"])
'CompanyCarGame'
Raises:

ValueError – If the specified game_type is not registered.

Note

This method is primarily useful for debugging, testing, and documentation generation rather than normal game operation.

LLM Manager Module

LLM Manager

The LLMManager provides comprehensive management of Large Language Model instances with advanced features including lazy loading, memory management, and plug-and-play model switching for negotiation sessions.

Key Features:
  • Dynamic model loading and unloading with memory management

  • Thread-safe model operations with concurrent session support

  • Lazy loading strategy to minimize GPU memory usage

  • LRU (Least Recently Used) eviction for memory optimization

  • Model aliasing and shared instance management

  • Plug-and-play architecture for different model types

  • Automatic model registration from configuration files

Architecture:

The LLMManager uses a sophisticated caching strategy where models are loaded on-demand and shared across sessions when possible. This minimizes GPU memory usage while maintaining performance for active negotiations.

Memory Management:
  • Lazy loading: Models loaded only when first requested

  • Shared instances: Same model shared across multiple model IDs

  • LRU eviction: Least recently used models unloaded when memory limit reached

  • Thread safety: All operations protected by locks for concurrent access

class negotiation_platform.core.llm_manager.LLMManager(model_configs: dict)[source]

Bases: object

Advanced Large Language Model management system with memory optimization and threading support.

The LLMManager orchestrates the complete lifecycle of AI model instances used in negotiations. It provides intelligent loading, caching, and memory management to efficiently handle multiple models while minimizing GPU memory usage and maximizing performance.

Core Capabilities:
  • Dynamic model registration from configuration files

  • Lazy loading with on-demand model instantiation

  • Intelligent memory management with LRU eviction

  • Thread-safe operations for concurrent negotiation sessions

  • Model aliasing for flexible configuration management

  • Shared model instances to reduce memory footprint

  • Plug-and-play architecture supporting multiple model types

Memory Strategy:

Uses a sophisticated multi-level caching system: 1. Model Registry: Configurations for all available models 2. Shared Instances: Actual loaded models (limited by max_loaded_models) 3. Model Aliases: Mapping from model IDs to shared instances 4. LRU Eviction: Automatic unloading of least recently used models

models

Registry of model ID to wrapper instances.

Type:

Dict[str, BaseLLMModel]

shared_models

Cache of actual loaded model instances.

Type:

Dict[str, BaseLLMModel]

model_aliases

Mapping from model IDs to shared model names.

Type:

Dict[str, str]

model_configs

Configuration for all registered models.

Type:

Dict[str, Dict[str, Any]]

max_loaded_models

Maximum number of models to keep loaded simultaneously.

Type:

int

loaded_order

LRU tracking for loaded models.

Type:

List[str]

manager_lock

Thread synchronization for safe concurrent access.

Type:

threading.Lock

Example

>>> model_configs = {
...     "model_a": {"model_name": "meta-llama/Llama-2-7b-chat-hf", "type": "huggingface"},
...     "model_b": {"model_name": "mistralai/Mistral-7B-Instruct-v0.1", "type": "huggingface"}
... }
>>> llm_manager = LLMManager(model_configs)
>>> response = llm_manager.generate_response("model_a", "Hello, how are you?")
>>> print(response)
"I'm doing well, thank you for asking!"
Thread Safety:

All public methods are thread-safe and can be called concurrently from multiple negotiation sessions without risk of race conditions or memory corruption.

__init__(model_configs: dict)[source]

Initialize the LLMManager with model configurations and setup internal data structures.

Creates a new LLMManager instance with the provided model configurations and initializes all internal data structures for model management, caching, and thread safety. Models are registered but not loaded until first requested.

Parameters:

model_configs (dict) –

Dictionary mapping model IDs to their configurations. Each configuration should contain:

  • model_name (str): HuggingFace model identifier

  • type (str): Model wrapper type (e.g., ‘huggingface’)

  • device (str): Target device (‘cuda:0’, ‘cpu’, etc.)

  • generation_config (dict): Model generation parameters

  • api_key (str): Authentication token (environment variable)

Architecture Setup:
  • models: Registry mapping model IDs to wrapper instances

  • shared_models: Cache of actual loaded model instances (limited size)

  • model_aliases: Mapping from model IDs to shared model names

  • loaded_order: LRU tracking list for memory management

  • manager_lock: Thread synchronization primitive

Memory Management:

The manager is configured to keep a maximum of 2 models loaded simultaneously (configurable via max_loaded_models). This prevents GPU memory exhaustion while maintaining reasonable performance for active negotiations.

Example

>>> configs = {
...     "model_a": {
...         "model_name": "meta-llama/Llama-2-7b-chat-hf",
...         "type": "huggingface",
...         "device": "cuda:0"
...     }
... }
>>> manager = LLMManager(configs)
>>> print(len(manager.model_configs))
1

Note

Models are registered during initialization but not loaded into memory. Loading occurs lazily when the first generation request is made.

register_model(model_id: str, model_config: Dict[str, Any])[source]

Register a new model configuration without loading it into memory.

Adds a model configuration to the registry, making it available for future loading and generation requests. The model is not loaded into memory during registration - this occurs lazily when first requested.

Parameters:
  • model_id (str) – Unique identifier for this model instance. Used to reference the model in generation requests.

  • model_config (Dict[str, Any]) – Configuration dictionary containing: - model_name (str): HuggingFace model identifier - type (str): Model wrapper type (currently supports ‘huggingface’) - device (str): Target device specification - generation_config (dict): Model-specific generation parameters - api_key (str): Authentication token reference

Supported Model Types:
  • huggingface: HuggingFace Transformers models with GPU/CPU support

Registration Process:
  1. Validates model type is supported

  2. Creates model alias mapping for shared instance management

  3. Stores configuration for lazy loading

  4. Sets up wrapper instance placeholder

Example

>>> manager = LLMManager({})
>>> config = {
...     "model_name": "meta-llama/Llama-2-7b-chat-hf",
...     "type": "huggingface",
...     "device": "cuda:0"
... }
>>> manager.register_model("my_model", config)
>>> "my_model" in manager.models
True
Raises:

ValueError – If model_type is not supported (currently only ‘huggingface’).

load_model(model_id: str)[source]

Load a specific model into memory with intelligent resource management.

Loads the specified model into GPU/CPU memory using a smart caching system that prevents memory overflow by automatically evicting least-recently-used models when memory limits are approached. Thread-safe loading ensures no concurrent model initialization conflicts.

Parameters:

model_id (str) – Registered identifier of the model to load into memory.

Returns:

The loaded model wrapper ready for text generation.

Return type:

BaseLLMModel

Memory Management:
  • Maintains LRU cache of loaded models

  • Automatically evicts oldest models when max_loaded_models exceeded

  • Shares model instances across multiple aliases to save memory

  • Thread-safe loading prevents concurrent initialization conflicts

Example

>>> manager = LLMManager(model_configs)
>>> model = manager.load_model("llama-7b-chat")
>>> response = model.generate("Hello, how are you?")
Raises:

ValueError – If model_id is not registered in the manager.

Note

Loading is performed lazily - models are only loaded when explicitly requested or when generating responses.

switch_model(model_id: str)[source]

Switch the active model for subsequent generation requests.

Changes the default model used for generation requests that don’t specify a model_id. Enables dynamic model switching during runtime without requiring manager reconfiguration or restart.

Parameters:

model_id (str) – Registered identifier of the model to make active.

Lazy Loading:

If the target model is not currently loaded in memory, it will be loaded automatically using the intelligent memory management system.

Example

>>> manager.switch_model("llama-7b-chat")
>>> response = manager.get_response("Hello")  # Uses llama-7b-chat
>>> manager.switch_model("mistral-7b")
>>> response = manager.get_response("Hello")  # Uses mistral-7b
Raises:

ValueError – If model_id is not registered in the manager.

Note

Switching models does not immediately unload the previous active model - it remains in memory subject to LRU eviction policies.

get_response(prompt: str, model_id: str | None = None, **kwargs) str[source]

Generate text response using specified model or currently active model.

Primary interface for text generation that handles model selection, lazy loading, and response generation. Automatically loads models on-demand if not already in memory and manages the generation pipeline.

Parameters:
  • prompt (str) – Input text prompt for the language model to process.

  • model_id (str, optional) – Specific model identifier to use for generation. If None, uses the currently active model.

  • **kwargs – Additional generation parameters passed to the model: - temperature (float): Randomness control (0.0-1.0) - max_length (int): Maximum response length - top_p (float): Nucleus sampling parameter - do_sample (bool): Whether to use sampling

Returns:

Generated text response from the language model.

Return type:

str

Lazy Loading:

Models are loaded automatically if not already in memory, making this method fully self-contained for generation requests.

Example

>>> manager = LLMManager(model_configs)
>>> response = manager.get_response("What is AI?", temperature=0.7)
>>> print(response)
"Artificial Intelligence is..."
Raises:
  • RuntimeError – If no model is specified and no active model is set.

  • ValueError – If the specified model_id is not registered.

Note

This method automatically updates the active model if a specific model_id is provided, making it the new default for future calls.

unload_model(model_id: str)[source]

Unload a specific model from GPU/CPU memory to free resources.

Removes the specified model from memory while respecting shared model instances. If multiple model aliases reference the same underlying model, the unload operation will only proceed if no other aliases are using it.

Parameters:

model_id (str) – Registered identifier of the model to unload from memory.

Memory Management:
  • Respects shared model instances across multiple aliases

  • Updates LRU tracking to reflect unloaded state

  • Frees GPU/CPU memory allocated to the model

Example

>>> manager.unload_model("llama-7b-chat")
>>> # Model is removed from memory, alias remains registered
>>> manager.load_model("llama-7b-chat")  # Can be reloaded later
Raises:

ValueError – If model_id is not registered in the manager.

Note

Unloading does not remove the model from the registry - it can be reloaded later via load_model(). This is primarily useful for manual memory management in resource-constrained environments.

list_models() Dict[str, Dict[str, Any]][source]

Retrieve comprehensive information about all registered models.

Provides detailed metadata about all models in the registry including configuration details, loading status, and memory usage information. Useful for debugging, monitoring, and system introspection.

Returns:

Dictionary mapping model IDs to their

detailed information including: - model_name (str): HuggingFace model identifier - is_loaded (bool): Current memory loading status - device (str): Target device specification - configuration details from the model wrapper

Return type:

Dict[str, Dict[str, Any]]

Example

>>> models = manager.list_models()
>>> for model_id, info in models.items():
...     print(f"{model_id}: loaded={info.get('is_loaded', False)}")
llama-7b-chat: loaded=True
mistral-7b: loaded=False

Note

This method is primarily useful for monitoring and debugging rather than normal operation. The information reflects the current state and may change as models are loaded and unloaded.

cleanup()[source]

Unload all models and free all allocated resources.

Performs comprehensive cleanup by unloading all models from memory, clearing caches, and resetting internal state. Should be called when the manager is no longer needed to ensure proper resource cleanup.

Cleanup Operations:
  • Unloads all models from GPU/CPU memory

  • Clears LRU tracking and shared model caches

  • Resets active model state

  • Frees all allocated GPU/CPU resources

Example

>>> manager = LLMManager(model_configs)
>>> # ... use models ...
>>> manager.cleanup()  # Free all resources

Note

After cleanup, the manager instance should not be used further. Create a new manager instance if continued operation is needed. This method is particularly important for long-running applications to prevent memory leaks.

Metrics Calculator Module

Metrics Calculator

Main component for calculating comprehensive negotiation performance metrics.

This module provides a centralized, extensible system for computing performance metrics across negotiation sessions. It implements a plug-and-play architecture that allows dynamic registration and calculation of various metric types.

Key Features:
  • Plug-and-play metric registration system

  • Comprehensive performance analysis across multiple dimensions

  • Extensible architecture for custom metric implementations

  • Robust error handling with individual metric failure isolation

  • Standardized result aggregation and reporting

  • Compatible interface with session manager data formats

Core Responsibilities:
  1. Metric Registration – Dynamic loading and registration of metric implementations

  2. Data Conversion – Transform session data into metric-compatible formats

  3. Computation – Execute metric calculations across registered metrics

  4. Error Handling – Manage failures in individual metric computations

  5. Result Aggregation – Combine individual metric results into structured format

  6. Report Generation – Create comprehensive performance reports with statistics

Architecture:

The calculator follows a plugin-based architecture where individual metrics inherit from BaseMetric and implement standardized calculation interfaces. This allows for easy addition of new metrics without modifying core logic.

Supported Default Metrics:
  • Utility Surplus: Measures utility gained above BATNA baseline

  • Risk Minimization: Analyzes risk-taking behavior and outcomes

  • Deadline Sensitivity: Evaluates time pressure impact on decisions

  • Feasibility: Assesses solution viability and constraint satisfaction

Example

>>> calculator = MetricsCalculator()
>>> game_state = {
...     'agreement_reached': True,
...     'final_utilities': {'Player1': 85, 'Player2': 75},
...     'current_round': 4
... }
>>> actions_history = [
...     {'round': 1, 'actions': {'Player1': {'type': 'offer', 'value': 100}}},
...     {'round': 2, 'actions': {'Player2': {'type': 'counteroffer', 'value': 90}}}
... ]
>>> results = calculator.calculate_all(game_state, actions_history)
>>> print(results['utility_surplus'])
{'Player1': 25.0, 'Player2': 15.0}
class negotiation_platform.core.metrics_calculator.MetricsCalculator(config: Dict[str, Any] | None = None)[source]

Bases: object

Centralized calculator for negotiation performance metrics with plug-and-play architecture.

This class serves as the main coordinator for metric computation, providing a standardized interface for calculating comprehensive performance metrics across various dimensions of negotiation analysis. It implements dynamic metric registration and supports both built-in and custom metric implementations.

The calculator automatically converts session manager data formats (game_state and actions_history) into metric-compatible formats (GameResult and PlayerAction objects), enabling seamless integration with the broader negotiation platform architecture.

Key Features:
  • Dynamic metric registration and discovery system

  • Automatic data format conversion for metric compatibility

  • Comprehensive error handling with graceful degradation

  • Extensible architecture supporting custom metric implementations

  • Standardized result formatting and aggregation

  • Performance report generation with summary statistics

Architecture:

The calculator maintains a registry of BaseMetric implementations and dynamically instantiates them during initialization. Each metric implements standardized calculation methods, ensuring consistent computation and result formats.

Data Flow: 1. Session data (game_state, actions_history) received from SessionManager 2. Data converted to GameResult and PlayerAction objects for metric compatibility 3. Each registered metric calculates its specific performance measures 4. Individual results aggregated into comprehensive metrics dictionary 5. Optional report generation with summary statistics and analysis

Supported Operations:
  • Registration: Add/remove metrics dynamically during runtime

  • Calculation: Compute all or specific subsets of registered metrics

  • Reporting: Generate comprehensive performance reports with statistics

  • Introspection: List available metrics and retrieve descriptions

config

Configuration parameters for metric calculations.

Type:

Dict[str, Any]

metrics

Registry of available metric implementations.

Type:

Dict[str, BaseMetric]

Example

>>> # Initialize with default metrics
>>> calculator = MetricsCalculator()
>>>
>>> # Register custom metric
>>> custom_metric = MyCustomMetric()
>>> calculator.register_metric("custom_analysis", custom_metric)
>>>
>>> # Calculate all metrics for a session
>>> results = calculator.calculate_all(game_state, actions_history)
>>>
>>> # Generate comprehensive report
>>> report = calculator.generate_report(game_result, player_actions)
>>> print(f"Success rate: {report['success']}")
>>> print(f"Average utility: {report['summary_stats']['utility_surplus']['avg']}")
Raises:
  • ValueError – If invalid game state or actions history data provided.

  • TypeError – If metric registration receives non-BaseMetric implementations.

  • RuntimeError – If critical metric computation failures occur across all metrics.

__init__(config: Dict[str, Any] | None = None)[source]

Initialize MetricsCalculator with configuration and default metrics.

Creates a new metrics calculator instance, sets up the metric registry, and automatically registers default performance metrics for immediate use.

Parameters:

config (Dict[str, Any], optional) – Configuration parameters for metric calculations. Can include thresholds, weights, or calculation preferences. If None, uses empty configuration with default settings.

Example

>>> # Initialize with default configuration
>>> calculator = MetricsCalculator()
>>>
>>> # Initialize with custom configuration
>>> config = {
...     'utility_threshold': 0.8,
...     'risk_tolerance': 0.2,
...     'enable_detailed_logging': True
... }
>>> calculator = MetricsCalculator(config)
register_metric(metric_id: str, metric: BaseMetric)[source]

Register a new metric for dynamic calculation (plug-and-play functionality).

Adds a metric implementation to the calculator’s registry, making it available for computation in all subsequent metric calculations. Supports runtime addition of custom metrics without requiring system restart or reconfiguration.

Parameters:
  • metric_id (str) – Unique identifier for the metric. Used as key in results dictionary and for metric-specific operations. Must be unique across all registered metrics.

  • metric (BaseMetric) – Metric implementation instance that inherits from BaseMetric and implements required calculation methods.

Example

>>> calculator = MetricsCalculator()
>>> custom_metric = MyCustomAnalysisMetric()
>>> calculator.register_metric("custom_analysis", custom_metric)
>>>
>>> # Metric now available in calculations
>>> results = calculator.calculate_all(game_state, actions_history)
>>> custom_result = results["custom_analysis"]
Raises:
  • TypeError – If metric is not an instance of BaseMetric.

  • ValueError – If metric_id is already registered (use unregister first).

unregister_metric(metric_id: str)[source]

Remove a metric from the calculation registry.

Removes a previously registered metric from the calculator’s registry, preventing it from being included in future metric calculations. Useful for dynamically adjusting analysis scope or removing problematic metrics.

Parameters:

metric_id (str) – Identifier of the metric to remove. Must match the identifier used during registration.

Example

>>> calculator = MetricsCalculator()
>>> calculator.unregister_metric("risk_minimization")
>>>
>>> # risk_minimization no longer included in calculations
>>> results = calculator.calculate_all(game_state, actions_history)
>>> # 'risk_minimization' key will not be present in results

Note

Silently ignores attempts to unregister non-existent metrics. No error is raised if the metric_id is not found in the registry.

calculate_all(game_state: Dict[str, Any], actions_history: List[Dict[str, Any]]) Dict[str, Dict[str, float]][source]

Calculate all registered metrics using session manager data formats.

Primary interface for metric calculation that accepts data directly from SessionManager. Automatically converts session data formats into metric-compatible objects and computes all registered metrics in a single operation.

This method serves as the main entry point for comprehensive performance analysis, handling data transformation, metric computation, and result aggregation in a unified workflow.

Parameters:
  • game_state (Dict[str, Any]) – Final game state from completed negotiation session. Expected to contain keys like ‘game_type’, ‘players’, ‘agreement_reached’, ‘final_utilities’, ‘current_round’, and game-specific data.

  • actions_history (List[Dict[str, Any]]) – Chronological log of all actions taken during the negotiation. Each entry should contain ‘round’ and ‘actions’ keys with player actions for that round.

Returns:

Nested dictionary where top-level keys are metric identifiers and values are dictionaries mapping player IDs to their computed metric values.

Return type:

Dict[str, Dict[str, float]]

Example

>>> game_state = {
...     'game_type': 'price_bargaining',
...     'players': ['Player1', 'Player2'],
...     'agreement_reached': True,
...     'final_utilities': {'Player1': 85, 'Player2': 75},
...     'current_round': 3
... }
>>> actions_history = [
...     {'round': 1, 'actions': {'Player1': {'type': 'offer', 'value': 100}}},
...     {'round': 2, 'actions': {'Player2': {'type': 'counteroffer', 'value': 90}}},
...     {'round': 3, 'actions': {'Player1': {'type': 'accept'}}}
... ]
>>> results = calculator.calculate_all(game_state, actions_history)
>>> print(results)
{
    'utility_surplus': {'Player1': 25.0, 'Player2': 15.0},
    'risk_minimization': {'Player1': 0.8, 'Player2': 0.6},
    'deadline_sensitivity': {'Player1': 0.9, 'Player2': 0.7},
    'feasibility': {'Player1': 1.0, 'Player2': 1.0}
}
Raises:
  • ValueError – If game_state or actions_history contain invalid or missing data.

  • RuntimeError – If data conversion or metric computation fails critically.

calculate_all_metrics(game_result: GameResult, actions_history: List[PlayerAction]) Dict[str, Dict[str, float]][source]

Calculate all registered metrics using GameResult and PlayerAction objects.

Core computation method that executes all registered metrics against standardized data objects. Provides comprehensive error handling to ensure partial results are returned even if individual metrics fail.

Parameters:
  • game_result (GameResult) – Standardized game outcome data containing final scores, winner information, and game metadata.

  • actions_history (List[PlayerAction]) – Chronological sequence of player actions taken during the negotiation session.

Returns:

Nested dictionary where top-level keys are metric identifiers and values are player-to-score mappings.

Return type:

Dict[str, Dict[str, float]]

Example

>>> game_result = GameResult(game_id="test", players=["P1", "P2"], ...)
>>> actions = [PlayerAction(player_id="P1", action_type="offer", ...)]
>>> results = calculator.calculate_all_metrics(game_result, actions)
>>> print(results["utility_surplus"]["P1"])
25.0

Note

Failed metric calculations are logged and replaced with default values (0.0 for each player) to ensure consistent result structure.

calculate_specific_metrics(metric_ids: List[str], game_result: GameResult, actions_history: List[PlayerAction]) Dict[str, Dict[str, float]][source]

Calculate only specified metrics for targeted performance analysis.

Computes a subset of registered metrics based on provided identifiers. Useful for focused analysis or when computational resources are limited and only specific metrics are required.

Parameters:
  • metric_ids (List[str]) – List of metric identifiers to calculate. Must match registered metric IDs. Unknown metrics are skipped.

  • game_result (GameResult) – Standardized game outcome data.

  • actions_history (List[PlayerAction]) – Player action sequence.

Returns:

Results dictionary containing only requested metrics. Structure matches calculate_all_metrics output.

Return type:

Dict[str, Dict[str, float]]

Example

>>> # Calculate only utility and risk metrics
>>> specific_results = calculator.calculate_specific_metrics(
...     ["utility_surplus", "risk_minimization"],
...     game_result,
...     actions_history
... )
>>> print(list(specific_results.keys()))
['utility_surplus', 'risk_minimization']

Note

Unregistered metric IDs are logged as warnings and skipped. Failed calculations are replaced with default values.

get_metric_descriptions() Dict[str, str][source]

Retrieve descriptions of all registered metrics for documentation and analysis.

Provides human-readable descriptions of each registered metric, useful for generating documentation, user interfaces, and analytical reports that explain what each metric measures.

Returns:

Mapping of metric identifiers to their descriptive text. Each description explains what the metric measures and its significance.

Return type:

Dict[str, str]

Example

>>> descriptions = calculator.get_metric_descriptions()
>>> print(descriptions["utility_surplus"])
"Measures the utility gained above BATNA baseline for each player"
list_metrics() List[str][source]

List all currently registered metric identifiers.

Returns the identifiers of all metrics currently available for calculation. Useful for introspection, validation, and dynamic metric selection.

Returns:

List of registered metric identifiers that can be used with calculate_specific_metrics or other metric operations.

Return type:

List[str]

Example

>>> available_metrics = calculator.list_metrics()
>>> print(available_metrics)
['utility_surplus', 'risk_minimization', 'deadline_sensitivity', 'feasibility']
generate_report(game_result: GameResult, actions_history: List[PlayerAction]) Dict[str, Any][source]

Generate comprehensive performance report with metrics and summary statistics.

Creates a detailed analytical report that includes all metric calculations plus derived summary statistics such as averages, ranges, and totals. Ideal for comprehensive performance analysis and comparative studies.

Parameters:
  • game_result (GameResult) – Final game outcome data.

  • actions_history (List[PlayerAction]) – Complete sequence of player actions.

Returns:

Comprehensive report containing:
  • game_id: Game identifier

  • players: List of participating players

  • success: Whether negotiation was successful

  • total_rounds: Number of negotiation rounds

  • metrics: Full metric calculation results

  • summary_stats: Derived statistics (avg, min, max, total) per metric

Return type:

Dict[str, Any]

Example

>>> report = calculator.generate_report(game_result, actions_history)
>>> print(report["summary_stats"]["utility_surplus"]["avg"])
20.0
>>> print(report["success"])
True
>>> print(f"Game completed in {report['total_rounds']} rounds")
Game completed in 3 rounds

Note

Summary statistics are calculated only for metrics that return numeric values. Non-numeric metrics are included in the metrics section but excluded from summary statistics.

Session Manager Module

Session Manager

Coordinates a single negotiation “session” between two (or more) LLM agents playing a specific game. Responsibilities:

  1. Game bootstrap – Creates the game instance and initial state.

  2. Turn scheduling – Alternates prompts/actions between players.

  3. Action validation – Uses each game’s is_valid_action method.

  4. State transition & logging – Calls process_actions and keeps history.

  5. Stopping criteria – Ends when the game says it is over.

  6. Metric computation – Delegates to MetricsCalculator once done.

  7. Fault tolerance – Catches malformed LLM outputs & retries.

class negotiation_platform.core.session_manager.SessionManager(llm_manager: LLMManager, game_engine: GameEngine, metrics_calculator: MetricsCalculator, *, max_turn_retries: int = 3, logger: Logger | None = None)[source]

Bases: object

High-level driver that orchestrates complete negotiation sessions between AI agents.

This class coordinates all aspects of a negotiation session, from game initialization through completion and metrics calculation. It serves as the central orchestrator that manages turn-based interactions, validates actions, handles errors, and computes final performance metrics.

Key Responsibilities:
  • Game bootstrap and initial state creation

  • Turn-based scheduling and player action coordination

  • Action validation using game-specific rules

  • State transition management and action history logging

  • Game termination detection based on stopping criteria

  • Comprehensive metrics calculation via MetricsCalculator

  • Fault tolerance for malformed LLM outputs with retry logic

  • Winner determination and performance analysis

Workflow:
  1. Initialize game instance with specified configuration

  2. Establish initial game state and player assignments

  3. Coordinate alternating turns between players

  4. Validate each action against game rules

  5. Process valid actions and update game state

  6. Log all actions and state transitions

  7. Check termination conditions after each round

  8. Calculate comprehensive metrics upon completion

  9. Return enriched results with performance analysis

llm_manager

Manages AI model loading and interaction.

Type:

LLMManager

game_engine

Creates and manages game instances.

Type:

GameEngine

metrics_calculator

Computes performance metrics.

Type:

MetricsCalculator

max_turn_retries

Maximum retry attempts for invalid actions.

Type:

int

logger

Logger for session events and debugging.

Type:

logging.Logger

Example

>>> llm_manager = LLMManager(model_configs)
>>> game_engine = GameEngine()
>>> metrics_calc = MetricsCalculator()
>>> session = SessionManager(llm_manager, game_engine, metrics_calc)
>>> result = session.run_negotiation(
...     game_type="price_bargaining",
...     players=["model_a", "model_b"],
...     game_config={"max_rounds": 5}
... )
>>> print(result['agreement_reached'])
True
Raises:
  • ValueError – If invalid game type or player configuration provided.

  • RuntimeError – If session execution fails due to unrecoverable errors.

__init__(llm_manager: LLMManager, game_engine: GameEngine, metrics_calculator: MetricsCalculator, *, max_turn_retries: int = 3, logger: Logger | None = None) None[source]

Initialize a new SessionManager instance with required components.

Creates a session manager that orchestrates negotiations between AI agents using the provided LLM manager, game engine, and metrics calculator.

Parameters:
  • llm_manager (LLMManager) – Manager for loading and interacting with AI models. Must be configured with the models that will participate in negotiations.

  • game_engine (GameEngine) – Engine for creating and managing game instances. Should have all required game types registered.

  • metrics_calculator (MetricsCalculator) – Calculator for computing performance metrics. Will be used to analyze negotiation outcomes and player performance.

  • max_turn_retries (int, optional) – Maximum number of retry attempts when a player provides an invalid action. Defaults to 3. Higher values increase robustness but may slow down sessions with consistently invalid players.

  • logger (logging.Logger, optional) – Logger for session events and debugging. If None, creates a new logger using the class name.

Example

>>> llm_manager = LLMManager({"model_a": model_config})
>>> game_engine = GameEngine()
>>> metrics_calc = MetricsCalculator()
>>> session = SessionManager(
...     llm_manager=llm_manager,
...     game_engine=game_engine,
...     metrics_calculator=metrics_calc,
...     max_turn_retries=5  # Allow more retries for unstable models
... )
run_negotiation(*, game_type: str, players: List[str], game_config: Dict[str, Any] | None = None, session_id: str | None = None, seed_messages: Dict[str, str] | None = None) Dict[str, Any][source]

Execute a full negotiation session and return the enriched game result that also contains computed metrics and a complete action log.

Parameters:
  • game_type (the registered key inside GameEngine (e.g. "company_car").)

  • players (ordered list of model names (length == 2 for bilateral games).)

  • game_config (per-game configuration dictionary; if None, defaults are used.)

  • session_id (optional external identifier; autogenerated if omitted.)

  • seed_messages (dict of {player_name: system_prompt} to prime behaviour.)

Returns:

result

  • raw game_state at termination

  • ”actions_history”: chronological list of {round, actions} dicts

  • ”metrics”: Dict[str, Dict[str, float]] (metric → player → value)

  • ”session_metadata”: misc run info (IDs, timestamps, etc.)

Return type:

Dict containing