Results

Statistical analysis and research tools for processing negotiation data and evaluating model performance.

This package provides comprehensive analysis capabilities for researchers studying negotiation behaviors, biases, and AI model performance in structured negotiation environments.

Key Features

  • Bias Detection: Advanced statistical testing for role bias and first-mover advantages

  • Model Comparison: Sophisticated comparison tools with bias adjustment

  • Performance Metrics: Agent-based metrics analysis and longitudinal tracking

  • Statistical Reporting: Research-grade statistical analysis with effect sizes

  • Data Processing: Robust log parsing and data extraction capabilities

Submodules

metrics_statistics.py

Comprehensive statistical analysis tools for negotiation metrics and performance data.

Negotiation Metrics Analysis and Statistics

This module provides comprehensive statistical analysis tools for negotiation metrics and performance data. It processes negotiation logs and computes advanced statistical measures for research and evaluation purposes.

Key Features:
  • Log parsing and data extraction from negotiation sessions

  • Agent-based metrics analysis with model performance tracking

  • Advanced statistical testing for research validation

  • Performance comparison across different negotiation scenarios

  • Research-grade data processing and reporting

Analysis Capabilities:
  • Agent Performance Analysis: Tracks individual model metrics over time

  • Statistical Distribution Testing: Validates metric distributions

  • Comparative Analysis: Compares performance across models and games

  • Correlation Analysis: Identifies relationships between metrics

  • Longitudinal Analysis: Tracks performance changes over sessions

Usage:

This module is designed for researchers analyzing negotiation platform output logs. It provides both command-line and programmatic interfaces for comprehensive statistical analysis of agent performance metrics.

Example

python metrics_statistics.py negotiation_log.out

results.metrics_statistics.parse_negotiation_log_agent_metrics(file_path)[source]

Parses a negotiation log file to extract agent-based metrics, ensuring one row per model per game.

Parameters:

file_path (str) – Path to the negotiation log file.

Returns:

A tuple containing:
  • pd.DataFrame: DataFrame with parsed agent-based metrics.

  • str: The detected game type (e.g., ‘integrative_negotiation’).

Return type:

tuple

Raises:

Example

>>> df, game_type = parse_negotiation_log_agent_metrics("log_file.out")
>>> print(game_type)
'integrative_negotiation'
results.metrics_statistics.calculate_average_agreement_round(df)[source]

Calculates the average round in which agreements were reached.

Parameters:

df (pd.DataFrame) – DataFrame containing parsed agent-based metrics.

Returns:

A dictionary containing:
  • ’total_games’ (int): Total number of games analyzed.

  • ’agreements_reached’ (int): Number of games where agreement was reached.

  • ’average_agreement_round’ (float or None): Average round of agreement, or None if no agreements.

  • ’agreement_rate’ (float): Percentage of games that reached agreement.

Return type:

dict

Example

>>> stats = calculate_average_agreement_round(df)
>>> print(f"Average agreement round: {stats['average_agreement_round']:.2f}")
Average agreement round: 3.25
results.metrics_statistics.bias_corrected_metric_analysis(df, metric_name, metric_column)[source]

Analyzes and reports bias-corrected model comparison for a given metric.

Parameters:
  • df (pd.DataFrame) – DataFrame containing parsed agent-based metrics.

  • metric_name (str) – Name of the metric being analyzed (e.g., ‘Risk Minimization’).

  • metric_column (str) – Column name in the DataFrame corresponding to the metric.

Returns:

A dictionary with bias-corrected analysis results if analysis is successful, otherwise None. The dictionary includes:

  • ’model_a’ (str): Name of the first model.

  • ’model_b’ (str): Name of the second model.

  • ’adjusted_difference’ (float): Bias-adjusted difference between models.

  • ’predicted_mean_a’ (float): Predicted mean for model_a.

  • ’predicted_mean_b’ (float): Predicted mean for model_b.

  • ’p_value’ (float): P-value of the test.

  • ’significant’ (bool): Whether the result is statistically significant.

Return type:

dict or None

Example

>>> results = bias_corrected_metric_analysis(df, 'Risk Minimization', 'Risk_Minimization')
>>> print(results['adjusted_difference'])
0.15
results.metrics_statistics.main()[source]

Main function to perform bias-corrected analysis on agent-based metrics from a negotiation log file.

Usage:

python compare_agent_metrics_bias_corrected.py <log_file.out>

Parameters:

None (command-line arguments are used)

Returns:

Outputs results to the console and exports corrected data to a CSV file.

Return type:

None

Example

$ python compare_agent_metrics_bias_corrected.py integrative_negotiation_1975553.out

win_statistics.py

Advanced statistical analysis for negotiation outcomes with bias detection and model comparison.

results.win_statistics.parse_negotiation_log_corrected(file_path)[source]

Parses a negotiation log file to extract game data with corrected statistical structure, ensuring one row per game.

Parameters:

file_path (str) – Path to the negotiation log file.

Returns:

A tuple containing:
  • pd.DataFrame: DataFrame with parsed game data.

  • str: The detected game type (e.g., ‘integrative_negotiation’).

Return type:

tuple

Raises:

Example

>>> df, game_type = parse_negotiation_log_corrected("log_file.out")
>>> print(game_type)
'integrative_negotiation'
results.win_statistics.analyze_role_bias_corrected(df, game_type)[source]

Analyzes role bias in negotiation games to determine if certain roles have inherent advantages.

Parameters:
  • df (pd.DataFrame) – DataFrame containing parsed game data.

  • game_type (str) – The type of game (e.g., ‘integrative_negotiation’).

Returns:

A dictionary with statistical test results if analysis is possible, otherwise None. The dictionary includes:

  • ’chi2’ (float): Chi-square test statistic.

  • ’p_value’ (float): P-value of the test.

  • ’cohens_h’ (float): Effect size (Cohen’s h).

  • ’effect_size’ (str): Interpretation of effect size.

  • ’significant’ (bool): Whether the result is statistically significant.

Return type:

dict or None

Example

>>> results = analyze_role_bias_corrected(df, "integrative_negotiation")
>>> print(results['significant'])
True
results.win_statistics.analyze_first_mover_bias_corrected(df)[source]

Analyzes first-mover bias in negotiation games to determine if going first provides an advantage.

Parameters:

df (pd.DataFrame) – DataFrame containing parsed game data.

Returns:

A dictionary with statistical test results, including:
  • ’chi2’ (float): Chi-square test statistic.

  • ’p_value’ (float): P-value of the test.

  • ’cohens_h’ (float): Effect size (Cohen’s h).

  • ’first_mover_win_rate’ (float): Win rate of the first mover.

  • ’significant’ (bool): Whether the result is statistically significant.

Return type:

dict

Example

>>> results = analyze_first_mover_bias_corrected(df)
>>> print(results['first_mover_win_rate'])
0.65
results.win_statistics.bias_adjusted_model_comparison(df, role_bias_significant=False, first_mover_bias_significant=False)[source]

Compares model performances while controlling for role and first-mover biases.

Parameters:
  • df (pd.DataFrame) – DataFrame containing parsed game data.

  • role_bias_significant (bool) – Whether role bias was detected as significant.

  • first_mover_bias_significant (bool) – Whether first-mover bias was detected as significant.

Returns:

A dictionary with bias-adjusted model comparison results if analysis is successful, otherwise None. The dictionary includes:

  • ’model_a’ (str): Name of the first model.

  • ’model_b’ (str): Name of the second model.

  • ’adjusted_prob_a’ (float): Bias-adjusted win probability for model_a.

  • ’raw_prob_a’ (float): Raw win probability for model_a.

  • ’adjustment’ (float): Difference between adjusted and raw probabilities.

Return type:

dict or None

Example

>>> results = bias_adjusted_model_comparison(df, True, False)
>>> print(results['adjusted_prob_a'])
0.72
results.win_statistics.main()[source]

Main function to perform corrected bias analysis on a negotiation log file.

Usage:

python compare_games_statistics_FIXED.py <log_file.out>

Parameters:

None (command-line arguments are used)

Returns:

Outputs results to the console and exports corrected data to a CSV file.

Return type:

None

Example

$ python compare_games_statistics_FIXED.py integrative_negotiation_1975553.out