Skip to content

Issue with distance_metric_params for dowhy.causal_estimators.distance_matching_estimator #1390

@kevinchiv

Description

@kevinchiv

Describe the bug
When trying to use a distance_metric that requires additional distance_metric_params, distance matching does not seem able to parse out additional kwargs. In the __init__ method of DistanceMatchingEstimator here, the code looks like this:

    def __init__(
        self,
        identified_estimand: IdentifiedEstimand,
        test_significance: Union[bool, str] = False,
        evaluate_effect_strength: bool = False,
        confidence_intervals: bool = False,
        num_null_simulations: int = CausalEstimator.DEFAULT_NUMBER_OF_SIMULATIONS_STAT_TEST,
        num_simulations: int = CausalEstimator.DEFAULT_NUMBER_OF_SIMULATIONS_CI,
        sample_size_fraction: int = CausalEstimator.DEFAULT_SAMPLE_SIZE_FRACTION,
        confidence_level: float = CausalEstimator.DEFAULT_CONFIDENCE_LEVEL,
        need_conditional_estimates: Union[bool, str] = "auto",
        num_quantiles_to_discretize_cont_cols: int = CausalEstimator.NUM_QUANTILES_TO_DISCRETIZE_CONT_COLS,
        num_matches_per_unit: int = 1,
        distance_metric: str = "minkowski",
        **kwargs,
    ):
        ...
        self.distance_metric_params = {}
        for param_name in self.Valid_Dist_Metric_Params:
            param_val = getattr(self, param_name, None)
            if param_val is not None:
                self.distance_metric_params[param_name] = param_val
        ...

Steps to reproduce the behavior

import dowhy
from dowhy import CausalModel
from causaldata import black_politicians

data = black_politicians.load_pandas().data
X = data.drop("responded", axis=1).values
y = data["responded"].values

model = CausalModel(
    data=data,
    treatment=["leg_black"],
    outcome="responded",
    common_causes=["medianhhincom", "blackpercent, "leg_democrat"]
)

identified_estimand = model.identify_effect(proceed_when_unindentifiable=True)

# this will error
model.estimate_effect(
    identifed_estimand=identified_estimand,
    method_name="backdoor.distance_matching",
    target_units="att",
    method_params={
        "distance_metric": "mahalanobis",
        "V": np.cov(X.T)
    }
)

The error will be:

ValueError: Must provide either V or VI for Mahalanobis distance

Expected behavior
When method_name="backdoor.distance_matching" for an instance of CausalModel, it should be possible to pass additional parameters required for distance matching. These additional parameters can be found here.

Version information:

  • DoWhy version 0.14

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions