Regression Class#
- class regmmd.regression.MMDRegressor(model, fit_intercept=True, par_v=None, par_c=None, kernel_y='Gaussian', kernel_X='Laplace', bandwidth_y='auto', bandwidth_X='auto', solver=None, random_state=None)[source]#
Bases:
RegressorMixin,BaseEstimatorRegression using the Maximum Mean Discrepancy (MMD) criterion.
This class implements regression using the MMD criterion, which is a kernel-based method to compare distributions by measuring the distance between mean embeddings in a Reproducing Kernel Hilbert Space (RKHS).
MMDRegressor fits a regression model by minimizing the MMD between the distributions of the observed data and the model’s predictions. It supports various kernel types and bandwidth selection methods for both the input features and the target variables.
- Parameters:
model (RegressionModel) – The statistical model used for regression, provided as an instance of a RegressionModel class with initialized parameters. This model defines the relationship between the input features and the target variable.
fit_intercept (bool, default=True) – Specifies whether to calculate the intercept for the model. If set to False, the model assumes that the data is already centered, and no intercept will be fitted.
par_v (np.array, optional) – Initial values for the variable parameters of the model. If None, the model will use default initial values.
par_c (np.array, optional) – Initial values for the constant parameters of the model. If None, the model will use default initial values.
kernel_y (str, default="Gaussian") – The kernel type used for the target variable y. Supported options are “Gaussian”, “Laplace”, and “Cauchy”.
kernel_X (str, default="Laplace") – The kernel type used for the input features X. Supported options are “Gaussian”, “Laplace”, and “Cauchy”.
bandwidth_y (Union[str, float], default="auto") – The bandwidth parameter for the kernel applied to the target variable y. If set to “auto”, the bandwidth is determined using a heuristic method, such as the median heuristic.
bandwidth_X (Union[str, float], default="auto") – The bandwidth parameter for the kernel applied to the input features X. If set to “auto”, the bandwidth is determined using a heuristic method, such as the median heuristic.
solver (dict, optional) – A dictionary specifying the solver parameters for the optimization process. It should include keys such as “burnin” (number of burn-in iterations), “n_step” (number of optimization steps), and “stepsize” (learning rate for the optimizer). If None, default solver settings are used.
random_state (int, optional) – random seed to be passed to the model and any sampler used in the SGD optimizers.
- X_offset[source]#
The offset applied to the input features X during preprocessing. This is used when fit_intercept is True.
- Type:
np.array or None
- y_offset[source]#
The offset applied to the target variable y during preprocessing.
- Type:
np.array or None
- X_scale[source]#
The scale factor applied to the input features X during preprocessing.
- Type:
np.array or None
Notes
The fit method preprocesses the data, fits the model using the specified solver, and updates the model parameters.
The predict method uses the fitted model to make predictions on new data.
- fit(X, y)[source]#
Fit the MMD regression model according to the given training data.
- Parameters:
X (np.ndarray, shape (n_samples, n_features)) – Training input samples.
y (np.ndarray, shape (n_samples,)) – Target values.
- Returns:
res – A dictionary containing the results of the optimization process, including the estimated parameters and the optimization trajectory.
- Return type:
MMDResult