Provider Interface#

This module defines the abstract base classes that all quantization providers must implement.

If you are building a custom quantization strategy or extending Qwix, you will implement the QuantizationProvider interface.

class qwix.QuantizationProvider(rules: Sequence[QuantizationRule], *, disable_jit: bool = False)[source]#

Interface for model integration.

A provider can be either explicitly called by model authors, or implicitly injected into the model by using interception.py.

get_intercept_map() → dict[str, Callable[[...], Any]][source]#: Returns the intercept map for interception.wrap_func_intercepted.

get_interceptors() → Sequence[Callable[[], Interceptor]][source]#

Returns a list of interceptor factories.

The default implementation returns a single interceptor that handles all ops. Subclasses can override this method to return multiple interceptors if needed (e.g. ODML providers).

get_unused_rules() → Sequence[QuantizationRule][source]#

Returns the quantization rules that did not match any operations.

This should be called after model quantization (e.g., quantize_model) to verify that all rules were applied as expected. A rule is considered unused if its module_path regex did not match any module’s path, or if its op_names did not match any intercepted operation within a matching module.

Returns:: A sequence of unused quantization rules.

process_model_inputs(model: Any, model_args: Sequence[Any], model_kwargs: dict[str, Any]) → tuple[Any, Sequence[Any], dict[str, Any]][source]#: Process the model and its inputs before it is called.

process_model_output(method_name: str, model_output: Any) → Any[source]#: Process the model output before it is returned.

class qwix.QuantizationRule(*, module_path: str = '.*', op_names: Collection[str] = (), weight_qtype: str | type[Any] | dtype | SupportsDType | None = None, act_qtype: str | type[Any] | dtype | SupportsDType | None = None, tile_size: int | float | None = None, act_static_scale: bool | None = None, weight_calibration_method: str = 'absmax', act_calibration_method: str | None = None, act_batch_axes: Collection[int] = (0,))[source]#: Quantization rules that match and configure the quantization behavior.

Provider Interface

Contents

Provider Interface#