Dequantize a Matrix
mlx_dequantize.RdReconstructs an approximate floating-point matrix from a quantized representation
produced by mlx_quantize().
Usage
mlx_dequantize(
  w,
  scales,
  biases = NULL,
  group_size = 64L,
  bits = 4L,
  mode = "affine",
  device = mlx_default_device()
)Arguments
- w
- An mlx array (the quantized weight matrix) 
- scales
- An mlx array (the quantization scales) 
- biases
- An optional mlx array (the quantization biases for affine mode). Default: NULL 
- group_size
- The group size used during quantization. Default: 64 
- bits
- The number of bits used during quantization. Default: 4 
- mode
- The quantization mode used: "affine" or "mxfp4". Default: "affine" 
- device
- Execution target: supply - "gpu",- "cpu", or an- mlx_streamcreated via- mlx_new_stream(). Default:- mlx_default_device().
Details
Dequantization unpacks the low-precision quantized weights and applies the scales (and biases) to reconstruct approximate floating-point values. Note that some precision is lost during quantization and cannot be recovered.
Examples
if (FALSE) { # \dontrun{
w <- mlx_random_normal(c(512, 256))
quant <- mlx_quantize(w)
w_reconstructed <- mlx_dequantize(quant$w_q, quant$scales, quant$biases)
} # }