Skip to content

Implement quantization for Decimal type when encode#978

Open
NyanFisher wants to merge 1 commit intojcrist:mainfrom
NyanFisher:decimal-quantize
Open

Implement quantization for Decimal type when encode#978
NyanFisher wants to merge 1 commit intojcrist:mainfrom
NyanFisher:decimal-quantize

Conversation

@NyanFisher
Copy link
Copy Markdown

@NyanFisher NyanFisher commented Feb 11, 2026

Hello!

Description of the problem solved by this PR

The msgspec library has many useful features, but the current version lacks the ability to correctly quantize Decimal values during encoding. In applications related to finance and precise calculations, it is critical to take into account maximum accuracy and return values rounded to a specified precision. Without implementing this functionality, a complete transition from pydantic to msgspec is not possible.

Changes implemented in this PR

  1. An optional decimal_quantize parameter has been introduced. When specified, all decimal.Decimal values will be rounded to a scale corresponding to this value using the Decimal.quantize() method.
  2. Added an optional decimal_rounding parameter to set the rounding mode for quantizing decimal values. Standard modes from the Python decimal module are supported, such as ‘ROUND_DOWN’, 'ROUND_HALF_UP', etc. If the parameter is not specified, the standard mode ‘ROUND_HALF_EVEN’ is applied.
    This improvement enhances the functionality of the library and ensures full compatibility when transitioning to msgspec in financial and other applications that require precise control over the representation of decimal numbers.

Examples

decimal_quantize

import decimal

encoder = msgspec.json.Encoder(decimal_quantize=decimal.Decimal("0.00"))

encoder.encode(decimal.Decimal("1.23456789"))
b'"1.23"'

decimal_rounding

import decimal

encoder = msgspec.json.Encoder(
    decimal_quantize=decimal.Decimal("0.00"),
    decimal_rounding=decimal.ROUND_UP,
)

encoder.encode(decimal.Decimal("1.235"))
b'"1.24"'

I would appreciate any comments on improving or restructuring the code, as I don't often write in C.

Fix my issue - Closes #848

@NyanFisher NyanFisher changed the title Implement quantization for Decimal type when encode Draft: Implement quantization for Decimal type when encode Feb 11, 2026
@NyanFisher NyanFisher force-pushed the decimal-quantize branch 2 times, most recently from 2ee9741 to 31effee Compare February 11, 2026 13:51
@NyanFisher NyanFisher changed the title Draft: Implement quantization for Decimal type when encode Implement quantization for Decimal type when encode Feb 11, 2026
@NyanFisher NyanFisher force-pushed the decimal-quantize branch 2 times, most recently from 117001b to b5ff6b7 Compare February 11, 2026 15:32
@NyanFisher
Copy link
Copy Markdown
Author

CI failures are unrelated to this change:

All build, test, and wheel jobs pass across all platforms.

@Siyet
Copy link
Copy Markdown
Collaborator

Siyet commented Apr 10, 2026

Code looks solid and CI is green across the matrix - nice work, especially the test coverage in test_common.py.

One API design question I'd like to raise before this moves forward: the current shape places decimal_quantize / decimal_rounding on the encoder itself, which means every Decimal field in every struct passing through that encoder gets the same scale and rounding mode. In financial code it's common to have heterogeneous Decimal fields in the same payload (e.g. price at scale 4, quantity at scale 0, tax_rate at scale 6) - with an encoder-level setting you'd need separate encoders per shape, which defeats most of the ergonomic win.

An alternative would be to attach quantization to the type via Annotated[Decimal, Meta(...)], e.g.

Price = Annotated[Decimal, Meta(decimal_quantize="0.0001", decimal_rounding="ROUND_HALF_EVEN")]

That composes naturally with per-field configuration, lives next to the type where the constraint is logically defined, and matches how gt/ge/pattern etc. already work today. The downside is more plumbing through TypeNode instead of one encoder kwarg.

Did you consider the Meta-based approach? If so, what made you land on encoder-level? Both have trade-offs and I'd rather get the API right before merge.

cc @jcrist @ofek — this expands the encoder API surface, so I'd like your read on whether the encoder-kwarg shape is the one we want, or whether Meta-based quantization is preferable.

@jcrist
Copy link
Copy Markdown
Owner

jcrist commented Apr 10, 2026

Instead of two new options for quantization, how about adding a single decimal_format option to Encoder? This would take either a string to pass to quantize (something like decimal.quantize(Decimal(decimal_format))), or a callable that takes in the decimal and returns a new value to encode. A few examples:

# Uses default rounding
enc = Encoder(decimal_format="0.0001")

# Custom rounding
enc = Encoder(decimal_format=lambda d: d.quanitize(decimal.Decimal("0.001"), "ROUND_DOWN"))

I like this since it's more flexible, and also only adds a single new option. Otherwise I'd worry about other users needing further customization, resulting in a number of decimal_* kwargs.

I wouldn't expect a callable here to have a perf cost - calling into python here is negligible, most of the time will be in the quantize call itself.

Did you consider the Meta-based approach? If so, what made you land on encoder-level? Both have trade-offs and I'd rather get the API right before merge.

In msgspec, (currently) encoding doesn't have any type-level information, it only has the values. This means customization for encoding cannot rely on information in annotations, it has to rely on the actual object instances themselves. This is admittedly less flexible in cases where you might want to encode different values differently, but keeps the encoder simple and supports values that exist outside of containers with attached annotations (e.g. encode(decimal_object) wouldn't have annotations, but encode(struct_with_a_decimal_field) would).

For now a single setting on an Encoder is both straightforward to implement, and matches the current conventions.

@NyanFisher
Copy link
Copy Markdown
Author

@Siyet @jcrist Hello! Thanks for the review!

@Siyet

Did you consider the Meta-based approach? If so, what made you land on encoder-level? Both have trade-offs and I'd rather get the API right before merge.

I hadn't considered using Meta, but I think that approach would result in a large number of TypeNode. I work at a bank and know that a single Price isn't enough, since it's too general a concept. But it's a good idea for future 😃

@jcrist

Instead of two new options for quantization, how about adding a single decimal_format option to Encoder?

I like this idea, but the decimal_format parameter already exists. If you plan to extend the interface with additional types, I don’t think this is the best solution, as it will confuse users. I suggest using a separate additional parameter called decimal_quantize with the types Decimal | Callable[[Decimal], Decimal], which would be responsible exclusively for quantization.
This way, we’ll retain the ability to convert Decimal to “string”/“number”, add quantization, and maintain backward compatibility.

@Siyet
Copy link
Copy Markdown
Collaborator

Siyet commented Apr 15, 2026

After thinking it through I'm coming around to @jcrist's single-kwarg shape. One slot for everything is, in my view, the right call here.

Encoder(decimal_format=lambda d: d.quantize(Decimal("0.001"), ROUND_DOWN))  # custom
Encoder(decimal_format="string")                                            # existing
Encoder(decimal_format="number")                                            # existing

We could split it along dataclasses.field(default=..., default_factory=...) lines (value in one kwarg, callable in another), but that split exists specifically to disambiguate "the value is a callable" from "call this to produce the value", and neither "string" nor "number" is callable. Introducing a separate decimal_hook just to satisfy a pattern we do not actually need feels like overcomplicating the interface.

There is also the naming angle: decimal_format reads as a verb just as naturally as it reads as a noun ("how to format the decimal"), which makes "pass a callable that does the formatting" fit the name rather than fight it.

@NyanFisher regarding your concern about overloading an existing kwarg: the three shapes ("string" / "number" / callable) dispatch unambiguously on type (string vs. callable), so the dispatch logic in C stays simple and the user-facing docs just enumerate the three accepted shapes in one place.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Decimal is a custom type

3 participants