Skip to content

Accept URL-safe and unpadded base64 in from_dict for bytes fields#227

Open
gaoflow wants to merge 1 commit into
betterproto:mainfrom
gaoflow:fix-from-dict-urlsafe-base64
Open

Accept URL-safe and unpadded base64 in from_dict for bytes fields#227
gaoflow wants to merge 1 commit into
betterproto:mainfrom
gaoflow:fix-from-dict-urlsafe-base64

Conversation

@gaoflow

@gaoflow gaoflow commented Jun 14, 2026

Copy link
Copy Markdown
Contributor

The proto3 JSON mapping accepts a bytes value encoded with either the standard or URL-safe base64 alphabet, with or without padding, and the reference protobuf implementation decodes all of these forms.

from_dict used plain base64.b64decode, which only handles the standard alphabet:

  • URL-safe input (-/_) was silently corruptedb64decode drops the non-standard characters, so e.g. from_dict({"v": "--__"}) returned b"" instead of the real bytes, with no error.
  • Unpadded input was rejected with binascii.Error: Incorrect padding.

Fix

Normalize to the standard alphabet and restore the optional padding before decoding (betterproto2/src/betterproto2/__init__.py):

value = value.replace("-", "+").replace("_", "/")
value += "=" * (-len(value) % 4)
return b64decode(value)

Standard padded input is unchanged; URL-safe and unpadded inputs now decode correctly.

Test

tests/test_from_dict_base64.py parametrizes the four accepted forms (standard/URL-safe x padded/unpadded) plus a to_dict/from_dict round trip. Without the fix, the unpadded and URL-safe cases fail (corruption / Incorrect padding); with it, all pass.


This change was prepared by an AI agent under my direction; I reviewed and verified it.

The proto3 JSON mapping accepts bytes encoded with either the standard or
URL-safe base64 alphabet, with or without padding, and the reference
protobuf implementation decodes all of these. from_dict used plain
base64.b64decode, which only handles the standard alphabet: it silently
discarded the URL-safe '-'/'_' characters (corrupting the value to empty
bytes) and rejected unpadded input with 'Incorrect padding'. Normalize to
the standard alphabet and restore the optional padding before decoding.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant