pynenc_mongo.util.chunked_data¶
Utilities for compressing and splitting large strings into chunks.
Provides functions to compress data with zlib and split it into chunks that fit within MongoDB’s BSON document size limit. Chunks are simple byte sequences with an index for ordered reassembly.
Key components:
compress / decompress: zlib-based string compression
split_into_chunks / reassemble_chunks: size-based splitting and reassembly
exceeds_bson_threshold: check if data needs chunking
Module Contents¶
Functions¶
Compress a string using zlib. |
|
Decompress zlib-compressed bytes back to a string. |
|
Split bytes into ordered chunks of at most chunk_size bytes. |
|
Reassemble ordered chunks into the original bytes. |
|
Check if a string or dict of strings exceeds the size threshold for a single BSON document. |
Data¶
API¶
- pynenc_mongo.util.chunked_data.logger¶
‘getLogger(…)’
- pynenc_mongo.util.chunked_data.compress(data: str) bytes[source]¶
Compress a string using zlib.
Uses compression level 6 for balanced speed/ratio tradeoff. Level 6 is significantly faster than default (9) with minimal size difference.
- Parameters:
data – UTF-8 string to compress
- Returns:
Compressed bytes
- pynenc_mongo.util.chunked_data.decompress(data: bytes) str[source]¶
Decompress zlib-compressed bytes back to a string.
- Parameters:
data – Compressed bytes
- Returns:
Decompressed UTF-8 string
- pynenc_mongo.util.chunked_data.split_into_chunks(data: bytes, chunk_size: int) list[bytes][source]¶
Split bytes into ordered chunks of at most chunk_size bytes.
- Parameters:
data – The bytes to split
chunk_size – Maximum size per chunk in bytes
- Returns:
List of byte chunks in order
- pynenc_mongo.util.chunked_data.reassemble_chunks(chunks: list[bytes]) bytes[source]¶
Reassemble ordered chunks into the original bytes.
- Parameters:
chunks – List of byte chunks in order
- Returns:
Reassembled bytes
- pynenc_mongo.util.chunked_data.exceeds_bson_threshold(data: dict[str, str] | str, threshold: int) bool[source]¶
Check if a string or dict of strings exceeds the size threshold for a single BSON document.
- Parameters:
data – The string or dict of strings to check
threshold – Size threshold in bytes
- Returns:
True if the encoded data exceeds the threshold