feat: add binary row serializers and blob implementation#31
Conversation
leaves12138
left a comment
There was a problem hiding this comment.
Thanks for the PR. I found two blockers that should be fixed before merge: one compile-time missing include and one moved-from unique_ptr dereference risk in Blob construction.
| #include "paimon/common/memory/memory_segment.h" | ||
| #include "paimon/common/memory/memory_segment_utils.h" | ||
| #include "paimon/common/memory/memory_slice.h" | ||
| #include "paimon/common/utils/var_length_int_utils.h" |
There was a problem hiding this comment.
This header is not present in the PR head or the current base branch, so including paimon/common/utils/var_length_int_utils.h makes this code fail to compile. Please add the missing utility header/implementation or switch to an existing varint helper.
| PAIMON_ASSIGN_OR_RAISE(std::string normalized_path, PathUtil::NormalizePath(path)); | ||
| PAIMON_ASSIGN_OR_RAISE(std::unique_ptr<BlobDescriptor> descriptor, | ||
| BlobDescriptor::Create(normalized_path, offset, length)); | ||
| auto impl = std::make_unique<Blob::Impl>(std::move(descriptor), descriptor->Uri()); |
There was a problem hiding this comment.
This passes std::move(descriptor) and descriptor->Uri() in the same function call. The argument evaluation order is unspecified, so descriptor->Uri() may run after the unique_ptr has been moved from and become null. Please capture the URI before moving descriptor; the same pattern below in FromDescriptor needs the same fix.
Purpose
Introduce serialization infrastructure for Paimon's binary row format, including
BinaryRowSerializer,BinarySerializerUtils,RowCompactedSerializer, and theBlobimplementation.Changes
BinaryRowSerializerBinaryRowBinarySerializerUtilsInternalRow,InternalArray,InternalMapinto their binary counterparts (BinaryRow,BinaryArray,BinaryMap)RowCompactedSerializerInternalRow, using bitset-based null tracking and variable-length integer encodingSerializeToBytes()/Deserialize()for row-level serializationBlobBlobclass for managing blob descriptors and blob dataTests
BinaryRowSerializerTestBinarySerializerUtilsTestRowCompactedSerializerTest