// Protocol Buffers - Google's data interchange format
// Copyright 2008 Google Inc. All rights reserved.
// https://developers.google.com/protocol-buffers/
//
// Redistribution and use in source and binary forms, with or without
// modification, are permitted provided that the following conditions are
// met:
//
// * Redistributions of source code must retain the above copyright
// notice, this list of conditions and the following disclaimer.
// * Redistributions in binary form must reproduce the above
// copyright notice, this list of conditions and the following disclaimer
// in the documentation and/or other materials provided with the
// distribution.
// * Neither the name of Google Inc. nor the names of its
// contributors may be used to endorse or promote products derived from
// this software without specific prior written permission.
//
// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
// "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
// LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
// A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
// OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
// SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
// LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
// DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
// THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
// (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
package com.google.protobuf;
RawMessageInfo stores the same amount of information as MessageInfo
but in a more compact format. /**
* RawMessageInfo stores the same amount of information as {@link MessageInfo} but in a more compact
* format.
*/
final class RawMessageInfo implements MessageInfo {
private final MessageLite defaultInstance;
The compact format packs everything in a String object and a Object[] array. The String object
is encoded with field number, field type, hasbits offset, oneof index, etc., whereas the
Object[] array contains field references, class references, instance references, etc.
The String object encodes a sequence of integers into UTF-16 characters. For each int, it
will be encoding into 1 to 3 UTF-16 characters depending on its unsigned value:
- 1 char: [c1: 0x0000 - 0xD7FF] = int of the same value.
- 2 chars: [c1: 0xE000 - 0xFFFF], [c2: 0x0000 - 0xD7FF] = (c2 << 13) | (c1 & 0x1FFF)
- 3 chars: [c1: 0xE000 - 0xFFFF], [c2: 0xE000 - 0xFFFF], [c3: 0x0000 - 0xD7FF] = (c3 << 26)
| ((c2 & 0x1FFF) << 13) | (c1 & 0x1FFF)
Note that we don't use UTF-16 surrogate pairs [0xD800 - 0xDFFF] because they have to come in
pairs to form a valid UTF-16char sequence and don't help us encode values more efficiently.
The integer sequence encoded in the String object has the following layout:
- [0]: flags, flags & 0x1 = is proto2?, flags & 0x2 = is message?.
- [1]: field count, if 0, this is the end of the integer sequence and the corresponding
Object[] array should be null.
- [2]: oneof count
- [3]: hasbits count, how many hasbits integers are generated.
- [4]: min field number
- [5]: max field number
- [6]: total number of entries need to allocate
- [7]: map field count
- [8]: repeated field count, this doesn't include map fields.
- [9]: size of checkInitialized array
- [...]: field entries
Each field entry starts with a field number and the field type:
- [0]: field number
- [1]: field type with extra bits:
- v & 0xFF = field type as defined in the FieldType class
- v & 0x100 = is required?
- v & 0x200 = is checkUtf8?
- v & 0x400 = needs isInitialized check?
- v & 0x800 = is map field with proto2 enum value?
If the file is proto2 and this is a singular field:
- [2]: hasbits offset
If the field is in an oneof:
- [2]: oenof index
For other types, the field entry only has field number and field type.
The Object[] array has 3 sections:
- ---- oneof section ----
- [0]: value field for oneof 1.
- [1]: case field for oneof 1.
- ...
- [.]: value field for oneof n.
- [.]: case field for oneof n.
- ---- hasbits section ----
- [.]: hasbits field 1
- [.]: hasbits field 2
- ...
- [.]: hasbits field n
- ---- field section ----
- [...]: field entries
In the Object[] array, field entries are ordered in the same way as field entries in the
String object. The size of each entry is determined by the field type.
- Oneof field:
- Oneof message field:
- [0]: message class reference.
- Oneof enum fieldin proto2:
- [0]: EnumLiteMap
- For all other oneof fields, field entry in the Object[] array is empty.
- Repeated message field:
- [0]: field reference
- [1]: message class reference
- Proto2 singular/repeated enum field:
- [0]: field reference
- [1]: EnumLiteMap
- Map field with a proto2 enum value:
- [0]: field reference
- [1]: map default entry instance
- [2]: EnumLiteMap
- Map field with other value types:
- [0]: field reference
- [1]: map default entry instance
- All other field type:
- [0]: field reference
In order to read the field info from this compact format, a reader needs to progress through
the String object and the Object[] array simultaneously.
/**
* The compact format packs everything in a String object and a Object[] array. The String object
* is encoded with field number, field type, hasbits offset, oneof index, etc., whereas the
* Object[] array contains field references, class references, instance references, etc.
*
* <p>The String object encodes a sequence of integers into UTF-16 characters. For each int, it
* will be encoding into 1 to 3 UTF-16 characters depending on its unsigned value:
*
* <ul>
* <li>1 char: [c1: 0x0000 - 0xD7FF] = int of the same value.
* <li>2 chars: [c1: 0xE000 - 0xFFFF], [c2: 0x0000 - 0xD7FF] = (c2 << 13) | (c1 & 0x1FFF)
* <li>3 chars: [c1: 0xE000 - 0xFFFF], [c2: 0xE000 - 0xFFFF], [c3: 0x0000 - 0xD7FF] = (c3 << 26)
* | ((c2 & 0x1FFF) << 13) | (c1 & 0x1FFF)
* </ul>
*
* <p>Note that we don't use UTF-16 surrogate pairs [0xD800 - 0xDFFF] because they have to come in
* pairs to form a valid UTF-16char sequence and don't help us encode values more efficiently.
*
* <p>The integer sequence encoded in the String object has the following layout:
*
* <ul>
* <li>[0]: flags, flags & 0x1 = is proto2?, flags & 0x2 = is message?.
* <li>[1]: field count, if 0, this is the end of the integer sequence and the corresponding
* Object[] array should be null.
* <li>[2]: oneof count
* <li>[3]: hasbits count, how many hasbits integers are generated.
* <li>[4]: min field number
* <li>[5]: max field number
* <li>[6]: total number of entries need to allocate
* <li>[7]: map field count
* <li>[8]: repeated field count, this doesn't include map fields.
* <li>[9]: size of checkInitialized array
* <li>[...]: field entries
* </ul>
*
* <p>Each field entry starts with a field number and the field type:
*
* <ul>
* <li>[0]: field number
* <li>[1]: field type with extra bits:
* <ul>
* <li>v & 0xFF = field type as defined in the FieldType class
* <li>v & 0x100 = is required?
* <li>v & 0x200 = is checkUtf8?
* <li>v & 0x400 = needs isInitialized check?
* <li>v & 0x800 = is map field with proto2 enum value?
* </ul>
* </ul>
*
* If the file is proto2 and this is a singular field:
*
* <ul>
* <li>[2]: hasbits offset
* </ul>
*
* If the field is in an oneof:
*
* <ul>
* <li>[2]: oenof index
* </ul>
*
* For other types, the field entry only has field number and field type.
*
* <p>The Object[] array has 3 sections:
*
* <ul>
* <li>---- oneof section ----
* <ul>
* <li>[0]: value field for oneof 1.
* <li>[1]: case field for oneof 1.
* <li>...
* <li>[.]: value field for oneof n.
* <li>[.]: case field for oneof n.
* </ul>
* <li>---- hasbits section ----
* <ul>
* <li>[.]: hasbits field 1
* <li>[.]: hasbits field 2
* <li>...
* <li>[.]: hasbits field n
* </ul>
* <li>---- field section ----
* <ul>
* <li>[...]: field entries
* </ul>
* </ul>
*
* <p>In the Object[] array, field entries are ordered in the same way as field entries in the
* String object. The size of each entry is determined by the field type.
*
* <ul>
* <li>Oneof field:
* <ul>
* <li>Oneof message field:
* <ul>
* <li>[0]: message class reference.
* </ul>
* <li>Oneof enum fieldin proto2:
* <ul>
* <li>[0]: EnumLiteMap
* </ul>
* <li>For all other oneof fields, field entry in the Object[] array is empty.
* </ul>
* <li>Repeated message field:
* <ul>
* <li>[0]: field reference
* <li>[1]: message class reference
* </ul>
* <li>Proto2 singular/repeated enum field:
* <ul>
* <li>[0]: field reference
* <li>[1]: EnumLiteMap
* </ul>
* <li>Map field with a proto2 enum value:
* <ul>
* <li>[0]: field reference
* <li>[1]: map default entry instance
* <li>[2]: EnumLiteMap
* </ul>
* <li>Map field with other value types:
* <ul>
* <li>[0]: field reference
* <li>[1]: map default entry instance
* </ul>
* <li>All other field type:
* <ul>
* <li>[0]: field reference
* </ul>
* </ul>
*
* <p>In order to read the field info from this compact format, a reader needs to progress through
* the String object and the Object[] array simultaneously.
*/
private final String info;
private final Object[] objects;
private final int flags;
RawMessageInfo(MessageLite defaultInstance, String info, Object[] objects) {
this.defaultInstance = defaultInstance;
this.info = info;
this.objects = objects;
int position = 0;
int value = (int) info.charAt(position++);
if (value < 0xD800) {
flags = value;
} else {
int result = value & 0x1FFF;
int shift = 13;
while ((value = info.charAt(position++)) >= 0xD800) {
result |= (value & 0x1FFF) << shift;
shift += 13;
}
flags = result | (value << shift);
}
}
String getStringInfo() {
return info;
}
Object[] getObjects() {
return objects;
}
@Override
public MessageLite getDefaultInstance() {
return defaultInstance;
}
@Override
public ProtoSyntax getSyntax() {
return (flags & 0x1) == 0x1 ? ProtoSyntax.PROTO2 : ProtoSyntax.PROTO3;
}
@Override
public boolean isMessageSetWireFormat() {
return (flags & 0x2) == 0x2;
}
}