public final class BytesToNameCanonicalizer extends Object
Names which are constructed directly from a byte-based
input source).
Complications arise from trying to do efficient reuse and merging of
symbol tables, to be able to make use of usually shared vocabulary
of subsequent parsing runs.| Modifier and Type | Field and Description |
|---|---|
protected int |
_collCount
Total number of Names in collision buckets (included in
_count along with primary entries) |
protected int |
_collEnd
Index of the first unused collision bucket entry (== size of
the used portion of collision list): less than
or equal to 0xFF (255), since max number of entries is 255
(8-bit, minus 0 used as 'empty' marker)
|
protected com.fasterxml.jackson.core.sym.BytesToNameCanonicalizer.Bucket[] |
_collList
Array of heads of collision bucket chains; size dynamically
|
protected int |
_count
Total number of Names in the symbol table;
only used for child tables.
|
protected boolean |
_failOnDoS
Flag that indicates whether we should throw an exception if enough
hash collisions are detected (true); or just worked around (false).
|
protected int[] |
_hash
Array of 2^N size, which contains combination
of 24-bits of hash (0 to indicate 'empty' slot),
and 8-bit collision bucket index (0 to indicate empty
collision bucket chain; otherwise subtract one from index)
|
protected int |
_hashMask
Mask used to truncate 32-bit hash value to current hash array
size; essentially, hash array size - 1 (since hash array sizes
are 2^N).
|
protected boolean |
_intern
Whether canonical symbol Strings are to be intern()ed before added
to the table or not.
|
protected int |
_longestCollisionList
We need to keep track of the longest collision list; this is needed
both to indicate problems with attacks and to allow flushing for
other cases.
|
protected Name[] |
_mainNames
Array that contains
Name instances matching
entries in _mainHash. |
protected BitSet |
_overflows
Lazily constructed structure that is used to keep track of
collision buckets that have overflowed once: this is used
to detect likely attempts at denial-of-service attacks that
uses hash collisions.
|
protected BytesToNameCanonicalizer |
_parent
Reference to the root symbol table, for child tables, so
that they can merge table information back as necessary.
|
protected AtomicReference<com.fasterxml.jackson.core.sym.BytesToNameCanonicalizer.TableInfo> |
_tableInfo
Member that is only used by the root table instance: root
passes immutable state into child instances, and children
may return new state if they add entries to the table.
|
| Modifier and Type | Method and Description |
|---|---|
Name |
addName(String name,
int[] q,
int qlen) |
Name |
addName(String name,
int q1,
int q2) |
int |
bucketCount() |
int |
calcHash(int q1) |
int |
calcHash(int[] q,
int qlen) |
int |
calcHash(int q1,
int q2) |
protected static int[] |
calcQuads(byte[] wordBytes) |
int |
collisionCount()
Method mostly needed by unit tests; calculates number of
entries that are in collision list.
|
static BytesToNameCanonicalizer |
createRoot()
Factory method to call to create a symbol table instance with a
randomized seed value.
|
protected static BytesToNameCanonicalizer |
createRoot(int seed)
Factory method that should only be called from unit tests, where seed
value should remain the same.
|
Name |
findName(int q1)
Finds and returns name matching the specified symbol, if such
name already exists in the table.
|
Name |
findName(int[] q,
int qlen)
Finds and returns name matching the specified symbol, if such
name already exists in the table; or if not, creates name object,
adds to the table, and returns it.
|
Name |
findName(int q1,
int q2)
Finds and returns name matching the specified symbol, if such
name already exists in the table.
|
static Name |
getEmptyName() |
int |
hashSeed() |
BytesToNameCanonicalizer |
makeChild(boolean canonicalize,
boolean intern)
Deprecated.
|
BytesToNameCanonicalizer |
makeChild(int flags)
Factory method used to create actual symbol table instance to
use for parsing.
|
int |
maxCollisionLength()
Method mostly needed by unit tests; calculates length of the
longest collision chain.
|
boolean |
maybeDirty()
Method called to check to quickly see if a child symbol table
may have gotten additional entries.
|
void |
release()
Method called by the using code to indicate it is done
with this instance.
|
protected void |
reportTooManyCollisions(int maxLen) |
int |
size() |
protected final BytesToNameCanonicalizer _parent
protected final AtomicReference<com.fasterxml.jackson.core.sym.BytesToNameCanonicalizer.TableInfo> _tableInfo
protected boolean _intern
NOTE: non-final to allow disabling intern()ing in case of excessive collisions.
protected final boolean _failOnDoS
protected int _count
protected int _longestCollisionList
protected int _hashMask
protected int[] _hash
protected Name[] _mainNames
Name instances matching
entries in _mainHash. Contains nulls for unused
entries.protected com.fasterxml.jackson.core.sym.BytesToNameCanonicalizer.Bucket[] _collList
protected int _collCount
_count along with primary entries)protected int _collEnd
protected BitSet _overflows
public static BytesToNameCanonicalizer createRoot()
protected static BytesToNameCanonicalizer createRoot(int seed)
public BytesToNameCanonicalizer makeChild(int flags)
@Deprecated public BytesToNameCanonicalizer makeChild(boolean canonicalize, boolean intern)
public void release()
public int size()
public int bucketCount()
public boolean maybeDirty()
public int hashSeed()
public int collisionCount()
size() - 1), but should usually be much lower, ideally 0.public int maxCollisionLength()
size() - 1 in the pathological casepublic static Name getEmptyName()
public Name findName(int q1)
Note: separate methods to optimize common case of short element/attribute names (4 or less ascii characters)
q1 - int32 containing first 4 bytes of the name;
if the whole name less than 4 bytes, padded with zero bytes
in front (zero MSBs, ie. right aligned)public Name findName(int q1, int q2)
Note: separate methods to optimize common case of relatively short element/attribute names (8 or less ascii characters)
q1 - int32 containing first 4 bytes of the name.q2 - int32 containing bytes 5 through 8 of the
name; if less than 8 bytes, padded with up to 3 zero bytes
in front (zero MSBs, ie. right aligned)public Name findName(int[] q, int qlen)
Note: this is the general purpose method that can be called for names of any length. However, if name is less than 9 bytes long, it is preferable to call the version optimized for short names.
q - Array of int32s, each of which contain 4 bytes of
encoded nameqlen - Number of int32s, starting from index 0, in quads
parameterpublic int calcHash(int q1)
public int calcHash(int q1,
int q2)
public int calcHash(int[] q,
int qlen)
protected static int[] calcQuads(byte[] wordBytes)
protected void reportTooManyCollisions(int maxLen)
Copyright © 2014-2015 FasterXML. All Rights Reserved.