Package net.sf.saxon.serialize.charcode
Class UTF16CharacterSet
- java.lang.Object
-
- net.sf.saxon.serialize.charcode.UTF16CharacterSet
-
- All Implemented Interfaces:
CharacterSet
public class UTF16CharacterSet extends java.lang.Object implements CharacterSet
A class to hold some static constants and methods associated with processing UTF16 and surrogate pairs
-
-
Field Summary
Fields Modifier and Type Field Description static intNONBMP_MAXstatic intNONBMP_MINstatic charSURROGATE1_MAXstatic charSURROGATE1_MINstatic charSURROGATE2_MAXstatic charSURROGATE2_MIN
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description static intcombinePair(char high, char low)Return the non-BMP character corresponding to a given surrogate pair surrogates.static booleancontainsSurrogates(java.lang.CharSequence s)Test whether a CharSequence contains any surrogates (i.e.static intfirstInvalidChar(java.lang.CharSequence chars, java.util.function.IntPredicate predicate)Test whether all the characters in a CharSequence are valid XML charactersjava.lang.StringgetCanonicalName()Get the preferred Java name of the character set.static UTF16CharacterSetgetInstance()Get the singular instance of this classstatic charhighSurrogate(int ch)Return the high surrogate of a non-BMP characterbooleaninCharset(int c)Determine if a character is present in the character setstatic booleanisHighSurrogate(int ch)Test whether the given character is a high surrogatestatic booleanisLowSurrogate(int ch)Test whether the given character is a low surrogatestatic booleanisSurrogate(int c)Test whether a given character is a surrogate (high or low)static charlowSurrogate(int ch)Return the low surrogate of a non-BMP characterstatic voidmain(java.lang.String[] args)
-
-
-
Field Detail
-
NONBMP_MIN
public static final int NONBMP_MIN
- See Also:
- Constant Field Values
-
NONBMP_MAX
public static final int NONBMP_MAX
- See Also:
- Constant Field Values
-
SURROGATE1_MIN
public static final char SURROGATE1_MIN
- See Also:
- Constant Field Values
-
SURROGATE1_MAX
public static final char SURROGATE1_MAX
- See Also:
- Constant Field Values
-
SURROGATE2_MIN
public static final char SURROGATE2_MIN
- See Also:
- Constant Field Values
-
SURROGATE2_MAX
public static final char SURROGATE2_MAX
- See Also:
- Constant Field Values
-
-
Method Detail
-
getInstance
public static UTF16CharacterSet getInstance()
Get the singular instance of this class- Returns:
- the singular instance of this class
-
inCharset
public boolean inCharset(int c)
Description copied from interface:CharacterSetDetermine if a character is present in the character set- Specified by:
inCharsetin interfaceCharacterSet
-
getCanonicalName
public java.lang.String getCanonicalName()
Description copied from interface:CharacterSetGet the preferred Java name of the character set. Note that Java in many cases also supports a "historic name".- Specified by:
getCanonicalNamein interfaceCharacterSet
-
combinePair
public static int combinePair(char high, char low)Return the non-BMP character corresponding to a given surrogate pair surrogates.- Parameters:
high- The high surrogate.low- The low surrogate.- Returns:
- the Unicode codepoint represented by the surrogate pair
-
highSurrogate
public static char highSurrogate(int ch)
Return the high surrogate of a non-BMP character- Parameters:
ch- The Unicode codepoint of the non-BMP character to be divided.- Returns:
- the first character in the surrogate pair
-
lowSurrogate
public static char lowSurrogate(int ch)
Return the low surrogate of a non-BMP character- Parameters:
ch- The Unicode codepoint of the non-BMP character to be divided.- Returns:
- the second character in the surrogate pair
-
isSurrogate
public static boolean isSurrogate(int c)
Test whether a given character is a surrogate (high or low)- Parameters:
c- the character to test- Returns:
- true if the character is the high or low half of a surrogate pair
-
isHighSurrogate
public static boolean isHighSurrogate(int ch)
Test whether the given character is a high surrogate- Parameters:
ch- The character to test.- Returns:
- true if the character is the first character in a surrogate pair
-
isLowSurrogate
public static boolean isLowSurrogate(int ch)
Test whether the given character is a low surrogate- Parameters:
ch- The character to test.- Returns:
- true if the character is the second character in a surrogate pair
-
containsSurrogates
public static boolean containsSurrogates(java.lang.CharSequence s)
Test whether a CharSequence contains any surrogates (i.e. any non-BMP characters- Parameters:
s- the string to be tested
-
firstInvalidChar
public static int firstInvalidChar(java.lang.CharSequence chars, java.util.function.IntPredicate predicate)Test whether all the characters in a CharSequence are valid XML characters- Parameters:
chars- the character sequence to be testedpredicate- the predicate that all characters must satisfy- Returns:
- the codepoint of the first invalid character in the character sequence (according to the supplied predicate); or -1 if all characters in the character sequence are valid
-
main
public static void main(java.lang.String[] args)
-
-