Class Utf8.UnsafeProcessor

  • Enclosing class:
    Utf8

    static final class Utf8.UnsafeProcessor
    extends Utf8.Processor
    Utf8.Processor that uses sun.misc.Unsafe where possible to improve performance.
    • Constructor Summary

      Constructors 
      Constructor Description
      UnsafeProcessor()  
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      (package private) java.lang.String decodeUtf8​(byte[] bytes, int index, int size)
      Decodes the given byte array slice into a String.
      (package private) java.lang.String decodeUtf8Direct​(java.nio.ByteBuffer buffer, int index, int size)
      Decodes direct ByteBuffer instances into String.
      (package private) int encodeUtf8​(java.lang.String in, byte[] out, int offset, int length)
      Encodes an input character sequence (in) to UTF-8 in the target array (out).
      protected void encodeUtf8Internal​(java.lang.String in, java.nio.ByteBuffer out)
      Encodes the input character sequence to a direct ByteBuffer instance.
      (package private) static boolean isAvailable()
      Indicates whether or not all required unsafe operations are supported on this platform.
      boolean isValidUtf8​(byte[] bytes, int index, int limit)
      Returns true if the given byte array slice is a well-formed UTF-8 byte sequence.
      protected boolean isValidUtf8BufferDirect​(java.nio.ByteBuffer buffer, int index, int limit)
      Must only be called on Direct buffers.
      private static int unsafeEstimateConsecutiveAscii​(byte[] bytes, long offset, int maxChars)
      Counts (approximately) the number of consecutive ASCII characters starting from the given position, using the most efficient method available to the platform.
      private static int unsafeEstimateConsecutiveAscii​(long address, int maxChars)
      Same as Utf8.estimateConsecutiveAscii(ByteBuffer, int, int) except that it uses the most efficient method available to the platform.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • UnsafeProcessor

        UnsafeProcessor()
    • Method Detail

      • isAvailable

        static boolean isAvailable()
        Indicates whether or not all required unsafe operations are supported on this platform.
      • isValidUtf8

        public boolean isValidUtf8​(byte[] bytes,
                                   int index,
                                   int limit)
        Description copied from class: Utf8.Processor
        Returns true if the given byte array slice is a well-formed UTF-8 byte sequence. The range of bytes to be checked extends from index index, inclusive, to limit, exclusive.
        Specified by:
        isValidUtf8 in class Utf8.Processor
      • isValidUtf8BufferDirect

        protected boolean isValidUtf8BufferDirect​(java.nio.ByteBuffer buffer,
                                                  int index,
                                                  int limit)
        Description copied from class: Utf8.Processor
        Must only be called on Direct buffers. This exists as a separate method only so that the UnsafeProcessor can optimize specially for that case.
        Overrides:
        isValidUtf8BufferDirect in class Utf8.Processor
      • encodeUtf8

        int encodeUtf8​(java.lang.String in,
                       byte[] out,
                       int offset,
                       int length)
        Description copied from class: Utf8.Processor
        Encodes an input character sequence (in) to UTF-8 in the target array (out). For a string, this method is functionally identical to
        
         byte[] a = string.getBytes(UTF_8);
         System.arraycopy(a, 0, bytes, offset, a.length);
         return offset + a.length;
         
        but may be implemented differently for efficiency purposes.

        Matching String.getBytes(UTF_8) this replaces unpaired surrogates with a replacement character.

        To ensure sufficient space in the output buffer, either call Utf8.encodedLength(java.lang.String) to compute the exact amount needed, or leave room for Utf8.MAX_BYTES_PER_CHAR * sequence.length(), which is the largest possible number of bytes that any input can be encoded to.

        Specified by:
        encodeUtf8 in class Utf8.Processor
        Parameters:
        in - the input character sequence to be encoded
        out - the target array
        offset - the starting offset in bytes to start writing at
        length - the length of the bytes, starting from offset
        Returns:
        the new offset, equivalent to offset + Utf8.encodedLength(sequence)
      • encodeUtf8Internal

        protected void encodeUtf8Internal​(java.lang.String in,
                                          java.nio.ByteBuffer out)
        Description copied from class: Utf8.Processor
        Encodes the input character sequence to a direct ByteBuffer instance.
        Specified by:
        encodeUtf8Internal in class Utf8.Processor
      • unsafeEstimateConsecutiveAscii

        private static int unsafeEstimateConsecutiveAscii​(byte[] bytes,
                                                          long offset,
                                                          int maxChars)
        Counts (approximately) the number of consecutive ASCII characters starting from the given position, using the most efficient method available to the platform.
        Parameters:
        bytes - the array containing the character sequence
        offset - the offset position of the index (same as index + arrayBaseOffset)
        maxChars - the maximum number of characters to count
        Returns:
        the number of ASCII characters found. The stopping position will be at or before the first non-ASCII byte.