Class FileEncoding

java.lang.Object
org.freebsd.file.FileEncoding

public class FileEncoding extends Object
Tries to guess the encoding of the byte sequence. Orignial code taken from https://github.com/file/file/blob/master/src/encoding.c
  • Field Details

    • type

      private String type
    • code

      private String code
    • code_mime

      private String code_mime
    • F

      private static final byte F
      See Also:
    • T

      private static final byte T
      See Also:
    • I

      private static final byte I
      See Also:
    • X

      private static final byte X
      See Also:
    • text_chars

      private byte[] text_chars
    • ebcdic_to_ascii

      private static final char[] ebcdic_to_ascii
    • ebcdic_1047_to_8859

      private static final char[] ebcdic_1047_to_8859
  • Constructor Details

    • FileEncoding

      public FileEncoding()
  • Method Details

    • getCodeMime

      public String getCodeMime()
    • getType

      public String getType()
    • getCode

      public String getCode()
    • guessFileEncoding

      public boolean guessFileEncoding(byte[] buf)
      Try to determine whether text is in some character code we can identify. It also identifies EBCDIC by converting it to ISO-8859-1.
      Returns:
      true if it could guess an encoding.
    • looks_ascii

      private boolean looks_ascii(byte[] buf, int nbytes)
    • looks_latin1

      private boolean looks_latin1(byte[] buf, int nbytes)
    • looks_extended

      private boolean looks_extended(byte[] buf, int nbytes)
    • looks_utf8

      protected int looks_utf8(byte[] buf, int nbytes)
    • looks_utf8_with_BOM

      private boolean looks_utf8_with_BOM(byte[] buf, int nbytes)
    • looks_utf7

      private boolean looks_utf7(byte[] buf, int nbytes)
    • looks_ucs16

      private int looks_ucs16(byte[] buf, int nbytes)
    • from_ebcdic

      private byte[] from_ebcdic(byte[] buf, int nbytes)
    • unsignedByte

      private int unsignedByte(byte value)