Class ObjectChecker


  • public class ObjectChecker
    extends java.lang.Object
    Verifies that an object is formatted correctly.

    Verifications made by this class only check that the fields of an object are formatted correctly. The ObjectId checksum of the object is not verified, and connectivity links between objects are also not verified. Its assumed that the caller can provide both of these validations on its own.

    Instances of this class are not thread safe, but they may be reused to perform multiple object validations, calling reset() between them to clear the internal state (e.g. getGitsubmodules())

    • Field Detail

      • tree

        public static final byte[] tree
        Header "tree "
      • parent

        public static final byte[] parent
        Header "parent "
      • author

        public static final byte[] author
        Header "author "
      • committer

        public static final byte[] committer
        Header "committer "
      • encoding

        public static final byte[] encoding
        Header "encoding "
      • object

        public static final byte[] object
        Header "object "
      • type

        public static final byte[] type
        Header "type "
      • tag

        public static final byte[] tag
        Header "tag "
      • tagger

        public static final byte[] tagger
        Header "tagger "
      • dotGitmodules

        private static final byte[] dotGitmodules
        Path ".gitmodules"
      • allowInvalidPersonIdent

        private boolean allowInvalidPersonIdent
      • windows

        private boolean windows
      • macosx

        private boolean macosx
      • gitsubmodules

        private final java.util.List<GitmoduleEntry> gitsubmodules
    • Constructor Detail

      • ObjectChecker

        public ObjectChecker()
    • Method Detail

      • setSkipList

        public ObjectChecker setSkipList​(@Nullable
                                         ObjectIdSet objects)
        Enable accepting specific malformed (but not horribly broken) objects.
        Parameters:
        objects - collection of object names known to be broken in a non-fatal way that should be ignored by the checker.
        Returns:
        this
        Since:
        4.2
      • setIgnore

        public ObjectChecker setIgnore​(@Nullable
                                       java.util.Set<ObjectChecker.ErrorType> ids)
        Configure error types to be ignored across all objects.
        Parameters:
        ids - error types to ignore. The caller's set is copied.
        Returns:
        this
        Since:
        4.2
      • setIgnore

        public ObjectChecker setIgnore​(ObjectChecker.ErrorType id,
                                       boolean ignore)
        Add message type to be ignored across all objects.
        Parameters:
        id - error type to ignore.
        ignore - true to ignore this error; false to treat the error as an error and throw.
        Returns:
        this
        Since:
        4.2
      • setAllowLeadingZeroFileMode

        public ObjectChecker setAllowLeadingZeroFileMode​(boolean allow)
        Enable accepting leading zero mode in tree entries.

        Some broken Git libraries generated leading zeros in the mode part of tree entries. This is technically incorrect but gracefully allowed by git-core. JGit rejects such trees by default, but may need to accept them on broken histories.

        Same as setIgnore(ZERO_PADDED_FILEMODE, allow).

        Parameters:
        allow - allow leading zero mode.
        Returns:
        this.
        Since:
        3.4
      • setAllowInvalidPersonIdent

        public ObjectChecker setAllowInvalidPersonIdent​(boolean allow)
        Enable accepting invalid author, committer and tagger identities.

        Some broken Git versions/libraries allowed users to create commits and tags with invalid formatting between the name, email and timestamp.

        Parameters:
        allow - if true accept invalid person identity strings.
        Returns:
        this.
        Since:
        4.0
      • setSafeForWindows

        public ObjectChecker setSafeForWindows​(boolean win)
        Restrict trees to only names legal on Windows platforms.

        Also rejects any mixed case forms of reserved names (.git).

        Parameters:
        win - true if Windows name checking should be performed.
        Returns:
        this.
        Since:
        3.4
      • setSafeForMacOS

        public ObjectChecker setSafeForMacOS​(boolean mac)
        Restrict trees to only names legal on Mac OS X platforms.

        Rejects any mixed case forms of reserved names (.git) for users working on HFS+ in case-insensitive (default) mode.

        Parameters:
        mac - true if Mac OS X name checking should be performed.
        Returns:
        this.
        Since:
        3.4
      • check

        public void check​(int objType,
                          byte[] raw)
                   throws CorruptObjectException
        Check an object for parsing errors.
        Parameters:
        objType - type of the object. Must be a valid object type code in Constants.
        raw - the raw data which comprises the object. This should be in the canonical format (that is the format used to generate the ObjectId of the object). The array is never modified.
        Throws:
        CorruptObjectException - if an error is identified.
      • check

        public void check​(@Nullable
                          AnyObjectId id,
                          int objType,
                          byte[] raw)
                   throws CorruptObjectException
        Check an object for parsing errors.
        Parameters:
        id - identify of the object being checked.
        objType - type of the object. Must be a valid object type code in Constants.
        raw - the raw data which comprises the object. This should be in the canonical format (that is the format used to generate the ObjectId of the object). The array is never modified.
        Throws:
        CorruptObjectException - if an error is identified.
        Since:
        4.2
      • checkId

        private boolean checkId​(byte[] raw)
      • checkCommit

        public void checkCommit​(byte[] raw)
                         throws CorruptObjectException
        Check a commit for errors.
        Parameters:
        raw - the commit data. The array is never modified.
        Throws:
        CorruptObjectException - if any error was detected.
      • checkTag

        public void checkTag​(byte[] raw)
                      throws CorruptObjectException
        Check an annotated tag for errors.
        Parameters:
        raw - the tag data. The array is never modified.
        Throws:
        CorruptObjectException - if any error was detected.
      • duplicateName

        private static boolean duplicateName​(byte[] raw,
                                             int thisNamePos,
                                             int thisNameEnd)
      • checkTree

        public void checkTree​(byte[] raw)
                       throws CorruptObjectException
        Check a canonical formatted tree for errors.
        Parameters:
        raw - the raw tree data. The array is never modified.
        Throws:
        CorruptObjectException - if any error was detected.
      • checkTree

        public void checkTree​(@Nullable
                              AnyObjectId id,
                              byte[] raw)
                       throws CorruptObjectException
        Check a canonical formatted tree for errors.
        Parameters:
        id - identity of the object being checked.
        raw - the raw tree data. The array is never modified.
        Throws:
        CorruptObjectException - if any error was detected.
        Since:
        4.2
      • checkPath

        public void checkPath​(byte[] raw,
                              int ptr,
                              int end)
                       throws CorruptObjectException
        Check tree path entry for validity.

        Unlike checkPathSegment(byte[], int, int), this version scans a multi-directory path string such as "src/main.c".

        Parameters:
        raw - buffer to scan.
        ptr - offset to first byte of the name.
        end - offset to one past last byte of name.
        Throws:
        CorruptObjectException - path is invalid.
        Since:
        3.6
      • checkPathSegment

        public void checkPathSegment​(byte[] raw,
                                     int ptr,
                                     int end)
                              throws CorruptObjectException
        Check tree path entry for validity.
        Parameters:
        raw - buffer to scan.
        ptr - offset to first byte of the name.
        end - offset to one past last byte of name.
        Throws:
        CorruptObjectException - name is invalid.
        Since:
        3.4
      • toHexString

        private static java.lang.String toHexString​(byte[] raw,
                                                    int ptr,
                                                    int end)
      • isInvalidOnWindows

        private static boolean isInvalidOnWindows​(byte c)
      • isGit

        private static boolean isGit​(byte[] buf,
                                     int p)
      • isGitmodules

        private boolean isGitmodules​(byte[] buf,
                                     int start,
                                     int end,
                                     @Nullable
                                     AnyObjectId id)
                              throws CorruptObjectException
        Check if the filename contained in buf[start:end] could be read as a .gitmodules file when checked out to the working directory. This ought to be a simple comparison, but some filesystems have peculiar rules for normalizing filenames: NTFS has backward-compatibility support for 8.3 synonyms of long file names (see https://web.archive.org/web/20160318181041/https://usn.pw/blog/gen/2015/06/09/filenames/ for details). NTFS is also case-insensitive. MacOS's HFS+ folds away ignorable Unicode characters in addition to case folding.
        Parameters:
        buf - byte array to decode
        start - position where a supposed filename is starting
        end - position where a supposed filename is ending
        id - object id for error reporting
        Returns:
        true if the filename in buf could be a ".gitmodules" file
        Throws:
        CorruptObjectException
      • matchLowerCase

        private boolean matchLowerCase​(byte[] b,
                                       int ptr,
                                       byte[] src)
      • isNTFSGitmodules

        private boolean isNTFSGitmodules​(byte[] buf,
                                         int start,
                                         int end)
      • isGitTilde1

        private static boolean isGitTilde1​(byte[] buf,
                                           int p,
                                           int end)
      • isNormalizedGit

        private static boolean isNormalizedGit​(byte[] raw,
                                               int ptr,
                                               int end)
      • match

        private boolean match​(byte[] b,
                              byte[] src)
      • toLower

        private static char toLower​(byte b)
      • isPositiveDigit

        private static boolean isPositiveDigit​(byte b)
      • normalize

        private java.lang.String normalize​(byte[] raw,
                                           int ptr,
                                           int end)
      • getGitsubmodules

        public java.util.List<GitmoduleEntry> getGitsubmodules()
        Get the list of ".gitmodules" files found in the pack. For each, report its blob id (e.g. to validate its contents) and the tree where it was found (e.g. to check if it is in the root)
        Returns:
        List of pairs of ids <tree, blob>.
        Since:
        4.7.5
      • reset

        public void reset()
        Reset the invocation-specific state from this instance. Specifically this clears the list of .gitmodules files encountered (see getGitsubmodules()) Configurations like errors to filter, skip lists or the specified O.S. (set via setSafeForMacOS(boolean) or setSafeForWindows(boolean)) are NOT cleared.
        Since:
        5.2