Skip to content

add BFG #4

Description

@adamreimer

Recently had an issue where I could not push a repository due to size and some of the violators were deep in my history. tracked it down and removed the largest files with an add on call BFG. Complicated but may be worth adding to the book as I could see this situation coming up. First you need to identify the problem...

To find out what is in your index
git ls-files --stage

To identify the largest files in a repository
git ls-files -z | xargs -0 ls -l | sort -nrk5 | head -n 10

Once you know what needs to go...
add the file to your .gitignore

Remove them from the index
git rm --cached 'path/to/file’

Now create a copy of your repository as this is dangerous. Visit https://rtyley.github.io/bfg-repo-cleaner/ download bfg-1.14.0jar and put the file in the repository folder.
java -jar bfg-1.14.0.jar --strip-blobs-bigger-than 50M

(the 50M part can be replaced with other sizes). be careful, this is dark magic. In my case it removed more stuff than I thought it would (I wasn't sure where to see the 50M number in the prior commands outputs) and deleted something I just wanted removed from the history. This was not a problem (I had a backup for the deletion and the other stuff which was removed should have been) but it was scary.

At a later time I realized I had committed a file which was confidential and should not be posted to GitHub (publicly or even within our organization). This time I ran:
java -jar bfg-1.14.0.jar --delete-files "Copy of Sport Fish survey comments_sg.xlsx"

I think I would do that with file size issues in the future as It felt safer to specify the file I wanted removed. Another weird thing. When ran BFG with a size limit it left the record of the file in each commit with "-DELETED" appended to the name. When I specified the file name it was gone form the git history with no record of having ever been there.

Need to research the implications of doing this when it has already been shared.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions