-
Notifications
You must be signed in to change notification settings - Fork 390
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cloning this repo consumes 1.4GB on disk #771
Comments
Thanks for bringing this up @kfranqueiro. Please find below my additional investigation and comments. Size of the repository: 1.3 GB
count: 0
size: 0
in-pack: 1383802
packs: 1
size-pack: 1362244
prune-packable: 0
garbage: 0
size-garbage: 0
Finding large objectsResults of script from Atlassian documentation
Main large files:
Possible solutionsHost videos elsewhereVideos used in ACT rules could be hosted on https://media.w3.org/ and removed from Git history (see Atlassian documentation for removing large files) I do not think we need to use Git LFS (at least for now): we do not use many large files (such as videos) in this project; these files are not frequently updated; and thus do not need to be tracked with Git. Directly hosting them somewhere else appear good enough to me. Warning From Atlassian documentation about reducing repository size: Optimize imagesAs you noted, there are many PNG files in the Many could be replaced with JPG files or smaller versions (images used in How People with Disabilities Use the Web subpages in particular). The PNG/larger versions could then be removed from Git repository. Squash gh-pages history (from @kfranqueiro)After the above cleaning, I agree with the solution of regularly squashing I would exclude a more radical approach of using this branch as an orphan branch: we want multiple commits in Test squash with my forkAfter squashing the
|
Maybe y'all have already communicated this: Ken is pursuing adding some videos to a new subdirectory of https://media.w3.org/wai/ |
@kfranqueiro @shawna-slh @iadawn I have squashed the history of the I now get the following results:
|
Upon cloning this repo for the first time today, I was somewhat unnerved to find that git needed to download 1.2GB of data, ultimately occupying 1.4GB of disk space.
Further digging leads to the conclusion that the
gh-pages
branch is the main culprit:Moreover, most of this isn't due to files currently present on the branch, but rather files in its history:
(i.e., there is roughly 300 MB actually on the branch, while the other 1.3GB is in commit history)
The
content-assets
andcontent-images
folders account for most of the presently occupied space on the branch, containing a handful of MP4 videos and many PNG images.Possible solutions
Squash gh-pages history
Fully remedying the current state would likely require rewriting git history of the
gh-pages
branch, to squash the majority of the 1.3 GB history with no bearing on the present contents. The silver lining is I would suspect that only automated systems push togh-pages
most of the time, so this wouldn't be nearly as disruptive as having to do so on the main branch.Git LFS for binary files
Using git for long-term storage of binary assets inevitably leads to this result. Maybe we should look into using Git LFS with this repo?
Potential workarounds
If you want to clone the repo in its current state without pulling down over a gigabyte of data, you can clone only the default branch using
git clone --single-branch repo-url
. (This costs roughly 160MB of space rather than 1.4GB.)If you need to checkout a particular existing branch after doing the above, you can explicitly add only the branch you want:
The text was updated successfully, but these errors were encountered: