Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve visibility date of last edit/revision of the Git Pro book #1930

Open
nbehrnd opened this issue Dec 7, 2024 · 4 comments
Open

improve visibility date of last edit/revision of the Git Pro book #1930

nbehrnd opened this issue Dec 7, 2024 · 4 comments

Comments

@nbehrnd
Copy link

nbehrnd commented Dec 7, 2024

URL for broken page

https://git-scm.com/book/en/v2

Problem

A visitor to the page sees the cover of the Pro git book, the string 2nd Edition (2014), icons to fetch the documentation in one pdf/epub/mobi file. It is somewhat misleading though, because work in the source didn't stop in 2014. Page ii of the current pdf explicitly reads Version 2.1.439, 2024-12-03, too. (However, there is no explicit note in the html version, nor in the epub file either -- perhaps a question how the workflows are organized.) It would be better to see on first view, i.e. without need of a download to complete a note like 2nd Edition (2014, last revision by 2024-12-03). right there, on the web site itself.

Because I don't know how one can relay this information from the source of Git Pro book to the web site, I am not able to prepare a PR.

Operating system and browser

Linux Debian 13/trixie (currently branch testing); Firefox 128.4.0esr (64-bit).

Steps to reproduce

not applicable here

Other details

Ben Straub agrees this could represent an improvement, see progit/progit2#1996.

@dscho
Copy link
Member

dscho commented Dec 7, 2024

Because I don't know how one can relay this information from the source of Git Pro book to the web site, I am not able to prepare a PR.

The flow from progit/progit2 to git-scm.com goes like this:

flowchart TD
    A[progit/progit2] -->|scheduled trigger| B
    B(<a href="https://github.com/git/git-scm.com/actions/workflows/update-book.yml">Update Progit Book</a> GitHub workflow) -->|pushes to| C[git/git-scm.com]
    B -->|deploys to| D[[<a href="https://git-scm.com/book/en/v2">https://git-scm.com/book/en/v2</a>]]
Loading

This Update Progit Book GitHub workflow essentially clones the progit2 repository (or any of the repositories containing the translated versions) and calls the Ruby script update-book2.rb (which uses book.rb) to render the book.

Therefore, the date would have to be updated by that script.

Do note, though, that it is not necessarily easy to determine the date of the latest update: many a time, there is a new commit in progit/progit2 but the actual contents have not changed. Have a look e.g. at 6f5f6b2: it corresponds to https://github.com/progit/progit2/compare/181ebf9cd07b0338cf659fd64267dcf6fd74dbc4..051ae88b8c5d95298b39dc1d40a2c2b55c09b04c which only updated the version of the json dependency, which resulted in an update of the PDF/ebook versions of the book (and hence in an update of the links on the front page, too), yet there has not been any actual update of any visible kind to the contents of the book.

If you can come up with a robust way to determine the date of the latest content update (I could imagine that a workable strategy would be some sort of storing the date in the book's metadata, updating it in the GitHub workflow, right before committing, but only when git diff --quiet --exit-code -I '^ *(sha|ebook_(pdf|epub|mobi)): [^ ]*$' -- external/book/content/${{ matrix.language.lang }} returns a non-zero exit code, which would indicate an actual content update, and then using said book's metadata on the book's front page, most likely something like <div style="light">2nd Edition (2014){{ if isset .Params.book "latest_update" }} (latest update: {{ .Params.book.latest_update }}){{ end }}</div>), then I would like to encourage you to work on this and open a PR.

@nbehrnd
Copy link
Author

nbehrnd commented Dec 11, 2024

@dscho I was just reading again a part of a workflow of a fork of a project which advances further only if there was some change at all (lines 69-72 of .github/workflows/automate_xfpm.yml) by read-out of the last commit hash

          # collect the new raw data only if there was a change
          git ls-remote https://github.com/Beliavsky/Fortran-code-on-GitHub main | \
            cmp - previous_check.txt --silent || \
              wget https://raw.githubusercontent.com/Beliavsky/Fortran-code-on-GitHub/main/README.md

As you outline, "any change" in the remote repository (presuming https://github.com/git/git-scm.com.git as the local, and https://github.com/progit/progit2.git as the remote to monitor) is not a good criterion for a more complex repository like in the present case. I agree with your line of argumentation. Instead, I speculate the following sequential approach to monitor changes of content might be suitable:

  • get to know the remote repository by a clone. Because the interest isn't about the files managed, but about their (recent) changes, a sparse checkout to fetch the git objects could suffice. It would require less data to transfer than a full checkout, too.
  • request git log to report the last commit dates (--pretty, --date), constrained to the set of *.asc presumed to be the sole containers of content relevant to compile the Pro Git pdf/epub/mobi. If the contents of this query is different to a local record about the last check, advance; else stop.
  • if the new list by git log is different to the local log, identify the date and abbreviated commit hash of the most recent commit of an .asc file (eventually to be relayed to the web page), and update/overwrite the local log file. Then close this file, and end this check/update gracefully.

In comparison to my fork of the "Fortran packages monitor", I do not know yet if a GitHub token were useful/necessary here. It is used there because the GitHub API then offers a higher performance. Maybe the GitHub API equally allows to collect the commit date / commit hash of the latest commit to an .asc file of the remote repository -- to be tested.

@dscho
Copy link
Member

dscho commented Dec 12, 2024

@nbehrnd I believe that it can be a lot simpler than that.

We already have a scheduled GitHub workflow that picks up any changes in the ProGit book repositories, and then re-renders the respective part of external/book/, commits, pushes and deploys it.

That is already in place. It runs every day, at 4:29am UTC. (I picked the time of day semi-randomly.)

So there is actually no need for any complicated test whether anything has changed: after committing the changes we have all the information we need at our fingertips:

  • git diff --quiet --exit-code -I '^ *(sha|ebook_(pdf|epub|mobi)): [^ ]*$' HEAD^ -- external/book/content/${{ matrix.language.lang }} will tell whether there was any actual content change, and
  • git -C progit-clone show -s --format=%ad delivers the author date of the latest commit in the ProGit repository, and
  • external/data/book/data/${{ matrix.language.lang }}.yml is the YAML-formatted file where we can store the date as latest_update attribute.

At this stage, I think it would only need someone to actually go implement it, verify that it works as intended, and then open a PR.

@dscho
Copy link
Member

dscho commented Jan 12, 2025

@nbehrnd are you planning on working on this? If not, I think we should close this ticket.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants