AboutGift cardsBlogLaunch app

How to make a Git repo from a Google Doc

Published on Feb 28, 2023 by Arpad Ray

It occurred to me the other day that Google Docs' revision history has a lot in common with version control repos like Git, and it might be useful to actually generate a Git repo from a doc so you can use your preferred Git tools to inspect the history and changes.

Some tasks, like finding out who changed a certain line, are vastly easier with a Git repo (simply git blame) than in the UI provided by Google Docs.

Using GitKraken's blame view to browse the history of an example doc

Fortunately the revision history is easily accessible using the Google Drive API, although it's a bit of a hassle doing the OAuth dance.

I've built a quick web app which does all this, creating a Git repo in your browser with a commit for each revision with the correct time and author: doc2git

If you'd prefer to do the OAuth setup and run it yourself then you can check out doc2git on GitHub.

Roughly, the steps are:

  1. Sign in with Google and OAuth to get drive.readonly scope. This is done client-side so the access token isn't saved anywhere, Google's OAuth library updates the Drive API client automatically.
  2. Create a Git repo (using isomorphic-git)
  3. Retrieve all the revisions for the selected doc. Each revision has a set of URLs which can be used to "export" the doc at that revision. I'm currently just retrieving the text/plain version since that's easiest to diff, but there also other formats like text/rtf and even application/pdf. The content of this revision is downloaded from the export URL, saved as doc.txt, and then committed with the author name, email address and time from the revision.
  4. Create a zip file of the repo (using zip.js) including the .git directory
  5. Provide a download link to the zip file

It can take a while to process the revisions for docs with lots of history. This could be made a lot faster by downloading several revisions at once, but of course would have to be put into order again for the Git commits to make sense. I haven't bothered for now, happy to leave it running and make a cup of tea in the meantime!

What is Repography?

Repography is a web app which creates data visualizations for your Git repos. You can use our command line script to try out Repography on any Git repo, or install our GitHub app.

Our dashboards are kept up-to-date automatically and are designed to be embedded in your README.md.
Our dashboards are kept up-to-date automatically and are designed to be embedded in your README.md.
[object Object]

Our posters are available for purchase both as downloads ($5) and as framed prints (from $109).

You can now buy Repography gift cards!

One click, no signup required

BETA