GitMedia extension enables Git handle blobs in separate storage.

Conditions:

  • Use AWS S3 for storage

Setup AWS S3 (remote)

  • Create S3 bucket foo

Setup AWS IAM users and groups (remote)

  • Create user johnd, etc.
  • Create credentials and keep the “access key” and “secret key” somewhere.
  • Create group foo.
  • Apply (attach) appropriate policy to foo.

    {
       "Version": "2012-10-17",
       "Statement": [
         {
           "Effect": "Allow",
           "Action": ["s3:ListAllMyBuckets"],
           "Resource": "arn:aws:s3:::*"
         },
         {
           "Effect": "Allow",
           "Action": [
             "s3:ListBucket",
             "s3:GetBucketLocation"
           ],
           "Resource": "arn:aws:s3:::foo"
         },
         {
           "Effect": "Allow",
           "Action": [
             "s3:PutObject",
             "s3:GetObject",
             "s3:DeleteObject"
           ],
           "Resource": "arn:aws:s3:::foo/*"
         }
       ]
    }
    

Setup git-media (local)

Setup Ruby environment since git-media is distributed as a Ruby Gem (for now).

Setup rbenv (in my case).

% sudo apt-get install rbenv ruby-build
% $EDITOR ~/.zshrc
...
# rbenv
eval "$(rbenv init -)"
...
% source ~/.zshrc

Setup git-media. I am using bundler here.

% mkdir testdir && cd testdir
% rbenv local x.x.x-pxxx
% rbenv exec gem install --no-ri --no-rdoc bundler
% $EDITOR Gemfile
...
% cat Gemfile
source 'https://rubygems.org'
gem 'trollop'
gem 's3'
gem 'ruby-atmos-pure'
gem 'right_aws'
gem 'git-media'
% rbenv exec bundle install
% rbenv rehash

rbenv rehash creates ~/.rbenv/shims/git-media and you can invoke git-media simply by runnning “git media foo”. (Remember to add eval “$(rbenv init -)” line in your .bashrc or .zshrc.)

Check to see if git-media can be invoked as git media foo.

% ls -al ~/.rbenv/shims/git-media
% echo $PATH
% git media status

Setup Git repository (local)

Create and configure Git repository.

% git init testrepo && cd testrepo

% git config filter.media.clean 'git-media filter-clean'
% git config filter.media.smudge 'git-media filter-smudge'

% git config media.auto-download false

Prepare .gitattributes.

% $EDITOR .gitattributes
% cat .gitattributes
*.pdf filter=media -crlf

Configure git-media for S3.

% git config git-media.transport s3
% git config git-media.s3user johnd
% git config git-media.s3bucket storagefoo
% $EDITOR .git/config
... (put s3key and s3secret) ...

% cat .git/config
[core]
...
[git-media]
         transport = s3
         s3user = johnd
         s3bucket = storagefoo
         s3key = ....................
         s3secret = ........................................
[filter "media"]
         clean = git-media filter-clean
         smudge = git-media filter-smudge

Test

% echo 'readme' > README
% git add README
% git commit -m 'Initial commit'

% git add .gitattributes
% git commit -m 'Add .gitattributes'

% echo 'Fake PDF file.' > foo.pdf
% git add foo.pdf
% git commit -m 'Add foo.pdf.'

% git media sync
% git media status
== Expanded Media ==
    (..b)    foo.pdf

== Already Pushed Media ==
    (..b)    ........
%

Resources

  • https://github.com/schacon/git-media
  • http://docs.aws.amazon.com/IAM/latest/UserGuide/ExampleIAMPolicies.html#iampolicy-example-s3
  • http://endorno.github.io/blog/2013/10/06/how-to-use-git-media/ (Japanese)

Alternatives

Other software aimed at similar purpose:

  • git-annex
  • git-fat

  • Pros:
    • Relatively simple.
    • No symlinks but real files in working directory.
    • Supports several types of storage including S3.
  • Cons:
    • Dependencies to a few Ruby Gems.