Hosting Maven Repos on Github

UPDATE: If you’re using Clojure and Leiningen, read no further. Just use s3-wagon-private to deploy artifacts to S3. (The deployed artifacts can be private or public, depending on the scheme you use to identify the destination bucket, i.e. s3://... vs. s3p://....)

 

Hosting Maven repos has gotten easier and easier over the years.  We've run the free version of Nexus for a couple of years now, which owns all the other options feature-wise as far as I can tell, and is a cinch to get set up and maintain.  There's a raft of other free Maven repository servers, using plain-Jane FTP, and various recipes on the 'nets to serve up a Hudson instance's local Maven repository for remote usage.  Finally, Sonatype began offering free Maven repository hosting (via Nexus) for open source projects earlier this year, which comes with automatic syncing/promotion to Maven Central if you meet the attendant requirements.

Despite all these options, I continue to run into people that are intimidated by the notion of running a Maven repo to support their own projects – something that is increasingly necessary in the Clojure community, where all of the build tools at hand (clojure-maven-plugin, Leiningen, Clojuresque [a Gradle plugin], and those brave souls that use Ant + Ivy) require Maven-style artifact repositories.  Some recent discussions on this topic reminded me of a technique I came across a few months ago for hosting a maven repository on Google Code (also available for those that use Kenai).  This approach (ab)uses a Google Code subversion repo as a maven repo (over either webdav or svn protocols). At the time, I thought that it would be nice to do the same with Github, since I've long since sworn off svn, but I didn't pursue it any further then.

So, perhaps people might find hosting Maven artifacts on Github more approachable than running a proper Maven repository.  Thankfully, it's remarkably easy to get setup; there's no rocket science here, and the approach is fundamentally the same as using a subversion repository as a Maven repo, but a walkthrough is warranted nonetheless.  I'll demonstrate the workflow using clutch, the Clojure CouchDB library that I've contributed to a fair bit.

1. Set up/identify your Maven repo project on Github

You need to figure out where you're going to deploy (and then host) your project artifacts.  I've created a new repo, and checked it out at the root of my dev directory.

[catapult:~/dev] chas% git clone git@github.com:cemerick/cemerick-mvn-repo.git
Initialized empty Git repository in ~/dev/cemerick-mvn-repo/.git/
warning: You appear to have cloned an empty repository.

Because Maven namespaces artifacts based on their group and artifact IDs, you should probably have only one Github-hosted Maven repository for all of your projects and other miscellaneous artifact storage.  I can't see any reason to have a repository-per-project.

2. Set up separate snapshots and releases directories.

Snapshots and releases should be kept separate in Maven repositories.  This isn't a technical necessity, but will generally be expected by your repo's consumers, especially if they're familiar with Maven.  (Repository managers such as Nexus actually require that individual repositories' types be declared upon creation, as either snapshot or release.)

[catapult:~/dev] chas% cd cemerick-mvn-repo/
[catapult:~/dev/cemerick-mvn-repo] chas% mkdir snapshots
[catapult:~/dev/cemerick-mvn-repo] chas% mkdir releases

3. Deploy your project's artifacts to your Maven repo

A properly-useful pom.xml contains a <distributionManagement> configuration that specifies the repositories to which one's project artifacts should be deployed.  If you're only going to use Github-hosted Maven repositories, then we just need to stub this configuration out (doing this will not be necessary in the future1):

<distributionManagement>
	<repository>
		<id>repo</id>
		<url>https://github.com/cemerick/cemerick-mvn-repo/raw/master/releases</url>
	</repository>
	<snapshotRepository>
		<id>snapshot-repo</id>
		<url>https://github.com/cemerick/cemerick-mvn-repo/raw/master/snapshots</url>
	</snapshotRepository>
</distributionManagement>

Usually, URLs provided here would describe a Maven repository server's API endpoint (e.g. a webdav URL, etc).  That's obviously not available if Github is going to be hosting the contents of the Maven repos, so I'm just using the root URLs where my git Maven repos will be hosted from; as a side effect, this will cause mvn deploy to fail if I  don't provide a path to my clone of the Github Maven repo.

Now let's run the clutch build and deploy our artifacts (which handily implies running all of the project's tests), providing a path to our repo's clone directory using the altDeploymentRepository system property (heavily edited console output below)2:

[catapult:~/dev/cemerick-mvn-repo/] chas% cd ../vendor/clutch
[catapult:~/dev/vendor/clutch] chas% mvn -DaltDeploymentRepository=snapshot-repo::default::file:../../cemerick-mvn-repo/snapshots clean deploy
[INFO] Building jar: ~/dev/vendor/clutch/target/clutch-0.2.3-SNAPSHOT.jar
[INFO] Using alternate deployment repository snapshot-repo::default::file:../../cemerick-mvn-repo/snapshots
[INFO] Retrieving previous build number from snapshot-repo
Uploading: file:../../cemerick-mvn-repo/snapshots/com/ashafa/clutch/0.2.3-SNAPSHOT/clutch-0.2.3-SNAPSHOT.jar
729K uploaded  (clutch-0.2.3-SNAPSHOT.jar)
[INFO] BUILD SUCCESSFUL

That looks happy-making.  Let's take a look:

[catapult:~/dev/vendor/clutch] chas% find ~/dev/cemerick-mvn-repo/snapshots
/Users/chas/dev/cemerick-mvn-repo/snapshots/
/Users/chas/dev/cemerick-mvn-repo/snapshots//com
/Users/chas/dev/cemerick-mvn-repo/snapshots//com/ashafa
/Users/chas/dev/cemerick-mvn-repo/snapshots//com/ashafa/clutch
/Users/chas/dev/cemerick-mvn-repo/snapshots//com/ashafa/clutch/0.2.3-SNAPSHOT
/Users/chas/dev/cemerick-mvn-repo/snapshots//com/ashafa/clutch/0.2.3-SNAPSHOT/clutch-0.2.3-SNAPSHOT.jar
/Users/chas/dev/cemerick-mvn-repo/snapshots//com/ashafa/clutch/0.2.3-SNAPSHOT/clutch-0.2.3-SNAPSHOT.jar.md5
/Users/chas/dev/cemerick-mvn-repo/snapshots//com/ashafa/clutch/0.2.3-SNAPSHOT/clutch-0.2.3-SNAPSHOT.jar.sha1
/Users/chas/dev/cemerick-mvn-repo/snapshots//com/ashafa/clutch/0.2.3-SNAPSHOT/clutch-0.2.3-SNAPSHOT.pom
/Users/chas/dev/cemerick-mvn-repo/snapshots//com/ashafa/clutch/0.2.3-SNAPSHOT/clutch-0.2.3-SNAPSHOT.pom.md5
/Users/chas/dev/cemerick-mvn-repo/snapshots//com/ashafa/clutch/0.2.3-SNAPSHOT/clutch-0.2.3-SNAPSHOT.pom.sha1
/Users/chas/dev/cemerick-mvn-repo/snapshots//com/ashafa/clutch/0.2.3-SNAPSHOT/maven-metadata.xml
/Users/chas/dev/cemerick-mvn-repo/snapshots//com/ashafa/clutch/0.2.3-SNAPSHOT/maven-metadata.xml.md5
/Users/chas/dev/cemerick-mvn-repo/snapshots//com/ashafa/clutch/0.2.3-SNAPSHOT/maven-metadata.xml.sha1
/Users/chas/dev/cemerick-mvn-repo/snapshots//com/ashafa/clutch/maven-metadata.xml
/Users/chas/dev/cemerick-mvn-repo/snapshots//com/ashafa/clutch/maven-metadata.xml.md5
/Users/chas/dev/cemerick-mvn-repo/snapshots//com/ashafa/clutch/maven-metadata.xml.sha1

That there is a Maven repository.  Just to briefly dissect the altDeploymentRepository argument:

snapshot-repo::default::file:../../cemerick-mvn-repo/snapshots

It's a three-part descriptor of sorts:

  1. snapshot-repo is the ID of the repository we're defining, and can refer to one of the repositories specified in the <distributionManagement> section of the pom.xml.  This allows one to change a repository's URL while retaining other <distributionManagement> configuration that might be set.
  2. default is the repository type; unless you're monkeying with Maven 1-style repositories (hardly anyone is these days), this is required.
  3. file:../../cemerick-mvn-repo/snapshots is the actual repository URL, and has to be relative to the root of your project, or absolute. No ~ here, etc.

4. Push to Github

Remember that your Maven repo is just like any other git repo, so changes need to be committed and pushed up in order to be useful.

[catapult:~/dev/cemerick-mvn-repo] chas% git add *
[catapult:~/dev/cemerick-mvn-repo] chas% git commit -m "clutch 0.2.3-SNAPSHOT"
[master f177c06] clutch 0.2.3-SNAPSHOT
 12 files changed, 164 insertions(+), 2 deletions(-)
 create mode 100644 snapshots/com/ashafa/clutch/0.2.3-SNAPSHOT/clutch-0.2.3-SNAPSHOT.jar
 create mode 100644 snapshots/com/ashafa/clutch/0.2.3-SNAPSHOT/clutch-0.2.3-SNAPSHOT.jar.md5
 create mode 100644 snapshots/com/ashafa/clutch/0.2.3-SNAPSHOT/clutch-0.2.3-SNAPSHOT.jar.sha1
 create mode 100644 snapshots/com/ashafa/clutch/0.2.3-SNAPSHOT/clutch-0.2.3-SNAPSHOT.pom
 create mode 100644 snapshots/com/ashafa/clutch/0.2.3-SNAPSHOT/clutch-0.2.3-SNAPSHOT.pom.md5
 create mode 100644 snapshots/com/ashafa/clutch/0.2.3-SNAPSHOT/clutch-0.2.3-SNAPSHOT.pom.sha1
 create mode 100644 snapshots/com/ashafa/clutch/0.2.3-SNAPSHOT/maven-metadata.xml
 create mode 100644 snapshots/com/ashafa/clutch/0.2.3-SNAPSHOT/maven-metadata.xml.md5
 create mode 100644 snapshots/com/ashafa/clutch/0.2.3-SNAPSHOT/maven-metadata.xml.sha1
[catapult:~/dev/cemerick-mvn-repo] chas% git push origin master
Counting objects: 24, done.
Delta compression using 2 threads.
Compressing objects: 100% (7/7), done.
Writing objects: 100% (19/19), 669.07 KiB, done.
Total 19 (delta 1), reused 0 (delta 0)
To git@github.com:cemerick/cemerick-mvn-repo.git
 f57ccba..f177c06  master -> master

5. Use your new Maven repository

Your repository's root will be at

https://github.com/<your-github-username>/<your-github-maven-project>/raw/master/

Just append snapshots or releases to that root URL, as appropriate for your project's dependencies.

You can use your Github-hosted Maven repository in all the same ways as you would use a "normal" Maven repo – configure projects to depend on artifacts from it, proxy and aggregate it with Maven repository servers like Nexus, etc. The most common case of projects depending upon artifacts in the repo only requires a corresponding <repository> entry in their pom.xml, e.g.:

<repositories>
    <repository>
        <id>cemerick-snapshots</id>
        <url>https://github.com/cemerick/cemerick-mvn-repo/raw/master/snapshots</url>
    </repository>
</repositories>

Advantages

  1. Administrative simplicity: there are no servers to maintain, no additional accounts to obtain (i.e. compared to using Sonatype's OSS Nexus hosting service), and the workflow is very familiar (at least to those that use git).
  2. Configuration simplicity: compared to (ab)using Google Code, Kenai, or any other subversion host as a Maven repository, the project configuration described above is far simpler.  Subversion options require adding a build extension (for either wagon-svn or wagon-webdav) and specifying svn credentials in one's ~/.m2/settings.xml.
  3. Tighter alignment with the "real world": ideally, every artifact would be in Maven Central, and every project would use Hudson and deploy SNAPSHOT artifacts to a Maven repo.  In reality, if you want to depend upon the bleeding edge of most projects – which aren't regularly built in a continuous integration environment and which don't regularly throw off artifacts from HEAD – having an artifact repository that is (roughly) co-located with the source repository that you control is very handy.  This is true even if you have your own hosted Maven repo, such as Nexus; especially for SNAPSHOTs of "vendor" artifacts, it's often easier to simply deploy to a local clone of your git-hosted Maven repo and push that than it is to recall the URL for your Maven server's third-party snapshots repository, or constantly be adding/modifying a <distributionManagement> element to the projects' pom.xml.

Caveats / Cons

  1. This practice may be considered harmful by some.  Quoting here from comments on another description of how to host Maven repositories via svn:

    This basically introduces microrepository. Users of these projects require either duplicate entries in their pom: one for the dependency and one for the repository, or put great burden on the maintainer of the local repository to add each microrepository by hand....So please instead of this solution have your Maven build post artifacts to a real repository like Java’s, Maven’s or Sontatype’s.
    – tbee

    Please do NOT use this approach. Sonatype provide free hosting of a Maven Repo for open source projects and BONUS! you get syncing to Maven Central too for free!!!
    Stephen Connolly

    tbee's point is that, the more "microrepositories" there are, the more work there is for those that maintain their own "proper" Maven repository servers, all of which provide a proxying feature that allows users of such servers (almost always deployed in a corporate environment as a hedge against network issues, artifact rot, and security/provenance issues, among other things) to specify only one source repository in their projects' pom.xml configurations, even if artifacts are actually originating from dozens or hundreds of upstream repositories.  I don't really buy that argument, insofar as the cat is definitely out of the bag vis á vis a proliferation of Maven repositories. Our Nexus install proxies around 15 Maven repositories in addition to Maven Central, and I suspect that that's a very, very low compared to the typical Nexus site. I'll bet there are hundreds – perhaps thousands – of moderately-active Maven repositories in the world already.

    I agree with Stephen that if you are willing to get set up with deployment rights to Sonatype's OSS Maven repo, you should do so.  Not everyone is though – and I'd rather see people using dependency management than not, and sharing SNAPSHOT builds rather than not. In any case, if you use this method, know that you've strayed from the Maven pack to some extent.

  2. The notion of having to commit the results of a mvn deploy invocation is very foreign.  This is particularly odd for release artifacts, the deployment of which are ostensibly transactional (yes, the git commit / git push workflow is atomic as far as other users of the git/Maven repository are concerned, but I'm referring to the deployer's perspective here).  The subversion-based Maven hosting arrangements don't suffer from this workflow oddity, which is nice.  I suppose one could add a post-commit hook to immediately push results, but that's just illustrating the law of conservation of strangeness, sloughing off unusual semantics from Maven-land to the Realm of git.  You could fall back to using Github's svn write support, but then you're back in subversion-land, with the configuration complexity I noted earlier.
  3. Without a proper Maven repository server receiving deployed artifacts, there will never be any indexes offered by these git-hosted Maven repos.  The same goes for subversion-hosted Maven repos as well. Such indexes are a bit of a niche feature, but are well-liked by those using Maven-capable tooling (such as the Eclipse and NetBeans IDEs).  It would be possible to generate and update those indexes in conjunction with the deployment process, but that would likely require a new Maven plugin – a configuration complication, and perhaps enough additional work to make deploying to Sonatype's OSS repo or hosting one's own Maven repository worthwhile.

Footnotes

  1. The fact that Maven currently requires a stubbed-out <distributionManagement> configuration is a known bug, slated to be fixed for Maven 2.5. Specifying the deployment repository location via the altDeploymentRepository will then be sufficient.
  2. An alternative to using the altDeploymentRepository option would be to make the git Maven repository a submodule of the project repository.  This would require that the git Maven repo be the canonical Maven repo for the project, and would imply that all the usual git submodule gymnastics be used when deploying to the git Maven repo.  I've not tried this workflow myself, but might be worth experimenting with.
About these ads
This entry was posted in Clojure, geek, Maven, Random Software Geekery. Bookmark the permalink.

33 Responses to Hosting Maven Repos on Github

  1. Brian Fox says:

    Although I was skeptical of the content when I saw the title, this is a good and balanced description of the available options. Naturally I would prefer you use our free hosted solution, but if you’re just getting started on a project I would agree that’s overkill and admin overhead that isn’t needed in the beginning.

    Once you start to have users outside of your project consuming release artifacts, that would be the proper time to get setup to host and sync your artifacts to Central. It makes everyone’s life easier in the long run and should dramatically lower the barrier to entry for your users.

    Just keep in mind that the requirements we place on artifacts for Central are there for the public good. We try to make them not cumbersome by providing poms you can use to inherit the correct setup.

    • Chas Emerick says:

      Thanks for your kind words, Brian. I was somewhat expecting that this approach wouldn’t find much of any favor over at Sonatype. :-)

      AFAICT, not many people necessarily care about syncing to Central, but they would be very happy to have access to a proper repository. If I were to make two suggestions, the first one would be to split off the items related to syncing to Central from the OSS guide. It’s not clear (even to me) whether conforming to the Central sync requirements is necessary to use the OSS repo at all; if it is, then that’s unfortunate. If not, that should be clarified.

      Second, especially for those that don’t care about syncing to central, a path to getting set up with OSS repo privs that doesn’t require what looks like a very convoluted JIRA ticket process (compared to an account signup with any other online service) would probably put OSS repo usage through the roof.

      Just one data point: I have about a half-dozen internal projects that I plan on open-sourcing over the next few months. I will give getting set up in the OSS repo a shot, but I suspect (just based on the info in the OSS repo setup guide) that I’m going to find the process far too onerous to go through for that smattering of fairly minor libraries, regardless of how many external users they garner – and this is coming from a mostly happy user of Nexus, Maven, etc. *shrug*

  2. Pingback: links for 2010-08-25 – Magpiebrain

  3. Pingback: Hosting a new maven project on Github, g3-java-client, a Gallery3 remote client library | Anthony Dahanne's blog

  4. tommy says:

    I’m following your guide for my project:
    https://github.com/tc/google-pagerank
    and publishing to
    https://github.com/tc/tc-maven-repo

    unfortunately, when maven pulls the project using the github repo url, it grabs:
    pagerank-1.0.pom

    301 Moved Permanently

    301 Moved Permanently
    nginx/0.7.67

    Is this happening to your repo as well?

  5. Pingback: Spring is coming | Philipp Kölmel

  6. Piwaï says:

    Thanks, this post have been a real time saver!

    I successfully deployed maven artifacts to a project’s GitHub repository using the mvn release plugin. So now, I just have to “mvn release:prepare”, “mvn release:perform” and push to the git repo to create a new release. Neat!

    I configured the altDeploymentRepository property directly in the POM : https://github.com/pyricau/BuilderGen/blob/master/pom.xml

    The only trick was that the pom.xml had to be at the root of the git repository.

    Thanks to you, the releasing instructions are really simple.

  7. eltimn says:

    Not sure if these were available when this was written, but I think a slightly better approach is to use GitHub user pages: http://pages.github.com/

    I just did this and created a maven2 directory and put my releases and snapshots in there.

    Then my repo url is just http://eltimn.github.com/maven2/releases, which is a little cleaner.

  8. Did someone else notice that this approach is not working anymore? I used to use it successfully just a few weeks ago, but now the /raw/master URL is rendering a 404:

    $curl https://github.com/kaeppler/maven2/raw/master/

    var NREUMQ=[];NREUMQ.push(["mark","firstbyte",new Date().getTime()]);
    404 – GitHub
    …….

    I wonder if GitHub rolled out a change to prohibit this kind of use, since it’s basically abusing GitHub as a content/repository server.

  9. Hi there,
    I added the stub section in my project’s POM, and then ran the command

    $ mvn -DaltDeploymentRepository=snapshot-repo::default::file:/path/to/my-mvn-repo/snapshots clean deploy

    This fails with the following error message:

    [ERROR] Failed to parse plugin descriptor for org.apache.maven.plugins:maven-deploy-plugin:2.5

    This happens even if I remove the stub from my pom.

    Any ideas?

  10. Chas Emerick says:

    By “stub section”, do you mean the distributionManagement addition?

    No, no ideas offhand. Maybe if you can paste your pom.xml somewhere? — hopefully not in a comment! ;-)

  11. Thanks for a good explanation.
    It is always nice to find when someone implemented thing you’ve just thought about :)

  12. Any idea how to get this running if you want to keep the github repo private? I can’t seem to figure out how to get my

    Also, any reason why this wouldn’t work with `lein deploy` and setting up deploy repositories in ~/.lein/init.clj? I set my init.clj up like so:

    (def settings
    {:deploy-repositories {:local-snapshots “file:///absolute/repo/uri/snapshots”})

    And lein deploy seems to return that it works — but the files never seem to make it into the repo. `mvn -DaltDeploymentRepository=snapshot-repo::default::file:///absolute/repo/uri/snapshots clean deploy` does work. My thoughts are it might be the `default` repo layout? Though digging through the lein source, it seems to use the default layout as well (though, my local ~/.m2 repo isn’t laid out this way … so there might be a clash somehow).

    Any thoughts?

  13. Pingback: Managing and Building version-controlled Maven Repos using Git, Gradle and Nexus Server « Marcello de Sales's Weblog

  14. yanivtalmusic says:

    I think this is great for small companies / private usage. I would hope that most large projects that have external consumers would post to a central repository to make it easier for others… though I wouldn’t fault anyone for having any number of backups.

  15. Pingback: How to include external libraries in Maven - Charlie Wu

  16. Dave says:

    How is this going to scale? Git is terrible at hosting binary files. As a publisher, you’d have to clone the entire repo before you can push a new artifact, right?

  17. Pingback: How to add external libraries in Maven – - Charlie Wu

  18. Mike says:

    Great article, worked first time. I think this is a great solution for single user projects, r&d etc..

  19. Dirk says:

    Is this possible for private github repos?
    I tried a repository entry in my settings.xml and even a password (encrypted and unencrypted) for the repository id.

  20. Hi Chas,

    Very useful post. I was able to get the uploading to happen automatically as part of the deploy phase of the build. Just type “mvn deploy” and your artifacts are automatically deployed to github as though you were using a real nexus server.

    More details available here:
    http://stackoverflow.com/a/14013645/82156

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s