How to cache maven dependencies in Docker

maven cache dependencies
docker-maven
docker maven local repository
dependency:go-offline
maven build cache
maven-dependency-plugin
docker maven build java
maven docker hot reload

I'm working on a project with ~200MB dependencies and i'd like to avoid useless uploads due to my limited bandwidth.

When I push my Dockerfile (i'll attach it in a moment), I always have a ~200MB upload even if I didn't touch the pom.xml:

FROM maven:3.6.0-jdk-8-slim

WORKDIR /app

ADD pom.xml /app

RUN mvn verify clean --fail-never

COPY ./src /app/src

RUN mvn package

ENV CONFIG_FOLDER=/app/config
ENV DATA_FOLDER=/app/data
ENV GOLDENS_FOLDER=/app/goldens
ENV DEBUG_FOLDER=/app/debug

WORKDIR target

CMD ["java","-jar","-Dlogs=/app/logs", "myProject.jar"]

This Dockerfile should make a 200MB fatJAR including all the dependencies, that's why the ~200MB upload that occurs everytime. What i would like to achieve is building a Layer with all the dependencies and "tell" to the packaging phase to not include the dependencies JARs into the fatJAR but to search for them inside a given directory.

I was wondering to build a script that executes mvn dependency:copy-dependencies before the building process and then copying the directory to the container; then building a "non-fat"JAR that has all those dependencies only linked and not actually copied into it.

Is this possible?

EDIT: I discovered that the Maven Local Repository of the container is located under /root/.m2. So I ended making a very simple script like this:

BuildDocker.sh

mvn verify -clean --fail-never
mv ~/.m2 ~/git/myProjectRepo/.m2

sudo docker build -t myName/myProject:"$1"

And edited Dockerfile like:

# Use an official Python runtime as a parent image
FROM maven:3.6.0-jdk-8-slim

# Copy my Mavne Local Repository into the container thus creating a new layer
COPY ./.m2 /root/.m2

# Set the working directory to /app
WORKDIR /app

# Copy the pom.xml
ADD pom.xml /app

# Resolve and Download all dependencies: this will be done only if the pom.xml has any changes
RUN mvn verify clean --fail-never

# Copy source code and configs 
COPY ./src /app/src

# create a ThinJAR
RUN mvn package


# Run the jar
...

After the building process i stated that /root/.m2 has all the directories I but as soon as i launch the JAR i get:

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/log4j/Priority
    at myProject.ThreeMeans.calculate(ThreeMeans.java:17)
    at myProject.ClusteringStartup.main(ClusteringStartup.java:7)
Caused by: java.lang.ClassNotFoundException: org.apache.log4j.Priority
    at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    ... 2 more

Maybe i shouldn't run it through java -jar?


If I understand correctly what you'd like to achieve, the problem is to avoid creating a fat jar with all Maven dependencies at each Docker build (to alleviate the size of the Docker layers to be pushed after a rebuild).

If yes, you may be interested in the Spring Boot Thin Launcher, which is also applicable for non-Spring-Boot projects. Some comprehensive documentation is available in the README.md of the corresponding GitHub repo: https://github.com/dsyer/spring-boot-thin-launcher#readme

To sum up, it should suffice to add the following plugin declaration in your pom.xml:

<build>
    <plugins>
        <plugin>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-maven-plugin</artifactId>
            <!--<version>${spring-boot.version}</version>-->
            <dependencies>
                <dependency>
                    <groupId>org.springframework.boot.experimental</groupId>
                    <artifactId>spring-boot-thin-layout</artifactId>
                    <version>1.0.19.RELEASE</version>
                </dependency>
            </dependencies>
        </plugin>
    </plugins>
</build>

Ideally, this solution should be combined with a standard Dockerfile setup to benefit from Docker's cache (see below for a typical example).

Leverage Docker's cache mechanism for a Java/Maven project

The archetype of a Dockerfile that avoids re-downloading all Maven dependencies at each build if only source code files (src/*) have been touched is given in the following reference: https://whitfin.io/speeding-up-maven-docker-builds/

To be more precise, the proposed Dockerfile is as follows:

# our base build image
FROM maven:3.5-jdk-8 as maven

WORKDIR /app

# copy the Project Object Model file
COPY ./pom.xml ./pom.xml

# fetch all dependencies
RUN mvn dependency:go-offline -B

# copy your other files
COPY ./src ./src

# build for release
# NOTE: my-project-* should be replaced with the proper prefix
RUN mvn package && cp target/my-project-*.jar app.jar


# smaller, final base image
FROM openjdk:8u171-jre-alpine
# OPTIONAL: copy dependencies so the thin jar won't need to re-download them
# COPY --from=maven /root/.m2 /root/.m2

# set deployment directory
WORKDIR /app

# copy over the built artifact from the maven image
COPY --from=maven /app/app.jar ./app.jar

# set the startup command to run your binary
CMD ["java", "-jar", "/app/app.jar"]

Note that it relies on the so-called multi-stage build feature of Docker (presence of two FROM directives), implying the final image will be much smaller than the maven base image itself. (If you are not interested in that feature during the development phase, you can remove the lines FROM openjdk:8u171-jre-alpine and COPY --from=maven /app/app.jar ./app.jar.)

In this approach, the Maven dependencies are fetched with RUN mvn dependency:go-offline -B before the line COPY ./src ./src (to benefit from Docker's cache).

Note however that the dependency:go-offline standard goal is not "perfect" as a few dynamic dependencies/plugins may still trigger some re-downloading at the mvn package step. If this is an issue for you (e.g. if at some point you'd really want to work offline), you could take at look at that other SO answer that suggests using a dedicated plugin that provides the de.qaware.maven:go-offline-maven-plugin:resolve-dependencies goal.

Caching Maven dependencies in a Docker build, Fortunately for us, Maven allows you to pull in dependencies before actually building anything. We can use this to trick the Docker cache into  Download the dependencies separatelly, we do so by adding only the dependencies related files, which are from Gradle in my sample, but could be Maven or any other JVM dependencies manager. That


In general Dockerfile container build, works in layers and each time you build these layers are available in catch and is used if there are no changes. Ideally it should have worked same way.

Maven generally looks for dependencies by default in .m2 folder located in Home dir of User in Ubuntu /home/username/

If dependent jars are not available then it downloads those jars to .m2 and uses it.

Now you can zip and copy this .m2 folder after 1 successful build and move it inside Docker Container User's Home directory.

Do this before you run build command

Note: You might need to replace existing .m2 folder in docker

So your Docker file would be something like this

FROM maven:3.6.0-jdk-8-slim

WORKDIR /app

COPY .m2.zip /home/testuser/

ADD pom.xml /app

RUN mvn verify clean --fail-never

COPY ./src /app/src

RUN mvn package
...

Optimizing Docker Images for Maven Projects, xml file to resolve project dependencies. It downloads missing JAR files from private and public Maven repositories, and caches these files for  As you can see, I'm caching all dependencies with the first mvn command so that every change in my code app will not trigger a new bulk of dependencies downloads. It works for most of the dependencies but some are still downloaded (even if cached).


The documentation of the official Maven Docker images also points out different ways to achieve better caching of dependencies.

Basically, they recommend to either mount the local maven repository as a volume and use it across Docker images or use a special local repository (/usr/share/maven/ref/) the contents of which will be copied on container startup.

Crafting the perfect Java Docker build flow, Is it possible to cache already downloaded m2 dependencies so that next time https://github.com/carlossg/docker-maven#packaging-a-local-  FROM maven:3-jdk-8-slim WORKDIR /build # Download Dependencies COPY pom.xml . RUN mvn dependency:go-offline So, at first, you should build a image named 'java-builder'


Cache m2 dependencies for subsequent builds · Issue #19 , Luke's Docker+Maven development build process wasn't very efficient. Check out what Dockerfile build cache tricks of the trade he used to greatly better ways to handle some of this (e.g. dependency:resolve/resolve-plugin  Caching Strategy Reminder for Maven-Based Docker Builds Luke Patterson January 5, 2015 Docker , Java , Tutorial 14 Comments My local development feedback loop between code change and runnable container was annoyingly long on a Maven-based project I was recently working on.


Maven for building Java applications - Tutorial, Modifying the second line always invalidate maven cache due to false dependency, which exposes inefficient caching issue. BuildKit solves  We can use this to trick the Docker cache into holding onto our Maven dependencies: # select image FROM maven:3.5-jdk-8 # copy the project files COPY ./pom.xml ./pom.xml # build all dependencies for offline use RUN mvn dependency:go-offline -B # copy your other files COPY ./src ./src # build for release RUN mvn package # set the startup command


Docker frequently asked questions (FAQ), I used similar approach but maven is still not using local storage. This is my circle​.yml. machine: services: - docker java: version: oraclejdk8. dependencies: I currently have this step on my config, it installs my dependencies and save then on yarn’s cache folder: install-dependencies: executor: node/default steps: - checkout - restore_cache: keys: - dependencies-{{ checksum "yarn.lock" }} - run: yarn install - save_cache: paths: - ~/.cache/yarn key: dependencies-{{ checksum "yarn.lock" }} All working good except when my e2e tests run using