Running Yarn offline

Posted Nov 24, 2016 by Konstantin Raev

Repeatable and reliable builds for large JavaScript projects are vital. If your builds depend on dependencies being downloaded from network, this build system is neither repeatable nor reliable.

One of the main advantages of Yarn is that it can install node_modules from files located in file system. We call it “Offline Mirror” because it mirrors the files downloaded from registry during the first build and stores them locally for future builds.

“Offline mirror” is different from cache that both npm CLI and Yarn have. Caches store already unzipped tarballs downloaded from registry, they also can be implementation specific and may be invalid between multiple versions of CLI tools. The tarballs in “Offline mirror” can be consumed by any version Yarn that will build cache based on them. It is also easier to store files when they are compressed.

Let’s set up “Offline mirror” for a simple JS project

{
  "name": "yarn-offline",
  "version": "1.0.0",
  "main": "index.js",
  "license": "MIT",
  "dependencies": {
    "is-array": "^1.0.1",
    "left-pad": "^1.1.3",
    "mime-types": "^2.1.13"
  }
}

When you run yarn install, the generated yarn.lock file has sections for every dependency:

# THIS IS AN AUTOGENERATED FILE. DO NOT EDIT THIS FILE DIRECTLY.
# yarn lockfile v1

is-array@^1.0.1:
  version "1.0.1"
  resolved "https://registry.yarnpkg.com/is-array/-/is-array-1.0.1.tgz#e9850cc2cc860c3bc0977e84ccf0dd464584279a"

left-pad@^1.1.3:
  version "1.1.3"
  resolved "https://registry.yarnpkg.com/left-pad/-/left-pad-1.1.3.tgz#612f61c033f3a9e08e939f1caebeea41b6f3199a"

mime-db@~1.25.0:
  version "1.25.0"
  resolved "https://registry.yarnpkg.com/mime-db/-/mime-db-1.25.0.tgz#c18dbd7c73a5dbf6f44a024dc0d165a1e7b1c392"

mime-types@^2.1.13:
  version "2.1.13"
  resolved "https://registry.yarnpkg.com/mime-types/-/mime-types-2.1.13.tgz#e07aaa9c6c6b9a7ca3012c69003ad25a39e92a88"
  dependencies:
    mime-db "~1.25.0"

Each of these dependencies have a resolved field with a remote URL. If you delete your node_modules folder and run yarn install again, Yarn will download the same resolved dependencies specified in this lockfile. It will even guarantee that no one modified the files since your first installs by verifying checksum for each of them.

However, if for some reason these urls are unreachable during your build, it will fail. To solve this, we’ll need an “Offline mirror”.

Set up .yarnrc

First we need to setup a directory to be our “Offline mirror” storage, we can do that with yarn config command:

$ yarn config set yarn-offline-mirror ./npm-packages-offline-cache
yarn config v0.17.8
success Set "yarn-offline-mirror" to "./npm-packages-offline-cache".
✨  Done in 0.06s.

./npm-packages-offline-cache is an example location relative to home folder where all the source .tar.gz files will be downloaded to from the registry.

This will create a .yarnrc file in your HOME directory. Let’s move this file to the project root so that offline mirror would be used only for this project.

$ mv ~/.yarnrc ./

Initialize the new lockfile

Remove node_modules and yarn.lock that were generated previously and run yarn install again:

$ rm -rf node_modules/ yarn.lock
$ yarn install
yarn install v0.17.8
info No lockfile found.
[1/4] 🔍  Resolving packages...
[2/4] 🚚  Fetching packages...
[3/4] 🔗  Linking dependencies...
[4/4] 📃  Building fresh packages...
success Saved lockfile.
✨  Done in 0.57s.

The dependency resolutions in your yarn.lock should now look like this:

# THIS IS AN AUTOGENERATED FILE. DO NOT EDIT THIS FILE DIRECTLY.
# yarn lockfile v1

is-array@^1.0.1:
  version "1.0.1"
  resolved is-array-1.0.1.tgz#e9850cc2cc860c3bc0977e84ccf0dd464584279a

left-pad@^1.1.3:
  version "1.1.3"
  resolved left-pad-1.1.3.tgz#612f61c033f3a9e08e939f1caebeea41b6f3199a

mime-db@~1.25.0:
  version "1.25.0"
  resolved mime-db-1.25.0.tgz#c18dbd7c73a5dbf6f44a024dc0d165a1e7b1c392

mime-types@^2.1.13:
  version "2.1.13"
  resolved mime-types-2.1.13.tgz#e07aaa9c6c6b9a7ca3012c69003ad25a39e92a88
  dependencies:
    mime-db "~1.25.0"

The resolved field for each resolution should now contain file names relative to the npm-packages-offline-cache folder that was configured earlier. Each resolved dependency also contains a checksum after the file name to ensure that no one mangles with the downloaded files.

And inside the “Offline mirror” folder we have the .tgz files that yarn will use for the following builds without reaching out to network.

$ ls npm-packages-offline-cache/
is-array-1.0.1.tgz    left-pad-1.1.3.tgz    mime-db-1.25.0.tgz    mime-types-2.1.13.tgz

In a nutshell, to enable “Offline mirror” for your project you need:

  • add “yarn-offline-mirror” configuration to .yarnrc file
  • generate a new yarn.lock with “yarn install” command

A few tips and tricks

You can check in “Offline mirror” into git or mercurial repository

The “Offline Mirror” can be shared between build servers or development machines in any way that is convenient: a Dropbox folder, stored in source control or on a network drive. At Facebook the offline mirror lives inside of our big Mercurial “monorepo”.

Whether to commit binary files into a repository or not depends on the number and size of your project’s dependencies. For example, out of 849 React Native dependencies totaling 23MB, only 10% are larger than 30KB. list of React Native .tgz files sorted by size descending Size of all .tgz files used in React Native

Many Facebook teams, including the React Native team, decided to check in their “Offline mirror”. They all share the same “Offline mirror” which means that most dependencies for new projects are often already checked into that folder, so the cost of storing the packages in source control gets lower the more projects use it.

Let’s compare checking in node_modules to checking in “Offline mirror”

The React Native team used to check in the node_modules folder but they hit several limits:

  • node_modules contains more than 37,000 files (and more than 100,000 files back when we were using npm2). This had a bad performance impact on our Mercurial repository.
  • Reviewing Pull Requests that changed a dependency was quite hard as all the files in node_modules that were added and removed created a ton of noise, making code reviews unpleasant

In comparison, updating a third-party dependency with the Offline Mirror adds just a few files that are very easy to review:

$ yarn add [email protected] --dev
yarn add v0.17.8
[1/4] 🔍  Resolving packages...
[2/4] 🚚  Fetching packages...
[3/4] 🔗  Linking dependencies...
[4/4] 📃  Building fresh packages...
success Saved lockfile.
success Saved 4 new dependencies.
├─ [email protected]
├─ [email protected]
└─ [email protected]
│  └─ [email protected]
✨  Done in 8.15s.

$ git diff
diff --git a/package.json b/package.json
index 4619f16..7acb42f 100644
--- a/package.json
+++ b/package.json
@@ -220,7 +220,7 @@
     "mock-fs": "^3.11.0",
     "portfinder": "0.4.0",
     "react": "~15.3.1",
-    "shelljs": "0.6.0",
+    "shelljs": "0.7.0",
     "sinon": "^2.0.0-pre.2"
   }
 }
diff --git a/yarn.lock b/yarn.lock
index 11ce116..f5d81ba 100644
--- a/yarn.lock
+++ b/yarn.lock
...
[email protected]:
-  version "0.6.0"
-  resolved shelljs-0.6.0.tgz#ce1ed837b4b0e55b5ec3dab84251ab9dbdc0c7ec
[email protected]:
+  version "0.7.0"
+  resolved shelljs-0.7.0.tgz#3f6f2e4965cec565f65ff3861d644f879281a576
+  dependencies:
+    glob "^7.0.0"
+    interpret "^1.0.0"
+    rechoir "^0.6.2"

 shellwords@^0.1.0:
   version "0.1.0"

$ git status
On branch testing-yarn
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

    modified:   package.json
    modified:   yarn.lock

Untracked files:
  (use "git add <file>..." to include in what will be committed)

    yarn-offline-mirror/interpret-1.0.1.tgz
    yarn-offline-mirror/rechoir-0.6.2.tgz
    yarn-offline-mirror/shelljs-0.7.0.tgz

no changes added to commit (use "git add" and/or "git commit -a")

Did you know that Yarn is also distributed as a single bundle JS file in releases that can be used on CI systems without internet access?

Files distributed with Yarn releases **yarn-.js** (for Node 5+) and **yarn-legacy-.js** (for Node 4) can be used stand-alone in CI systems without a need to install it.

Just check it into your project’s repository and use it in the build script:

node ./yarn-0.17.8.js install

This is quite convenient for teams that use multiple operating systems and want to have atomic updates for Yarn.

This is going to get better

The “Offline mirror” was implemented early in Yarn’s development cycle and we are working on improving it in a backwards compatible way:

  • The resolved field is used both for offline mirror paths and registry URIs. This means that the yarn.lock file that React Native team uses internally can’t be shared with the open source community because the React Native team does not sync the offline mirror with the open source version of React Native. Issue.
  • There is an improved workflow being considered for future versions of Yarn. It is not drastically different but some settings and lock files may change.