Previously, on “Is DT a monorepo?”
In a previous post, I talked about the layout of DefinitelyTyped and how it was indeed a monorepo, albeit a funky one. In short, packages were laid out (more or less) like this:
types/
gensync/
tsconfig.json
index.d.ts
node/
tsconfig.json
index.d.ts
package.json
react/
tsconfig.json
index.d.ts
package.json
And so on. Each tsconfig.json file contained bits like:
{
"compilerOptions": {
"baseUrl": "../",
"typeRoots": [
"../"
],
"types": []
}
}
This config means that when a types package looks for itself or another package,
it can map to other directories in types; no package.json is needed. At
publish time, we detected dependencies and add them. If a package needed an
external dependency, then the package would need a package.json file with
that dependency declared.
This provided a monorepo-like feel without any symlinking, but with many downsides, including:
- Long
npm installtimes when external dependencies are needed, especially when testing the entire repo. The tooling just looped over every folder with apackage.jsonand rannpm install. - Completely unrealistic module resolution (no
node16/nodenext, noexportmaps, etc.) thanks to the use ofbaseUrl,typeRoots, andpaths. Not eventypesVersionsworks.
I also talked about what we could do to remedy the situation, which boils down to “what if we were just a monorepo like everyone else uses in the JS ecosystem and let a package manager handle things”?
Making fetch happen
Obviously, all of that was a good 6 months ago. There were some unresolved blockers that made me put the project on the backburner. What changed?
Recently, Andrew merged
fetch support
into @types/node. Yay!
But, you might have noticed that only @types/node@20 got this feature.
Surprise! It’s the second bullet point from above. DefinitelyTyped’s fake module
resolution
broke resolution
inside undici-types, the package @types/node depends on to provide fetch
types (without depending on undici itself, which is vendored into Node). The
effect of this is that we could only add fetch to the latest types for Node,
not to any older versions. In fact, if @types/node@22 were needed, we’d have
to drop it from @types/node@20! Boo.
This problem shuffled the whole “DefinitelyTyped monorepo” thing straight to the top of our interest list.
And so with that, I’m happy to say that after a few weeks of effort from Nathan, Andrew, and myself, we’re actually doing it! DefinitelyTyped is becoming a monorepo!
Hello, pnpm
If you’ve read the previous posts, you won’t be surprised to find that we’re
using pnpm to do this. All modern package managers have some sort of monorepo
support these days, but DefinitelyTyped’s unique situation limits what we can
use. Specifically, DefinitelyTyped contains multiple versions of the same
package. For example, we currently have @types/react v15-v18, @types/node
v16-v20, and so on. Both npm and yarn exit early when they see two workspace
packages with the same name. Understandable!1 But, we need to
do it somehow.
With pnpm, this “just works”. Internally within pnpm, workspace packages are
identified by their paths, so there’s no conflict. Then, when pnpm goes to
resolve packages, it only cares about the name and version. This actually
means we get something better than just “it doesn’t fail”; it can actually
resolve to these workspace packages based on their versions! It behaves just
as though the packages were provided by the npm registry. So long, paths.
There’s a bunch more goodness pnpm provides, but for now, let’s just look at
the new layout.
The new layout
Anyone who’s worked in a monorepo will not be surprised by the new layout. Now, we have:
types/
gensync/
tsconfig.json
index.d.ts
package.json # new!
node/
tsconfig.json
index.d.ts
package.json
react/
tsconfig.json
index.d.ts
package.json
Every @types package now requires a package.json, even if it doesn’t have
any external dependencies. Let’s take a look at what’s inside. Here’s the new
bits of package.json for @types/jsdom:
{
"private": true,
"name": "@types/jsdom",
"version": "21.1.9999",
"projects": [
"https://github.com/jsdom/jsdom"
],
"minimumTypeScriptVersion": "4.5",
"dependencies": {
"@types/node": "*",
"@types/tough-cookie": "*",
"parse5": "^7.0.0"
},
"devDependencies": {
"@types/jsdom": "workspace:."
},
"owners": [
{ "name": "Leonard Thieu", "githubUsername": "leonard-thieu" },
{ "name": "Johan Palmfjord", "githubUsername": "palmfjord" },
{ "name": "ExE Boss", "githubUsername": "ExE-Boss" }
]
}
That’s a lot of stuff. Much of this is information that was previously a part of
the index.d.ts “header”, i.e. something like:
// Type definitions for jsdom 21.1
// Project: https://github.com/jsdom/jsdom
// Definitions by: Leonard Thieu <https://github.com/leonard-thieu>
// Johan Palmfjord <https://github.com/palmfjord>
// ExE Boss <https://github.com/ExE-Boss>
// Definitions: https://github.com/DefinitelyTyped/DefinitelyTyped
// Minimum TypeScript Version: 4.5
In the new layout, we’re going to be using a package manager, so we need to
declare a name and version to get packages to link up. That’s the first bit
of the header. At that point, we may as well just move everything into JSON and
be done with it. Additionally, this means that tools wanting to grab info about
a DefinitelyTyped package don’t need to parse the header text; it’s all in
package.json.
Let’s break down the fields.
private
{
"private": true
}
This is always set to true, telling pnpm to not attempt to publish this
package to the registry. The DefinitelyTyped publisher handles publishing.
Packages that had a package.json previously already had this set, so this is
nothing new.
name
{
"name": "@types/jsdom"
}
This is new! Previously this was declared only via the directory name, but now
we’re going to be using pnpm to handle things, so we need to specify this.
version
{
"version": "21.1.9999"
}
This one’s funky; it’s almost what we put in the header, but with a patch
version of 9999. When DT packages are published (automatically on a schedule),
the patch version is generated; it’s just whatever the previous version was,
plus one. So the patch version that’s actually in the repo never matters.
At development time, we’re making use of the fact that pnpm can resolve to
local versions. Normally, we could set prefer-workspace-packages, which would
force pnpm to always link to the local workspace package. But we actually have
a few packages which intentionally point to old versions of @types packages.
If we were to do the much nicer thing of using 0 as our patch version, the
version from the registry would always be chosen instead. So, we can’t use
prefer-workspace-packages. Instead, we just pick an arbitrarily high patch
version, such that it will always be newer than what’s in the registry, hence
9999. 9999 publishes ought to be enough for anyone, right?
projects
{
"projects": [
"https://github.com/jsdom/jsdom"
]
}
This is an array of helpful links to info about a project. Usually it contains a GitHub link, but can sometimes contain more.
minimumTypeScriptVersion
{
"minimumTypeScriptVersion": "4.5"
}
This defines the minimum supported version of TypeScript version for a package.
dependencies
{
"dependencies": {
"@types/node": "*",
"@types/tough-cookie": "*",
"parse5": "^7.0.0"
}
}
This isn’t new, but it is bigger! Dependencies on @types packages are now
explicit. No longer can every package access every other package; pnpm won’t
link them. This removes the complexity of the infrastructure; we don’t need to
parse the code or rely on heuristics to figure out what packages depend on what.
Additionally, this makes pnpm fully aware of how the packages interrelate,
meaning that we can use fun features like --filter (more on that later).
devDependencies
{
"devDependencies": {
"@types/jsdom": "workspace:."
}
}
This is new. For the most part, this will contain just one thing; a
self-dependency. Without the baseUrl / typeRoots / paths combo, a package
can’t find itself anymore, but that’s the API that we’re wanting to test. pnpm
doesn’t yet support creating self-links, so we do it ourselves using a
workspace:. specifier.2
This list can also contain packages that are needed for testing. This
technically an improvement over the previous setup, which didn’t allow
devDependencies at all. But, it’s generally better to not have any testing
dependencies anyhow.
owners
{
"owners": [
{ "name": "Leonard Thieu", "githubUsername": "leonard-thieu" },
{ "name": "Johan Palmfjord", "githubUsername": "palmfjord" },
{ "name": "ExE Boss", "githubUsername": "ExE-Boss" }
]
}
This is a list of the users that “own” the package. They get pings when people
send PRs to packages and can approve them. This used to be in the as URLs (as
that’s the syntax needed for the contributors array in package.json), but
our tooling only wants usernames and loads of people incorrectly typed their
GitHub profile URLs. For owners that aren’t directly on github, url can still
be passed (though not shown above).
nonNpm, nonNpmDescription
{
"nonNpm": true,
"nonNpmDescription": "Google Maps JavaScript API"
}
My example didn’t have these two, but some of the packages in DefinitelyTyped
describe things that aren’t npm packages at all. For example,
@types/google.maps describes the Google Maps API (a global), and had a header
like:
// Type definitions for non-npm package Google Maps JavaScript API 3.54
This info is used to inform various checks and is carried into the published package. In the new layout, this information is represented in JSON.3
Installing dependencies
Let’s start by doing the naive thing and just run pnpm install in the root of
the repo.
$ pnpm install
Scope: all 9114 workspace projects
...
Done in 3m 35.4s
Wow, that’s a lot of install. But it’s a major improvement over the previous
layout, where installing the entire repo (with 10x fewer package.json files)
took some 30 minutes.
The good news is that those working on DT don’t actually need to install the
entire repo. pnpm supports filtering. Let’s say
I’m working on @types/node, and want to be able to test it and any packages it
depends on. I can run:
$ pnpm install -w --filter '...@types/node...'
Scope: 2722 of 9114 workspace projects
...
Done in 1m 2.5s
That’s a good bit better! Since we have explicit dependencies, pnpm can
actually figure out what packages are needed for @types/node and install
those, but also figure out which packages depend on @types/node and install
those too. The -w tells pnpm to also install the workspace root, which is
needed to get the DefinitelyTyped tooling, linters, dprint, etc.
What about package that isn’t so hefty?
$ pnpm install -w --filter '...@types/lodash...'
Scope: 372 of 9114 workspace projects
...
Done in 7.2s
Now we’re talking. Most packages won’t need to do a huge install, so long as people read the docs (🙃) to know how to avoid the big install.
From this point on, the workflow is the same as DefinitelyTyped was before.
Filtering in CI
There’s one other cool trick that we can use in CI; not only can we filter by package name, but we can also filter by what changed since a specific git ref. In a PR build, we can use:
$ pnpm install -w --filter '...[origin/master]'
And only get what we need.
Other misc improvements
There’s also a grab bag of other improvements that come with this change. In no particular order:
- Having to redo a bunch of the DefinitelyTyped tooling has led to improvements
like
dtslint-runnerno longer bailing early on certain kinds of errors. Many more things are collected for reporting at the end such that doing one thing wrong doesn’t hide the problem until a second run. - As a part of making everything work in the new monorepo, we manually fixed a
few hundred packages. These packages were silently broken (or at least,
weird) in various ways. For example, multiple packages imported the
eventslibrary. This could mean@types/node, but it could also mean@types/events. In practice, it resolved to the latter, but then sometimes, the tooling would say the package depended on neither (probably due to a bug in the implicit dependency resolution). Now, each package actually has to say which they need. There are other weird things besides just this; invalidreferencesdirectives, packages depending on the wrong versions of things, etc. - Having a complete working
package.jsonfor every package means that one can theoretically justnpm packand get a working tarball. This is likely to become useful for tools like Are The Types Wrong, although the publisher still does a bunch of stuff (notably, deciding which files actually get included in each package, which is still mostly implicit). - It turns out that a load of
react-based types packages are broken at the moment due to@types/reactusingtypesVersions. SincetypesVersionsis inpackage.json, butbaseUrl/typesRoot/pathsskippackage.jsonresolution, packages that depend on@types/reactalways get the types meant for TS 5.1 and above. Oops. With actualnode_moduleslinking, this isn’t a problem and things work as intended. Another reason to speed this along.
It’s not all rainbows and sunshine
Everything I’ve described so far has been an improvement over the previous layout. But there are some warts left to figure out.
Using shared-workspace-lockfile=false
The astute reader may have noticed that the performance of pnpm install seems
way slower than expected, especially given the numbers I achieved in the
previous post about making pnpm faster.
The reason for this slowdown is our use of shared-workspace-lockfile=false.
This is a lesser-used option which instructs pnpm to instead handle each
workspace package individually. Each package gets is own dependency graph, and
calculating that 9000 some times is a lot slower than doing it once.
Why enable it, then? It’s a bit hard to fully explain. I filed
an issue for it upstream, but the
gist is that since pnpm is stricter about how it handles dependencies (they
aren’t just all hoisted to the top), the combo of so many packages forgetting to
declare dependencies on @types packages along with some packages (outside DT)
explicitly depending on @types/node@17 (why??) causes the “fallback”
@types/node to point to that awkward v17 version.
I haven’t quite figured out what the best solution is; it’s possible that pnpm
could gain a yarn-like hoisting-limit system to avoid this problem, or always
resolve “fallback” dependencies to the latest version.
I had previously skipped working on this project due to this problem, but the
upsides of the migration (especially with fetch getting thrown in the mix)
tipped the scales. Although shared-workspace-lockfile=false is quite a bit
slower, it’s still an improvement when installing the entire repo, and filtering
provides a very straightforward way to reduce the cost of package installs.
Just to show what the difference is, here’s the install without this setting.
$ pnpm install
Scope: all 9114 workspace projects
...
Done in 1m 5.8s
$ pnpm install -w --filter '...@types/node...'
Scope: 2722 of 9114 workspace projects
...
Done in 30.2s
$ pnpm install -w --filter '...@types/lodash...'
Scope: 372 of 9114 workspace projects
...
Done in 7.9s
Not helpful super for a small number of packages, but quite a bit faster if you ever need to work with the whole thing.
git clean is broken on Windows
pnpm uses symlinks under the hood. On POSIX-ish platforms like Linux and
macOS, this is all good; symlinks work for any user and behave as expected. On
Windows, however, the story is different. For a very long time, the only way to
get “real” symlinks was to gain elevated permissions. Without being an admin,
the best you could hope for was a “junction”. I’m not sure I could explain the
ins and outs of junctions other than to say that they’re like a symlink, but
only for directories, and they act kinda weird sometimes. But, they do the job.
There’s a gotcha; if you run git clean, git treats junctions as directories!
This is normally not a big deal, but every one of our packages has a self link,
which makes the symlinks recursive. And since git doesn’t treat junctions like
symlinks, git clean will just keep recursing infinitely until it hits the max
path length. It may eventually finish, but without loads of errors and
complaining.
There are two ways forward:
- Make
gittreat junctions as symlinks. I sent a PR for this earlier this year, but it hasn’t yet been accepted. I personally think this is the correct solution; ifgit cleanhad been implemented in shell scripts (like much ofgit), it would have treated junctions as symlinks and just worked. Butgit cleanis written in C, so instead of going throughgit-bash, it goes through the shims which translate the POSIX-y file system accesses into the Windows API, and those shims disagree withgit-bash.4 - Make
pnpmuse real symlinks. You’re probably confused; didn’t I just say that you needed elevated privileges to do that? Normally, yes, but if you enable Developer Mode, any user is able to make symlinks! And, I fully suspect that most people developing on Windows have this enabled. You even need enabled to enable WSL. This is probably a good idea whether or notgitchanges; real symlinks don’t have the same warts as junctions. I’ll disclaim that I haven’t actually proposed this change upstream. There are some gotchas in that it may be awkward to enable this automatically (what happens if you end up with anode_moduleswith both junctions and symlinks?), but I think it should be straightforward to detect.
For now, DefinitelyTyped has included a script Windows users can run to clean up
node_modules; pnpm run clean-node-modules will find and delete all
node_modules directories within the repo. Good enough for now.
Removing a package won’t expose breaking changes in newly-typed packages
This one’s subtle. Imagine we have a package @types/foo. Another package (in
the repo or even external) depends on @types/foo. But, foo has just gained
types, which means that it’s time to remove @types/foo from DefinitelyTyped.
In the old layout, we’d delete the directory and add it to notNeededPackages.
When the PR that does this is merged, the publisher will publish one final
version of @types/foo that contains only a package.json with a dependency on
the real foo.
But, when you’re actually working on the PR that does this, the shim package
hasn’t yet been published! If another package within DefinitelyTyped depends on
@types/foo it will stop pointing to the one in the repo (it’s been deleted).
In the old layout, things would stop compiling until dependencies are updated.
But in the new layout, pnpm will just resolve to the latest version of
@types/foo in the registry, which will be exactly the same code that is being
deleted. This means that the PR will definitely pass CI, when it may actually
fail later if the real upstream foo package has types which differ enough to
break things.
There’s not really a great solution to this other than to ban external
dependencies on @types packages that aren’t contained in the repo; that
handles some of the situations but not all. (If you have any clever ideas, let
me know.)
Removing a package isn’t reflected in pnpm git filter
In addition to the above, when we delete a package, pnpm’s behavior for
--filter '...[origin/master]' doesn’t pick up on the removed package. In an
ideal world, it’d see the package removed, and then figure out which local
packages are affected by the removal. But, it doesn’t do that, either simply
because the package is gone (so there’s no more edges to check), or due to the
previous section (where the package is still there, just not in the repo). The
workaround is to instead use
pnpm ls --depth -1 --filter '...@types/removed...' to get some sort of list of
what may need to be tested. In CI, if a package.json is removed from the repo,
we also don’t use pnpm install --filter '...[origin/master]', resorting to a
complete install.
Future work
After this is merged, we’re still not done! There is some exiting stuff that gets unlocked:
- Since there’s no more header, there’s nothing special about
index.d.tsanymore. Since we’re trying to get DT as close to the upstream packages as possible, it may actually be wrong to have anindex.d.tsif the package doesn’t containindex.js. We should be able to remove the requirement that packages have anindex.d.ts. - Right now, there’s no way to really verify that types work properly for people
using
nodenextresolution (though, some analysis has shown that DefinitelyTyped does better in this regard than those publishing types themselves).Now that everything works vianode_modules, we could finally use enable those options intsconfig.jsonand verify that things work. Maybe even required. - We could offload even more of
dtslint-runnerontopnpmscripts. - We could move tests out into their own packages, forcing them to use the
public APIs of the packages they’re testing. Or even, have multiple
tsconfigs to make sure things work with various settings.
Anyway…
I hope this info dump was interesting; I’m really excited to see this change happen.
Big thanks to everyone involved with this big migration, including
Nathan and
Andrew from the TypeScript team,
Zoltan of pnpm fame, and everyone who spent time
finding more ways to speed pnpm and semver up for our ridiculous use case.
-
Honestly, this is a little unsatisfying. If you think about it, all package managers already need to be able to handle multiple versions of the same package when they’re sourced from the npm registry. Theoretically, they could support multiple versions of workspace packages, but alas, no. ↩︎
-
Technically,
link:.would also work, but for “reasons”,workspace:.has better performance. ↩︎ -
There’s still some clarity needed about what these fields are supposed to represent. There are quite a few packages (~200) that don’t have this field set but aren’t on
npmeither. We’ll get it sorted; my hope is that this field becomes defined specifically as “this package is not on npm, do not look at npm for it, but if you do find an npm package with this name, then that may be a problem so CI should fail until we triage the problem”. ↩︎ -
It’s absolutely possible that my take is the wrong one here; I know Go recently recently changed things to treat these special reparse points as some sort of “irregular” file. Honestly, I have no clue. ↩︎