At Lightning AI, we made the decision to use a monorepo a while back. By and large, we’ve been really happy with it. It’s made supporting developers within a diverse and complex code ecosystem much simpler.
None of these are particularly surprising, nor are they necessarily fatal – but they are worth being aware of before you commit, particularly if you’re a smaller organization:
Many SaaS Analysis Tools Don’t Work
Many SaaS-based analysis tools make deep assumptions that repo == project. Some look for a lockfile in the root of the repo and just won’t proceed without it.
Some tools we found were just not gonna work with a monorepo:
Some tools that did pass this particular filter:
Some tools are just stuck in a middle ground of kinda-sorta working, but… maybe not perfectly – and maybe not once the needs if different sub-projects diverge. For example, CodeFactor forces you to choose a Python version repo-wide. We have some projects in Python 2.x, and some in 3.x. Not a perfect fit – but not necessarily fatal just yet. We’re hobbling along with the switch set to 3.x.
Code Coverage Reporting Can Be a Mess
Most coverage tracking plugins want to report paths relative to the project root, not the repo root. This can lead to intermingled results, and bad data. So far we’ve had to tweak the configuration of every coverage-reporting plugin to compensate for this.
For example, our setup for SimpleCov and CodeCov looks something like this:
We’ve also had annoyances where having warnings about reductions in coverage bites us on PRs that affect projects where code coverage is not supported. Line counts increase, pushing the percentage code covered down. Our only solution to this so far has been to explicitly configure CodeCov to ignore every project directory where code coverage isn’t actually enabled.
Tools Get Slower
While a PR might only touch one project, all the analyses for all the projects will be executed. This can slow things down considerably as your codebase grows.
It’s Better If You Roll Your Own
All of these problems so far share one thing in common: If we were rolling our own, we could easily avoid them. Organizations with thousands of engineers often have both the time and the need to customize their tools (e.g. Facebook’s work on Mercurial) or to build new tools (e.g. Android team’s Repo tool) and thus may feel fewer of the pain-points that come from a monorepo versus what a smaller org might experience.
We’re a smaller team without the resources to set up a more custom CI / analysis flow just yet. In the meantime, we have a few tools set to not block builds – and other tools we’re just passing on, or running locally. Developers have been advised to be mindful of whether any particular results are relevant to their PRs, and ignore as necessary. Not a scalable solution, but perfectly reasonable for now.
If you’re accustomed to a model of merging into a
development branch, and then periodically merging that into a
master branch, you’re going to need to prepare for a much greater degree of coordination overhead around that. You’re probably way better off with a simpler flow model of merging feature branches directly to
master and not having an intermediate staging area.
We love our monorepo, and we’re sticking with the technique. Setting up our own in-house tools for CI and analysis has had its time horizon moved forward because of it, which does add to the cost side of things but we feel like the reduced overhead is well worth it.