“How could our change possibly cause the slew of errors we’re seeing now,” the exasperated lead developer practically shouted at his manager, “we changed a single file. I checked, double-checked, and triple-checked the changelog. It was just one file. And a static HTML file at that!”
The exchange had become commonplace in the hours-long frantic mix of debugging, blaming, and doing whatever it took to fix production. It was a slew of cryptic errors that made even less sense the more they were analyzed. There were things like “invalid format string” cropping up in places that hadn’t been changed in months, random “missing method exceptions”, and the always-elusive “object reference not set to an instance of an object” exceptions.
“Waitasec,” one of the development engineers jumped up, “I think I figured it out. Is anyone familiar with ServiceStack.Text.dll? That seemed to change… and there’s been a… uhh… binding redirect added to web.config!?”
As it turned out, everyone had been looking in the wrong place. They had assumed that, just because there were no changes to the source code, no changes to their build machine, and no changes to their infrastructure, that the application they built the day before would be the same the application they built today. It was a good assumption, but for one fact: they were using NuGet, and as a result, their build had changed.
Nuget as That Tool
NuGet has become a four-letter word at a lot of organizations. And it happened the exact same way that any other platform, technology, or methodology turns pariah: the organization tries it (knowingly or unknowingly), something really bad happens, a blame game ensues, and eventually, everyone’s finger get pointed at That Tool. It’s easy to blame the tool.
For at least a generation, the mere mention of That Tool will result in scornful recollections of The Great Failure. Going so far a recommendation for That Tool again will cost years of hard-earned credibility. This tool blame phenomenon is so bad that, in agile circles, some practitioners suggest using words like “lean” and “nimble” to broach the topic of Agile, just on the off-chance that the organization has become anti-agile.
Unfortunately, it’s way too easy for NuGet to become That Tool. It was not built with enterprise organizations in mind and as such, using it carelessly in the wrong context can lead to serious and unexpected consequences. But before we get into that (and how to overcome these problems), let’s explain what I mean by exploring what NuGet is good at and what it was built to do.
A World Without NuGet
Picture this. You, your boss, and a few other teammates in the conference room. “Good news, everyone,” the boss beams, “we just sold the unlimited enterprise mega edition, including the new PDF export feature. So, when can you install it for them?”
Naturally, there is no “mega edition” and certainly no PDF export features, which means that you and your teammates will have to quickly implement PDF functionality. The first step is of course Google, which reveals a plethora of commercial and open-source third-party libraries. Knowing that your overall tools budget is $0, you download the PdfSharp library, extract the .dll to your /lib folder, add it as a project reference, andjust like that you can begin exporting PDFs.
So far, there’s not a lot of room for NuGet in this story, but what happens when the boss sells the ZIP extraction feature? And the SAP plug-in? And the Facebook integration?
Not only will searching for these libraries be painful, but you’ll soon find yourself with dozens of third-party libraries to manage. Each will have their own set of DLLs, some of which will overlap with other DLLs, creating a whole mess of versions to test and manage. Getting the project to build on a teammate’s workstation will be difficult enough, let alone on the build server. And that’s before you even consider doing library upgrades. This world of pain called DLL Hell.
NuGet… to the rescue?
This is where NuGet fits in. According its creators, NuGet is the “Visual Studio extension that makes it easy to install and update third-party libraries.” If there’s one word that should be emboldened, underlined, and possibly even blinking, it’s “easy”. NuGet is incredibly easy to use.
Open package manager, search for something (say, “PDF”), then click install on the package you’d like to use. Within seconds, you’ll be ready to code. NuGet automatically manages dependencies and makes upgrading just as easy. There’s a good reason it has become so popular.
Unfortunately, NuGet’s “ease of use” can have detrimental consequences in enterprise development environments if wrongly used, and is why it so easily becomes That Tool. It’s not that things in enterprises should be difficult, but tools that are easy to use are often easy to abuse – and NuGet is no exception.
But before we dig into the specific problems and how to overcome them, let’s dive a bit more into how NuGet works.
Dependency Management Overview
The problems that NuGet solves – dependency management and integration – are certainly not new. In fact, the problem space is fairly well known and has been solved numerous times in other platforms. As such, we can look towards other implementations to come up with a common set of concepts.
This is key component of all dependency managers, and what everything else is based upon. In NuGet, they’re called Packages (and contain DLLS and other assets) Java they’re called Artifacts (and contain jars), and in Ruby they’re gems (.rb files, etc).
Package metadata describes the contents of a Package, and has three key elements:
- Package ID – a name to uniquely identify a package; Newtonsoft.Json, for example
- Version Number – a specific version of the Package; 4.5.11, for example
- Dependency List – describes other packages which the package requires; see below
There are of course other metadata fields (Description, Authors, etc.), but these three are the most important to consider.
A dependency consists of two key elements:
- Package ID – the name of the package to depend on
- Version Spec – a range of compatible versions of the package
Each dependency manager has their own syntax for a version spec, and allows you to define things like “everything between 1.0 and 3.0, but not 3.0”. In NuGet, the version spec for that would be “[1,3)”.
A dependency graph is the result of mapping a specific package and all of its dependencies’ versions. This is where the Version Spec comes in: if one package requires Foo[1,3) and another requires Foo[2,3], the dependency graph must include the latest allowable version of Foo. That might be Foo 2.9.
A package repository is “like source control, but for packages.” Most users are only familiar with the publicly-available repositories (nuget.org, maven.org, rubygems.org, etc), but a package repository could be any place where packages may be searched and retrieved – inside a firewall or on a public website.
Package Management Platform
The platform defines not only all of the aforementioned concepts, but includes the totality of associated technologies. Thus, the NuGet platform includes:
- NuGet (the Visual Studio extension) – the extension itself
- NuGet Gallery – the publically available repository hosted at nuget.org/packages
- NuGet Package – a zip file with a .nupkg extension that contains files (e.g. DLLs) and metadata
- NuGet Package Repository – a server where NuGet packages are hosted
- NuGet Client Tools – the handful of command-line and GUI-based tools, such as nuget.exe
Brief History of NuGet
NuGet made its big debut on October 5th, 2010, when a small cadre of Microsoft’s community bloggers – Scott Hanselman, Phil Haack, and Scott Guthrie – simultaneously announced the launch of NuGet (or, NuPack in those days). It was a project they (and a few other Web Platform and Tools members) had been working on for months, and that was officially blessed by Outercurve (Microsoft’s open source initiative). Overnight, it became the de facto standard for .NET package management.
Before that fateful day, there was obviously lots of coding, lots of planning, lots of design, and most certainly, lots of research into existing package management systems on other platforms. At the time, there were two well-established systems.
- Maven – used by Java developers (especially in enterprise circles), with a largely decentralized approach (i.e. there are numerous public repositories)
- Ruby Gems – from the Ruby community, a virtual “must have” for developing Ruby applications, with almost all packages (called “gems”) residing at rubygems.org
Although .NET is a much closer fit to Java – both technologically and in usage -NuGet’s designers chose to use Ruby Gems for much of their inspiration. There were a lot of reasons for this, but a major one was to help cement Microsoft’s commitment to open source and to grow the .NET Open Source community. And it certainly worked – by the time of this writing, there are over 10,000 packages hosted on NuGet, and it’s become the hub for publishing new, free, and open source packages.
But what works in the open source world does not necessarily work in enterprise development, and this is especially true with NuGet.
The Enterprise Development Challenge
One of the major tenets of software development in the enterprise is risk management. Whether it’s the risk of budget overrun, or the risk of not being ready on time, or the risk of production failure, software produced to power business functions needs to be as predictable as any other business activity – even if that comes at the cost of new features or higher-quality code.
While no business activity is without risk, organizations need to focus their risk-taking on their core business functions – ideally, leaving everything else to be as risk-free as buying electricity from the utility company.
Open source software, on the other hand, doesn’t share this tenet, and instead can focus on new features and adapting to the community’s needs. And since NuGet is built to support the open source community, those needs are often different than enterprise development’s needs.
Enterprise Annoyances with NuGet
There have been open source libraries long before NuGet has been around, and many of these libraries have become trusted by the .NET community. They’re written about on blogs (sometimes books), used in talks, and given as solutions on many a Q&A site. The C# Zip Library SharpZipLib, for example, is one of the most popular tools for working with ZIP files in .NET.
A Google search for “sharp zip library” takes you straight to the official source of the library (icsharpcode.net), but a NuGet search brings up a number of different packages:
Which one is the official SharpZipLib? Turns out none of them are. Anyone (in this case, the user “bigsan”) can upload anything they want, and then call it whatever they want, so long as that exact name isn’t taken. There is no concept of namespace ownership, which would allow one to easily upload log4net-winrt that works exactly like log4net, except that it scans for strings that look like credit card numbers and sends them to a server in Russia. There is no verification or validation in NuGet, which means one can only hope that 50,000+ other downloaders weren’t wrong when the picked bigsan’s SharZipLib binary.
When you install a package, NuGet will automatically install the packages that package depends on. What can be confusing, however, is how NuGet determines which versions of the packages are needed.
If an author specifies <dependency id=”ExamplePackage”” version=”1.2.3″ /> in his package, NuGet will not interpret that to be “ExamplePackage v1.2.3”, as many package authors might think. According to the NuGet documentation, the dependency element specifies a dependency on version 1.3.2 or higher of the package named ExamplePackage.
But even that’s not entirely accurate. The specific behavior is rather difficult to describe (this 2011 blog post from one of the NuGet authors somewhat explains it), but it could get 1.2.3, 188.8.131.52, 1.2.4, depending on the other versions of the package available. As your project depends on more and more packages (and those packages depend on more packages), knowing exactly what version of what packages you’ll need becomes difficult.
Unexpected and Irreversible Project Changes
With project files, different framework versions, multiple languages, and so on, .NET solutions are significantly more complicated than ruby projects, which are just a directory of files. To ensure that NuGet package installation “just works”, NuGet offers package authors a lot of tools to change the project’s configuration and thus make sure packages are installed painlessly. It’s these same tools, however, that can lead towards a lot of unexpected and irreversible pain during deployment.
While these changes are theoretically reversible – just carefully rollback the broken changes to an earlier version of working code in source control – they may not be detected until it’s too late, and the consequence of an errant change was deployed to production.
When dependency resolution fails – for example, if one project requests 1.2.3 of a package, and another requests 1.2.4 – NuGet may silently add a binding redirect to the application configuration file to attempt to resolve this conflict. This effectively bypasses compile-time verification and advises the runtime that it’s OK to load a different version of an assembly.
In general, binding redirects lead towards confusing and difficult to debug error messages. If, for example, a method signature changed in 1.2.4 (a parameter was added), the error won’t be noticed until that specific method is called at runtime.
.config File Transforms
When NuGet sees that a package contains an”app.config.transform” file, it modifies the application’s configuration file (without confirmation) by applying whatever changes the “.transform” file calls for. This is generally used to add configuration sections and necessary application, but it’s up to the package author to make sure it doesn’t interfere with existing confirmation.
NuGet offers an option to “download missing packages during build” – a feature which allows both build servers and developers without NuGet to have packages downloaded automatically. The specific behavior of this feature changes frequently, but as of NuGet 2.3 clicking “Enable Package Restore” will
- Download NuGet.exe and NuGet.targets from NuGet.org
- Add a solution folder named .nuget containing these two files
- Change every project in the solution to import the NuGet.targets MsBuild task
The NuGet.targets file also changes fairly frequently, but in general it will do the following at build time:
- Download NuGet.exe from NuGet.org
- Run NuGet.exe, which will then go to NuGet.org and download the latest version of NuGet.exe
- Download each package in packages.config from NuGet.org and install it to the packages directory
Advanced users can certainly edit the .targets file to behave differently, but many developers will unknowingly keep this default behavior.
Automatic Import of Build Targets and Properties
Starting in NuGet 2.5, when a package includes .targets and .props files, NuGet will automatically import these into each project. Because .targets files can be used to execute any code (or process for that matter) on the machine it’s built, this can lead to far more than just broken builds. A malicious author could design a .targets file to modify a build machine itself to inject a backdoor into binaries that it compiles. Because these backdoors would only be added by the build machine, they’d never be apparent in the source code.
Arbitrary PowerShell Script Execution
If the Nuget package contains any of the following files — install.ps1, init.ps1, uninstall.ps1 – NuGet will simply execute those scripts without confirmation upon package installation. While this PowerShell Script is only run on the developer’s workstation, it could do anything. Even without administrative privileges, a malicious script could perform a network scan and, using the developer’s integrated credentials, identify and exploit internal network vulnerabilities.
Like most software, virtually all packages on NuGet are copyrighted works that are licensed for use; they’re not free, public domain works. Even if these licenses are not read, they’re still a legally-binding agreement between the licensee and the licenser – and that agreement can contain some unexpected terms. GPLv3, for example, requires that all licensees publishtheir source code as open source GPLv3 code. Not doing so is a breach of the licensing agreement, which is effectively copyright infringement.
NuGet makes it incredibly easy for a developer to bind his employer to a license agreement: just install the package.
Getting around Nuget annoyances
In the beginning narrative, NuGet’s tendency to automatically add binding redirects was the culprit of their production failures – but any one of these gotchas can lead to similar problems in production. The solution is twofold: carefully monitor changes to the entire codebase (not just codefiles) and ensure that developers don’t treat NuGet as an all-you-can-eat buffet of unlimited free and open source code.
While the former can be tedious, the latter is relatively easy to solve with a private NuGet repository.
Private NuGet Repository
A private repository serves as an in-house, exclusive NuGet.org – only the packages that have been approved by the organization are hosted there, and it’s the only source that developers (and, most importantly, build machines) can use for packages. While a private repository needn’t be any more than a network share (just as source code technically doesn’t need to be in source control), there are several high-quality products that specifically address the problem of maintaining an in-house private repository.
At a minimum an enterprise-grade private repository should offer the following:
- Security – give different groups of users different access – read, write, overwrite, administer, etc
- Multiple Feeds – allows for not only different groups to have different packages, but allows for internal publishing of packages
- Connectors – allow a private feed to display filtered packages from another feed (such as NuGet.org)
Following is a comparison chart listing different private repository options that meet these basic requirements, and offer more.
Preparing Packages for the Enterprise
Even with a private NuGet repository, third-party packages can produce undesired behavior when using NuGet client tools. The good news is that these packages are easy to download, inspect, and edit.
Before adding a package to your private repository, you can use the free NuGet Package Explorer or simply rename the package file from “.nupkg” to “.zip” and then open them with Windows explorer. Packages follow a fairly-stable (but poorly documented) packaging convention. As such, you can mitigate the risk of undesired behavior by inspecting and modifying the following from package zip files:
- tools folder – stores both the PowerShell Scripts that are executed and the MsBuild Targets that are automatically added to the project
- content folder – configuration file transforms (*.transform) and other files that are automatically added to the solution
- lib folder – the actual .dll libraries that are added to the project
To avoid confusing a modified third-party package for internal use and the “official” package from NuGet.org, it may be worth renaming the package by prefixing it with “ThirdParty.” or something to that effect. This is easily accomplished with NuGet Package Explorer, but may also be done by renaming the file and editing the “nuspec” file within the package.
Note that modifying a third-party package (even if It’s free and open source) may be a violation of the package’s license agreement – as with all third-party libraries, make sure to read the license before integrating the code.
Alternative Client Tools
In the Java community, the Maven space has expanded to grown to not only have types of multiple private repositories (Archiva, Artifactory, Nexus), but multiple client tools (Maven2, AntMaven, Ivy, etc). Each of these behave differently, and each of these are appropriate for different development environments. The same is happening in the NuGet space.
Currently, the primary alternative to the official NuGet client tools are the ProGet Client Tools from Inedo. As of this writing, they’re still early in the lifecycle (0.2), but are shaping up to be a suitable alternative for enterprise development scenarios. Not only do they form a tighter integration with private repositories, but they also mitigate the NuGet annoyances by:
- letting users to “opt out” of non-reversible changes (config transforms, PowerShell scripts, etc)
- making installation of older package versions easy
- requiring explicit unbounded dependencies for resolution instead of assuming “newer is better”
As the NuGet user base expands, tools to help the enterprise will only grow and improve.
Taking NuGet in the Enterprise
Although NuGet was not designed for enterprise development scenarios, it can be used to solve the dependency management problem that enterprise organizations face. NuGet and the surrounding ecosystem have evolved quite a bit over the past few years, and while its introduction still warrants a careful approach, it’s a necessary step for organizations to take. Not only will developers start expecting NuGet be available as a development tool, but Microsoft has started distributing their own libraries and platforms on NuGet: Visual Studio 2012 now ships with NuGet, and the ASP.NET MVC4 applications require NuGet.
Because NuGet will continue to be developed through Microsoft’s open source initiatives (at least for the foreseeable future), its evolution and development may continue to be misaligned with Microsoft’s enterprise initiatives – but the platform is here to stay, and it’s no longer a question of if it should be implemented, but when. With the pace that development is accelerating – and the fact that NuGet isn’t that challenging to implement – the when may as well be now.
Beyond Dependency Management
Several tools – both open source and commercial – have repurposed NuGet components for uses other than dependency management. While these components will obviously share NuGet’s annoyances, the same techniques can be applied to mitigate these annoyances in the enterprise.
One such tool is Chocolatey. It’s described as “somewhat like apt-get, but built with Windows in mind,” and allows users to install programs like Notepad++, Git, and 7Zip with a single command at the Command Prompt. The Chocolatey client accomplishes this by downloading the corresponding .nupkg file from chocolatey.org and executing the install.ps1 contained within.
Obviously, most of NuGet’s dependency management annoyances aren’t applicable with Chocolatey, but Package Verification, Arbitrary PowerShell Script Execution, and Unexpected Licensing are equally – if not more – problematic. But they can all be mitigated with a private NuGet repository and careful package preparation.
Another tool that uses NuGet components is Red Gate’s Deployment Manager. Essentially, it retrieves application components (which have been packaged as NuGet packages) from a private NuGet repository and deploys those components to various servers. But in this case, since the packages come from known sources (i.e. built by the organization), used outside of the context of development, and are already housed in a private repository, none of NuGet’s annoyances have transferred.
All new tools brought in the enterprise need to be carefully adopted, but as these examples show, just because a tool uses NuGet components doesn’t mean that it inherits NuGet’s annoyances. And even if it does (as is the case with Chocolatey), it’s just as easy to mitigate.