Two Tips for Go’s pkgsite

2023-04-23

Here are two things that I learned after I listed a Go module on pkg.go.dev. Neither was obvious to me; hopefully, this will be useful to other people too.

Tip 1: Be Careful with Public Repos

After (deliberately) listing something I wrote on pkg.go.dev, I discovered that several repositories of mine were already there. I hadn’t listed those repos myself, so I was initially confused. After reading the about page on pkgsite and talking to people in Go’s slack, I realized that I was only surprised because I didn’t understand how listings end up on pkg.go.dev.

For no good reason, I assumed that pkg.go.dev worked like CPAN, RubyGems, or LuaRocks. On those sites, users create accounts and then publish their libraries. Without worrying about the details, the key thing is that someone has to deliberately list their code for it to end up on CPAN, RubyGems, or LuaRocks.

Things are different at pkg.go.dev. Someone can deliberately list their package there, but the package can also end up there in other ways. Most importantly, if anyone runs go get or go install on a public repository with Go code, then that repository will probably (see below for more details) end up listed on pkg.go.dev.

Taking a step back, there are three ways to upload something to pkg.go.dev.

  1. Visit what would be the page for the package on pkg.go.dev and click the “Request” button. (See the documentation for an example of what I mean by what would be the page for the package.)
  2. Make a GET request to the appropriate endpoint on proxy.golang.org. (See the documentation for an example of what I mean by the appropriate endpoint.)
  3. Download the package with, e.g., go get or go install.

The third case is the tricky one. In the first two cases, someone deliberately tries to list a package. But I suspect that many people have no idea that if they download a package using go, they will cause the package to be listed publicly on pkg.go.dev.

Why does this matter? Well, I have lots of repositories on GitHub that I share publicly (because why not), but which I don’t consider worth listing on a website where people go to find serious programming libraries. Unfortunately, several of those are now listed on pkg.go.dev. However they got there, it’s pretty silly to have things like my notes and exercises for The Go Programming Language available on pkg.go.dev.

All that said, it is easy to avoid this problem (once you know to avoid it). On projects that you may share publicly in a git repository but that you don’t consider serious enough for pkg.go.dev, don’t use a proper, full import path. Instead of go mod init github.com/user/package, use go mod init package. If the module path is invalid, then pkg.go.dev won’t list it. (Even better use go mod init test or go mod init example/foo. As an unmerged update to Go’s documentation about modules explains, “the paths example and test are reserved for users.” Go’s standard library will never use them.)

Tip 2: How to Hide a README.

Many git hosting sites give README files special treatment. They are often treated as landing pages for repositories, and CSS can make them even more pleasant to read. If you have a README in the root of a repository that you list on pkg.go.dev, the site will automatically place the contents of the README at the top of the package’s documentation. I’m sure that this is often what people want, but sometimes it isn’t. One solution, which I found on Stack Overflow, is to put the README into a subfolder. (As an added convenience, GitHub will still use, e.g., docs/README.md as a landing page, but pkg.go.dev will leave the README out of the documentation if you put it in the subfolder.)

Why Share What You Don’t Want to Share?

One more thought. On Go’s slack, several people asked me, in effect, “Why share what you don’t want to share?” That is, they wondered why I cared about things ending up on pkg.go.dev if I had already shared them on GitHub. I hadn’t thought about this before, but here’s my rough answer. In my mind, there are several different situations. In some cases, I can’t share things because they are for work or contain private data. Things like that clearly don’t go into public repositories. In other cases, I write something that I think is terrible (even if it gets some job done). I’m ashamed of it, so I keep it private. In a very few cases, I write something I think is worth sharing, and I share it. But almost everything else (which at a guess is about 90% of everything I write) is neither awful nor worth much to other people. (A long time ago, I read that an overwhelming percentage of academic articles are only ever cited by their original author(s). I don’t remember the exact numbers, but it was as high as 75% or 80%. I suspect a ton of open-source software is the same way.) I often make this code, which is fine but not especially useful for other people, public on GitHub, and I even license it properly, but I make zero effort beyond that to share it. To put this another way, I think that there is (and should be) a significant distinction between sharing something on GitHub (or the equivalent) and sharing something on RubyGems or pkg.go.dev. In effect, the people I talked with in Go’s slack didn’t think that this distinction was real or useful. Maybe they’re right. I wonder what other people think.