Refactoring a product is tricky
It's easy to talk about refactoring at the codebase level because there's a reflexive definition: a refactor is a change to the codebase that preserves all existing functionality. If the code change modifies the functionality in any way, it is not a refactor and therefore needs to be considered in a different light. One of the reasons I love refactors is that our codebase, for the most part, is very clean because we do a good job of rigorously reorganizing our implementations over time. This is definitionally impossible to do from a product or design perspective. Any change is a change in behavior. Therefore, any change at sufficient scale needs to be handled with the requisite weight that comes from knowing that you're moving somebody's cheese.
If you can't refactor a product and you can't clean up after your poor design decisions, what do you do? One answer is to do nothing and to let poor choices sit in perpetuity. I liken this model to that of a college campus that has lovingly maintained, for history's sake, its old and crumbling buildings and built newer and better ones around them instead of bulldozing the old ones down. You can sense this in some older products too. The feeling of going from one page to another and discovering a completely different design language.
This is abstract. Let me make it more concrete. Buttondown has a webhooks feature that is exactly like every other webhooks feature, and it predates our automations feature, which is, to some extent, exactly like every other automations feature. That's not a critique so much as a shorthand because the design of either feature is not particularly germane. At some point, I realized that webhooks are a strict subset of automations. They both have the exact same backend implementation and, in fact, share many code paths. When a certain event occurs in the codebase, we check to see if there's a webhook for that given event, and then we check to see if there's one or more automations for that given event. This kind of dual-track functionality upsets my engineering brain. From here, we have a couple options:
- Do nothing. Having a parallel implementation of automations is not really causing any problems. It is objectively a bit of a drag on technical debt, but only just. The same cannot be said of the UX. Right now, our webhooks UX is admittedly janky. We've invested a lot and are investing even more in improving the automations UX because it's such a load-bearing feature for us, but webhooks are relatively malnourished.
- Unify the backend while maintaining separate front ends. Get rid of all the webhook-specific code and just have a flag or metadata on the automation that says this is technically a webhook object. We get the benefits of combining the two pieces of infrastructure without having to worry too much about breaking changes on the front end or on the API. This solves the backend problem, which we have just said is not that big, and leaves the larger problem, which is the malnourished front end still kicking.
- Get rid of webhooks entirely. Just delete them, backfill all webhooks to automations to preserve existing functionality, and call it a day. This definitively falls into the realm of breaking customers' expectations, if not their functionality. And while we get to clean up a lot of our documentation, the front end will incur a large overhead in communicating these changes to users and having them be able to handle the changes, which seems particularly relevant given that webhooks tend to be the kind of thing that you set and forget.
- Update the UX alongside the backend unification mentioned in option two, in some way that hints towards the fact that webhooks are secretly a backing automation and not a true standalone primitive. The biggest advantage of this approach is that it's something that we want to use for other purposes, because webhooks are not the only thing that are secretly an automation in a trench coat. Subscriber cleanup, subscriber reminders, and a few other things fit this bill. However, this is difficult to do because now we're not talking about a refactor but a full-on redesign and one with wide-ranging implications for our core application.
The problem with all of this is that the problem we're trying to solve is not a big one. People don't write in to complain about the UX of setting up webhooks. It's a technical skeuomorph that pattern-matches onto industry conventions. So if we're going to do something relatively onerous, we need to be confident that the change brings value commensurate with the cost. Sometimes that conviction is easy to muster. Other times, it is less so.
I started writing out this essay in hopes that the process of enumeration and navel-gazing would reveal the right choice. And unfortunately, we've come to the end and I am still torn, and that in and of itself is a kind of failure mode.
Thinking hard for an hour about something and not having anything to show from it when there are so many other things to spend an hour thinking hard on strikes me as a bit of a process failure.
But this, I think, usefully illustrates why back office table stakes features, like webhooks, end up in a state of perpetual malnourishment. It's really easy to back yourself into a corner where the easiest decision at any given point is to do nothing.