Patreon Engineering attempted to build for the future by creating DB schema and code that supported features that didn't exist yet. While the schema formed the basis of a successful product, our attempt to eliminate a source of tech debt added friction and confusion to development and didn’t make implementation of those future features any easier.
Patreon has two basic units of identity: a User represents a person (someone who logs in), and a Campaign represents all the information about a creator's creative business: their creator page, billing, tiers, benefits, etc. Patreon Engineering made a forward-looking decision when we were building our schema: even though the product we were building would only allow a single user per campaign and a single campaign per user, we'd design for a future where a campaign could have many users, and users could have many campaigns. So: we have the campaigns_users join table in our database. In early 2019, we released the Team Accounts feature that allows multiple users per campaign, but we still only allow a single campaign per user.
Even though our lower-level backend layers support many campaigns per user, much of the app is explicitly built to hide that: the User API object, for example, has a single "campaign" relationship, not an array of "campaigns". When we handle a request, we always assume there's either a single campaign or no campaign at all for the logged-in user. For example, when a user makes a new post from our web client, the client doesn't pass up a
campaign_id, we just assume that we're posting to that user's (only) campaign.
In our controller-layer code, we introduced a series of methods with the suffix
get_campaign_id_by_user_id_hack. The naming served a twofold purpose of
- reminding devs that they should be thinking about a future world where there are multiple campaigns per user and
- when we make that world a reality and implement multi-campaign users, making it easy to track down all the code we need to change by just grepping for
In theory this could allow us to avoid tech debt: most coders will have already built the systems that support multiple campaigns as they're coding, because the
_hack suffix reminded them that they should handle the multiple-campaigns case gracefully. There are only a few places where anyone will have called the
_hack method, and when it comes time to implement multi-campaign users, we mess with the API a little and voila, everything just works.
In practice this is not the case.
Every time a dev comes across a place where they'd like to look up a user's campaign, they face a choice: do they call the
_hack method or do they handle the multiple-campaigns case? Calling the
_hack method is scary for devs new to our system (uh-oh, a hack!?) and shame-inducing for experienced devs (I am increasing the entropy of our system 😔). Further, handling the multiple-campaigns case isn't free:
- There are product decisions to be made (if they have two campaigns, how does this API route respond?)
- There is extra code to be written (loop through campaigns, apply changes to all campaigns, look up more campaigns, etc.)
- This extra coding and thinking time has a real cost
- Most importantly: that extra code will never be exercised in production
It's those last points that really drove home to me the idea that we should be coding to our system in its current state and not to a vision of the future. Let's imagine that no one ever used the
_hack methods and everyone always spent some time thinking through how their methods, API endpoints, and route handlers worked when there are two or more campaigns per user. They wrote tests, the tests are good, the tests occasionally catch bugs where a dev didn't properly account for multiple campaigns.
Now product decides it's time to roll out multi-campaign users. Sure, we have to make some significant changes to our UI, but the backend is already all set to go, right? We can just flip the switch.
That was a joke.
The odds that everything will just work in this scenario are very small.
There's lots of code out there that worked fine with multiple campaigns two years ago when it was written. That code's been hammered on, debugged, fixed, edited, and now it's stable. But no one's ever used it with two campaigns.
It has tests, so under some contrived scenarios it will work. If they're really good tests, it will work over all scenarios. Let's even put on our rose-colored QA glasses and imagine that all the places that we queried a user's campaigns were well tested and will just work. Then we can get our switch-flipping finger out and go, right?
Remember that the coder in each situation had to think through how the system should react every time they coded up a method that dealt with a user's campaign. Here's an actual example, this code was written in 2016 and 2017:
def block_user( blocker_id: int, blocked_id: int, block_reason_id: int=None, created_at: datetime=None, ) -> BlockUser: if blocker_id == blocked_id: raise ParameterInvalid('blocker', 'User cannot block self') # Various logic and side effects... campaign_ids = CampaignIdsFromUserId.get(blocker_id) for campaign_id in campaign_ids: do_internal_blocking_stuff( patron_id=blocked_id, campaign_id=campaign_id, ) # Cleanup...
The logic is that if User A blocks User B, that means User A is blocking User B on all the campaigns that A owns. Is that really the logic that we want? It's possible, but it's also reasonable that we really want to be able to block a user from a particular campaign, one at a time. That's something that our product owners and designers probably never thought about, because before they never had to. If we decide that we really do want to be able to block users from one campaign at a time, we need to change logic up and down the chain to let Campaign Z (that User A owns) block User B, separately for each campaign.
We didn't save any work by investing upfront in multi-campaign; instead we had slightly more logic to test (or fail to test: there are no automated tests for this particular function that test multiple campaigns), a little more friction at development time, and code that we must revisit and re-test anyway when we start actually designing interactions and APIs for multi-campaign users.
Does this mean we should never design for features that aren't yet implemented? Of course not. It was an excellent decision to put the user <-> campaign relation in
campaigns_users rather than, say,
campaigns.user_id. It was a choice that enabled us to implement Team Accounts much more quickly. However while the data-modeling choice was a good one, adding the
*_hack methods had a negative impact overall.
So next time visions of the ultimate end-state of your product dance in front of your editor, what should you do? If you have a good guess that you'll need a feature within a reasonable time horizon, go ahead and engineer for the future. What's a reasonable time horizon? That will depend entirely on your product and your roadmap, and how much will have to change if you guess wrong. For Patreon, our initial guess that we would be adding multi-user campaigns or multi-campaign users within a year of writing the
*_hack methods was off by two years, and we paid some small but real amount for that miscalculation. Remember that avoiding debt means upfront investment, and you want to make sure that that's effort well spent.