Are analytics engineers software engineers?
How to scale data-as-a-product and avoid the trap of ad-hoc analyses
Before becoming a software engineer, I was an analytics engineer. So I enjoyed Tristan Handy’s latest entry in the Analytics Engineering Roundup.
Tristan’s post is a follow-up to Emily Thompson’s excellent article about scaling your data team. He focuses on one major reason data teams have trouble scaling—the cost of maintenance. Specifically, how software engineering teams approach the problem.
Software teams often spend too much time on new features and not enough on maintenance. This is a common mistake I see software teams make, especially in the startup world.
But the most common failure of analytics engineering teams is slightly different. Too often, they put all their effort into what Tristan calls data-as-a-service. Data teams constantly get pings to " do a quick pull for me" and cannot spend as much time as they’d like on everything else.
What can we do about this?
“Data as a product” gets us most of the way there. It is the idea that your data team should not be a service center for one-off requests. Instead, you should use product management techniques to build data products. These data products solve the needs of your customers (internal business users).
Yet, data-as-a-product faces a few challenges (aside from the ones Emily points out):
Sometimes you need to do analyses, not build a product. Some decisions need in-depth statistics to get right. Even if they don’t, business stakeholders often don’t have the time to dive into data.1
At smaller companies, it often just means telling analytics engineers to have a “product mindset.” While this can be useful, asking analytics engineers to both set the product vision and execute on it is a recipe for burnout. Neither will be done well.2
Analytics engineers too often see data models, BI tools, and dashboards as the only outputs we can build. We build based on the tools available to us, not users' needs.
These challenges suggest some areas where we can push data-as-a-product forward. Software engineers aren’t expected to be PMs, nor are they the primary users of the products they build. At the same time, they don’t limit themselves to working with only a few tools.
Analytics engineers as software engineers
To make data-as-a-product succeed, we need to change how we view analytics engineers. They aren't “full-stack data analysts.” Nor does their job begin and end with data modeling. Instead, we should see them as software engineers who build internal data products. And we need to structure teams the right way to make that possible.
In my ideal data team structure, analytics engineering is set up like a software team. The team builds products for data analysts and business users:
Business users are responsible for (and have the tools to do) “quick pulls." Meanwhile, analysts (aligned to business units) do in-depth analyses.
Back-end analytics engineers maintain the warehouse and build data models and APIs to expose the data.3
Front-end analytics engineers build user-facing products on top of the data. That may include a BI tool and dashboards. But it could also mean (for example) building data into tools like Salesforce via Reverse ETL. Or creating internal Hex (or even React) apps for stakeholders.
A data PM sets the product vision and roadmap. They bring the same level of focus on customer needs that a PM would to a software engineering team.4
But achieving this vision is hard.
Why? For one, it requires investing in many roles whose responsibilities we usually throw on one person or the entire data team.
Often that’s done to save costs, especially at smaller startups and nonprofits. That is fair enough—as Emily says, “data teams are always under-resourced.” And I am not opposed to having “full stack analytics engineers.” Many software engineers work across the front end and back end successfully! But I find front-end vs back-end is a useful way to think about analytics engineering even if it’s just one person doing the work.5
The second problem is that business users often don’t have the tools, knowledge, or confidence to do “quick pulls” themselves. Without those, simple data pulls will just get pushed onto analysts’ plates.
However, this isn’t just about roles. It’s also about responsibilities, and analytics engineers will have to take on some new ones to make self-service possible:
Focusing on building scalable, user-friendly data applications, not just data models. Today’s BI tools are frankly not good enough to be the only front-end we expose to users
Using the right tool for the job (whether an off-the-shelf tool or a programming language). Analytics engineers should be comfortable working with multiple languages and even managing infrastructure
As Tristan mentions, saying “no” to asks that don’t add value or create heavy maintenance burdens
Often, it seems like we define an analytics engineer as “someone who uses dbt to build data models.” But a better definition is “someone who uses code and tools (often dbt) to build internal data products.” Put that way, it is clear that analytics engineering isn't just similar to software engineering. It is software engineering.
Analytics engineers are a subset of software engineers, just like web or mobile developers. We will all benefit when analytics engineers start to see themselves that way.
Thanks to David Jayatillake for his generous feedback and advice on this article
This is why I’m building Hermes, an open-source tool that lets business users ask data questions in plain English.
Note that we almost never ask the same of a software engineer
This doesn’t have to be a REST API. It could be well-documented SQL interfaces
PMs will intuitively get Emily’s great point that “anyone who shows interest in data, no matter what the motivation, is a potential customer” and be able to help build the right products for them.
The data PM role, however, should always be separate.