All posts

Webinar Starting a Data Team? Tips to Nail It.

Webinar recap from Erathos: how to build a data team, the minimum stack, and common pitfalls — with CTO Luca Piermartiri.

Oct 7, 2024

Slide from Erathos's webinar on how to build a data function from scratch

In October 2024, Erathos prepared the webinar “How to start a data area? Tips to get it right” led by our CTO, Luca Piermartiri.

In this blog, we’ll share a summary of what was presented and the main points from the event!

We started the event by introducing the concept of a data-driven culture within companies, which you can read more about in this blog, and then we developed what the main behaviors within data teams are that prevent business success and the proper implementation of a data-driven culture. Below, we compiled the main dysfunctions we mapped:

Main dysfunctions of data teams

Data silos

If you use more than one system/platform to run different business processes, you have isolated information, resulting in data silos. This dysfunction causes major delays in carrying out analyses, considering that reports need to be manually extracted from each of these systems and consolidated in an Excel file, for example, in order to try to visualize and connect this information to gain insights. Quite a lot of work, right?

Manual processes

Many times within IT teams, to escape the complex journey created by data silos, team members decide to find a way to unify these manual processes, such as in one large single Excel file, developing a Python solution that runs only on that individual’s computer to “solve” the problem. All of this, besides not being beneficial for the company—since it allocates too much time to non-value-adding processes—also creates dependency on one specific professional, because they are the only ones who truly know what happens to the data.

Lack of trust

From data silos and manual processes, among other aggravating factors, it is inevitable that trust issues arise regarding the information presented and even between areas of the company. Data that is often incorrect due to integration issues or processes that only temporarily solve challenges creates uncertainty and hinders decision-making. There is a major difficulty in delivering quality information within critical deadlines for action-taking.

Analytical monopolization

Another very frequent problem within organizations is analytical monopolization by the data team, consequently creating a bottleneck for other areas, making them dependent when it comes to performing analyses. Each area should have some basic analytical capability to avoid dependence on IT, such as calculating the sales team’s commission.

“IT-ization of processes”

The dependency that other areas may develop on the IT team leads to overloading the data area, considering the opening of multiple tickets to meet different demands, both simple and complex. In this case, we return to the point that the team cannot deliver everything (much less with the desired quality), and prioritization may affect data project development to the detriment of business area progress.

Building data debt

Lack of process and project documentation is a common and very costly mistake for companies. It not only creates dependency on specific people to understand what was effectively done, but also depends on step-by-step memory capacity to understand at what point in the journey something may have gone wrong—for example, calculating the same metric in different analyses and showing different results. Data quality is strongly influenced by good documentation practices, and unfortunately, by the time this is realized, it may be too late depending on the project.

Ticket factory

Opening tickets from different areas—whether shallow tasks that are relatively easy to complete or more complex tasks that require more time—ends up consuming a large portion of the data team’s hours. The goal stops being to generate value and becomes ensuring things are working as they should and that information is correct for consumption (in other words—putting out fires). Developing quick fixes to rapidly meet urgent data demands is a very common mistake, which only avoids addressing the root cause of the problem, which is often found in the initial infrastructure.

Difficulty measuring impact

The dysfunctions mentioned also lead to difficulty measuring the data team’s impact, since a large share of working hours is dedicated to keeping the solution running as it should, rather than developing value-generating projects. The team starts to be seen as a cost center due to the difficulty of measuring the value of delivered analyses and the low ability to work on long-term projects, since short-term demands always end up being prioritized.

Three Analysis Levels: Strategic, Hands-on, and Process

Strategic view

When considering how to start a data area from scratch through a strategic perspective, there are some highly relevant questions. The first is: what is the role of this data area, what activities will it need to perform, and how will this role change over time?

It is important to remember that this may vary depending on the company, and that data area activities vary according to what activities would be assigned and taught to other areas. The tools and processes that should be built must be clearly defined, and agility, simplicity, and continuous improvement should be core values of the data area.

Hands-on view

You only need 4 tools in your data area

One to extract/collect information;
One to store information in an analytical data environment;
One to transform this data and generate value for the business;
One to visualize it for analysis and decision-making.

Process view

To implement all of this, it is necessary to:

Solve the cold start, that is, begin the groundwork and deliver analyses to improve how each area operates.
How can we have clarity on where effort is going? It is ideal to track the time and effort dedicated to each business area, either through a specific solution or through an Excel spreadsheet.
How do we get closer to generating value for the company? After delivering analyses to different areas, it is interesting to reduce the budget allocated to these sectors, covering only maintenance costs, and define a larger quarterly project plus a primary metric that allows tracking the value generation of this data team (for example, number of sales).

Every minute spent on data problems is one less minute spent on business problems. At the beginning of a data team, this more than ever defines success or not.

How we do it at Erathos

Erathos as the data extraction solution
Information sent to BigQuery and transformations done through dbt
Looker Studio as the visualization tool
We don’t build anything that isn’t necessary
Our weekly meetings ensure tracking of clear and well-defined metrics within the routine