Data Lakehouse & Self-Service BI

A report that should have been ready Monday morning arrives Thursday afternoon. The CEO needs to make a decision about market expansion, but the data is still being "compiled" by the IT team. Meanwhile, the opportunity passes. This scenario repeats itself every week in dozens of Brazilian companies with revenues above R$100 million — and most leaders still believe the problem is a lack of analysts, when in reality it is a lack of architecture.

Over more than 20 years working with technology at companies like IBM and AWS, and serving organizations such as BTG, B3, XP and Bradesco, I have identified a consistent pattern: data slowness is not a people problem, it is a structural problem. And the solution does not lie in hiring more engineers — it lies in rethinking the data strategy from scratch.

This article explains why your company is still experiencing this bottleneck, what a modern architecture based on a data lakehouse is, how self-service BI transforms the autonomy of business teams, and what the practical path looks like to move from analytical chaos toward real-time decisions.

The real diagnosis: why reports take so long

The first thing I do when I begin a diagnostic engagement with a new client is ask: "How long does it take for a business manager to get a number they themselves requested?" The average answer, in the Brazilian market, is between 3 and 7 business days. In companies with heavier legacy systems, it can take up to 2 weeks.

The reason is not laziness or incompetence. It is structure. Most companies operate with a fragmented data architecture, built over years of point-in-time decisions: a data warehouse here, a transactional database there, a CRM system that does not talk to the ERP, marketing data scattered across spreadsheets on Google Drive. The result is an ecosystem where no single tool sees everything, and every meaningful analysis requires manual integration work.

There is also a critical human bottleneck: every data request goes through the IT or data engineering team. The business analyst wants to understand the churn rate by customer segment? Opens a ticket. The CFO needs a revenue projection by channel? Opens a ticket. This model does not scale. And as long as it persists, the company is making decisions with yesterday's data — or last week's.

What has changed in data architecture over the last 5 years

For a long time, the dominant standard was the data warehouse — a centralized, structured repository optimized for analytical queries. It was expensive, rigid and required data to arrive already transformed and organized. Then came data lakes, which promised to store everything in raw format at lower cost. The problem is that a data lake without governance quickly becomes a "data swamp" — a swamp of data that no one can use with confidence.

The concept of the data lakehouse emerged precisely to resolve this impasse. It combines the best of both worlds: the flexibility and low storage cost of a data lake with the governance, quality and performance capabilities of a data warehouse. Platforms such as Databricks, Apache Iceberg, Delta Lake and Amazon Redshift Spectrum have popularized this approach in recent years.

In practice, a data lakehouse allows you to store raw, semi-structured and structured data in the same repository, apply progressive transformations (bronze, silver, gold — the medallion architecture), and deliver reliable data to both data scientists and BI tools — all within the same infrastructure. This unification drastically reduces data duplication and maintenance costs.

A financial sector company I worked with reduced pipeline processing time by 68% after migrating from a fragmented environment (separate Oracle + S3 + Redshift) to a unified lakehouse architecture on AWS. Infrastructure costs fell 40% in the first year.

Self-service BI: returning autonomy to the business

Even with a modern data architecture, a company can still face a critical bottleneck if all analytical access depends on an engineer writing SQL. This is where self-service BI comes in — the ability for anyone with business knowledge to explore data, build visualizations and answer questions without depending on IT.

Tools such as Power BI, Tableau, Looker and Amazon QuickSight have evolved significantly in recent years. But self-service BI is not about the tool — it is about the semantic model. For a manager to be able to drag and drop dimensions without knowing SQL, someone must have previously built a consistent semantic layer: metrics defined in a unique way, clear data hierarchies, encapsulated business rules. Without this layer, self-service becomes chaos — each department calculating the same KPI in different ways.

The pillars of a well-implemented self-service BI include:

Centralized semantic model: A single definition for critical metrics such as revenue, churn, CAC and LTV — preventing marketing and finance from presenting different numbers for the same indicator.
Granular access control: Each user sees only the data they have permission to see, without limiting the exploration experience.
Data catalog: A navigable inventory where anyone in the company can discover what data exists, what it means and where it comes from.
Reliable and monitored pipelines: Self-service is worthless if data arrives late or with inconsistencies. Pipeline reliability is the foundation of everything.
Training and analytical culture: Technology enables, but adoption depends on people who know how to ask the right questions.

When these pillars are in place, the impact is immediate. Companies that have correctly implemented self-service BI report a 60% to 80% reduction in data request tickets to IT, and a significant increase in the frequency of dashboard use by business leaders — which is a direct indicator that decisions are being made with more data.

Modern data architecture in practice: from raw data to decision

I will describe how a modern and functional data architecture is organized, without unnecessary abstractions.

At the ingestion layer, data arrives from multiple sources: transactional systems (ERP, CRM, core banking), external APIs, real-time events (clickstream, IoT, financial transactions), batch files and third-party data. Tools such as AWS Glue, Apache Kafka, Airbyte or Fivetran handle this collection and delivery work to the central repository.

In storage, the data lakehouse stores everything in layers. The bronze layer holds raw data exactly as it arrived — untransformed, immutable, auditable. The silver layer applies cleaning, deduplication and standardization. The gold layer contains data ready for analytical consumption, modeled according to business needs. Technologies such as Delta Lake or Apache Iceberg guarantee ACID transactions even at petabyte scale.

At the transformation layer, frameworks such as dbt (data build tool) have gained enormous adoption by allowing data transformations to be written as versioned, tested and documented SQL code — treating data pipelines with the same software engineering discipline.

At the consumption layer, BI tools connect directly to the gold layer of the lakehouse, or to an intermediate semantic layer (such as Cube.dev or dbt Semantic Layer), ensuring that metrics like "net revenue" mean exactly the same thing across all of the company's reports.

This flow, when well implemented, reduces the time for new data to become available from days to hours — and in streaming cases, to minutes or seconds.

What prevents Brazilian companies from moving forward

In the projects I have led in the Brazilian market, I identified four recurring barriers that block data modernization:

Organizational silos: Marketing, finance and operations data live in separate systems, managed by teams that do not communicate. Data modernization requires a political decision before it becomes a technical one.
Fear of migration: IT leaders are afraid to move critical data to the cloud or to a new architecture. This fear is legitimate, but manageable with the right approach — incremental migration with parallel validation of results.
Absence of data ownership: Without a clear owner for each data domain (the Data Mesh concept), no one guarantees quality, and responsibility becomes diluted.
Fragmented investment: Companies buy BI tools without building the data foundation that supports them. It is like buying a Formula 1 car and driving it on a dirt road.

The solution to each of these barriers exists and has already been successfully applied in Brazilian companies across all sectors. What is missing, in most cases, is not technology — it is a cohesive strategy that integrates people, processes and platform.

How to start a modern data strategy without reinventing everything at once

The good news is that you do not need to throw away what you already have to get started. Data modernization can be incremental and oriented toward business value. The first step is always the diagnosis: understanding where the critical data lives, what the most urgent analytical use cases are, and what the current maturity level of the team is.

From there, a pragmatic approach follows three phases. In the first phase, you consolidate the most critical data sources into a unified cloud repository, establish the first reliable pipelines, and deliver one or two high-impact dashboards to leadership — building credibility for the program. In the second phase, you expand data coverage, implement the semantic model, and begin enabling self-service for the most mature business teams. In the third phase, you add advanced capabilities: real-time data, predictive models, integration with generative AI for natural language analysis.

This path can take anywhere from 6 to 18 months depending on the complexity of the environment. But the first tangible results — reports that previously took days being delivered in hours — appear already in the first phase, and that is enough to sustain the investment politically within the organization.

If you are a CEO, CTO or founder and recognized your company in any of the scenarios in this article, the problem will not solve itself. Every month that passes with slow reports and fragmented data is a month of decisions falling short of their potential — and of competitive advantage handed on a silver platter to those who have already made this move. The technical path exists, is mature, and has already been traveled by companies like the ones I serve. What is missing is taking the first step with the right strategy.

If you would like to understand how this model applies to the specific reality of your company, get in touch at abraao.tech. I will conduct an initial analysis of your data environment and provide an honest diagnosis of where your biggest bottlenecks are — and how to resolve them.

Why your report takes days to be ready: how to build a modern data strategy with data lakehouse and self-service BI

The real diagnosis: why reports take so long

What has changed in data architecture over the last 5 years

Self-service BI: returning autonomy to the business

Modern data architecture in practice: from raw data to decision

What prevents Brazilian companies from moving forward

How to start a modern data strategy without reinventing everything at once

Is your data slowing down your decisions?

Why your report takes days to be ready: how to build a modern data strategy with data lakehouse and self-service BI

The real diagnosis: why reports take so long

What has changed in data architecture over the last 5 years

Self-service BI: returning autonomy to the business

Modern data architecture in practice: from raw data to decision

What prevents Brazilian companies from moving forward

How to start a modern data strategy without reinventing everything at once

Is your data slowing down your decisions?

Read also

Your competitor launches in weeks while you take months: how to accelerate time-to-market with modern engineering

AI Agents in 2026: what they are, how they work, and when they make sense for your business