Skip to content
View eduardocornelsen's full-sized avatar
🚀
Building Agentic BI systems | dbt · MCP · Python · Claude API
🚀
Building Agentic BI systems | dbt · MCP · Python · Claude API

Block or report eduardocornelsen

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
eduardocornelsen/README.md
Data Science Cover

Eduardo Cornelsen's Applied Data Portfolio

   

Total Time Coded

 

🇺🇸 English Version

👋 About Me

Analytics Engineer & Data Analyst | Revenue Operations (RevOps) · AI & Agentic BI | SQL · Python · dbt · BigQuery · MCP

Analytics Engineer & Data Analyst with 10+ years across consultative ERP sales (Omie — Brazil's leading cloud ERP), business consulting, and analytics. I diagnose where SaaS, Ecommerce, and Fintech companies lose revenue and build the governed AI pipelines to capture it.

I've sat in the revenue meetings. I know what the VP of Sales needs by Monday morning — and I build the analytics to answer it before they ask.

🛠️ Tech Stack

Python SQL dbt ELT Pipelines BigQuery Databricks MCP Servers LangChain Claude/Gemini APIs FastAPI Looker Studio Streamlit React/NextJS

🧠 AI & Agentic BI

Building governed AI analytics systems where natural language queries return deterministic, dbt-governed insights in 15 seconds — no hallucinations, no metric drift.

📊 Business Stack

BI Architecture · Revenue Analytics · Marketing & Funnel Analytics · SaaS Unit Economics (CAC, LTV, MRR, Churn, ROAS) · Financial Modeling · ELT Architecture

🎓 TripleTen Data Science Resident (700h+) · Six Sigma Green Belt · Final-round candidate — Epic Games 2026


🛠️ Full Tech & Business Stack

🧠 AI & Agentic BI

MCP Claude LangChain Generative AI n8n FastAPI

💻 Data Engineering & Analytics Engineering

Python SQL dbt ELT Pipelines BigQuery Databricks Snowflake

📊 BI & Visualization

Looker Studio Tableau Power BI Streamlit

🏢 Business Stack

RevOps Business Intelligence Product Analytics


🚀 Featured Projects

Category Project Stack
🏆 AI / Analytics Eng Full-Funnel AI Analytics Platform dbt · MCP · BigQuery · XGBoost · Claude/Gemini
RevOps / AI RevOps Lead Engine: AI-Powered B2B Command Center Python · Streamlit · Plotly · XAI
Product Strategy / UXR Epic Games Store: 2026 Ecosystem Intelligence Audit Python · Streamlit · Random Forest · NLP
GenAI / BI Conversational BI & Generative Analytics (Music Trends) Streamlit · LLMs · LangChain
Auto / Market Automotive Market Intelligence & AI Agent Python · Gemini AI · Plotly
Sales / Strategy Strategic Revenue & Churn Analysis (Telecom) Python · Stats · Business Logic
Gaming / Strategy Global Gaming Market Strategy Python · Stats · EDA
Stats / Edu PunkSQL — Mobile SQL Learning Platform Next.js · SQLite/WASM · Vercel · Google OAuth
Infra / AI Chatbot Portfolio Website + AI Chatbot Infrastructure React · Gemini · Cloud Run · LangFuse · Docker


🏆 Full-Funnel AI Marketing Analytics Platform

Natural language queries · dbt Semantic Layer · 7 MCP Servers · ML Lead Scoring · $0/month base cost

Full-Funnel AI Analytics — /marketing command live demo

Python dbt BigQuery Claude XGBoost FastAPI MLflow

Business Context: Companies run ads across Google, Meta, and organic channels. Marketing claims leads. Sales says they're low quality. The CEO asks: "Where should we spend next quarter?"

Answering this requires joining data from 5+ platforms, building attribution models, scoring leads, and making it all accessible to non-technical stakeholders. This project builds the production system — at $0/month base cost.

Architecture: Three Heads, One Spine

  • AI Layer — 7 MCP servers + Claude Desktop, OpenCode, Gemini CLI, Antigravity. Ask questions in plain English, get production-grade React dashboards in 15 seconds.
  • BI Layer — Looker Studio + Streamlit + Claude React artifacts. Same governed metrics across every output.
  • ML Layer — XGBoost trained on 93K rows + FastAPI /score endpoint + n8n auto-routing. Hot leads bypass the queue instantly.

Key Numbers:

  • 29 dbt models — full Medallion Architecture (Bronze → Silver → Gold)
  • 2.2M+ rows of real Olist ecommerce data + synthetic marketing data
  • 7 MCP servers — Google Ads, Meta Ads, GA4, HubSpot, Salesforce, BigQuery, dbt Semantic Layer
  • 4 attribution models — First-Touch, Last-Touch, Linear, Time-Decay
  • 5 warehouses — BigQuery, DuckDB, Snowflake, Databricks, Supabase
  • 4 AI clients — Claude Desktop, OpenCode, Gemini CLI, Antigravity
  • $0/month base infrastructure cost

The Core Insight — Why Governance Matters: Most AI-to-SQL tools fail because they lack a source of truth. This project solves that with the dbt Semantic Layer (MetricFlow): define "ROAS" once in YAML, and every AI client, dashboard, and ML pipeline consumes the exact same definition — forever. No hallucinations. No metric drift.

View Repo Architecture Demo Live Demo



🚀 RevOps Lead Engine: AI-Powered B2B Command Center

Full-cycle autonomous lead generation · Predictive scenario modeling · Post-sales retention analytics

RevOps Lead Engine Dashboard

Python Streamlit Plotly Pydantic

Business Context: Most B2B sales teams burn 60–70% of SDR time on manual prospecting. This project builds a fully autonomous RevOps platform — from ICP-driven lead discovery to AI-powered scoring, predictive revenue modeling, and post-sales retention tracking.

Key Features:

  • AI RevOps Copilot: Natural-language chat for pipeline risk analysis and quota pacing
  • Revenue Scenario Modeler: 4 levers (Volume, Win Rate, ACV, Cycle Time) → 90-day S-curve revenue projections
  • Post-Sales NDR Module: Net Dollar Retention, Account Health scoring, ARR Waterfall charts
  • Explainable AI (XAI): Every lead score includes transparent reasoning — no black-box models
  • 10 Integrated Modules: Revenue Dashboard, Lead Intelligence, Sales Navigator, Pipeline Analytics, and more

Live Demo View Repo



⚡ Epic Games Store Ecosystem Intelligence: Strategic Audit (2026)

Epic Games Store Strategic Audit Cover

Python Streamlit Scikit-Learn GenAI

Business Context: A strategic audit transitioning the Epic Games Store from a digital storefront to an "Ecosystem of Intelligence." Random Forest Regression (R²=0.39) and K-Means Clustering decode the "UX Alpha" — proving 60% of player satisfaction is driven by intangible factors beyond price and specs.

Key Findings:

  • The "Hardware Wall": High system requirements correlate negatively (-0.133) with user ratings — a critical churn zone
  • Behavioral Segmentation: 4 Product Personas mapping the "Premium Friction" risk in high-cost Indie titles
  • Final-round candidate for Epic Games Data Analyst role (2026)

View UXR Slides View Dashboard View Repo View Notebook



>_ PunkSQL — Mobile SQL Learning Platform

Duolingo meets LeetCode · 80 challenges · Real in-browser SQL execution · Cyberpunk CLI aesthetic

PunkSQL — Mobile SQL Learning Platform Demo

Next.js SQLite Supabase Vercel

What it is: A mobile-first SQL learning platform with a cyberpunk terminal aesthetic. Write real SQL queries that execute in your browser using SQLite compiled to WebAssembly. No backend, no signup required to play.

Key Features:

  • 80 SQL challenges across 8 modules — SELECT → CTEs, sequential unlocking
  • Real SQLite execution via WASM — queries validated against expected output
  • Gamified — 20 levels, XP system, 10 achievements, sound effects
  • Google OAuth — persistent progress across sessions
  • Bilingual — full EN/PT-BR support
  • $0/month infrastructure — fully client-side

Play Now View Repo



⚙️ Portfolio Website + AI Chatbot Infrastructure

Streaming AI chatbot · Google Cloud Run · GitHub Actions CI/CD · LangFuse LLMOps · 4-layer security · $0/month

Eduardo Cornelsen Portfolio — Live Site

React TypeScript Gemini Docker Cloud Run GitHub Actions LangFuse

What it is: The production infrastructure powering this portfolio — not a side project, but the live system you're looking at right now. Built to be a proof of work in itself: a streaming AI chatbot, containerized via multi-stage Docker, deployed on Google Cloud Run through a zero-touch GitHub Actions pipeline, with every conversation traced in LangFuse for cost, latency, and security analysis.

Architecture:

GitHub push → Actions: Docker build → Artifact Registry → Cloud Run deploy
Browser → React 19 + Vite (SSE streaming) → Express proxy → Gemini 2.5 Flash
                                                           → LangFuse (trace every token)

Key Features:

  • Streaming AI chatbot — Server-Sent Events deliver token-by-token responses; no waiting for the full reply
  • 4-layer security model — client sanitization, IP rate-limiting, canary token injection, intent classification (jailbreak detection)
  • LLMOps with LangFuse — every conversation traced with cost ($0.15/1M input), latency, and safety flag
  • Zero-touch CI/CD — push to main, site is live in ~3 minutes via OIDC-authenticated GitHub Actions
  • $0/month at rest — Cloud Run scales to zero when idle

Open source — replicate it yourself:

Live Site View Repo



🎵 Conversational BI & Generative Analytics

"Chat with Data" Agent — 100 Years of Music History

Streamlit App

MusicInsights AI — Conversational BI Dashboard

Python LangChain OpenAI Streamlit

Business Context: Stakeholders need quick answers but lack SQL skills. This solves that by integrating an LLM directly into the dashboard — ask "What was the most popular genre in the 80s?" and get an instant data-backed answer. Analysis of 170k+ tracks spanning 100 years.

View Repo Launch App


🚗 Automotive Market Intelligence & AI Agent

Advanced EDA with Tool-Calling AI for Market Trends

Streamlit App

Python Gemini Plotly

Business Context: Tool-Calling AI Agent using Gemini 2.5 Flash that writes and executes Python code in real-time to answer complex questions about vehicle depreciation, market saturation, and pricing trends.

View Repo Launch App


📈 Strategic Revenue Optimization (Telecom)

Statistical Analysis for Plan Profitability & Churn

Python Stats

Business Context: Comparative revenue analysis of mobile plans to identify user behaviors, inform marketing budget allocation, and maximize ARPU. Statistical hypothesis testing to validate profitability strategies.

View Repo View Story

***

🎮 Global Gaming Market Strategy

Predicting Sales Success to Mitigate Launch Risks

Python Stats

Business Context: Identifies success drivers in the console/PC gaming industry by analyzing global historical sales data. Builds a predictive framework to support Go-To-Market strategies and mitigate launch risks.

📓 Available as a deep-dive technical notebook — extensive EDA, hypothesis testing, and strategic analysis. Portuguese only.

View Technical Notebook


🇧🇷 Versão em Português

👋 Sobre Mim

Analytics Engineer & Data Analyst | Revenue Operations (RevOps) · AI & Agentic BI | SQL · Python · dbt · BigQuery · MCP

Analytics Engineer & Data Analyst com mais de 10 anos entre vendas consultivas de ERP (Omie — maior ERP cloud do Brasil), consultoria de negócios e analytics. Eu identifico onde empresas de SaaS, E-commerce e Fintech perdem receita e construo os pipelines de IA governados para capturá-la.

Já estive nas reuniões de receita. Sei o que o VP de Vendas precisa na segunda de manhã — e construo a analytics para responder antes de ser perguntado.

🛠️ Stack Técnica

Python SQL dbt Pipelines ELT BigQuery Databricks MCP Servers LangChain Claude/Gemini APIs FastAPI Looker Studio Streamlit React/NextJS

🧠 IA & Agentic BI

Construindo sistemas de analytics com IA governada — onde consultas em linguagem natural retornam insights determinísticos, governados pelo dbt, em 15 segundos. Sem alucinações. Sem metric drift.

📊 Stack de Negócios

Arquitetura de BI · Revenue Analytics · Analytics de Marketing & Funil · Unit Economics SaaS (CAC, LTV, MRR, Churn, ROAS) · Modelagem Financeira · Arquitetura ELT

🎓 Residente em Data Science na TripleTen (700h+) · Green Belt Lean Six Sigma · Finalista — Epic Games 2026


🚀 Projetos em Destaque

Categoria Projeto Stack
🏆 IA / Analytics Eng Full-Funnel AI Analytics Platform dbt · MCP · BigQuery · XGBoost · Claude/Gemini
RevOps / IA RevOps Lead Engine: Central de Comando B2B Python · Streamlit · Plotly · XAI
Estratégia de Produto / UXR Epic Games Store: Auditoria de Inteligência (2026) Python · Streamlit · Random Forest · NLP
GenAI / BI BI Conversacional & Analytics Generativo Streamlit · LLMs · LangChain
Auto / IA Inteligência de Mercado Automotivo & Agente IA Python · Gemini AI · Plotly
Vendas / Estratégia Otimização de Receita & Churn (Telecom) Python · Stats · Lógica de Negócios
Gaming / Estratégia Estratégia Global de Mercado de Games Python · Stats · EDA
Edu / SQL PunkSQL — Plataforma de Aprendizado de SQL Next.js · SQLite/WASM · Vercel · Google OAuth
Infra / AI Chatbot Infraestrutura Portfolio Website + AI Chatbot React · Gemini · Cloud Run · LangFuse · Docker


🏆 Full-Funnel AI Marketing Analytics Platform

Consultas em linguagem natural · dbt Semantic Layer · 7 MCP Servers · ML Lead Scoring · $0/mês

Full-Funnel AI Analytics — demo de consulta em linguagem natural

Contexto de Negócio: Empresas rodam anúncios no Google, Meta e canais orgânicos. Marketing reivindica os leads. Vendas diz que a qualidade é baixa. O CEO pergunta: "Onde devemos investir no próximo trimestre?"

Responder isso exige unir dados de 5+ plataformas, construir modelos de atribuição, pontuar leads e tornar tudo acessível para stakeholders não-técnicos. Este projeto constrói o sistema de produção — com custo base de $0/mês.

Números-chave:

  • 29 modelos dbt — Arquitetura Medallion completa (Bronze → Silver → Gold)
  • 2,2M+ linhas de dados reais Olist + dados sintéticos de marketing
  • 7 MCP Servers — Google Ads, Meta Ads, GA4, HubSpot, Salesforce, BigQuery, dbt Semantic Layer
  • 4 modelos de atribuição — First-Touch, Last-Touch, Linear, Time-Decay
  • 5 warehouses — BigQuery, DuckDB, Snowflake, Databricks, Supabase
  • $0/mês de custo base de infraestrutura

Ver Repo Demo Arquitetura Demo ao Vivo



🚀 RevOps Lead Engine: Central de Comando B2B com IA

RevOps Lead Engine Dashboard

Contexto: Plataforma RevOps totalmente autônoma — da descoberta de leads por ICP ao scoring com IA, modelagem preditiva de receita e rastreamento de retenção pós-venda. IA Explicável (XAI) em cada score.

Demo ao Vivo Ver Repo



⚡ Epic Games Store: Auditoria Estratégica de Ecossistema (2026)

Descobertas principais: Hardware Wall (-0.133 correlação), 4 Personas de Produto, 60% da satisfação do jogador impulsionada por fatores intangíveis. Finalista para a vaga de Data Analyst na Epic Games (2026).

Ver Apresentação Acessar Dashboard Ver Repo



>_ PunkSQL — Plataforma de Aprendizado de SQL

PunkSQL — Demo Plataforma de Aprendizado de SQL

Plataforma mobile-first para aprender SQL com estética cyberpunk. 80 desafios, 8 módulos (SELECT → CTEs), execução real de SQL no browser via SQLite/WASM, gamificação completa, Google OAuth, bilíngue EN/PT-BR.

Jogar Agora Ver Repo



⚙️ Infraestrutura Portfolio Website + AI Chatbot

AI chatbot com streaming · Google Cloud Run · CI/CD GitHub Actions · LLMOps com LangFuse · segurança 4 camadas · $0/mês

React TypeScript Gemini Docker Cloud Run GitHub Actions LangFuse

Eduardo Cornelsen Portfolio — Site ao Vivo

O que é: A infraestrutura de produção que roda este portfólio — não um projeto paralelo, mas o sistema ao vivo que você está vendo agora. Construído para ser prova de trabalho em si mesmo: chatbot de IA com streaming, containerizado via Docker multi-stage, implantado no Google Cloud Run via pipeline GitHub Actions zero-touch, com cada conversa rastreada no LangFuse para custo, latência e segurança.

Destaques:

  • Chatbot com streaming — Server-Sent Events entregam respostas token a token, sem espera
  • Segurança em 4 camadas — sanitização no cliente, rate-limit por IP, canary token, classificação de intenção (detecção de jailbreak)
  • LLMOps com LangFuse — cada conversa rastreada com custo estimado, latência e flag de segurança
  • CI/CD zero-touch — push para main, site no ar em ~3 minutos via GitHub Actions com OIDC
  • $0/mês em repouso — Cloud Run escala para zero quando ocioso

Open source — replique você mesmo:

Site ao Vivo Ver Repo


🎵 Conversational BI & Generative Analytics

Streamlit App

MusicInsights AI — Dashboard de BI Conversacional

Dashboard interativo com Consultor IA — gestores perguntam em português e recebem respostas baseadas em 100 anos de dados musicais (170k+ tracks).

Ver Repo Acessar App


🚗 Inteligência de Mercado Automotivo & AI Agent

Streamlit App

Agente de IA (Gemini 2.5 Flash) com Tool-Calling que escreve e executa código Python em tempo real para responder perguntas sobre depreciação, saturação de mercado e tendências de preço.

Ver Repo Acessar App


📈 Otimização de Receita e Estratégia de Churn (Telecom)

Python Stats

Análise comparativa de planos móveis para maximizar ARPU e identificar padrões de comportamento. Testes de hipóteses para validar estratégias de precificação e retenção.

Ver Repo Ver Story


🎮 Estratégia Global de Mercado de Games

Predição de Sucesso de Vendas para Mitigação de Riscos

Python Stats

Contexto de Negócio: Identifica drivers de sucesso na indústria de games analisando dados históricos globais de vendas. Framework preditivo para suportar estratégias Go-To-Market e mitigar riscos de lançamento.

📓 Disponível como notebook técnico — EDA extensivo, testes de hipóteses e análise estratégica. Somente em português.

Ver Notebook Técnico


🏆 Certifications & Education

Data Scientist - TripleTen Business Administration - Insper Green Belt Falconi



⚡ Neural Link Activity

From: 24 October 2025 - To: 19 April 2026

Total Time: 324 hrs 31 mins

Python                     245 hrs 35 mins       ██████████████████▓░░░░░░   74.12 %
Markdown                   38 hrs 28 mins        ███░░░░░░░░░░░░░░░░░░░░░░   11.61 %
HTML                       10 hrs 43 mins        ▓░░░░░░░░░░░░░░░░░░░░░░░░   03.24 %
TypeScript                 9 hrs 29 mins         ▓░░░░░░░░░░░░░░░░░░░░░░░░   02.86 %
Other                      6 hrs 48 mins         ▓░░░░░░░░░░░░░░░░░░░░░░░░   02.06 %

📊 GitHub Stats




© 2026 Eduardo Cornelsen — Analytics Engineer & Data Analyst

Diagnosing revenue leakage. Building governed AI pipelines to capture it.

Pinned Loading

  1. full-funnel-ai-analytics full-funnel-ai-analytics Public

    Full-Funnel AI Marketing Analytics. A modern data stack powered by dbt MetricFlow and MCP. Natural language insights across Google/Meta Ads, CRM, and 5 data warehouses. Includes XGBoost lead scorin…

    Python 3

  2. cv-educornelsen cv-educornelsen Public

    Production portfolio with a streaming AI chatbot (Gemini + SSE), containerized via Docker, deployed to Google Cloud Run through GitHub Actions CI/CD, with LangFuse LLMOps tracing every conversation…

    TypeScript 4

  3. punksql punksql Public

    Mobile-first SQL learning platform (Next.js + SQLite/WASM + Vercel) - 80 challenges, 8 modules (SELECT → CTEs), in-browser SQL execution, Google OAuth, XP/levels/achievements. Bilingual EN/PT-BR.

    JavaScript 1

  4. revops_lead_engine revops_lead_engine Public

    🤖 Autonomous B2B RevOps Command Center. End-to-end SDR automation: from programmatic lead discovery & enrichment to AI lead scoring (XAI), outreach, and 90-day revenue scenario modeling. Features a…

    Python

  5. epic-store-analysis epic-store-analysis Public

    ⚡ Epic Game Store (EGS) Ecosystem Intelligence: A strategic Data Science & UXR audit of the Epic Games Store. Uses K-Means Clustering, NLP, and Predictive Modeling to identify UX friction points, h…

    HTML 1

  6. spotify-music-insights-ai spotify-music-insights-ai Public

    Why are some songs popular? This Streamlit dashboard uses a LangChain AI agent (Gemini) to execute Pandas code on Spotify data, analyzing song features (danceability, energy, etc.) and popularity o…

    Python