I build data pipelines and automation systems that remove friction from technical workflows.
My background is in industrial electronics and audio — that's still where I come from, but most of what I build today is Python pipelines: LLM-powered prospecting tools, market data observatories, financial scoring systems, and scraping infrastructure.
When a process is slow, manual, or repetitive, I build the tool that replaces it.
📍 Barcelona, Spain
| Project | Description | Tech |
|---|---|---|
| 🔍 Prospector B2B | 4-phase LLM pipeline for company discovery & prospecting: web audit, LinkedIn, YouTube, fit scoring | Python Claude SerpAPI Selenium SQLite |
| 💳 WealthOptimizer | Interactive credit risk scoring dashboard with SHAP explainability. AUC-ROC 0.918 | Python Streamlit Scikit-learn SHAP |
| 🎹 EU Synth Market Observatory | Open dataset of second-hand synthesizer prices from European marketplaces. CC BY 4.0 | Python SQLite Docker |
| 🤖 Article Generator | Converts technical notes into structured articles using local LLMs. v6.0: multi-format (.md .txt .ipynb .rst .pdf), Mermaid support, section-chunking for large docs | Python Ollama Automation |
| 🔊 Clasificación Frecuencial MDS | Frequency classifier for audio samples using MFCC cross-correlation & MDS | R warbleR igraph |
| 🛠️ Bash Tools | CLI utilities: Conda env manager, Git search, file deduplication | Bash Automation |
| 🏦 synth-bank-es | Full pipeline: INE microdata → CTGAN / GaussianCopula / TVAE → statistical evaluation → Streamlit dashboard | Python SDV CTGAN Streamlit Scipy |
| 📊 Synthetic Banking Dataset Generator | Rule-based generator of 1M+ synthetic banking customers across 3 economic scenarios. 52 fields, calibrated default model | Python Pandas NumPy Multiprocessing |
| 🎸 Hispasonic Market Analysis | End-to-end pipeline on 5,962 second-hand synth listings: ETL, EDA, supply/demand modelling, HHI concentration, lagged correlation | Python Pandas Scipy Seaborn JupyterLab |
| 👔 compare_jobs | CLI tool to compare job listings across sources and analyse market demand by role, location and salary | Python Pandas CLI |
| 📷 OCR Translation Pipeline | Automated OCR, translation and AI post-processing workflow using local Ollama models. Runs fully offline | Bash Tesseract Ollama |
| 💼 Dashboard Mercado Laboral | Data Science job market and salary analysis dashboard | Python Streamlit Pandas |
| 💰 Cashbot | Financial analysis and trading strategy experiments with Python | Python Pandas Finance |
| 📚 Dataquest Projects | Data analysis portfolio from Dataquest learning path: market research, traffic analysis, survey data | Python Pandas Matplotlib |
| 🕷️ Using Scrapy & XPath | Web scraping foundations: Scrapy framework, XPath selectors and pipeline structure | Python Scrapy XPath |
mindmap
root((Alberto Jiménez))
B2B & Automatización
Prospector B2B
Pipeline 4 fases · LLM
compare_jobs
Análisis demanda laboral
Article Generator
Ollama · multi-format · v6.0
OCR Translation Pipeline
Tesseract · Ollama · offline
Bash Tools
CLI utilities
Finanzas & Datos
WealthOptimizer
Credit Scoring · SHAP
Dashboard Mercado Laboral
Salarios · BI
Cashbot
Trading · análisis financiero
synth-bank-es
CTGAN · GaussianCopula · TVAE
Synthetic Banking Generator
Rule-based · 1M clientes · 3 escenarios
Mercado de Sintetizadores
IntelliSynthPrice
Market observatory
EU Synth Market Data
Open dataset CC BY 4.0
Hispasonic Market Analysis
ETL · EDA · Supply/Demand · HHI
Audio ML
Clasificación Frecuencial MDS
MFCC + MDS
Aprendizaje & Portfolio
Dataquest Projects
Análisis de mercados · tráfico
Using Scrapy & XPath
Web scraping · fundamentos
- 🧩 LEGO Serious Play Facilitator - Certified
- 📊 Google Business Intelligence Professional - Coursera 2024
- 📈 Google Data Analytics - Coursera 2023
- 🐍 Data Scientist in Python - Dataquest
- 🎓 Master in BI & Data Science - IEBS 2020
