Skip to content
View albertjimrod's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report albertjimrod

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
albertjimrod/README.md

Alberto Jiménez

Data Scientist · Pipeline Builder | Python · LLM Pipelines · Automation

Portfolio Kaggle LinkedIn Email


About Me

I build data pipelines and automation systems that remove friction from technical workflows.

My background is in industrial electronics and audio — that's still where I come from, but most of what I build today is Python pipelines: LLM-powered prospecting tools, market data observatories, financial scoring systems, and scraping infrastructure.

When a process is slow, manual, or repetitive, I build the tool that replaces it.

📍 Barcelona, Spain


Featured Projects

Project Description Tech
🔍 Prospector B2B 4-phase LLM pipeline for company discovery & prospecting: web audit, LinkedIn, YouTube, fit scoring Python Claude SerpAPI Selenium SQLite
💳 WealthOptimizer Interactive credit risk scoring dashboard with SHAP explainability. AUC-ROC 0.918 Python Streamlit Scikit-learn SHAP
🎹 EU Synth Market Observatory Open dataset of second-hand synthesizer prices from European marketplaces. CC BY 4.0 Python SQLite Docker
🤖 Article Generator Converts technical notes into structured articles using local LLMs. v6.0: multi-format (.md .txt .ipynb .rst .pdf), Mermaid support, section-chunking for large docs Python Ollama Automation
🔊 Clasificación Frecuencial MDS Frequency classifier for audio samples using MFCC cross-correlation & MDS R warbleR igraph
🛠️ Bash Tools CLI utilities: Conda env manager, Git search, file deduplication Bash Automation
🏦 synth-bank-es Full pipeline: INE microdata → CTGAN / GaussianCopula / TVAE → statistical evaluation → Streamlit dashboard Python SDV CTGAN Streamlit Scipy
📊 Synthetic Banking Dataset Generator Rule-based generator of 1M+ synthetic banking customers across 3 economic scenarios. 52 fields, calibrated default model Python Pandas NumPy Multiprocessing
🎸 Hispasonic Market Analysis End-to-end pipeline on 5,962 second-hand synth listings: ETL, EDA, supply/demand modelling, HHI concentration, lagged correlation Python Pandas Scipy Seaborn JupyterLab
👔 compare_jobs CLI tool to compare job listings across sources and analyse market demand by role, location and salary Python Pandas CLI
📷 OCR Translation Pipeline Automated OCR, translation and AI post-processing workflow using local Ollama models. Runs fully offline Bash Tesseract Ollama
💼 Dashboard Mercado Laboral Data Science job market and salary analysis dashboard Python Streamlit Pandas
💰 Cashbot Financial analysis and trading strategy experiments with Python Python Pandas Finance
📚 Dataquest Projects Data analysis portfolio from Dataquest learning path: market research, traffic analysis, survey data Python Pandas Matplotlib
🕷️ Using Scrapy & XPath Web scraping foundations: Scrapy framework, XPath selectors and pipeline structure Python Scrapy XPath

Project Map

mindmap
  root((Alberto Jiménez))
    B2B & Automatización
      Prospector B2B
        Pipeline 4 fases · LLM
      compare_jobs
        Análisis demanda laboral
      Article Generator
        Ollama · multi-format · v6.0
      OCR Translation Pipeline
        Tesseract · Ollama · offline
      Bash Tools
        CLI utilities
    Finanzas & Datos
      WealthOptimizer
        Credit Scoring · SHAP
      Dashboard Mercado Laboral
        Salarios · BI
      Cashbot
        Trading · análisis financiero
      synth-bank-es
        CTGAN · GaussianCopula · TVAE
      Synthetic Banking Generator
        Rule-based · 1M clientes · 3 escenarios
    Mercado de Sintetizadores
      IntelliSynthPrice
        Market observatory
      EU Synth Market Data
        Open dataset CC BY 4.0
      Hispasonic Market Analysis
        ETL · EDA · Supply/Demand · HHI
    Audio ML
      Clasificación Frecuencial MDS
        MFCC + MDS
    Aprendizaje & Portfolio
      Dataquest Projects
        Análisis de mercados · tráfico
      Using Scrapy & XPath
        Web scraping · fundamentos
Loading

Tech Stack

Languages

Python SQL R Bash

LLM & AI

Claude Ollama

Data & ML

Pandas NumPy Scikit--learn XGBoost SHAP BigQuery

Web & Scraping

BeautifulSoup Selenium Flask

Infrastructure

Docker SQLite Linux

BI & Visualization

Streamlit Looker Studio Matplotlib


Certifications

  • 🧩 LEGO Serious Play Facilitator - Certified
  • 📊 Google Business Intelligence Professional - Coursera 2024
  • 📈 Google Data Analytics - Coursera 2023
  • 🐍 Data Scientist in Python - Dataquest
  • 🎓 Master in BI & Data Science - IEBS 2020

GitHub Stats

GitHub Stats

Top Languages


Let's connect!

Data Science · Pipeline Builder · Automation — always building.

Portfolio

Pinned Loading

  1. bash_tools bash_tools Public

    herramientas que me voy construyendo conforme la voy necesitando

    Shell

  2. dashboard-mercado-laboral dashboard-mercado-laboral Public

    Data Science salary dashboard analysis

  3. dataquest-projects dataquest-projects Public

    Data analysis projects from Dataquest learning path

    Jupyter Notebook

  4. hispasonic hispasonic Public

    Hispasonic Web Scraping & Data Analysis

    Jupyter Notebook

  5. Clasificacion-Frecuencial-MDS Clasificacion-Frecuencial-MDS Public

    Clasificación frecuencial de muestras de audio mediante Multidimensional Scaling (MDS) y cross-correlation MFCC

    HTML

  6. article-generator article-generator Public

    Transforms technical notes into professional articles. Supports .md, .txt, .rst, .ipynb and .pdf. Outputs in Spanish, English or bilingual.

    Python