Scroll to explore
Who I Am

Andrea Alberti
GenAI Engineer & Data Scientist
Graduated with a double degree in Management and Computer Science–Data Science, I have gained substantial experience in multidisciplinary projects. I specialize in applying machine learning, deep learning, and most recently Generative AI techniques to develop innovative and automated solutions.
My academic journey, from Management Engineering (110/110 cum laude) to a Master's in Data Science (110/110 cum laude), has given me a unique blend of technical expertise and strategic thinking. This allows me to approach complex problems from both a business and a technological perspective.
Professionally, I focus on building intelligent systems—from multi-agent architectures that automate complex business processes to advanced conversational AI—to drive efficiency and reduce operational costs. My goal is to continuously improve myself, using AI to create tangible value.
MSc. Data Science - Computer Engineering
University of Pavia
BSc. Management Engineering
University of Brescia
Certification
Google Cloud Professional ML Engineer
Google Cloud
Technical Skills
Core Skills
AI Tools
Cloud & Infrastructure
Programming Languages
Frameworks & Libraries
Professional Projects
A selection of professional projects where I applied Generative AI and Machine Learning to solve real-world problems.
AI-Powered Tyre Selection Assistant
Refined and tested a Dialogflow CX conversational agent for the UK market, guiding users in tyre selection through a multi-agent system with RAG and external APIs. Successfully migrated the solution to Google Agent Development Kit (ADK), expanding to international markets and new vehicle categories with cross-cloud AWS-GCP integration.
Automated Document Validation System
Full-stack implementation of a multi-agent system to automate verification and validation of documents for public funding requests. Developed backend in Python and frontend in HTML, CSS, and JavaScript. Designed a modular and flexible architecture that allows agent behavior adaptation without code modification.
Advanced Knowledge Base Chatbot
Implementation of advanced chatbot agent architecture to answer user questions using a knowledge base of web pages and PDF documents. System designed with a main orchestrator agent routing requests to five specialized sub-agents. Developed data management pipeline including PDF parsing, chunking strategy, and security layer for inappropriate questions.
Real-Time Multimodal Agent
Development and implementation of real-time multimodal conversational agent based on Google Agent Development Kit (ADK). System capable of processing audio and video inputs simultaneously, sustaining fluid conversations, and autonomously performing browser operations through reasoning and tools. Asynchronous dual-server architecture with WebSocket communication for low-latency bidirectional streaming.
Luxury Yacht Virtual Assistant
Implementation of virtual assistant with aim of answering user questions based on corporate knowledge base. Configured as RAG (Retrieval-Augmented Generation) application, leveraging Google Cloud ADK and Vertex AI Search to retrieve relevant information from dedicated datastore. Architecture uses Gemini models to process requests and generate accurate, contextualized responses in Italian.
Multi-Agent Ticketing System
Implementation of multi-agent ticketing system to automate user responses. Project developed as Proof of Concept (POC), involving creation of specialized agents, each capable of interacting with external databases via APIs to provide accurate and comprehensive replies. Multi-agent architecture ensures user requests are routed to most competent agent, reducing staff workload and management costs.
LLM-Based Email Classification Pipeline
Within broader project aimed at automating manual process and reducing operational costs, contribution focused on creating LLM-based pipeline for classification and dispatch of certified emails (PEC). Key activities included prompt engineering and few-shot learning to refine model outputs, along with development of metrics and analytical tools for system performance evaluation.
Insurance Liquidation Engine - Demo Project
Developed comprehensive multi-agent collaborative architecture for internal Sharing Days demonstration. System automatically analyzes, processes, and issues liquidation judgments on insurance claims. Architecture combines parallel agents for document analysis with sequential agents for final evaluation, utilizing custom tools, built-in Google Cloud services, and MCP integration.
Research & Academic Work

Heart Disease Detection from Audio Signals
Advanced Biomedical Machine Learning
This study presents two machine learning models (MLP_Ensemble5 and MLP_Ensemble2) aimed at enhancing heart disease detection from heart sound recordings using ensemble techniques. The project involved data preprocessing, feature extraction (MFCCs, Chroma STFT, etc.), and feature selection using data from the PASCAL CHSC2011 challenge.

Disease Prediction with Graph Machine Learning
Financial Data Science
This study investigated the complex relationships between symptoms and diseases using network analysis techniques on a large dataset. A bipartite graph was created and analyzed using metrics like degree distribution, Hidalgo's method of reflections, betweenness centrality, and community detection. Novel features derived from these network metrics were used to train predictive models (Logistic Regression, Random Forest, MLP).

Review Helpfulness Prediction with Big Data
Data Science & Big Data Analytics
This project involved a comprehensive analysis of an Amazon book review dataset using big data tools (Hadoop, Spark) and data science techniques. It investigated factors influencing review helpfulness, such as review length, sentiment, and user rating. Predictive models (Random Forest, SVR, MLP) were built using Word2Vec embeddings to estimate review helpfulness.

Clickbait Detection in News Headlines
Machine Learning
Implemented and compared Multinomial Naïve Bayes (MNB) and Logistic Regression (LR) classifiers for identifying clickbait headlines from a dataset of 32,000 examples. The project evaluated performance under two scenarios: maximizing overall accuracy and minimizing the False Positive Rate (FPR). Utilized Bag-of-Words feature representation with different vocabulary sizes.

DDoS Attack Detection and Mitigation
Enterprise Digital Infrastructure
This project experimentally assessed the impact of DNS reflection and amplification attacks within a controlled local network environment. It explored how different DNS request types affect amplification factors and analyzed the consequences on the target system's latency and the DNS server's resource utilization (CPU, RAM). Attacks were simulated using custom Scapy scripts with IP address spoofing.

Cake Classification Features Analysis
Machine Learning
This project developed models to classify images into 15 different cake categories. It compared two main approaches: using a Multi-Layer Perceptron (MLP) with low-level image features (Color Histogram, Edge Direction Histogram, Co-occurrence Matrix), and using an MLP with neural features extracted from a pre-trained CNN (PVMLNet). Transfer learning was also explored by adapting the pre-trained CNN.

Vanishing Points Detection in Images
Computer Vision
Developed two image processing programs. The first focuses on image binarization using a custom histogram-based thresholding technique, offering automatic and manual tuning. The second program detects vanishing points and lines using techniques like the Canny edge detector, probabilistic Hough transform, and the RANSAC algorithm. Both programs include command-line interfaces.

Sentiment Analysis on Social Media
Machine Learning
This project developed and compared Multinomial Naive Bayes (MNB) and Logistic Regression (LR) classifiers for predicting sentiment (positive/negative) in movie reviews, using the dataset from Maas et al. The study explored the impact of vocabulary size and preprocessing techniques like stopword removal and stemming on classifier performance. Analysis included evaluating accuracy, overfitting trends, and identifying the most impactful words.
Interactive AI Demos
Explore cutting-edge AI capabilities through interactive demonstrations. From RAG-powered research tools to autonomous multi-agent systems.
Research Paper Explorer
Chat with my academic papers using RAG. Ask questions about my research in ML, DL, NLP, and Computer Vision.
AI Board of Directors
Multi-agent system simulating a board of expert advisors. Watch agents debate and reach consensus on strategic decisions.
Autonomous Research Assistant
AI agent with tools for web search, data analysis, and report generation. Demonstrates agentic workflows and tool use.
Beyond Work
Other things I love to do

Active Body, Active Mind
It is important for me to have moments to take care of my health. I set up a small home gym and I also love playing tennis and football.

The Perfect Mix of Passion, Ability, and Strategy
I love Formula 1 because it represents the pinnacle of racing, combining cutting-edge technology with human skill and strategic thinking.
If you no longer go for a gap which exists, you are no longer a racing driver. — Ayrton Senna

A Long-Lasting Family Tradition
I inherited from my grandfather the passion for football. I love watching and analyzing matches and I'm a big fan of AC Milan since ever.
Let's Connect
Open to opportunities and collaborations. Feel free to reach out!
Let's Connect
Open to opportunities and collaborations