Hands-On LLM Engineering with Python (Part 2)
Who is this for?
Students, developers, and anyone who completed Hands-On LLM Engineering with Python (Part 1) or already understands the basics of calling LLMs and wants to go deeper into retrieval systems, vector search, neural embeddings, and multi-agent architectures.
If you enjoyed Part 1 and want to move from “LLM tools” to building real, intelligent systems, this class is for you.
Tired of surface-level RAG tutorials?
Most RAG guides stop at “upload a PDF and ask questions.”
This session goes further, focusing on how retrieval works, how embeddings represent meaning, and how to design proper agent pipelines you can trust in production.
Who is leading the session?
The session is led by Dr. Stelios Sotiriadis, CEO of Warestack and Associate Professor at Birkbeck, University of London, specialising in cloud computing, distributed systems, and AI engineering.
Stelios has worked with Huawei, IBM, Autodesk and several startups, holds a PhD from the University of Derby, completed postdoctoral research at the University of Toronto, and has been teaching in London since 2018.
He founded Warestack in 2021, building developer-focused automation software used internationally.
What we’ll cover
A hands-on deep dive into Retrieval-Augmented Generation (RAG), embeddings, and agent architectures, including:
How embeddings are generated using deep neural networks
Understanding vector spaces and meaning representation
Using FAISS for high-performance similarity search
Designing a real RAG pipeline: indexing → retrieval → generation
Choosing the right embedding model (local or cloud)
Evaluating retrieval quality and fixing common RAG failures
Multi-agent concepts: planners, tools, memory, delegation
Building simple multi-agent workflows with Python
Using ChromaDB or FAISS for vector memory
End-to-end examples: indexing documents, retrieving context, building agents that collaborate
This session focuses on theory + fundamentals + practical code you can re-use.
Why FAISS and deeper theory?
To build reliable retrieval systems, you must understand:
how embeddings capture meaning
how similarity search actually works
how to design scalable vector indexes
why agents need structured memory
how RAG interacts with agent workflows
FAISS gives you full control and high performance, and the theory helps you reason about quality, errors, and architectural decisions.
What are the requirements?
Bring a laptop with Python installed (Windows, macOS, or Linux), along with VS Code or a similar IDE. At least 10GB of free disk space and 8GB RAM recommended for local embedding models and FAISS indexing.
If your laptop may struggle, please contact Stelios before registering.
What is the format?
A 3-hour live session including:
Interactive theory
Hands-on coding
Step-by-step exercises
Small-group support
Three short breaks
Q&A and mini quizzes
This is a practical workshop centred around building working retrieval and agent systems.
Prerequisites
You should already be comfortable with Python and have completed:
Hands-On LLM Engineering with Python (Part 1)
ORhave equivalent knowledge of calling LLMs and basic embeddings.
What comes after?
Participants will receive an optional small project involving:
building a mini RAG system
evaluating retrieval performance
experimenting with multi-agent workflows.
Personalised one-to-one feedback is available.
Is it just one session?
This is Part 2 in the applied AI sequence.
Upcoming sessions will dive deeper into:
advanced embedding models
LangChain and orchestration frameworks
memory systems
production-ready RAG
multi-agent execution graphs
evaluation and monitoring
You can choose later whether to join the next levels.
How many participants?
To keep the class interactive, only 15 spots are available.
Please register early.