</>

Advanced Programming for Data Scientists

Vibe Coding &
Agents Building

Coding with AI, and building AI that acts.

Tel Aviv University · 3-hour session

Roadmap

Where we're going today

Before class

You already set up your machine ✅

GitHub Student Pack (with your @mail.tau.ac.il email) → Copilot Pro → VS Code → the Copilot extension. If not — do it now, it's on the course site.

02

Vibe Coding

Ask → visualize → test → fix → iterate.

Warm-up ispow2(n)

Give the AI the constraints, not just the goal

sol1.py
# no bin / len / logs / strings — bit ops only. 0 is NOT a power of two.
def ispow2(n):
    return n > 0 and (n & (n - 1)) == 0

Tell it the forbidden built-ins up front — or it reaches for bin().

The mental model

The vibe-coding loop

Ask Visualize Test Refactor safely Iterate

You stay the engineer. The AI is a very fast, very literal pair-programmer.

Demo longest_run(n)

Watch the loop on a real question

longest_run(n) → length of the longest run of consecutive 1 bits.

▶ Live now longest_run

🎬 Switch to VS Code

Full walkthrough, live — ask, visualize, test, fix, iterate.

The takeaway

Passing tests ≠ acceptable solution

🎯 Constraints first

The AI happily breaks the rules unless you state them. bin() worked — but was forbidden.

🧪 Tests = freedom

Once you have tests, you can let the AI rewrite the code and know instantly if it broke.

Before your turn

How to talk to the AI

Your turn

reverse_bits(x, num_bits) — 30 min

Reverse the low num_bits bits of x. Same loop: ask (with constraints) → visualize → test → fix → improve.

Timer + visualizer are on the course site → In Class.

03

Building AI Agents

with PydanticAI

The core idea

A single answer vs. an agent that acts

🗨️ One LLM request

  • Only its training data
  • No live info, no your data
  • Can't verify itself
  • One shot
vs

🤖 Agent + tools

  • Fetches real info via tools
  • Grounds in your sources
  • Retries, loops, checks
  • Structured output

What makes it an agent

Reason → act → observe → repeat

Prompt
LLMdecide
Call toolsearch, run…
Read result
Answerstructured

Remove the loop and the tools → you're back to one chat message.

What is a tool?

A normal typed function the model can call

tool
@agent.tool_plain
def word_count(text: str) -> int:
    """Return the number of words in the given text."""
    return len(text.split())

The docstring is the tool's instruction manual for the model.

Why bother

Tools + a loop beat better prompting

An LLM alone is a brilliant intern with no phone, no internet, no notebook. Tools give it the phone and the notebook.

SWE-benchGAIAτ-bench

On "doing" tasks, tool-using agents complete far more than single prompts.

Your stack = 3 choices

Framework · Model · Harness

🧱 Framework

PydanticAI, LangGraph, LlamaIndex…
We use PydanticAI.

🧠 Model

Gemini, Claude, GPT, Llama…
Swappable in one line. Gateway = API / OpenRouter / Ollama.

🛠️ Harness

Tools · observability · tests · guardrails.
Where the real work is.

Model & framework are easy swaps. The harness is what makes an agent good.

The point

The harness is the agent

Tools Observability Tests & evals Guardrails Memory

Same model, same framework — a strong harness is the difference between a demo and something you'd ship.

Architectures · first cut

Workflow vs. agent

🧭 Workflow

  • You wire the steps in code
  • Path is predefined
  • Predictable, cheap, debuggable
vs

🤖 Agent

  • The model decides next step
  • Path emerges at runtime
  • Flexible — but slower & pricier

Same brick underneath: the augmented LLM (model + tools + memory).

Architectures · the ladder simple → complex

Six patterns you compose

Scaling up · & the one rule

Topologies — and start simple

Hierarchical

Supervisors of workers. High control, big tasks.

Swarm

Peers, no boss. Exploration at scale.

Mesh

3–8 peers, tight loops on one artifact.

Rule: every layer adds cost, latency & failure points. Use the fewest pieces that solve it.

Tools at scale

MCP — “USB-C for AI tools”

One open standard. Plug your agent into ready-made servers — GitHub, Slack, databases, filesystem, browser — and it gains all their tools at once.

Write onceReuse across modelsClient & server

No credit card needed

Free tokens — and a cap so the loop can't run away

Because agents loop, always cap the number of requests.

▶ Live first_agent.py

One tool · free model · structured output · capped

first_agent.py
class Answer(BaseModel):
    result: int

agent = Agent("google-gla:gemini-2.0-flash", output_type=Answer)

@agent.tool_plain
def add(a: int, b: int) -> int:
    """Add two integers."""
    return a + b

out = agent.run_sync("What is 21 + 21? Use the add tool.",
                     usage_limits=UsageLimits(request_limit=5))
print(out.output.result)   # -> 42

Assume it misbehaves

Failure modes → guardrails

At home · workshop

Build a research assistant agent

Skeleton Simple tool Real tool Parallel + merge Observe Eval

Fan out with asyncio.gather, fan in with a synthesizer — then trace it and eval it. Full guide on the site.

Recap

Ask · Visualize · Test
Fix · Iterate

The same loop — from a bit trick to a parallel agent pipeline.

Course site has everything: prep · demo · challenge · workshop

← → to navigate · F fullscreen · press any key
1 / 1