A Coding Implementation of an Advanced Tool-Using AI Agent with Semantic Kernel and Gemini

On this tutorial, we construct a sophisticated AI agent utilizing Semantic Kernel mixed with Google’s Gemini free mannequin, and we run it seamlessly on Google Colab. We begin by wiring Semantic Kernel plugins as instruments, like internet search, math analysis, file I/O, and note-taking, after which let Gemini orchestrate them by way of structured JSON outputs. We see the agent plan, name instruments, course of observations, and ship a remaining reply. Try the FULL CODES here.

Copy Code

!pip -q set up semantic-kernel google-generativeai duckduckgo-search wealthy


import os, re, json, time, math, textwrap, getpass, pathlib, typing as T
from wealthy import print
import google.generativeai as genai
from duckduckgo_search import DDGS


GEMINI_API_KEY = os.getenv("GEMINI_API_KEY") or getpass.getpass(" Enter GEMINI_API_KEY: ")
genai.configure(api_key=GEMINI_API_KEY)
GEMINI_MODEL = "gemini-1.5-flash"
mannequin = genai.GenerativeModel(GEMINI_MODEL)


import semantic_kernel as sk
attempt:
   from semantic_kernel.capabilities import kernel_function
besides Exception:
   from semantic_kernel.utils.function_decorator import kernel_function

We start by putting in the libraries and importing important modules, together with Semantic Kernel, Gemini, and DuckDuckGo search. We arrange our Gemini API key and mannequin to generate responses, and we put together Semantic Kernel’s kernel_function to register our customized instruments. Try the FULL CODES here.

Copy Code

class AgentTools:
   """Semantic Kernel-native toolset the agent can name."""


   def __init__(self):
       self._notes: checklist[str] = []


   @kernel_function(title="web_search", description="Search the net for recent information; returns JSON checklist of {title,href,physique}.")
   def web_search(self, question: str, okay: int = 5) -> str:
       okay = max(1, min(int(okay), 10))
       hits = checklist(DDGS().textual content(question, max_results=okay))
       return json.dumps(hits[:k], ensure_ascii=False)


   @kernel_function(title="calc", description="Consider a protected math expression, e.g., '41*73+5' or 'sin(pi/4)**2'.")
   def calc(self, expression: str) -> str:
       allowed = {"__builtins__": {}}
       for n in ("pi","e","tau"): allowed[n] = getattr(math, n)
       for fn in ("sin","cos","tan","asin", "sqrt","log","log10","exp","flooring","ceil"):
           allowed[fn] = getattr(math, fn)
       return str(eval(expression, allowed, {}))


   @kernel_function(title="now", description="Get the present native time string.")
   def now(self) -> str:
       return time.strftime("%Y-%m-%d %H:%M:%S")


   @kernel_function(title="write_file", description="Write textual content to a file path; returns saved path.")
   def write_file(self, path: str, content material: str) -> str:
       p = pathlib.Path(path).expanduser().resolve()
       p.dad or mum.mkdir(mother and father=True, exist_ok=True)
       p.write_text(content material, encoding="utf-8")
       return str(p)


   @kernel_function(title="read_file", description="Learn textual content from a file path; returns first 4000 chars.")
   def read_file(self, path: str) -> str:
       p = pathlib.Path(path).expanduser().resolve()
       return p.read_text(encoding="utf-8")[:4000]


   @kernel_function(title="add_note", description="Persist a brief notice into reminiscence.")
   def add_note(self, notice: str) -> str:
       self._notes.append(notice.strip())
       return f"Notes saved: {len(self._notes)}"


   @kernel_function(title="search_notes", description="Search notes by key phrase; returns prime matches.")
   def search_notes(self, question: str) -> str:
       q = question.decrease()
       hits = [n for n in self._notes if q in n.lower()]
       return json.dumps(hits[:10], ensure_ascii=False)


kernel = sk.Kernel()
instruments = AgentTools()
kernel.add_plugin(instruments, "agent_tools")

We outline an AgentTools class as our Semantic Kernel toolset, giving the agent talents like internet search, protected math calculation, time retrieval, file learn/write, and light-weight notice storage. We then initialize the Semantic Kernel and register these instruments as a plugin in order that the agent can invoke them throughout reasoning. Try the FULL CODES here.

Copy Code

def list_tools() -> dict[str, dict]:
   registry = {}
   for title in ("web_search","calc","now","write_file","read_file","add_note","search_notes"):
       fn = getattr(instruments, title)
       desc = getattr(fn, "description", "") or fn.__doc__ or ""
       sig = "()" if title in ("now",) else "(**kwargs)"
       registry[name] = {"callable": fn, "description": desc.strip(), "signature": sig}
   return registry


TOOLS = list_tools()


CATALOG = "n".be a part of(
   [f"- {n}{v['signature']}: {v['description']}" for n,v in TOOLS.gadgets()]
)


SYSTEM = f"""You're a meticulous tool-using AI agent.
You'll be able to name TOOLS by returning ONLY a JSON object:
{{"software":"<title>","args":{{...}}}}
After ending all steps, reply with:
{{"final_answer":"<reply with citations/steps>"}}


TOOLS obtainable:
{CATALOG}


Guidelines:
- Choose factuality; cite web_search outcomes as [title](url).
- Maintain steps minimal; at most 8 software calls.
- For file outputs, use write_file and point out the saved path.
- If a software error happens, regulate arguments and check out once more.
"""


def extract_json(s: str) -> dict|None:
   for m in re.finditer(r"{.*}", s, flags=re.S):
       attempt: return json.hundreds(m.group(0))
       besides Exception: proceed
   return None

We create a list_tools helper to gather all obtainable instruments, their descriptions, and signatures right into a registry for the agent. We then construct a CATALOG string that lists these instruments and embed it into the SYSTEM immediate, which instructs Gemini the way to name instruments in strict JSON format and return a remaining reply. Lastly, we outline extract_json to soundly parse software calls or remaining solutions from the mannequin’s output. Try the FULL CODES here.

Copy Code

def run_agent(job: str, max_steps: int = 8, verbose: bool = True) -> str:
   transcript: checklist[dict] = [{"role":"system","parts":[SYSTEM]},
                             {"position":"consumer","components":[task]}]
   observations = ""
   for step in vary(1, max_steps+1):
       content material = []
       for m in transcript:
           position = m["role"]
           for half in m["parts"]:
               content material.append({"textual content": f"[{role.upper()}]n{half}n"})
       if observations:
           content material.append({"textual content": f"[OBSERVATIONS]n{observations[-4000:]}n"})
       resp = mannequin.generate_content(content material, request_options={"timeout":60})
       textual content = resp.textual content or ""
       if verbose:
           print(f"n[bold cyan]Step {step} - Mannequin[/bold cyan]n{textwrap.shorten(textual content, 1000)}")
       cmd = extract_json(textual content)
       if not cmd:
           transcript.append({"position":"consumer","components":[
               "Please output strictly one JSON object per your rules."
           ]})
           proceed
       if "final_answer" in cmd:
           return cmd["final_answer"]
       if "software" in cmd:
           tname = cmd["tool"]; args = cmd.get("args", {})
           if tname not in TOOLS:
               observations += f"nToolError: unknown software '{tname}'."
               proceed
           attempt:
               out = TOOLS[tname]["callable"](**args)
               out_str = out if isinstance(out,str) else json.dumps(out, ensure_ascii=False)
               if len(out_str) > 4000: out_str = out_str[:4000] + "...[truncated]"
               observations += f"n[{tname}] {out_str}"
               transcript.append({"position":"consumer","components":[f"Observation from {tname}:n{out_str}"]})
           besides Exception as e:
               observations += f"nToolError {tname}: {e}"
               transcript.append({"position":"consumer","components":[f"ToolError {tname}: {e}"]})
       else:
           transcript.append({"position":"consumer","components":[
               "Your output must be a single JSON with either a tool call or final_answer."
           ]})
   return "Reached step restrict. Summarize findings:n" + observations[-1500:]

We run an iterative agent loop that feeds system+consumer context to Gemini, enforces JSON-only software calls, executes the requested instruments, feeds observations again into the transcript, and returns a final_answer; if the mannequin drifts from the schema, we nudge it, and if it hits the step cap, we summarize the findings. Try the FULL CODES here.

Copy Code

DEMO = (
   "Discover the highest 3 concise info about Chandrayaan-3 with sources, "
   "compute 41*73+5, retailer a 3-line abstract into '/content material/notes.txt', "
   "add the abstract to notes, then present present time and return a clear remaining reply."
)


if __name__ == "__main__":
   print("[bold] Instruments loaded:[/bold]", ", ".be a part of(TOOLS.keys()))
   ans = run_agent(DEMO, max_steps=8, verbose=True)
   print("n" + "="*80 + "n[bold green]FINAL ANSWER[/bold green]n" + ans + "n")

We outline a demo job that makes the agent search, compute, write a file, save notes, and report the present time. We then run the agent end-to-end, printing the loaded instruments and the ultimate reply so we are able to confirm the complete tool-use workflow in a single go.

In conclusion, we observe how Semantic Kernel and Gemini collaborate to kind a compact but highly effective agentic system inside Colab. We not solely take a look at software calls but additionally see how outcomes stream again into the reasoning loop to supply a clear remaining reply. We now have a reusable blueprint for extending with extra instruments or duties, and we show that constructing a sensible, superior AI agent could be each easy and environment friendly after we use the appropriate mixture of frameworks.

Try the FULL CODES here. Be happy to take a look at our GitHub Page for Tutorials, Codes and Notebooks. Additionally, be happy to observe us on Twitter and don’t overlook to hitch our 100k+ ML SubReddit and Subscribe to our Newsletter.

The submit A Coding Implementation of an Advanced Tool-Using AI Agent with Semantic Kernel and Gemini appeared first on MarkTechPost.