Ukoni: Architecting an agent driven interface

I built the CRUD powering Ukoni (https://ukoni.app) and then tried to use it for the first time to plan our shopping this weekend. It was a pretty painful experience. There were usability issues such as typing in a search term and then having to type it in again to create it if it doesn't exist, some data model issues such as not being able to add a product rather than a product variant to the shopping list, some mobile responsiveness issues and some missing features such as creating meal plans and then shopping lists from those meal plans. After writing out this laundry list of things, it might seem like it was complete garbage, but Rome wasn't built in a day, they were laying bricks every hour.

I think the biggest takeaway though (and it's something I alluded to in my introductory note for this project) is that a traditional UX sucks for this kind of thing.

Another very important factor is the interface. All that I’ve listed out here was covered by the Notion database template I mentioned above. However, it fell into disuse because keeping it up to date was incredibly tedious. I’m keen to try to build a different, more natural language interface for this system, that should mean I don’t abandon using it after two shopping trips.

I had to create each product and each variant as we went through things to put on the list and while part of that is a cold-start problem and shouldn't need to be done to the same extent again, it's just not a great way to interact with data. It's true, forms and form inputs are how we've interacted with inputing and editing most data for the longest time, but it has always been a process that we've adapted to rather than the ideal interaction.

It reminds me of something my lecturer for Human Computer Interaction (HCI) in university said around how cars had originally been designed with manual transmission as the way to drive them, but that automatic transmission was a much better interaction and that ultimately, driverless cars would be an even better interaction pattern. The constraints the builders of those eras worked under meant that they couldn't go straight from inventing the combustion engine to driverless cars immediately.

Now that we can use natural language even for structured input, it opens up much better user experience and affordance for users. I've already started seeing this in the wild, even before the terminology of agents too off. In 2023 for example, I completed referencing checks for a tenancy application via chatbot on Goodlord. Best referencing check I've had to do.

So, after successfully building my first LLM-enhanced app the other day with puzzle generation on 8Words, I'm going to attempt to build a natural language driven input for Ukoni.

Current Flow.

The current flow is a typical CRUD application flow. Let's walk through creating a shopping list for example.

sequenceDiagram actor User participant UI as Web / Mobile UI participant API participant DB as Database User->>UI: Log in UI->>API: Authenticate user API-->>UI: Auth success User->>UI: Navigate to Shopping Lists UI->>API: Fetch shopping lists API->>DB: Get lists for user DB-->>API: Lists API-->>UI: Lists User->>UI: Create new shopping list UI->>API: Create list (name) API->>DB: Create shopping list DB-->>API: List created API-->>UI: List details loop For each item User->>UI: Add item UI->>API: Search product API->>DB: Search products DB-->>API: Search results API-->>UI: Results alt Product found User->>UI: Select product alt Variant known and found User->>UI: Select variant else Variant known but not found User->>UI: Create variant UI->>API: Create variant (name, properties) API->>DB: Create variant DB -->>API: Variant created API-->>UI: Variant else Variant not specified UI->>UI: Use generic/default variant end else Product not found User->>UI: Create product UI->>API: Create product API->>DB: Create product DB-->>API: Product created API-->>UI: Product alt Variant known User->>UI: Create variant UI->>API: Create variant API->>DB: Create variant DB-->>API: Variant created API-->>UI: Variant else Variant not specified UI->>UI: Use generic/default variant end end User->>UI: Add item properties (outlet, notes) UI->>API: Add item to list API->>DB: Create list item DB-->>API: Item added API-->>UI: Updated list end User->>UI: Shopping list complete


sequenceDiagram

    actor User

    participant UI as Web / Mobile UI

    participant API

    participant DB as Database

    User->>UI: Log in

    UI->>API: Authenticate user

    API-->>UI: Auth success

    User->>UI: Navigate to Shopping Lists

    UI->>API: Fetch shopping lists

    API->>DB: Get lists for user

    DB-->>API: Lists

    API-->>UI: Lists

    User->>UI: Create new shopping list

    UI->>API: Create list (name)

    API->>DB: Create shopping list

    DB-->>API: List created

    API-->>UI: List details

    loop For each item

        User->>UI: Add item

        UI->>API: Search product

        API->>DB: Search products

        DB-->>API: Search results

        API-->>UI: Results

        alt Product found

            User->>UI: Select product

            alt Variant known and found

                User->>UI: Select variant

            else Variant known but not found

                User->>UI: Create variant

                UI->>API: Create variant (name, properties)

                API->>DB: Create variant

                DB -->>API: Variant created

                API-->>UI: Variant

            else Variant not specified

                UI->>UI: Use generic/default variant

            end

        else Product not found

            User->>UI: Create product

            UI->>API: Create product

            API->>DB: Create product

            DB-->>API: Product created

            API-->>UI: Product

            alt Variant known

                User->>UI: Create variant

                UI->>API: Create variant

                API->>DB: Create variant

                DB-->>API: Variant created

                API-->>UI: Variant

            else Variant not specified

                UI->>UI: Use generic/default variant

            end

        end

        User->>UI: Add item properties (outlet, notes)

        UI->>API: Add item to list

        API->>DB: Create list item

        DB-->>API: Item added

        API-->>UI: Updated list

    end

    User->>UI: Shopping list complete

Somewhat complex, but also a fairly vanilla data-driven system.

Natural Language Flow

Ideally, we could remove most of the back and forth with the user and replace it with natural language so that its a much simpler flow.

sequenceDiagram actor User participant UI as Web / Mobile UI participant API participant Agent as NL Agent participant DB as Database User->>UI: Log in UI->>API: Authenticate user API-->>UI: Auth success User->>UI: Enter shopping list in natural language UI->>API: Submit NL prompt API->>Agent: Extract list name + items Agent-->>API: Structured intent (list, items, variants) API->>DB: Check if list exists (by name) alt List exists DB-->>API: Existing list id else List does not exist API->>DB: Create shopping list DB-->>API: New list id end loop For each extracted item API->>DB: Check product existence alt Product exists DB-->>API: Product (+ variants) else Product does not exist API->>DB: Create product DB-->>API: Product created end alt Variant specified API->>DB: Check / create variant DB-->>API: Variant else No variant specified API->>API: Use canonical / default variant end API->>ListSvc: Stage list item (not committed) end API-->>UI: Proposed list changes (preview) User->>UI: Review & edit items UI->>API: Confirm changes API->>DB: Commit staged items DB-->>API: Items committed API-->>UI: Updated shopping list


sequenceDiagram

    actor User

    participant UI as Web / Mobile UI

    participant API

    participant Agent as NL Agent

    participant DB as Database

    User->>UI: Log in

    UI->>API: Authenticate user

    API-->>UI: Auth success

    User->>UI: Enter shopping list in natural language

    UI->>API: Submit NL prompt

    API->>Agent: Extract list name + items

    Agent-->>API: Structured intent (list, items, variants)

    API->>DB: Check if list exists (by name)

    alt List exists

        DB-->>API: Existing list id

    else List does not exist

        API->>DB: Create shopping list

        DB-->>API: New list id

    end

    loop For each extracted item

        API->>DB: Check product existence

        alt Product exists

            DB-->>API: Product (+ variants)

        else Product does not exist

            API->>DB: Create product

            DB-->>API: Product created

        end

        alt Variant specified

            API->>DB: Check / create variant

            DB-->>API: Variant

        else No variant specified

            API->>API: Use canonical / default variant

        end

        API->>ListSvc: Stage list item (not committed)

    end

    API-->>UI: Proposed list changes (preview)

    User->>UI: Review & edit items

    UI->>API: Confirm changes

    API->>DB: Commit staged items

    DB-->>API: Items committed

    API-->>UI: Updated shopping list

With this, while the underlying data model complexity is exactly the same, the rigour of adding all that structured data is mostly hidden from the user.

Existing component diagram

I'm historically bad at diagrams in general, so forgive these, but hopefully they convey the idea. The classic flow is a client hits the server with CRUD actions after the user has gone through the user interface to set those up, and then the server performs those actions on a database.


flowchart LR

    U[User]

    WC[Web Client]

    API[API Server]

    DB[(Postgres DB)]

    U -->|CRUD actions| WC

    WC -->|CRUD actions| API

    API -->|CRUD actions| DB

    DB -->|Read results| API

    API -->|Responses| WC

    WC -->|UI updates| U

Proposed component diagram

In this proposed flow, the client hits a so called "agent server" with a natural language payload where the user has described the actions they want taken. This agent server communicates with one of the LLM models over an API (or does local inference) and can extract some structure from the user's prompt and then the agent server can call CRUD actions against the API server which eventually persists these against the database.


flowchart LR

    U[User]

    WC[Web Client]

    AS[Agent Server]

    API[API Server]

    DB[(Postgres DB)]

    U -->|Natural language| WC

    WC -->|Enhanced prompt| AS

    AS -->|CRUD actions| API

    API -->|CRUD actions| DB

    DB -->|Read results| API

    API -->|Responses| AS

    AS -->|Proposed changes / summaries| WC

    WC -->|UI updates| U

The Agent Server

So called, as I described it above. It's a harness for the agent in the same way as applications like Claude Code or OpenCode are. They orchestrate calls to the model, enhance prompts and manage context. Our harness will be far simpler than the illustrious examples I've cited, but I think the idea is similar.

I've learned a lot about this from reading GC Nwogu's Anatomy of an AI Agent (which inspired me to explore this) as well as Thorsten Ball's How to build an AI agent (which I read long ago and never acted on — until now). So a lot of the ideas for how the agent will work are inspired by them.

As Thorsten writes in that article, an agent is:

an LLM with access to tools, giving it the ability to modify something outside the context window.

In July 2025, I attended Georges Haidar's talk "Bridging the gap between your API and LLMs" at the Stripe London Developer Meetup. It was the moment the concept of tools clicked for me as API + description for the LLM. So, we'll essentially be converting our entire REST API surface into callable tools for the agent. The harness effectively becomes another client of the API, keeping the mental model simple and not requiring much change on the API side.

We'll use the OpenAPI spec to programmatically generate these tools, so that we don’t have to worry about drift between our API changes and the model calls. In the future, this could evolve into an MCP server for the app, but for now we’ll do something simpler with function calling from within our agent server.

Function calling works by specifying a set of functions, their description and their arguments as tools and passing that to the model along with a prompt. The model can then respond plainly or with a tool call specifying one of your defined functions along with arguments. You can then parse this and run the function before passing the result back to the model. And then the process continues until your task is complete.

sequenceDiagram participant Agent as Agent Server participant Model as LLM participant Tool as Function / Tool Agent->>Model: Prompt + tool definitions alt Plain response Model-->>Agent: Natural language response else Tool call Model-->>Agent: Tool call (function + arguments) Agent->>Tool: Execute function Tool-->>Agent: Function result Agent->>Model: Result as new context loop Until task complete Model-->>Agent: Next response or tool call end end


sequenceDiagram

    participant Agent as Agent Server

    participant Model as LLM

    participant Tool as Function / Tool

    Agent->>Model: Prompt + tool definitions

    alt Plain response

        Model-->>Agent: Natural language response

    else Tool call

        Model-->>Agent: Tool call (function + arguments)

        Agent->>Tool: Execute function

        Tool-->>Agent: Function result

        Agent->>Model: Result as new context

        loop Until task complete

            Model-->>Agent: Next response or tool call

        end

    end

I intend to have support for multiple providers and have a “bring your own API key” policy. The good news is that Google Gemini API and OpenAI API (the two I’m starting with) both have support for function calling with fairly similar structures.

Conclusion

Got tired of writing. Wrapping up here. Good plan. Now for the execution.

This is only the second model-driven application I’m making. Pretty excited to see it turn out.