Swift Academy Podcast, Episode 10, Season 2

Beyond the Prompt: Foundation Models as an Architectural Layer in iOS

Mohammad Azam on why Apple’s on-device LLM is less an AI feature and more a redefinition of where intelligence lives in your app, @Generable, @Guide, instructions, tools, and adapter-based fine-tuning, viewed through the lens of iOS layered architecture.

🎧 Swift Academy Podcast · Episode 10, Season 2 · Foundation Models • On-Device LLM • iOS Architecture

“Foundation Models is not a chat surface. It is a typed, on-device, structured capability you wire into your architecture, the way you wire in URLSession or SwiftData. Treat it like a chatbot and you will build bad apps. Treat it like a layer and you will build defensible ones.”

Editorial framing of the conversation with Mohammad Azam

Watch the Episode

About This Episode

Mohammad Azam has been writing iOS code since Objective-C 3.2 in 2010. Since then he has led mobile teams, shipped multiple App Store apps, taught iOS bootcamps, and, for the last several years, split his time between Swift and a Python/AI bootcamp. That dual perspective matters. When Apple introduced the Foundation Models framework, Azam was not reacting as someone discovering LLMs; he was reacting as someone who had already built CreateML pipelines, trained models in scikit-learn, and exported them through CoreML tools.

The headline of this episode is not that Apple shipped an LLM. It is that Apple shipped an architectural primitive. Foundation Models is not a chat product; it is a typed, on-device, deterministic-ish capability that you wire into your app the way you wire in URLSession or SwiftData. This conversation reframes the framework from that altitude: where on-device intelligence belongs in a layered iOS architecture, why instructions outweigh prompts, how @Generable turns probabilistic text into typed Swift, and what adapter-based fine-tuning means for long-lived products.

📖 The themes of this episode are covered in depth in Chapter 6 of AI-Driven Swift Architecture, which is dedicated to designing production-grade integrations of Apple’s Foundation Models inside a Clean Architecture iOS codebase.

An Architectural Deep-Dive into Apple’s Foundation Models Framework

AI Is Not a Feature, It Is a Layer

The most common failure mode in the current AI moment is the “Ask AI” button. Azam describes it bluntly: tools that used to work with a single click now demand a prompt. The friction is up, the affordance is down, and the user experience is worse. Forcing an LLM into a workflow that did not need one is a product mistake disguised as a technology investment.

Foundation Models pushes back on that framing precisely because it is so cheap to invoke, no API key, no per-token cost, no network call. When the marginal cost approaches zero, the right question stops being “can we add AI?” and becomes where does intelligence belong in our architecture? That shift, from feature to layer, is the real paradigm change. The model becomes a service the domain layer can call when its rules genuinely benefit from inference, not a UI surface bolted onto the side of the app.

Understanding Apple’s Foundation Models from a Developer Perspective

For an iOS engineer with no LLM background, the framework is best described as a heavily structured subset of ChatGPT, running entirely on the Neural Engine, addressable through Swift. There is no inference server, no token billing, no conversation history leaving the device. The model is small, a few billion parameters at best, and ships with the OS.

That single sentence contains the entire trade-off matrix:

  • Privacy and offline come for free. Health data, journal entries, and personal contracts never leave the phone.
  • Latency is dominated by the Neural Engine, not by network round-trips, which makes streaming UI viable for most prompts.
  • The capability ceiling is real. The model has no knowledge of last week’s events, will not answer broad world-knowledge questions reliably, and is not competitive with frontier cloud LLMs for open-ended reasoning.
  • Hardware gating is a constraint you must design around. Pre–iPhone 15 Pro devices will not run it at all, so any feature that depends on it needs an explicit non-AI fallback path.

Azam’s expectation, which mirrors where the platform is clearly heading, is that the on-device model will eventually be paired with an Apple-hosted private inference tier, a hybrid where the device tries first and escalates only when needed. Architect today for that future: do not let “Foundation Models is here” become “Foundation Models is everywhere in our codebase.”

Tools, Instructions, and Determinism: The Real Surface Area of the Framework

The part of the framework that deserves the most architectural attention is the one developers tend to skip past: @Generable, @Guide, and instructions.

Structured generation is the contract. A @Generable struct is not a convenience, it is the boundary between non-deterministic text generation and deterministic Swift code. The model is constrained to populate the schema; your app receives a typed value. @Guide tightens that contract per-property: a numeric range, a description, a constraint. Done well, the model becomes a function with a typed return value rather than a string firehose you have to parse.

Instructions are system design, not prompts. Azam is emphatic on this point: instructions matter more than the prompt itself. The prompt is the request; the instructions define the agent, its role, its tone, its decision boundaries, when it should reach for tools, when it should refuse. In a layered architecture, instructions belong with the service that owns the capability, versioned alongside the schema. Treat them as configuration of a domain service, not as ad-hoc strings near the call site.

Tools are controlled escape hatches. A tool is a Swift type the model can invoke, a weather lookup, a database query, an HTTP client. Conceptually it is the same idea as MCP, scoped to a single app: a typed interface that lets the model reach beyond its training data without losing the structured-output guarantee. The interesting design decision is not whether to expose a tool; it is how narrowly to scope it. Azam’s recipe-app example is instructive: a tool that fires only when rice is among the ingredients, deferring to a curated rice-dish API rather than the model’s general knowledge. That kind of fine-grained gating is where instruction quality and tool design meet.

Integrating Foundation Models into iOS Layered Architecture

The framework is well-behaved precisely because it does not push itself onto your architecture. It slots into the data and service layer like any other capability. Generables are DTOs. The model is a service. Your domain types remain the source of truth.

The boundary that matters is persistence. A @Generable struct cannot wear @Model, it is not a SwiftData entity, and conflating the two is a category error. If a generated recipe needs to survive past the session, your service layer maps the generable into a SwiftData model and persists that. If the output is ephemeral, a one-shot summary, a transient suggestion, a view can consume the generable directly. The rule is the same as with any DTO/entity split: do not let an external schema bleed into your storage layer.

This is also why the framework composes cleanly with existing apps. Adding Foundation Models to a Clean-ish architecture does not force a rewrite. It adds a new service implementation. The view models do not change shape; the domain does not learn about Apple Intelligence; the persistence layer keeps owning its types. If introducing the framework is invasive in your codebase, the framework is not the problem, your layering is.

Models That Evolve: Adapter-Based Fine-Tuning

The piece that gets less airtime than @Generable but matters more for long-lived products is adapters. Apple exposes an adapter-based fine-tuning path: you take the shipped foundation model and train a small delta on your own data, rental agreements, medical claim formats, your particular domain vocabulary. The base model stays Apple’s; the specialization is yours.

The architectural implication is significant. Adapters give you a knob between “general-purpose model” and “model that understands my product,” without operating an inference cluster and without your data leaving the device. They also introduce a new kind of artifact in your release pipeline: an adapter is versioned, tested, and shipped alongside your binary. Teams that take Foundation Models seriously will end up with a model-evaluation workflow next to their unit tests, regression suites for generables, golden outputs for instruction changes, drift detection between adapter versions. That is a maturity step many iOS teams have not taken yet.

Real-World Patterns from the Episode

Strip the demos down to their architectural shape and a small set of reusable patterns emerges:

  • Constrained recommendation. The Yummy recipe demo, Azam’s gardening app, both are the same pattern: user supplies a constrained input set, the model returns a @Generable list scoped by @Guide ranges, the view streams it. Use this whenever the answer space is open-ended but the shape of the answer is fixed.
  • On-device summarization over private data. Health metrics, journal entries, transaction history. Cloud LLMs are non-starters here for compliance reasons; on-device LLMs are not. The pattern: summarize locally, persist the summary, surface in widgets or live activities.
  • Structured matching against a user document. The resume vs. job description example. Two unstructured inputs, one structured diff out. This generalizes to contract review, code review, spec compliance, anywhere the value is in structured comparison rather than free-form prose.
  • Tool-routed inference. The rice-recipe escape hatch. The model handles the general case; a tool handles the specific case where you have authoritative data. Use this whenever your product has curated content the model should not try to reinvent.
  • Glanceable AI on wearables. Live Activities and widgets are a natural fit because Foundation Models’ latency is low and its outputs are short. A run summary, a hydration nudge, a weather-aware recommendation, all viable without ever lighting up the network.

The common thread: every good use case has a clear input shape, a clear output schema, and a fallback for devices and contexts where the model cannot run.

Vibe Coding, Domain Knowledge, and the Limits of LLM-Generated Architecture

The conversation closes on a point that is easy to dismiss as a generic AI critique but is in fact an architectural observation. Azam ships several “vibe-coded” small apps and is candid that he could not point at any line in them and explain it. For single-purpose utilities and throwaway tools, that is acceptable. For anything that lives inside a meaningful domain, finance, healthcare, anything with regulatory weight, it is not.

The reason is not that LLMs write bad functions. They write fine functions. The reason is that architecture is not a function-level concern. Where a calculation belongs, how a domain rule should be expressed, what should and should not be coupled, these are decisions that require understanding the business, and understanding the business is precisely what gets skipped when you outsource the code. Use LLMs as an accelerator for the parts you understand; do not let them become the reason you stopped understanding.

Resources & Links

About the Guest

Mohammad Azam

Senior iOS Engineer • Educator • Author

Mohammad Azam has been working in the Apple ecosystem since 2010, starting with Objective-C and moving through every major shift in the platform since. He has led mobile teams at large organizations, taught iOS, web, and AI/ML bootcamps, and authored extensive course material on Udemy and his own platform on SwiftUI, UIKit, iOS architecture, and most recently Foundation Models. He is the developer behind My Veggie Garden, a gardening app on the App Store that uses Apple’s on-device LLM to recommend plantings based on location, weather, and USDA hardness zone, a concrete example of the architectural patterns discussed in this episode. His perspective on Foundation Models is informed by years of pre-LLM machine-learning work with CreateML, CoreML, scikit-learn, and Pandas, which gives his commentary on the framework an unusually grounded shape.

Go Deeper, Chapter 6 of AI-Driven Swift Architecture

If the framing of this episode resonates, treating Foundation Models as a typed capability inside a layered architecture rather than as a UI feature, the natural next step is Chapter 6 of AI-Driven Swift Architecture, which is dedicated entirely to Apple’s on-device LLM stack.

The chapter takes the patterns discussed here and turns them into production-grade scaffolding: how to design @Generable schemas as service contracts, how to version and test instructions alongside the rest of your domain code, how to scope tool boundaries narrowly, where Foundation Models sits in a Clean Architecture module graph, and how to plan adapter-based fine-tuning as a first-class artifact in your release pipeline. It is the architectural reference companion to the conversation you just watched.

The book itself (530 pages, published by Packt, foreword by Jon Reid) covers the broader picture: Swift 6 concurrency, Clean Architecture for iOS, AI-assisted development with Claude Code, MCP integration, and the engineering practices that make AI-augmented Swift codebases maintainable rather than brittle.

Key Takeaways

  • Foundation Models is an architectural layer, not a UI feature. Place it in your service tier, not your views.
  • @Generable is the contract between probabilistic text and typed Swift. Design schemas as carefully as you would design API responses.
  • Instructions outweigh prompts. Version them, test them, treat them as configuration of a service.
  • Tools are how you keep the model from making things up about your domain. Scope them narrowly.
  • Generables are DTOs, not entities. Map to SwiftData explicitly when you need persistence.
  • Plan for hardware gating from day one. A feature that requires Foundation Models needs a non-AI alternate path.
  • Adapters are how the model becomes yours. Treat them as shippable artifacts with their own evaluation pipeline.
  • Do not add AI because it is available. Add it where the architecture genuinely benefits.

The Real Shift

The shift Apple is making with Foundation Models is quieter than the cloud-LLM narrative but more consequential for the people who build apps. Intelligence becomes a typed, local, composable capability, something you architect with, not something you bolt on. The teams that will get the most out of it are not the ones racing to ship a chat surface; they are the ones rethinking which parts of their domain logic could be expressed as a constrained generation, which workflows could be summarized on-device, where a tool boundary belongs.

That is the right altitude for this conversation. Not how do I prompt it? but where does it live in my system? Foundation Models rewards architectural thinking and punishes experimentation-as-strategy. Treat it accordingly.

Listen & Subscribe

If this episode changed how you think about on-device AI and the place of Foundation Models in iOS architecture, share it with a colleague building in the Apple ecosystem, particularly anyone exploring SwiftUI, layered architecture, or the practical side of Apple Intelligence integration.

Subscribe to the Swift Academy podcast for weekly deep-dives into iOS development, Swift engineering, and the craft of building software on Apple platforms.