Elliot One

465 posts

Elliot One banner
Elliot One

Elliot One

@elliot1one

AI Engineer • Teaching 37K+ engineers to build production-grade AI systems • Author of The Modern Engineer • Microsoft MVP

London, UK Katılım Ağustos 2023
302 Takip Edilen164 Takipçiler
Elliot One
Elliot One@elliot1one·
Don't call hallucinations a model problem when the real failure is system design. Too many teams still talk about hallucinations as if they are random model behavior. In production, they are usually a sign that the system was allowed to answer without enough evidence, without clear constraints, or without a reliable way to verify its claims. That is why grounding matters. Grounding is not just adding RAG and hoping for better answers. It is however designing a system that can retrieve the right context, filter weak evidence, constrain generation to what is actually supported, cite sources for important claims, and abstain when the evidence is incomplete or conflicting. That last part is still underused. One of the most important behaviors in any high-stakes AI system is the ability to say: I don't know based on the available evidence. Strong LLM systems do not try to sound smart at all costs. They prioritize supported answers over fluent ones. AI engineering today is less about prompt tricks and more about reliability architecture: - retrieval quality - evidence ranking - groundedness checks - citation discipline - fallback logic - human review for uncertain cases - monitoring in production RAG helps, but retrieval alone does not solve hallucinations. A model can still pick the wrong chunk, overstate certainty, or invent support that is not there. The bar is higher now. P.S. If your application needs trust, grounding has to be part of the system design, evaluation strategy, and production operations from day one. — ♻️ Share this if it helped ➕ Follow me [Elliot One] 🔔 Enable Notifications
Elliot One tweet media
English
1
0
0
43
Elliot One
Elliot One@elliot1one·
Most AI discussions still overfocus on generation. But in production systems, retrieval is often the higher ROI pattern. The real challenge is not generating answers. It is retrieving the right evidence consistently, with measurable behavior and inspectable scoring. That is why I wrote Issue 17 of The Modern Engineer: Deterministic Semantic Retrieval with Embeddings and Vector Search In this issue, I build a local-first semantic retrieval pipeline using .NET, Ollama embeddings, deterministic ranking, clustering, and optional Postgres pgvector support. The interesting part is not the customer feedback example. It is the architecture around semantic retrieval. The system is designed around a simple operational rule: Embeddings provide semantic capability. Deterministic systems control behavior. A few areas explored in the article: → Embedding domain text into semantic vector space → Deterministic cosine similarity ranking → Thresholded retrieval to block weak semantic matches → Stable tie-breaking for inspectable retrieval behavior → Metadata-aware semantic search → Incremental centroid clustering for signal extraction → Recall@K evaluation for retrieval quality measurement → Optional pgvector integration for scalable persistence One of the most important ideas: Embeddings are not answers. They are coordinates in semantic space. The production value comes from the deterministic layers around them: ranking, filtering, thresholds, evaluation, and operational control. The workflow intentionally separates responsibilities: • Embeddings generate semantic representations • Deterministic ranking controls retrieval behavior • Thresholds calibrate confidence • Metadata keeps results operationally useful • Clustering extracts recurring themes from noisy datasets This architecture works because the model is not the control plane. The deterministic system is. The result is not "AI magic". It is measurable semantic retrieval operating inside explicit constraints. . . 🔗 Read the full article here: lnkd.in/eDz9hPZs -- 📌 Subscribe to The Modern Engineer for weekly insights on production-grade AI engineering, software architecture, and real-world systems. No fluff. Just practical engineering. 👉 elliotone.com
Elliot One tweet media
English
0
0
2
40
Elliot One
Elliot One@elliot1one·
Most APIs should not expose domain entities directly. The moment internal models leak into public contracts, versioning, security, and maintainability become harder to control. That is why object mapping matters in modern ASP .NET Core systems. One library that has gained strong adoption in the .NET ecosystem is Mapster. Mapster helps translate domain models into lightweight API contracts while reducing repetitive mapping code and keeping runtime overhead low. A common example is mapping an Order aggregate into an OrderDto: • Hiding internal identifiers • Flattening nested objects • Simplifying date handling • Reshaping responses for API consumers What makes Mapster particularly interesting is its balance between performance and explicitness. Instead of relying heavily on runtime reflection, mappings can be centrally configured using TypeAdapterConfig, scanned during startup, and executed through IMapper or the Adapt() extension method. This creates a clean separation between: • Domain models • Application logic • Public contracts In larger systems, that separation becomes operationally important. A few reasons teams adopt Mapster: • Less repetitive boilerplate • Consistent transformations across endpoints • Lower allocation and runtime overhead • Clear mapping configuration instead of scattered conversion logic It also works naturally with collections, Minimal APIs, and layered architectures commonly used in modern ASP .NET Core applications. That said, mapping libraries are still abstractions. Once transformations become deeply conditional or business-driven, explicit code is often easier to reason about than complex mapping profiles. Mapster works best when used for simple to moderately complex object transformations where consistency, maintainability, and performance all matter. P.S. Good architecture is not eliminating code but is placing complexity in the right place. — ♻️ Share this if it helped ➕ Follow me [Elliot One] 🔔 Enable Notifications
Elliot One tweet media
English
1
0
0
39
Elliot One
Elliot One@elliot1one·
AI agents are becoming easier to build. Reliable AI systems are not. There is a major difference between an agent that can generate responses and a system that can operate safely inside production environments. Most agent demos optimize for capability: → More tools → More autonomy → More reasoning → More workflow complexity Production systems optimize for something else entirely: Predictability. That is why I wrote Issue 16 of The Modern Engineer: "Production-Grade AI Agents with Microsoft Agent Framework and Deterministic Guardrails" In this issue, I build a local-first incident triage system using Microsoft Agent Framework, Ollama, and deterministic enforcement layers. The interesting part is not the incident workflow. It is the architecture around the model. The system is designed around a simple operational rule: LLMs generate context. Deterministic systems enforce outcomes. A few areas explored in the article: → Structured agent workflows with explicit boundaries → Sequential orchestration using AgentWorkflowBuilder → Strongly typed contracts instead of free-form output → Defensive parsing and fail-fast behavior → Policy enforcement outside the agent → Reviewer agents for operational validation → Explicit fallback paths when structured execution fails → Local-first execution for reproducibility and operational visibility The workflow intentionally separates responsibilities: • Agents generate proposed triage reports • Reviewer agents validate operational quality • Deterministic policy layers normalize severity and domain classification • Runtime enforcement controls final system behavior One of the most important ideas: Production agents should not own authority. They should operate inside bounded systems where validation, enforcement, escalation, and operational policy remain deterministic. Microsoft Agent Framework provides a strong orchestration layer for this approach through components like ChatClientAgent, AgentRunResponse, AgentWorkflowBuilder, and provider-agnostic IChatClient integrations. The result is not autonomous AI. It is controlled software architecture with probabilistic components operating inside explicit constraints. . . 🔗 Read the full article here: elliotone.com/newsletter/202… -- 📌 Subscribe to The Modern Engineer for weekly insights on production-grade AI engineering, software architecture, and real-world systems. No fluff. Just practical engineering. 👉 elliotone.com
Elliot One tweet media
English
1
0
0
33
Elliot One
Elliot One@elliot1one·
EF Core 10 is so optimized now that most developers probably do not need compiled queries anymore. That says a lot about how far Entity Framework Core has matured. Years ago, query performance concerns often pushed teams toward raw SQL, Dapper, or aggressive manual optimizations. Today, EF Core already handles a huge amount internally: • Query shape caching • SQL reuse • Reduced allocations • Faster translation pipelines • Better runtime efficiency For most applications, standard LINQ queries are already fast enough. So where do compiled queries still fit? Compiled queries using 𝐄𝐅.𝐂𝐨𝐦𝐩𝐢𝐥𝐞𝐐𝐮𝐞𝐫𝐲 and 𝐄𝐅.𝐂𝐨𝐦𝐩𝐢𝐥𝐞𝐀𝐬𝐲𝐧𝐜𝐐𝐮𝐞𝐫𝐲 allow EF Core to precompile a query shape and reuse it across 𝐃𝐛𝐂𝐨𝐧𝐭𝐞𝐱𝐭 instances. The benefit is not faster SQL execution. It is faster EF Core execution. Specifically, compiled queries reduce overhead from: • Expression tree analysis • LINQ translation • Query compilation • Internal query planning In EF Core 10, the gains are usually much smaller than they used to be. Typical benchmarks now show roughly: • Standard LINQ query → baseline • Compiled query → around 8–12% faster in hot paths That optimization only matters when the same query executes constantly. Good use cases: • High throughput APIs • Background workers • Polling systems • Extremely hot read paths Poor use cases: • Dynamic queries • Rarely executed queries • General CRUD code • Premature optimization Compiled queries are now a micro optimization, not a default recommendation. And honestly, that is a good thing. It means EF Core's defaults have become extremely capable. Most systems will gain far more from: • Better indexing • Reducing N+1 queries • Proper projections • 𝐀𝐬𝐍𝐨𝐓𝐫𝐚𝐜𝐤𝐢𝐧𝐠() where appropriate • Efficient schema design • Smarter caching strategies P.S. Compiled queries are the final few percentage points after everything else is already healthy. Optimize because profiling proved it matters, not because the feature exists. Activate to view larger image,
Elliot One tweet media
English
0
0
1
33
Saeed Anwar
Saeed Anwar@saen_dev·
@elliot1one The system discipline framing is the right one. The hard problems in LLM systems are mostly software engineering problems: versioning, observability, failure modes, data contracts. The model is often the smallest variable.
English
2
0
1
13
Elliot One
Elliot One@elliot1one·
No one shares a practical roadmap to learn AI Engineering. Most content is scattered. Too theoretical. Or too focused on tools. AI Engineering in 2026 is a system discipline. It combines software engineering, machine learning, and LLM systems into one craft. Not models in isolation. End to end, production ready systems. Here is a clear way to think about it. AI systems span the full lifecycle: ㆍData ingestion ㆍPreprocessing ㆍModel training ㆍLLM integration ㆍRetrieval (RAG) ㆍAgent workflows ㆍEvaluation and safety ㆍDeployment and monitoring The goal is simple. Build systems that are reproducible, testable, and reliable. Seven foundations define the field: 1. Programming and system design ㆍClean architecture. APIs. Concurrency. Testing. 2. Math, statistics, and classical ML ㆍThe theory behind every model decision. 3. Deep learning ㆍNeural networks across text, vision, and audio. 4. LLMs and generative systems ㆍTransformers, embeddings, fine tuning, inference. 5. Retrieval and agents ㆍRAG pipelines. Tool use. Multi agent systems. 6. Evaluation and guardrails ㆍHallucination control. Safety. auditability. 7. Deployment and LLMOps ㆍScaling, latency, cost, monitoring. Three paths depending on your background: ⇢ Path 1: Data science and ML foundations ⇢ Path 2: LLM and generative systems ⇢ Path 3: Agentic systems and orchestration A practical progression: ㆍStart with Python and ML fundamentals ㆍMove to LLMs and prompting ㆍAdd retrieval and vector databases ㆍBuild agent workflows ㆍFocus on deployment and evaluation ㆍThen specialize If you want to stand out, build real systems: ㆍA classical ML project ㆍA deep learning model ㆍAn LLM fine tuning pipeline ㆍA RAG system ㆍAn agent workflow ㆍA deployed system with monitoring AI Engineering is not using models. It is designing systems that make them reliable. That is the real skill gap in the market today. 👉 Full deep dive: elliotone.com/newsletter/202… — ➕ Subscribe to The Modern Engineer for weekly AI Engineering insights. 📌 Subscribe at elliotone.com
Elliot One tweet media
English
1
0
0
69
Elliot One
Elliot One@elliot1one·
Everyone is talking about AI coding agents. Almost nobody is talking about control. Most AI-assisted coding workflows fail for one reason: They start with execution. The agent is told: → "Add authentication" → "Refactor the service" → "Build the feature" And immediately begins generating code. At that point, the model is already making implicit decisions about: • Architecture • Dependencies • Scope • Abstractions • System behavior That works for demos. It breaks in production. Production systems cannot rely on probabilistic assumptions hidden inside generated diffs. That is why I wrote Issue 15 of The Modern Engineer: "Spec-Driven AI-Assisted Coding and Agent Control" The core idea is simple: The specification becomes the execution contract. The agent executes only what the spec allows. Example: Included: • Input validation logic in InputValidator.cs Excluded: • Routing changes • Middleware changes • Dependency updates This transforms AI coding from open-ended generation into bounded execution. A few ideas explored in the article: → Why prompt-first coding breaks down → Specs as infrastructure, not documentation → Separation of planning and execution → Human-in-the-loop execution gates → Deterministic task execution → Multi-agent workflows with explicit authority boundaries → Why agents should behave like constrained execution engines, not autonomous engineers One of the most important rules: "The agent that plans must not execute." Planning requires exploration. Execution requires discipline. Reliable AI-assisted engineering is not about removing humans. It is about encoding intent, constraints, and boundaries explicitly enough that probabilistic systems cannot silently drift. Full article: elliotone.com/newsletter/202…
Elliot One tweet media
English
0
0
2
20
Elliot One
Elliot One@elliot1one·
C# 15 fixed one of its most awkward design problems. Union types are one of the most exciting additions to modern C#. Developers had to choose between awkward compromises when a method needed to return one of several valid outcomes: 1. use object and lose clarity 2. force unrelated types into a shared base class or interface 3. build custom Result wrappers again and again Now there's a cleaner option. With union types, C# can express something that shows up all the time in real systems: a value can be one of a small, known set of valid shapes.⁠ ⁠⁠ ⁠​ Why that matters: 1. API contracts become clearer 2. business outcomes become easier to model 3. pattern matching becomes more powerful 4. the compiler can help enforce exhaustive handling of known cases⁠ ⁠⁠ ⁠​ This is especially useful for things like: • success vs validation failure vs not found results • workflow states such as draft, scheduled, and published • parsing outcomes • "one item or many items" style APIs What I like most is that this fills a real gap in C# design. It sits between two extremes: too loose → "this could be anything" too heavy → "I need a full hierarchy just to model a few known cases" Union types give us a middle path: precise, readable, and intentional. They won't replace classes, interfaces, or inheritance. But for modeling known alternatives, they could make day-to-day C# design much cleaner. And that's why I think union types could become one of the most practical C# features in years.⁠ — ♻️ Share this if it helped ➕ Follow me [Elliot One] 🔔 Enable Notifications
Elliot One tweet media
English
0
0
1
12
Elliot One
Elliot One@elliot1one·
No one tells you that most AI systems fail because the model became the control plane. The problem is not the LLM itself. It is letting probabilistic reasoning own routing, tool execution, and system behavior. Real users do not send clean single-intent requests. They mix questions, omit context, and change direction mid-sentence. Without deterministic control flow, you get tool misuse, hallucinated actions, and unpredictable outcomes. In Issue 14 of The Modern Engineer [elliotone.com/newsletter/202…], I built a local-first .NET intent routing system designed around one principle: Predictable behavior under messy input. The architecture treats the LLM as a constrained component, not the orchestrator. The pipeline is intentionally layered: • InputValidator blocks invalid or suspicious input before any model call • Policy handles deterministic logic like security refusals and trivial compute • IntentClassifier uses rules first and embeddings only for disambiguation • IntentRouter enforces one-intent-per-turn execution • ToolRegistry controls allowlists, argument validation, and strict timeouts • GroundedAnswerComposer generates responses only from verified tool output A few production patterns matter more than people think: Rules-first routing keeps decisions explainable and stable. Embedding-based routing should never execute blindly. Confidence thresholds and ambiguity margins are mandatory if embeddings influence behavior. Multi-intent detection is not a UX detail. It is a reliability boundary. If a request contains multiple intents, the safest action is often asking a clarifying question and executing nothing. Tool execution must stay deterministic: allowlists, validated arguments, bounded execution windows, and safe failure modes. Grounded responses are equally important. If retrieval fails, the system should fail in a controlled way instead of generating confident fiction. One of the most important lessons from production AI systems is this: The model can assist the workflow, but it should not own the workflow. That distinction is where reliability starts. . . Read More: elliotone.com/newsletter/202…
Elliot One tweet media
English
0
0
2
14
Elliot One
Elliot One@elliot1one·
Most applications start with a single implementation per interface. One payment provider. One storage backend. One notification channel. Then the system grows. Suddenly the architecture needs: • Multiple payment processors • Different cloud storage providers • Tenant specific integrations • Environment specific implementations • Strategy based execution paths Before .NET 8, this usually led to one of three patterns: • Inject every implementation and manually branch • Build custom factory abstractions around DI • Pull services from IServiceProvider and hope the service locator problem stays contained The result was often unnecessary plumbing leaking into application code. .NET 8 Keyed Services changes this significantly. You can now register multiple implementations of the same contract using AddKeyedScoped, AddKeyedSingleton, or AddKeyedTransient and resolve them directly with FromKeyedServices or GetKeyedService. The important part is not the syntax. It is the architectural impact: • Cleaner constructors • Explicit dependency boundaries • Less runtime branching • No factory boilerplate • Better separation between orchestration and implementation details This becomes especially valuable in: • Multi tenant systems • Provider based architectures • Strategy driven workflows • Plugin style applications • Payment, storage, and notification abstractions Even better, strongly typed keys such as enums help avoid fragile string based resolution patterns. Like most features, this is not something to use everywhere. If your system only has one implementation per contract, keyed services add unnecessary complexity. But once multiple implementations become a core architectural concern, they provide a much cleaner composition model than most pre .NET 8 approaches. P.S. The biggest benefit of Keyed Services is not convenience. It is making implementation selection part of the dependency model instead of hidden runtime logic. — ♻️ Share this if it helped ➕ Follow me [Elliot One] 🔔 Enable Notifications
Elliot One tweet mediaElliot One tweet media
English
0
0
1
9
Elliot One
Elliot One@elliot1one·
The biggest problem in production AI is not generation quality It is undetected failure. They are defined by whether their behavior can be measured, observed, and improved over time. Most AI teams focus heavily on generation quality. 📌 But they skip the operational layer: • How do you detect regressions? • How do you measure reliability? • How do you enforce deployment gates? • How do you close the loop when failures occur? ✅ In Issue 13 of The Modern Engineer, I explored how to build a production-aligned evaluation harness for local AI systems in C# using deterministic evaluation, observability, and feedback-driven policy refinement. 📌 The architecture combines: • EvaluationCase and EvaluationResult contracts • Deterministic relevance and safety scoring • TelemetrySink spans + metrics collection • PromptPolicy constraint composition • FeedbackProcessor policy hardening loops • Deployment gates based on pass rate + p95 latency • Local model execution with OllamaSharp or mock clients The execution pipeline is intentionally simple and observable: Evaluation cases → Model execution → Deterministic scoring → Telemetry → Feedback events → Policy updates → Re-run validation One of the most important design decisions is treating prompts and policies as operational infrastructure instead of static text. 📌 The system continuously tightens constraints when failures occur: • Safety failures add explicit refusal constraints • Relevance failures strengthen required operational concepts • Verbosity failures enforce response discipline The policy evolves as the system discovers weaknesses. 📌 This creates a controlled feedback loop: • Detect regressions • Apply constraints • Re-run evaluations • Verify measurable improvement The deployment decision is not based on intuition. 📌 It is based on measurable thresholds: • Pass rate • Safety score • p95 latency • Token usage • Operational telemetry Reliable AI systems are built through evaluation discipline, observability, and continuous feedback loops. That is where production AI engineering starts. 👉 Read the full article here: elliotone.com/newsletter/202…
Elliot One tweet media
English
1
0
1
11
Elliot One
Elliot One@elliot1one·
𝗤𝘂𝗮𝗿𝘁𝘇.𝗡𝗘𝗧 𝗶𝘀 𝗼𝗻𝗲 𝗼𝗳 𝘁𝗵𝗲 𝗺𝗼𝘀𝘁 𝘂𝗻𝗱𝗲𝗿𝘂𝘀𝗲𝗱 𝗶𝗻𝗳𝗿𝗮𝘀𝘁𝗿𝘂𝗰𝘁𝘂𝗿𝗲 𝘁𝗼𝗼𝗹𝘀 𝗶𝗻 𝗺𝗼𝗱𝗲𝗿𝗻 .𝗡𝗘𝗧 𝘀𝘆𝘀𝘁𝗲𝗺𝘀 As applications scale, not all work belongs inside the HTTP request pipeline. Some workloads need to run independently of user traffic: • Outbox processing • Email delivery • Cache rebuilding • Scheduled cleanup jobs • External system synchronization Many teams initially push this logic into controllers, middleware, or long-running BackgroundService loops. That usually works until scheduling, retries, concurrency, and graceful shutdown behavior become operational problems. This is where Quartz.NET fits naturally. Quartz.NET is a production-grade scheduler for .NET that provides: • Cron-based scheduling • Persistent job storage • Retry and misfire handling • Clustering support • Clean IHostedService integration The architecture is intentionally simple: 📌 IJob ↳ Defines the executable unit of work 📌 Trigger ↳ Defines when the job runs 📌 IScheduler ↳ Coordinates execution and scheduling One of Quartz .NET’s biggest strengths is how cleanly it integrates with modern .NET applications. Jobs resolve through dependency injection, scoped services work correctly, and lifecycle management is handled automatically by the host. No manual thread management. No fragile while(true) background loops. No custom scheduling infrastructure. Features like [DisallowConcurrentExecution] also make execution boundaries explicit, which becomes critical for workloads like outbox processing and external integrations. Quartz is not a replacement for queues or event-driven systems. Queues move events and orchestrates time. That distinction matters. As systems mature, background processing becomes an architectural concern, not an implementation detail. Quartz .NET solves that problem exceptionally well in the .NET ecosystem. — ♻️ Repost if you think background jobs deserve more architectural attention ➕ Follow me (Elliot One) for more content on modern software engineering
Elliot One tweet media
English
0
0
1
13
Elliot One
Elliot One@elliot1one·
Production AI systems are not defined by whether they work once. They are defined by how they behave when things fail. Most AI demos stop at successful execution. Real systems operate under uncertainty: • Tools fail • Requests timeout • Model output becomes structurally invalid • Retrieval returns incomplete context • Users provide unpredictable input Without constraints, probabilistic systems eventually leak instability into deterministic infrastructure. In Issue 12 of The Modern Engineer, I explored how to build a guardrailed AI assistant in C# using Ollama and Microsoft.Extensions.AI with a strong focus on deterministic execution, explicit boundaries, and safe failure modes. The architecture is intentionally small and production-aligned. No autonomous agents. No hidden orchestration loops. No unconstrained tool execution. The execution pipeline is simple: User input → Validation → Tool planning → Constrained execution → Grounded response generation The project combines: • Input validation and prompt injection guardrails • Deterministic policy routing for sensitive requests • Strict ToolPlan JSON contracts • Tool allowlisting and argument validation • CancellationTokenSource timeouts for tool execution • Grounded final responses using validated tool output only One of the most important design decisions is separating probabilistic reasoning from deterministic execution. The assistant never executes tools directly. Every ToolPlan is validated before execution, and the runtime enforces: • Allowed tool boundaries • Timeout protection • Safe fallback behavior • Refusal handling for restricted operations If the system cannot retrieve enough information, it responds safely instead of hallucinating. Reliable AI systems are built by constraining uncertainty, not amplifying it. That is where production AI engineering starts. Read the full article here: elliotone.com/newsletter/202…
Elliot One tweet media
English
0
0
1
24
Elliot One
Elliot One@elliot1one·
Data access patterns that once felt harmless eventually become bottlenecks. Tables grow into millions of rows. List endpoints become some of the slowest queries in the system. Infinite scrolling and feeds expose hidden database inefficiencies. Pagination is usually where these problems surface first. EF Core primarily supports two pagination models: • Offset pagination • Keyset pagination Both solve the same problem with very different scalability characteristics. Most systems begin with offset pagination using Skip() and Take(). It is simple and works well for: • Small datasets • Internal dashboards • Page-based UIs But under scale, the weaknesses become obvious: • Skipped rows are still scanned • Query cost increases with page depth • Inserts and deletes can create inconsistent results • Large offsets eventually degrade performance badly Keyset pagination takes a different approach. Instead of skipping rows, the client requests records after a known value such as: • Id > lastId • CreatedAt < lastTimestamp Combined with proper indexing, this allows the database to perform index seeks instead of large scans. The result is significantly more stable performance on large datasets. Keyset pagination is ideal for: • Feeds • Timelines • Infinite scrolling • Event streams • High-traffic APIs However, it also introduces trade-offs: • No direct "go to page 12" support • Requires deterministic ordering • Ordering columns must be indexed correctly The important architectural point is this: Pagination is not just a UI feature. It is a data access strategy. Offset pagination optimizes for simplicity. Keyset pagination optimizes for scalability. Choosing intentionally becomes increasingly important as systems and datasets grow. — ♻️ Share this if it helped ➕ Follow me [Elliot One] 🔔 Enable Notifications
Elliot One tweet media
English
0
0
0
22
Elliot One
Elliot One@elliot1one·
Before AI systems can reason, plan, or act, they first need reliable access to knowledge. Without strong retrieval, even sophisticated AI architectures become unstable. Yet most AI discussions immediately jump toward agents, orchestration frameworks, memory systems, and autonomous execution loops. In practice, one of the most valuable production-ready AI capabilities is far simpler: semantic retrieval. In Issue 11, I explored how to build a fully local semantic runbook search engine using C#, Ollama, and Microsoft.Extensions.AI. The project combines: • Strongly typed domain models for operational knowledge • Ollama embeddings for fully local vector generation • Microsoft.Extensions.AI abstractions for clean integration • Deterministic cosine similarity scoring using System.Numerics.Tensors • Clear separation between AI integration and retrieval logic The architecture is intentionally simple: User query → Embedding generation → Vector similarity scoring → Ranked runbook retrieval No autonomous agents, orchestration loops, or hidden reasoning layers. At startup, operational documents are embedded and indexed locally. Engineers can then describe problems in natural language and retrieve the most relevant runbooks based on meaning rather than keywords. This matters because incidents rarely follow exact terminology. Engineers describe failures through symptoms, partial observations, and operational context. Traditional keyword search often breaks down when wording changes or pressure increases during outages. The project uses: • KnowledgeDocument as an immutable domain model • OllamaEmbeddingGenerator for local embedding generation • SemanticSearchEngine for deterministic ranking • IEmbeddingGenerator abstractions for replaceable AI infrastructure The retrieval layer contains no generative reasoning. Just deterministic vector math. This creates several production advantages: • Fully local and private execution • Reproducible retrieval behavior • Traceable similarity scores • Clear separation of concerns • Minimal infrastructure complexity The system also forms a strong foundation for future evolution. Persistent vector storage, hybrid retrieval, RAG pipelines, or conversational exploration can all be layered on top later without changing the core architecture. Before systems can reason, plan, or act, they must retrieve the right knowledge consistently. That is where production AI systems should start. Read here: elliotone.com/newsletter/202…
Elliot One tweet media
English
0
0
0
23
Elliot One
Elliot One@elliot1one·
DbContext lifetime eventually stops being a small implementation detail. It becomes architecture. What starts as a simple request → response API slowly turns into: • Background workers processing queues • Parallel workflows running outside HTTP requests • Long lived services • High throughput endpoints under load And suddenly the question is no longer: How do we access the database? It becomes: Who owns the DbContext, for how long, and on which thread? EF Core gives you multiple lifetime models because different workloads have different constraints. Here is the practical model: • AddDbContext → Default choice for web APIs → Scoped per request → Safe and simple for HTTP workloads • IDbContextFactory → Explicit context creation → Ideal for hosted services, queues, background jobs, and parallel workflows → Avoids relying on ambient request scopes • AddDbContextPool → Reuses DbContext instances → Reduces allocation overhead under sustained load → Best for high throughput APIs • AddPooledDbContextFactory → Explicit creation + pooled reuse → Great for high throughput background processing pipelines Also note: DbContext is NOT thread safe. Most production EF Core problems are not caused by SQL. They are caused by lifetime discipline: • Sharing contexts across threads • Holding them too long • Using scoped contexts outside request boundaries • Treating DbContext like a cache instead of a unit of work The scary part is that these mistakes usually work at first. Then traffic increases. Concurrency increases. Background processing appears. And the architecture starts leaking. P.S. Good EF Core architecture is not only queries and indexes. It is also understanding lifetime boundaries. — ♻️ Share this if it helped ➕ Follow me [Elliot One] 🔔 Enable Notifications
Elliot One tweet media
English
0
0
0
14
Elliot One
Elliot One@elliot1one·
Most discussions around AI agents immediately jump toward autonomous systems, planning loops, memory layers, and multi-agent orchestration. In practice, however, many production-ready AI systems start with something far simpler and far more reliable: reactive agents with deterministic execution boundaries. In Issue 10, I explored how to build a fully local reactive AI agent using .NET 10, Semantic Kernel, and Ollama. The project demonstrates how to combine: • Semantic Kernel ChatCompletionAgent for structured reasoning • Class-based function calling for deterministic execution • Ollama for fully local inference and tool selection • Clear separation between generative reasoning and executable logic The architecture is intentionally simple: User input → Agent reasoning → Optional tool invocation → Final response No hidden orchestration layer, autonomous execution loop, or external cloud dependency. One of the most important ideas behind this architecture is that instructions are not simply prompts. They become part of the system design itself. The agent is explicitly constrained to: • Invoke tools only when appropriate • Produce structured and predictable outputs • Avoid hallucinated responses • Return deterministic failure modes when information is unavailable This creates a far more reliable foundation for production AI systems than prematurely introducing autonomy everywhere. The article also explores an important distinction that is often misunderstood in the industry today: the difference between an agent and an agentic system. A reactive agent can reason, invoke tools, and respond intelligently within a constrained request-response cycle. That does not automatically make it autonomous or agentic in the research sense of the term. This distinction matters because reliability, observability, and deterministic behavior are often more valuable in production than unrestricted autonomy. The system remains fully extensible. Additional tools, memory layers, planning systems, or multi-agent coordination can be added later without changing the core architectural principles. A strong AI system is not defined by how autonomous it appears. It is defined by how predictably and safely it behaves under real-world conditions. Read the full article here: elliotone.com/newsletter/202…
Elliot One tweet media
English
2
1
1
34
Elliot One
Elliot One@elliot1one·
AI agents are useless without deterministic execution. That is the real bottleneck most teams are now hitting. Modern AI systems cannot rely on prompts alone. ✅ They need structured orchestration, predictable tool execution, and clean separation between reasoning and infrastructure. In Issue 9 of The Modern Engineer, I built a production-ready MCP architecture in C# and .NET 9 using Ollama as the LLM backend. The system combines: • MCP Server over STDIO for deterministic tool execution • Dynamic MCP Client for tool discovery and orchestration • Ollama integration for grounded AI reasoning A few important architectural principles emerged: 1. Deterministic vs Generative Separation Tool execution remains auditable and predictable. LLMs only reason over verified outputs. 2. Protocol Integrity Matters Even logging design matters in MCP systems. Redirecting logs to STDERR prevents protocol corruption and preserves transport safety. 3. Dynamic Tool Discovery Changes Everything The client discovers tools at runtime without hardcoded endpoints or signatures. This creates extensible AI systems instead of tightly coupled workflows. The result is a modular AI architecture where tools, orchestration, and reasoning evolve independently. This is where AI engineering is heading: → Context-aware orchestration → Tool-driven reasoning → Deterministic AI workflows → Protocol-first AI infrastructure MCP is becoming one of the most important standards in modern AI systems engineering. 👉 Full article: elliotone.com/newsletter/202…
Elliot One tweet media
English
0
1
1
22
Elliot One
Elliot One@elliot1one·
Nobody tells you: The real shift in AI is not better models! It is better systems. Classical machine learning was algorithm-centric for years. ⚠️ You defined a problem, engineered features, selected models like Logistic Regression, Random Forest, or XGBoost, and deployed a static predictor. It worked and it was efficient, interpretable, and reliable under stable conditions. ❌ But it was also rigid. ✅ Modern AI systems operate differently. Intelligence no longer lives inside a single model. It emerges from how systems are composed, orchestrated, and continuously adapted in production. This is the architectural shift. ✔️ Classical ML was built on: ㆍStructured data and predefined tasks ㆍLinear pipelines and static deployments ㆍModel weights as the primary source of intelligence ✔️ Modern AI systems are built on: ㆍNatural language interfaces and unstructured data ㆍOrchestration layers combining LLMs, embeddings, and tools ㆍContext, retrieval, and memory as first-class system components The implications are significant. Adaptation no longer requires retraining. It happens through prompt design, context engineering, and retrieval pipelines. Techniques like Retrieval-Augmented Generation move knowledge out of the model and into the system architecture. Updates become immediate and behavior becomes controllable. This is why fine-tuning is no longer the default. System design is. We are also seeing a transition from linear pipelines to iterative agent loops. Systems can plan, execute, reflect, and adjust. Frameworks like Semantic Kernel and protocols like MCP are formalising how models interact with tools, memory, and external systems. None of this replaces classical machine learning. It reframes it. In production, the strongest systems are hybrid: 1. Classical ML for structured signals, scoring, and anomaly detection 2. LLMs for reasoning, synthesis, and interaction 3. RAG for grounding in real-time and private data 4. Agents for multi-step execution and automation The model is no longer the system but is a component within it. Engineers are no longer just selecting algorithms. They are designing systems where intelligence is orchestrated, not trained. That is the real transition. 👉 Read the full article here: elliotone.com/newsletter/202…
Elliot One tweet media
English
1
0
1
39
Elliot One
Elliot One@elliot1one·
AI systems fail without context. Most teams optimise prompts. Senior engineers design context. ✅ Context engineering is the difference between a chatbot and a system that actually understands users. AI does not remember anything by default. No intent, no preferences, no history. It only responds to what exists inside the current context window. That constraint defines system quality. Prompt engineering improves single interactions. Context engineering defines behaviour across interactions. It answers four critical questions: • Who is the user • What are they trying to achieve • What has already happened • What should happen next Without this, systems feel generic and inconsistent. With it, systems become adaptive, coherent, and useful. This is an engineering problem, not a prompting trick. Well-designed context enables: • Personalised outputs at scale • Consistency across multi-turn conversations • Support for complex workflows and agents • Reduced cognitive load for users You move from stateless responses to continuous understanding. In practice, context engineering involves: • Structuring user identity, roles, and preferences • Managing session memory and persistent memory • Integrating tools and external knowledge sources • Updating context dynamically as conditions change More context is not better. Irrelevant or stale information degrades performance and creates confusion. Strong systems are selective. They prioritise relevance, recency, and structure. ❌ For example, a vague prompt like "Tell me about Paris" produces generic output. ☑️ A contextual prompt such as "Explain Paris during the French Revolution in three key points" produces focused, useful results. The model has not changed. The context has. This is the shift. AI systems are no longer just about generation. They are about managing state over time. When context is treated as a living system, you unlock: • Memory-aware interactions • Tool-driven reasoning • Reliable multi-step workflows • Systems that feel consistent rather than repetitive P.S. Context engineering is where AI engineering becomes real engineering. 👉 Full breakdown: lnkd.in/eGrgpANt — ♻️ Share this with your network ➕ Follow me [Elliot One] 🔔 Enable Notifications
Elliot One tweet media
English
2
1
2
22