05 May, 2026

Why We Built Solana Agent This Way

Why We Built Solana Agent This Way

Evaluating an AI runtime that interacts with wallets and on-chain actions benefits from clarity on the underlying providers, safety boundaries, and operational model. This post documents the current stack, how transaction proposals are handled, deployment practices, and the design choices that shape reliability and trust.

Provider transparency

The provider names and high-level responsibilities are published because they are not the sensitive elements of the system. Credentials, internal configurations, rate limits, queue details, and operational secrets remain private. The published information is intended to help builders evaluate whether the architecture and dependencies align with their requirements for production workloads.

Provider choices

LayerProviderWhy it is used
Primary inferenceCerebras with gpt-oss-120bVery fast open-weight inference for agent loops where latency matters. The target path is high-throughput, 3,000+ token/sec class generation.
Inference fallbackOpenAI with gpt-4.1-mini and priority service tierA separate fallback path when the primary model is unavailable, degraded, or unsuitable for a request.
EmbeddingsOpenAI text-embedding-3-smallCompact, reliable embeddings for memory, retrieval, and semantic search.
Memory and vector searchMongoDB Atlas auto-scaling cluster with Vector Search and TTLPersistent memory, indexed retrieval, and automatic expiry for data that should not live forever.
Queues and background workUpstash RedisDurable queueing, async jobs, retries, and backpressure without turning the app into queue infrastructure.
Solana RPCHelius RPCHigh-quality Solana RPC access for reads, transaction preparation, and chain state.
Swaps and ordersJupiter APIBest-route swap infrastructure and trading primitives Solana builders already trust.
Lending and yieldKamino APILending, borrow, repay, vault, and position workflows without custom protocol glue.
Market dataBirdeye APIPrices, token data, wallet analytics, charts, search, and technical signals.
Web and X searchGrok API (xAI)Real-time web search and X (Twitter) search for current events, sentiment, and external context that agents can ground responses in.
Hosted self-custody walletPrivy APIEmbedded wallet infrastructure so apps do not manage raw private keys or seed phrases.
Edge and traffic protectionCloudflareTLS, caching, DNS, and basic edge protection.
Usage settlementPayAI x402 Merchant ProviderUSDC-based x402 payments for usage settlement.
API runtimeLatest stable CPython with FastAPIInspectable, production-friendly Python API surface.
DeploymentOVH Cloud VPS with DokkuSimple, owned deployment with zero-downtime deploys and operational control.
SDKsPython Client SDK, with TypeScript and Rust SDKs plannedOpen-source client surfaces so builders can inspect integration code instead of treating the API as a black box.

The runtime integrates established services for inference, data persistence, execution, and infrastructure. The integration layer handles memory, tool orchestration, wallet context, transaction preparation, usage-based billing, reporting, and safety checks.

Model uncertainty and transaction safety

AI models can misinterpret instructions, hallucinate values, or generate unsafe plans. The architecture assumes this possibility and does not grant models direct signing authority.

The model generates intent. The runtime maps that intent to structured, typed tool calls. Transaction tools are further constrained by supported actions, live provider data (quotes, balances, routes), wallet state, configured limits, and explicit approval steps.

Model output never directly produces a signed transaction.

Transaction safety model

Solana Agent is designed around layered checks:

  • Structured tool calls. The agent uses typed tools for swaps, orders, lending, market data, wallet actions, memory, and reporting instead of free-form transaction generation.
  • Provider-backed preparation. Jupiter, Kamino, Birdeye, Helius, and Privy are used for known execution surfaces rather than ad hoc transaction construction everywhere.
  • Explicit transaction boundaries. Supported flows should expose the action, wallet, token, amount, route, fee behavior, and expected result before execution.
  • Hosted self-custody wallet model. The app gets a hosted self-custody wallet surface without handling seed phrases or raw private keys in the client app.
  • Gasless accounting. Network-fee coverage is tracked so operators can see gasless savings and fee-payer cost.
  • Wallet-scoped reporting. Spend, usage, provider requests, gasless savings, and forecasts are tied back to the wallet that actually ran the work.
  • Idempotency and queueing. Background jobs and retries run through queue infrastructure so repeated requests can be controlled rather than duplicated blindly.
  • Test coverage. The API is built on the latest stable CPython and FastAPI with 100% test coverage enforced in GitHub CI.

This approach does not eliminate blockchain or smart-contract risk. It reduces the number of custom systems a builder must design, implement, test, and operate to support agent-driven on-chain actions.

Attack vectors and mitigations

RiskWhat can go wrongMitigation
Prompt injectionA page, user, or retrieved context tries to override the agent’s instructions.Keep execution behind structured tools, supported action lists, and policy checks. Treat retrieved text as context, not authority.
Model hallucinationThe model invents a token, route, balance, or action.Require provider-backed market data, quotes, wallet state, and typed action parameters before transaction preparation.
Unsafe transaction intentThe user asks for something ambiguous or risky.Prefer clarification, explicit action summaries, configured limits, and reject unsupported actions.
Provider outageA model, RPC, market data, or DeFi provider degrades.Use fallback inference, queue backpressure, retries, provider-level errors, and visible failure states instead of silent execution.
Duplicate executionRetries or refreshes submit the same work twice.Use idempotency, queue state, and job tracking for background execution paths.
Cost abuseA user or agent loop burns tokens, provider requests, or gas sponsorship.Use usage-based accounting, wallet-scoped reporting, starter allowance visibility, thresholds, forecasts, and provider-level pricing.
Wallet compromiseA client app mishandles keys or exposes secret material.Do not ask for seed phrases. Use Privy for the hosted self-custody wallet layer and keep private key handling out of the client app.
Data retention driftMemory or logs live longer than intended.Use MongoDB Atlas TTL-backed storage for expiring memory and retention-bound data.

Uptime and deployments

A public SLA is not published because the service depends on multiple upstream providers. The operating model instead emphasizes:

  • Zero-downtime deploys via Dokku release flow.
  • Cloudflare for edge delivery and basic protection.
  • MongoDB Atlas with auto-scaling for data and vector search.
  • Upstash Redis for durable queues, retries, and backpressure.
  • Inference fallback paths.
  • Explicit error handling for provider-dependent actions.
  • Detailed usage and cost reporting for post-execution review.

Teams requiring contractual uptime commitments should discuss terms directly.

Scaling strategy

The API remains a FastAPI/CPython service. Stateless handlers scale horizontally. Background and retryable work is managed through Redis queues. Persistent memory and vector search use MongoDB Atlas auto-scaling with TTL policies. Edge traffic is handled by Cloudflare. Inference uses a primary-plus-fallback configuration. Solana execution is delegated to specialized providers.

Operational effort therefore centers on capacity planning, queue management, provider limits, observability, and cost control rather than repeated backend re-architecture.

Build versus integrate

Teams with specific custody, protocol, or provider requirements may prefer to build equivalent functionality themselves.

Solana Agent provides a pre-integrated runtime that combines inference routing, hosted self-custody wallet context, tool execution for swaps/lending/market data, x402 usage settlement, and wallet-scoped reporting. Open-source SDKs and this architecture description allow inspection of the integration points.

The design goal is to let builders focus on agent behavior and application logic instead of assembling and maintaining the supporting infrastructure.

Further reading