BastionSDK

Local AI for Swift Developers

A powerful Swift Package that brings open source LLM frameworks to iOS and macOS with OpenAI-compatible APIs. Build privacy-first AI applications that run entirely on-device.

🚀

OpenAI-Compatible API

Familiar Swift API structure mimicking OpenAI libraries. Easy migration from cloud to local AI with minimal code changes.

📱

Apple Platform Native

Built specifically for iOS and macOS with Metal GPU acceleration. Optimized for Apple Silicon and mobile devices.

🔒

Privacy by Design

All AI processing happens locally on device. No data ever leaves your user's device - complete privacy guarantee.

Hardware Acceleration

Leverages Apple's Neural Engine, Metal GPU, and CPU optimizations. Fast inference on iPhone, iPad, and Mac devices.

🛠️

Production Ready

Streaming and non-streaming completion, chat templates, GBNF grammar support, and robust error handling.

🔧

Developer Friendly

Async/await support, comprehensive documentation, and Swift Package Manager integration. Just add and use.

Your AI, Your Rules

Build AI-powered iOS and macOS apps without compromising on privacy or performance

Core Features

  • OpenAI-Compatible Swift API - BastionChatEngine, ChatCompletionRequest, familiar patterns
  • Local LLM Inference - Multiple model engine support including CoreML and open source frameworks
  • Model Management - Async model loading and unloading with resource management
  • Streaming Support - Real-time token streaming with interruptible generation
  • Chat Templates - Robust template application using model's built-in templates
  • Advanced Parameters - Temperature, Top-K, Top-P, penalties, stop sequences
  • GBNF Grammar - Constrained output generation for structured data
  • Token Statistics - Usage tracking for non-streaming completions

Technical Architecture

Swift API Layer BastionSDK - Your application interface
Model Engine Layer CoreML, Open Source Frameworks, Native Engines
Hardware Acceleration Metal GPU, Neural Engine, CPU optimization

Platform Support

  • iOS 15.0+ - iPhone and iPad with Neural Engine
  • macOS 12.0+ - Apple Silicon and Intel Macs
  • Multiple Engines - CoreML, Metal GPU, CPU optimization

Quick Start Guide

1. Add BastionSDK to Your Project

// Add to your Package.swift dependencies
.package(url: "https://github.com/BastionAI/BastionSDK.git", from: "1.0.0")

2. Basic Usage

import BastionSDK
 import Foundation
 
 // Initialize the engine
 let engine = BastionChatEngine()
 
 // Configure model engine (CoreML, Open Source, etc.)
 let config = ModelConfiguration(
     engine: .coreML,        // or .openSource, .metal
     contextSize: 4096,
     temperature: 0.7,
     useHardwareAcceleration: true
 )
 
 // Load your model
 try await engine.loadModel(
     modelPath: "Models/your_model.mlpackage", // or .gguf for open source
     configuration: config
 )
 
 // Create chat completion
 let messages = [
     ChatMessage(role: .system, content: "You are a helpful assistant."),
     ChatMessage(role: .user, content: "Hello! How are you?")
 ]
 
 let request = ChatCompletionRequest(
     model: "my-model",
     messages: messages,
     temperature: 0.7,
     maxTokens: 150,
     stream: true
 )
 
 // Stream response
 let stream = try await engine.streamChatCompletion(request: request)
 for try await chunk in stream {
     if let content = chunk.choices.first?.delta.content {
         print(content, terminator: "")
     }
 }

Advanced Features

Streaming & Non-Streaming

Support for both real-time streaming responses and traditional completion requests. Interruptible streaming allows users to stop generation at any time.

Model Engine Flexibility

Support for multiple model engines including Apple CoreML, open source frameworks, and optimized native engines. Choose the best option for your use case.

GBNF Grammar

Constrained output generation using GBNF grammar rules. Perfect for generating structured data, JSON, or following specific formats.

Hardware Acceleration

Leverage Apple's Neural Engine, Metal GPU, and CPU optimizations. Automatic hardware selection for optimal performance on each device.

Robust Error Handling

Comprehensive error types and handling for model loading, inference failures, and resource management. Production-ready reliability.

Resource Management

Efficient memory usage with async model loading/unloading. Proper cleanup and resource management for mobile environments.

Development Status

Core Engine

Model loading, unloading, and basic inference

Streaming Support

Real-time token streaming with interruption

Chat Templates

Built-in template support via open source engines

Hardware Acceleration

Neural Engine, Metal GPU, CPU optimization

🚧

SwiftUI Integration

First available version targets SwiftUI apps

📋

Additional Platforms

Future support for watchOS and tvOS

Use Cases

📱 Personal AI Assistants

Build privacy-first AI assistants that run entirely on iPhone. No data leaves the device, perfect for personal productivity apps.

💬 Offline Chat Apps

Create chat applications that work without internet connectivity. Perfect for remote areas or security-sensitive environments.

📝 Content Creation

Writing assistants, code completion, and creative tools that respect user privacy while providing intelligent suggestions.

🏥 Healthcare Apps

Medical and healthcare applications that can process sensitive data locally without HIPAA compliance concerns.

🎓 Educational Tools

Learning applications that work in schools without internet dependency. Personalized tutoring that stays private.

🏢 Enterprise Solutions

Business apps that can process confidential data locally. Perfect for air-gapped environments and sensitive workflows.

Ready to Build Local AI Apps?

Join the privacy-first AI revolution. Start building iOS and macOS apps that respect user privacy while delivering powerful AI capabilities.

Currently in active development. First version available for SwiftUI applications.