BastionSDK
Local AI for Swift Developers
A powerful Swift Package that brings open source LLM frameworks to iOS and macOS with OpenAI-compatible APIs. Build privacy-first AI applications that run entirely on-device.
OpenAI-Compatible API
Familiar Swift API structure mimicking OpenAI libraries. Easy migration from cloud to local AI with minimal code changes.
Apple Platform Native
Built specifically for iOS and macOS with Metal GPU acceleration. Optimized for Apple Silicon and mobile devices.
Privacy by Design
All AI processing happens locally on device. No data ever leaves your user's device - complete privacy guarantee.
Hardware Acceleration
Leverages Apple's Neural Engine, Metal GPU, and CPU optimizations. Fast inference on iPhone, iPad, and Mac devices.
Production Ready
Streaming and non-streaming completion, chat templates, GBNF grammar support, and robust error handling.
Developer Friendly
Async/await support, comprehensive documentation, and Swift Package Manager integration. Just add and use.
Your AI, Your Rules
Build AI-powered iOS and macOS apps without compromising on privacy or performance
Core Features
- OpenAI-Compatible Swift API - BastionChatEngine, ChatCompletionRequest, familiar patterns
- Local LLM Inference - Multiple model engine support including CoreML and open source frameworks
- Model Management - Async model loading and unloading with resource management
- Streaming Support - Real-time token streaming with interruptible generation
- Chat Templates - Robust template application using model's built-in templates
- Advanced Parameters - Temperature, Top-K, Top-P, penalties, stop sequences
- GBNF Grammar - Constrained output generation for structured data
- Token Statistics - Usage tracking for non-streaming completions
Technical Architecture
Platform Support
- iOS 15.0+ - iPhone and iPad with Neural Engine
- macOS 12.0+ - Apple Silicon and Intel Macs
- Multiple Engines - CoreML, Metal GPU, CPU optimization
Quick Start Guide
1. Add BastionSDK to Your Project
// Add to your Package.swift dependencies
.package(url: "https://github.com/BastionAI/BastionSDK.git", from: "1.0.0")
2. Basic Usage
import BastionSDK
import Foundation
// Initialize the engine
let engine = BastionChatEngine()
// Configure model engine (CoreML, Open Source, etc.)
let config = ModelConfiguration(
engine: .coreML, // or .openSource, .metal
contextSize: 4096,
temperature: 0.7,
useHardwareAcceleration: true
)
// Load your model
try await engine.loadModel(
modelPath: "Models/your_model.mlpackage", // or .gguf for open source
configuration: config
)
// Create chat completion
let messages = [
ChatMessage(role: .system, content: "You are a helpful assistant."),
ChatMessage(role: .user, content: "Hello! How are you?")
]
let request = ChatCompletionRequest(
model: "my-model",
messages: messages,
temperature: 0.7,
maxTokens: 150,
stream: true
)
// Stream response
let stream = try await engine.streamChatCompletion(request: request)
for try await chunk in stream {
if let content = chunk.choices.first?.delta.content {
print(content, terminator: "")
}
}
Advanced Features
Streaming & Non-Streaming
Support for both real-time streaming responses and traditional completion requests. Interruptible streaming allows users to stop generation at any time.
Model Engine Flexibility
Support for multiple model engines including Apple CoreML, open source frameworks, and optimized native engines. Choose the best option for your use case.
GBNF Grammar
Constrained output generation using GBNF grammar rules. Perfect for generating structured data, JSON, or following specific formats.
Hardware Acceleration
Leverage Apple's Neural Engine, Metal GPU, and CPU optimizations. Automatic hardware selection for optimal performance on each device.
Robust Error Handling
Comprehensive error types and handling for model loading, inference failures, and resource management. Production-ready reliability.
Resource Management
Efficient memory usage with async model loading/unloading. Proper cleanup and resource management for mobile environments.
Development Status
Core Engine
Model loading, unloading, and basic inference
Streaming Support
Real-time token streaming with interruption
Chat Templates
Built-in template support via open source engines
Hardware Acceleration
Neural Engine, Metal GPU, CPU optimization
SwiftUI Integration
First available version targets SwiftUI apps
Additional Platforms
Future support for watchOS and tvOS
Use Cases
📱 Personal AI Assistants
Build privacy-first AI assistants that run entirely on iPhone. No data leaves the device, perfect for personal productivity apps.
💬 Offline Chat Apps
Create chat applications that work without internet connectivity. Perfect for remote areas or security-sensitive environments.
📝 Content Creation
Writing assistants, code completion, and creative tools that respect user privacy while providing intelligent suggestions.
🏥 Healthcare Apps
Medical and healthcare applications that can process sensitive data locally without HIPAA compliance concerns.
🎓 Educational Tools
Learning applications that work in schools without internet dependency. Personalized tutoring that stays private.
🏢 Enterprise Solutions
Business apps that can process confidential data locally. Perfect for air-gapped environments and sensitive workflows.
Ready to Build Local AI Apps?
Join the privacy-first AI revolution. Start building iOS and macOS apps that respect user privacy while delivering powerful AI capabilities.
Currently in active development. First version available for SwiftUI applications.