Spring AI: Building AI Applications with Java and Spring Boot (2026)

Spring AI brings portable, Spring-style abstractions to AI model integration. Write your application against the Spring AI API once, then swap between OpenAI, Anthropic Claude, Google Gemini, Azure OpenAI, or a locally running Ollama model by changing a single configuration property — no application code changes required.

1. Setup and Dependencies

<!-- pom.xml -->
<properties>
    <spring-ai.version>1.0.0</spring-ai.version>
</properties>

<dependencyManagement>
    <dependencies>
        <dependency>
            <groupId>org.springframework.ai</groupId>
            <artifactId>spring-ai-bom</artifactId>
            <version>${spring-ai.version}</version>
            <type>pom</type>
            <scope>import</scope>
        </dependency>
    </dependencies>
</dependencyManagement>

<dependencies>
    <!-- OpenAI -->
    <dependency>
        <groupId>org.springframework.ai</groupId>
        <artifactId>spring-ai-openai-spring-boot-starter</artifactId>
    </dependency>
    <!-- Or Anthropic Claude -->
    <dependency>
        <groupId>org.springframework.ai</groupId>
        <artifactId>spring-ai-anthropic-spring-boot-starter</artifactId>
    </dependency>
    <!-- Vector store (PGVector) -->
    <dependency>
        <groupId>org.springframework.ai</groupId>
        <artifactId>spring-ai-pgvector-store-spring-boot-starter</artifactId>
    </dependency>
</dependencies>
# application.yaml
spring:
  ai:
    openai:
      api-key: ${OPENAI_API_KEY}
      chat:
        options:
          model: gpt-4o
          temperature: 0.7
    anthropic:
      api-key: ${ANTHROPIC_API_KEY}
      chat:
        options:
          model: claude-sonnet-4-5

2. ChatClient API

ChatClient is the fluent API for building AI interactions:

@Service
public class CustomerSupportService {

    private final ChatClient chatClient;

    public CustomerSupportService(ChatClient.Builder chatClientBuilder) {
        this.chatClient = chatClientBuilder
            .defaultSystem("You are a helpful customer support agent for Techoral. " +
                          "Be concise, friendly and accurate.")
            .build();
    }

    // Simple prompt → response
    public String answer(String question) {
        return chatClient.prompt()
            .user(question)
            .call()
            .content();
    }

    // Streaming response
    public Flux<String> answerStreaming(String question) {
        return chatClient.prompt()
            .user(question)
            .stream()
            .content();
    }

    // Multi-turn conversation
    public String chat(String userMessage, List<Message> history) {
        return chatClient.prompt()
            .messages(history)
            .user(userMessage)
            .call()
            .content();
    }
}

3. Prompt Templates

@Service
public class CodeReviewService {

    private final ChatClient chatClient;

    // Template with variables
    private static final String CODE_REVIEW_TEMPLATE = """
        Review the following {language} code for bugs, security issues, and improvements.

        Code:
        ```{language}
        {code}
        ```

        Provide feedback in these categories:
        1. Bugs
        2. Security issues
        3. Performance improvements
        4. Style suggestions
        """;

    public String reviewCode(String language, String code) {
        return chatClient.prompt()
            .user(u -> u.text(CODE_REVIEW_TEMPLATE)
                       .param("language", language)
                       .param("code", code))
            .call()
            .content();
    }
}

4. Structured Output

Map AI responses directly to Java objects:

// Define the output schema as a Java record
public record ProductRecommendation(
    String productName,
    String reason,
    double confidenceScore,
    List<String> alternativeProducts
) {}

@Service
public class RecommendationService {

    private final ChatClient chatClient;

    public ProductRecommendation recommend(String userQuery, String category) {
        return chatClient.prompt()
            .system("You are a product recommendation engine. Return structured recommendations.")
            .user("User query: " + userQuery + "\nCategory: " + category)
            .call()
            .entity(ProductRecommendation.class);  // automatic JSON parsing
    }

    // List of structured outputs
    public List<ProductRecommendation> recommendMultiple(String query) {
        return chatClient.prompt()
            .user(query)
            .call()
            .entity(new ParameterizedTypeReference<List<ProductRecommendation>>() {});
    }
}

5. RAG with VectorStore

Retrieval Augmented Generation — ground AI responses in your own data:

@Configuration
public class VectorStoreConfig {

    @Bean
    public VectorStore vectorStore(JdbcTemplate jdbcTemplate, EmbeddingModel embeddingModel) {
        return new PgVectorStore(jdbcTemplate, embeddingModel);
    }
}

@Service
public class DocumentIngestionService {

    private final VectorStore vectorStore;
    private final DocumentReader pdfReader;

    // Ingest documents into the vector store
    public void ingestPdf(Resource pdfResource) {
        List<Document> documents = new TikaDocumentReader(pdfResource).get();
        // Split into chunks
        List<Document> chunks = new TokenTextSplitter().apply(documents);
        vectorStore.add(chunks);
    }
}

@Service
public class RagChatService {

    private final ChatClient chatClient;
    private final VectorStore vectorStore;

    public String answerWithContext(String question) {
        // Retrieve relevant documents
        List<Document> relevantDocs = vectorStore.similaritySearch(
            SearchRequest.query(question).withTopK(5)
        );

        // Build context from retrieved documents
        String context = relevantDocs.stream()
            .map(Document::getContent)
            .collect(Collectors.joining("\n\n"));

        return chatClient.prompt()
            .system("Answer based only on the provided context. " +
                   "If you don't know, say so.\n\nContext:\n" + context)
            .user(question)
            .call()
            .content();
    }

    // Using QuestionAnswerAdvisor for automatic RAG
    public String answerWithAdvisor(String question) {
        return chatClient.prompt()
            .advisors(new QuestionAnswerAdvisor(vectorStore))
            .user(question)
            .call()
            .content();
    }
}

6. Function Calling / Tools

// Define a tool the AI can call
@Component
public class WeatherService {

    @Tool(description = "Get current weather for a city")
    public String getWeather(
            @ToolParam(description = "City name") String city,
            @ToolParam(description = "Unit: celsius or fahrenheit") String unit) {
        // Real implementation would call a weather API
        return String.format("Weather in %s: 22°%s, Sunny", city, unit.equals("celsius") ? "C" : "F");
    }
}

@Service
public class WeatherChatService {

    private final ChatClient chatClient;
    private final WeatherService weatherService;

    public String chat(String message) {
        return chatClient.prompt()
            .tools(weatherService)  // register the tool
            .user(message)
            .call()
            .content();
        // AI will automatically call getWeather() when needed
    }
}

7. Advisors API

Advisors are reusable request/response interceptors:

@Service
public class ChatService {

    private final ChatClient chatClient;
    private final VectorStore vectorStore;
    private final ChatMemory chatMemory = new InMemoryChatMemory();

    public String chat(String sessionId, String message) {
        return chatClient.prompt()
            // RAG — automatically retrieves and injects context
            .advisors(new QuestionAnswerAdvisor(vectorStore,
                SearchRequest.defaults().withTopK(4)))
            // Memory — remembers conversation history
            .advisors(new MessageChatMemoryAdvisor(chatMemory))
            // Safe guard — logs prompts (useful for debugging/compliance)
            .advisors(new SimpleLoggerAdvisor())
            .advisors(a -> a.param(AbstractChatMemoryAdvisor.CHAT_MEMORY_CONVERSATION_ID_KEY, sessionId))
            .user(message)
            .call()
            .content();
    }
}

Frequently Asked Questions

How do I switch between OpenAI and Anthropic without code changes?

Use Spring profiles. Define two profiles — one with spring.ai.openai config and one with spring.ai.anthropic config. Inject ChatModel (not the provider-specific class) into your service. Switch with spring.profiles.active=openai or anthropic.

How does Spring AI handle rate limiting and retries?

Spring AI uses Spring Retry under the hood. Configure retry with standard Spring Retry annotations. For rate limiting, implement a custom Advisor that adds delays or queuing. Use a connection pool configuration to limit concurrent requests.

What vector databases does Spring AI support?

Spring AI supports: PgVector (PostgreSQL), Pinecone, Weaviate, Chroma, Qdrant, Redis, Azure AI Search, MongoDB Atlas, Neo4j, Cassandra and more. All implement the VectorStore interface — swap implementations by changing the dependency and config.

Is Spring AI production-ready?

Spring AI reached 1.0 GA in 2025 and is production-ready. Organisations like Broadcom (VMware) and many Spring ecosystem users have deployed it in production. Monitor the GitHub repo for updates — the AI landscape moves fast and Spring AI releases frequently.