| Feature | Spring AI | LangChain4j |
|---|---|---|
| Backing | Spring team (Broadcom) — official Spring project | Open-source community (LangChain4j org) |
| Spring Integration | Excellent — first-class auto-config, Spring starters | Good — Spring Boot starters available, less seamless |
| LLM Providers | OpenAI, Anthropic, Azure OpenAI, Ollama, Mistral, Google VertexAI, Bedrock, Groq | 20+ including all of the above + Cohere, HuggingFace, Jlama, Qianfan, Zhipu, WatsonX |
| RAG Support | QuestionAnswerAdvisor, VectorStore, ETL pipeline | RetrievalAugmentor, ContentRetriever, re-ranking, hybrid search |
| Vector Stores | PgVector, Redis, Chroma, Pinecone, Qdrant, Weaviate, OpenSearch, Milvus | Same + Cassandra, Vearch, Azure AI Search, more community stores |
| Function Calling | Yes — @Bean function registration, auto-discovery | Yes — @Tool annotation on any method, very ergonomic |
| Streaming | Yes — Flux<ChatResponse> / SSE | Yes — StreamingResponseHandler / Flux |
| Maturity (2026) | 1.0 GA — production ready | 0.36+ — stable, widely used in production |
| Learning Curve | Low for Spring devs | Medium — new concepts (AiServices, pipeline) |
| License | Apache 2.0 | Apache 2.0 |
For years, Python dominated AI and ML development. Libraries like LangChain (Python), LlamaIndex, and Hugging Face Transformers made Python the de facto language for building LLM-powered applications. Java developers watched from the sidelines, stitching together raw HTTP calls to OpenAI or wrapping Python scripts in ProcessBuilder. That era is over.
In 2026, Java has two mature, production-grade frameworks for LLM integration: Spring AI and LangChain4j. Both can do the heavy lifting — chatbots, RAG pipelines, agent loops, embeddings, tool/function calling, and streaming responses. Both support the major LLM providers. The choice between them is now a genuine architectural decision, not a question of capability.
The numbers back this up. LangChain4j crossed 5,000 GitHub stars and 100 contributors in 2025. Spring AI shipped its 1.0 GA release, triggering adoption across thousands of Spring Boot applications already in production. Stack Overflow's 2026 developer survey showed "Java AI" as one of the fastest-growing tag combinations, up 340% year-over-year.
This guide is written for Java developers who already understand Spring Boot, dependency injection, and REST APIs — and want a clear, technical, code-first answer to the question: should I use Spring AI or LangChain4j for my next project? We will look at both frameworks in depth, compare the same tasks side by side, and give you a decision framework for real-world scenarios.
Spring AI is an official Spring project from the Broadcom/Spring team. It follows the same design philosophy as the rest of the Spring ecosystem: convention over configuration, dependency injection, and auto-configuration that just works. If you have ever set up a Spring Data JPA repository or a Spring Security configuration, Spring AI will feel immediately familiar.
At its core, Spring AI provides a portable abstraction layer over LLM providers. You write your application code against the Spring AI interfaces (ChatClient, ChatModel, EmbeddingModel, VectorStore), and switching from OpenAI to Anthropic Claude to a local Ollama instance is a matter of changing Maven dependencies and a few application.properties entries — zero code changes in most cases.
Add the Spring AI OpenAI starter to your pom.xml, set spring.ai.openai.api-key in your properties file, and Spring Boot auto-configures a fully functional ChatClient bean. No factories, no boilerplate HTTP clients, no JSON parsing. This is the same zero-friction experience Spring Boot devs have with databases, messaging, and caching — now extended to AI.
The ChatClient is Spring AI's primary user-facing API since 1.0. It offers a fluent builder pattern for constructing prompts, attaching system messages, enabling advisors (middleware for RAG, logging, safety), and controlling output format:
// Spring AI ChatClient fluent API
String response = chatClient.prompt()
.system("You are a senior Java architect.")
.user("Explain reactive streams in 3 sentences.")
.call()
.content();
Advisors are Spring AI's elegant solution to cross-cutting concerns. They wrap the ChatClient call pipeline, intercepting requests and responses. The built-in QuestionAnswerAdvisor injects retrieved vector store context into every prompt automatically — this is how Spring AI implements RAG without you writing a retrieval loop.
Spring AI 1.0 ships first-class support for: OpenAI (GPT-4o, o3, o4-mini), Anthropic (Claude 3.5/3.7 Sonnet, Claude Opus 4), Google (Gemini 2.0 Flash, Gemini 2.5 Pro), Ollama (any local model), Mistral AI, Amazon Bedrock (Claude, Llama, Titan), Azure OpenAI, and Groq. Each has its own Spring Boot starter artifact.
Full working example — a REST endpoint that calls Anthropic Claude with a system prompt, streams a response, and returns it as plain text:
// pom.xml dependency
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-anthropic-spring-boot-starter</artifactId>
<version>1.0.0</version>
</dependency>
// application.properties
spring.ai.anthropic.api-key=${ANTHROPIC_API_KEY}
spring.ai.anthropic.chat.options.model=claude-sonnet-4-5
spring.ai.anthropic.chat.options.max-tokens=1024
import org.springframework.ai.chat.client.ChatClient;
import org.springframework.web.bind.annotation.*;
import reactor.core.publisher.Flux;
@RestController
@RequestMapping("/api/ai")
public class AiController {
private final ChatClient chatClient;
public AiController(ChatClient.Builder builder) {
this.chatClient = builder
.defaultSystem("You are a helpful Java expert. " +
"Answer concisely with code examples.")
.build();
}
// Blocking call — simple use case
@GetMapping("/ask")
public String ask(@RequestParam String question) {
return chatClient.prompt()
.user(question)
.call()
.content();
}
// Streaming call — returns Server-Sent Events
@GetMapping(value = "/stream", produces = "text/event-stream")
public Flux<String> stream(@RequestParam String question) {
return chatClient.prompt()
.user(question)
.stream()
.content();
}
// Structured output — map response to a Java record
record CodeReview(String verdict, String explanation, int score) {}
@PostMapping("/review")
public CodeReview review(@RequestBody String code) {
return chatClient.prompt()
.user("Review this Java code and rate it 1-10:\n\n" + code)
.call()
.entity(CodeReview.class); // Spring AI deserializes automatically
}
}
The ChatClient.Builder is auto-configured by Spring Boot. You inject it, set defaults, and your endpoint is live. No API client boilerplate, no serialization code for the structured output case — Spring AI injects the JSON schema into the prompt and deserializes the response automatically.
LangChain4j is a Java port of the Python LangChain library, but it has evolved into its own distinct framework with Java-first idioms. Rather than slavishly copying Python patterns, LangChain4j embraces Java strengths: interfaces, annotations, generics, and compile-time safety. The result is a framework that feels natural to Java developers while providing the full suite of LLM building blocks.
The key design difference from Spring AI: LangChain4j is framework-agnostic by default. You can use it in a plain Java main method, a Quarkus service, a Micronaut app, or — with the Spring Boot starter — a Spring Boot application. This flexibility comes at the cost of slightly more explicit configuration compared to Spring AI's auto-wire-everything approach.
LangChain4j's most celebrated feature is AiServices. You define a Java interface with annotated methods, and LangChain4j generates a proxy at runtime that handles the entire LLM interaction — prompt templating, memory injection, tool invocation, and response parsing — all transparently.
import dev.langchain4j.service.*;
// Define what you want — LangChain4j builds the implementation
interface JavaMentor {
@SystemMessage("You are a senior Java architect with 20 years experience.")
String advise(@UserMessage String question);
@SystemMessage("You are a code reviewer. Be strict and concise.")
@UserMessage("Review this code and give a score 1-10: {{code}}")
CodeReview review(@V("code") String javaCode);
// Streaming variant
TokenStream streamAdvice(@UserMessage String question);
}
// Create the service
JavaMentor mentor = AiServices.builder(JavaMentor.class)
.chatLanguageModel(model)
.chatMemory(MessageWindowChatMemory.withMaxMessages(10))
.build();
// Use it like a regular Java object
String advice = mentor.advise("When should I use virtual threads?");
LangChain4j ships MessageWindowChatMemory (keep last N messages) and TokenWindowChatMemory (keep within token budget). Both are injected into AiServices automatically — the framework manages the conversation state and injects prior messages into each new request behind the scenes.
In LangChain4j, any class method annotated with @Tool becomes available to the LLM as a callable function. The framework handles schema generation, invocation, and result injection automatically:
import dev.langchain4j.agent.tool.Tool;
import dev.langchain4j.agent.tool.P;
public class DatabaseTools {
@Tool("Look up a customer's order history by their ID")
public List<Order> getOrderHistory(@P("The customer ID") String customerId) {
return orderRepository.findByCustomerId(customerId);
}
@Tool("Get the current price of a product")
public double getProductPrice(@P("Product SKU") String sku) {
return catalogService.getPrice(sku);
}
}
// Wire tools into AiServices
CustomerAgent agent = AiServices.builder(CustomerAgent.class)
.chatLanguageModel(model)
.tools(new DatabaseTools(orderRepository, catalogService))
.chatMemory(MessageWindowChatMemory.withMaxMessages(20))
.build();
LangChain4j provides a rich document processing ecosystem. You can load PDFs, Word docs, HTML pages, GitHub repos, S3 objects, and plain text. Documents are split by DocumentSplitter implementations (recursive character, sentence, token-aware), embedded via any EmbeddingModel, and stored in a EmbeddingStore.
import dev.langchain4j.data.document.*;
import dev.langchain4j.data.document.loader.FileSystemDocumentLoader;
import dev.langchain4j.data.document.splitter.DocumentSplitters;
import dev.langchain4j.data.segment.TextSegment;
import dev.langchain4j.model.embedding.AllMiniLmL6V2EmbeddingModel;
import dev.langchain4j.store.embedding.EmbeddingStore;
import dev.langchain4j.store.embedding.EmbeddingStoreIngestor;
import dev.langchain4j.store.embedding.pgvector.PgVectorEmbeddingStore;
import dev.langchain4j.rag.content.retriever.EmbeddingStoreContentRetriever;
import dev.langchain4j.service.AiServices;
// 1. Load documents
List<Document> docs = FileSystemDocumentLoader.loadDocuments("/docs/api-specs");
// 2. Configure the embedding store (PostgreSQL + pgvector)
EmbeddingStore<TextSegment> store = PgVectorEmbeddingStore.builder()
.host("localhost")
.port(5432)
.database("techoral_ai")
.user("postgres")
.password(System.getenv("DB_PASSWORD"))
.table("api_docs_embeddings")
.dimension(384) // all-MiniLM-L6-v2 output dimension
.build();
// 3. Ingest: split → embed → store
EmbeddingStoreIngestor ingestor = EmbeddingStoreIngestor.builder()
.documentSplitter(DocumentSplitters.recursive(512, 64))
.embeddingModel(new AllMiniLmL6V2EmbeddingModel())
.embeddingStore(store)
.build();
ingestor.ingest(docs);
// 4. Build RAG-enabled AiService
interface ApiAdvisor {
@SystemMessage("You are a helpful API documentation assistant. " +
"Answer based on the provided context only.")
String answer(@UserMessage String question);
}
ApiAdvisor advisor = AiServices.builder(ApiAdvisor.class)
.chatLanguageModel(openAiModel)
.contentRetriever(EmbeddingStoreContentRetriever.builder()
.embeddingStore(store)
.embeddingModel(new AllMiniLmL6V2EmbeddingModel())
.maxResults(5)
.minScore(0.75)
.build())
.build();
// 5. Query
String answer = advisor.answer("How do I authenticate with the payments API?");
Task: Call an LLM with a system prompt + user message and return the response as a structured Java object.
// Spring AI — structured output via ChatClient
record ProductDescription(
String name,
String elevator_pitch,
List<String> key_features,
String target_audience
) {}
@Service
public class ProductService {
private final ChatClient chatClient;
public ProductService(ChatClient.Builder builder) {
this.chatClient = builder
.defaultSystem("You are a product marketing expert. " +
"Always respond with valid JSON matching the schema provided.")
.build();
}
public ProductDescription generateDescription(String rawNotes) {
return chatClient.prompt()
.user("Generate a product description from these notes:\n\n" + rawNotes)
.call()
.entity(ProductDescription.class);
// Spring AI automatically injects JSON schema into prompt
// and deserializes the LLM response into ProductDescription
}
}
// LangChain4j — structured output via AiServices
record ProductDescription(
String name,
@Description("One sentence that sells the product")
String elevator_pitch,
List<String> key_features,
String target_audience
) {}
interface ProductMarketer {
@SystemMessage("You are a product marketing expert.")
@UserMessage("""
Generate a product description from these notes:
{{notes}}
""")
ProductDescription generateDescription(@V("notes") String rawNotes);
}
@Service
public class ProductService {
private final ProductMarketer marketer;
public ProductService(ChatLanguageModel model) {
this.marketer = AiServices.create(ProductMarketer.class, model);
// LangChain4j also injects JSON schema automatically
// and deserializes the response
}
public ProductDescription generateDescription(String rawNotes) {
return marketer.generateDescription(rawNotes);
}
}
.entity(Class)), while LangChain4j's return type on the interface method carries the intent — arguably cleaner for teams that want to keep AI concerns in the interface layer.
Retrieval-Augmented Generation (RAG) is the most common production AI pattern: ground the LLM's responses in your own documents by retrieving relevant context at query time. Both frameworks support RAG but with different abstractions.
Spring AI uses the QuestionAnswerAdvisor which plugs into the ChatClient pipeline. The ETL (Extract, Transform, Load) pipeline handles document ingestion:
// Spring AI — Document Ingestion (ETL Pipeline)
@Component
public class DocumentIngestionService {
private final VectorStore vectorStore;
private final TokenTextSplitter splitter;
public DocumentIngestionService(VectorStore vectorStore) {
this.vectorStore = vectorStore;
this.splitter = new TokenTextSplitter(512, 64, 5, 10000, true);
}
public void ingest(Resource pdfResource) {
// Read → Split → Embed → Store (one pipeline)
new TokenTextSplitter()
.apply(new PagePdfDocumentReader(pdfResource).get())
.forEach(vectorStore::add);
}
}
// Spring AI — RAG Query with QuestionAnswerAdvisor
@Service
public class KnowledgeBaseService {
private final ChatClient chatClient;
public KnowledgeBaseService(ChatClient.Builder builder, VectorStore vectorStore) {
this.chatClient = builder
.defaultSystem("Answer based on the provided context. " +
"If the context does not contain the answer, say so.")
.defaultAdvisors(new QuestionAnswerAdvisor(
vectorStore,
SearchRequest.query("").withTopK(5).withSimilarityThreshold(0.75)
))
.build();
}
public String query(String question) {
return chatClient.prompt()
.user(question)
.call()
.content();
// QuestionAnswerAdvisor intercepts the call,
// runs a vector similarity search,
// injects the top-K chunks into the prompt,
// then forwards to the LLM
}
}
LangChain4j exposes more RAG internals, allowing you to customize each step of the pipeline — query transformation, retrieval, re-ranking, and content injection:
// LangChain4j — Advanced RAG with RetrievalAugmentor
import dev.langchain4j.rag.DefaultRetrievalAugmentor;
import dev.langchain4j.rag.RetrievalAugmentor;
import dev.langchain4j.rag.content.retriever.*;
import dev.langchain4j.rag.query.transformer.*;
// Query compression transformer — rewrites the question using chat history
QueryTransformer transformer = CompressingQueryTransformer.builder()
.chatLanguageModel(compressionModel)
.build();
// Primary retriever — dense vector search
ContentRetriever denseRetriever = EmbeddingStoreContentRetriever.builder()
.embeddingStore(pgVectorStore)
.embeddingModel(embeddingModel)
.maxResults(10)
.minScore(0.70)
.build();
// Secondary retriever — keyword/BM25 search (hybrid RAG)
ContentRetriever sparseRetriever = WebSearchContentRetriever.builder()
.webSearchEngine(googleSearchEngine)
.maxResults(3)
.build();
// Combine retrievers for hybrid search
RetrievalAugmentor augmentor = DefaultRetrievalAugmentor.builder()
.queryTransformer(transformer)
.contentRetriever(denseRetriever) // in practice, wrap both in ContentAggregator
.build();
interface TechAdvisor {
@SystemMessage("You are a senior software architect.")
String advise(@UserMessage String question);
}
TechAdvisor advisor = AiServices.builder(TechAdvisor.class)
.chatLanguageModel(model)
.retrievalAugmentor(augmentor)
.chatMemory(MessageWindowChatMemory.withMaxMessages(10))
.build();
| Provider | Spring AI | LangChain4j | Notes |
|---|---|---|---|
| OpenAI (GPT-4o, o3, o4-mini) | Yes | Yes | Both support function calling, vision, streaming |
| Anthropic Claude (Sonnet, Opus) | Yes | Yes | Extended thinking supported in both |
| Ollama (local models) | Yes | Yes | Run Llama 3, Mistral, Phi-3 locally |
| Mistral AI | Yes | Yes | Mistral Large, Mixtral |
| Google Gemini | Yes (VertexAI) | Yes | Gemini 2.0 Flash, 2.5 Pro |
| Azure OpenAI | Yes | Yes | Enterprise with private deployment |
| Amazon Bedrock | Yes | Yes | Claude on Bedrock, Titan, Llama on Bedrock |
| Cohere | No | Yes | Command R+, rerank models |
| Hugging Face | No | Yes | Inference API, 200k+ models |
| Groq | Yes | Yes | Ultra-fast inference |
| IBM WatsonX | No | Yes | Enterprise IBM environments |
| Jlama (JVM-native models) | No | Yes | Run small models in-process on JVM |
This is where Spring AI has a structural advantage. It is built by the Spring team, for the Spring ecosystem, and it shows at every level.
Add one starter, set one property — everything is wired:
<!-- Spring AI: ONE dependency, fully configured -->
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-openai-spring-boot-starter</artifactId>
<version>1.0.0</version>
</dependency>
# application.properties
spring.ai.openai.api-key=${OPENAI_API_KEY}
spring.ai.openai.chat.options.model=gpt-4o
spring.ai.openai.chat.options.temperature=0.7
spring.ai.openai.chat.options.max-tokens=2048
# That's it. ChatClient, ChatModel, EmbeddingModel beans are ready.
LangChain4j also provides Spring Boot starters, but configuration is slightly more explicit:
<!-- LangChain4j: two dependencies needed for Spring Boot -->
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-spring-boot-starter</artifactId>
<version>0.36.0</version>
</dependency>
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-open-ai-spring-boot-starter</artifactId>
<version>0.36.0</version>
</dependency>
# application.properties
langchain4j.open-ai.chat-model.api-key=${OPENAI_API_KEY}
langchain4j.open-ai.chat-model.model-name=gpt-4o
langchain4j.open-ai.chat-model.temperature=0.7
langchain4j.open-ai.chat-model.max-tokens=2048
# AiServices beans are auto-scanned if you use @AiService annotation (0.36+)
Spring AI integrates natively with Spring Boot Actuator, Micrometer, and OpenTelemetry. You get metrics (spring.ai.chat.observations), traces that propagate through your AI calls, and health endpoints — all with zero extra configuration if you already use the Spring observability stack. LangChain4j requires manual instrumentation or the community observability module to achieve the same.
@SpringBootTest works perfectly, including mocked ChatClient<!-- pom.xml — Spring AI with OpenAI -->
<parent>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-parent</artifactId>
<version>3.3.5</version>
</parent>
<dependencyManagement>
<dependencies>
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-bom</artifactId>
<version>1.0.0</version>
<type>pom</type>
<scope>import</scope>
</dependency>
</dependencies>
</dependencyManagement>
<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-openai-spring-boot-starter</artifactId>
</dependency>
</dependencies>
// src/main/java/.../Application.java
@SpringBootApplication
public class Application {
public static void main(String[] args) {
SpringApplication.run(Application.class, args);
}
}
// src/main/java/.../ChatController.java
@RestController
public class ChatController {
private final ChatClient chatClient;
public ChatController(ChatClient.Builder builder) {
this.chatClient = builder.build();
}
@GetMapping("/chat")
public String chat(@RequestParam String message) {
return chatClient.prompt()
.user(message)
.call()
.content();
}
}
# src/main/resources/application.properties
spring.ai.openai.api-key=${OPENAI_API_KEY}
spring.ai.openai.chat.options.model=gpt-4o-mini
<!-- pom.xml — LangChain4j with OpenAI -->
<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-spring-boot-starter</artifactId>
<version>0.36.0</version>
</dependency>
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-open-ai-spring-boot-starter</artifactId>
<version>0.36.0</version>
</dependency>
</dependencies>
// Define your AI service interface
@AiService // LangChain4j 0.36+ — auto-creates a Spring bean
interface Assistant {
@SystemMessage("You are a helpful assistant.")
String chat(@UserMessage String message);
}
// Inject and use it like any Spring bean
@RestController
public class ChatController {
private final Assistant assistant;
public ChatController(Assistant assistant) {
this.assistant = assistant;
}
@GetMapping("/chat")
public String chat(@RequestParam String message) {
return assistant.chat(message);
}
}
# src/main/resources/application.properties
langchain4j.open-ai.chat-model.api-key=${OPENAI_API_KEY}
langchain4j.open-ai.chat-model.model-name=gpt-4o-mini
/chat?message=... endpoint backed by GPT-4o-mini. The difference is architectural: Spring AI's ChatClient vs LangChain4j's @AiService-annotated interface.
Yes. Spring AI reached its 1.0 GA milestone in mid-2025 and is now production-ready. It ships with stable APIs for chat, embedding, image generation, RAG advisors, and function calling, backed by the full Spring team at Broadcom. The 1.x line follows Spring's standard release cadence with LTS guarantees.
Yes. LangChain4j ships official Spring Boot starter modules (langchain4j-spring-boot-starter and model-specific starters like langchain4j-open-ai-spring-boot-starter). Since version 0.36, the @AiService annotation automatically registers your interface as a Spring bean, making it indistinguishable from hand-written Spring services at the injection site.
LangChain4j supports a wider range of LLM providers out of the box — over 20 integrations including OpenAI, Anthropic, Ollama, Mistral, Cohere, Hugging Face, Azure OpenAI, Google Gemini, and more. Spring AI covers the major providers well but has fewer total integrations. If your deployment requires Cohere, HuggingFace Inference API, IBM WatsonX, or Jlama (in-JVM inference), LangChain4j is currently the only option.
Both are capable for RAG. Spring AI uses an Advisor pattern that integrates cleanly into the ChatClient fluent API — simple, idiomatic, sufficient for most use cases. LangChain4j offers a more explicit pipeline with RetrievalAugmentor, content injectors, query transformers (compression, HyDE), and content retrievers that give you fine-grained control. For complex RAG with re-ranking, hybrid dense+sparse search, or RAPTOR-style recursive summarization, LangChain4j currently provides more hooks.
Technically yes, but it is not recommended. Both frameworks configure beans with similar names (chatModel, embeddingModel, vectorStore) and their auto-configurations can conflict. You would need to carefully qualify beans and disable conflicting auto-configs. Pick one as the primary AI framework per service. In a microservices architecture, different services can use different frameworks without conflict.
The Java AI landscape in 2026 is genuinely mature. Both Spring AI and LangChain4j can power production LLM applications — chatbots, RAG pipelines, AI agents, document processing, and more. The choice comes down to context.
Spring AI is the natural home for Java developers already in the Spring ecosystem. Its zero-configuration auto-wiring, native Actuator/Micrometer/OTel integration, and backing by the Spring team make it the lower-friction choice for teams who want to add AI capabilities to existing Spring Boot services without introducing new paradigms.
LangChain4j is the more flexible, framework-agnostic option with broader LLM provider coverage and a richer RAG pipeline model. The AiServices interface pattern is elegant and scales well as AI complexity grows. If you need Cohere, Hugging Face, advanced RAG control, or you are not on Spring, LangChain4j is the answer.
For teams starting fresh in 2026: default to Spring AI if you are Spring-native, and consider LangChain4j when you hit the edge of what Spring AI offers. Both are Apache-licensed, actively maintained, and worthy of production trust.