Baeldung

Java, Spring and Web Development tutorials

 

Building an AI Chatbot in Java With Langchain4j and MongoDB Atlas
2025-04-24 12:07 UTC by Kostiantyn Ivanov

1. Overview

Chatbot systems enhance the user experience by providing quick and intelligent responses, making interactions more efficient.

In this tutorial, we’ll walk through the process of building a chatbot using Langchain4j and MongoDB Atlas.

LangChain4j is a Java library inspired by LangChain, designed to help build AI-powered applications using LLMs. We use it to develop applications such as chatbots, summarization engines, or intelligent search systems.

We’ll use MongoDB Atlas Vector Search to give our chatbot the ability to retrieve relevant information based on meaning, not just keywords. Traditional keyword-based search methods rely on exact word matches, often leading to irrelevant results when users phrase their questions differently or use synonyms.

By using vector stores and vector search, our app compares the meaning of the user’s query to the stored content by mapping both into a high-dimensional vector space. This allows the chatbot to understand and respond to complex, natural language questions with greater contextual accuracy, even if the exact words don’t appear in the source content. As a result, we achieve more context-aware results.

2. AI Chatbot Application Architecture

Let’s take a look at our application components:

Chat Bot Components Schema

Our application interacts with the chatbot using HTTP endpoints. It has two flows: a document loading flow and a Chatbot flow.

For the document loading flow, we’ll take a dataset of articles. Then, we’ll generate vector embeddings using an embedding model. Finally, we’ll save the embeddings alongside our data in MongoDB. These embeddings represent the semantic content of the articles, enabling efficient similarity search.

For the chatbot flow, we’ll perform a similarity search in our MongoDB instance based on user input to retrieve the most relevant documents. After this, we’ll use the retrieved articles as context for the LLM prompt and generate the chatbot’s response based on the LLM output.

3. Dependencies and Configuration

Let’s start by adding the spring-boot-starter-web dependency since we’ll build the HTTP API:

<dependency> 
    <groupId>org.springframework.boot</groupId>         
    <artifactId>spring-boot-starter-web</artifactId> 
    <version>3.3.2</version> 
</dependency>

Next, let’s add the langchain4j-mongodb-atlas dependency, which provides interfaces to communicate with the MongoDB vector store and the embedding model:

<dependency>
    <groupId>dev.langchain4j</groupId>
    <artifactId>langchain4j-mongodb-atlas</artifactId>
    <version>1.0.0-beta1</version>
</dependency>

Finally, we’ll add the langchain4j dependency. This gives us the interfaces we need to interact with both the embedding model and the LLM:

<dependency>
    <groupId>dev.langchain4j</groupId>
    <artifactId>langchain4j</artifactId>
    <version>1.0.0-beta1</version>
</dependency>

For our demonstration purposes, we’ll set up the local MongoDB cluster. Next, we’ll obtain the OpenAI API key. Now, we can configure the MongoDB URL, database name, and OpenAI API key in the application.properties file:

app.mongodb.url=mongodb://chatbot:password@localhost:27017/admin
app.mongodb.db-name=chatbot_db
app.openai.apiKey=${OPENAI_API_KEY}

Now, let’s create a ChatBotConfiguration class. Here, we’ll define the MongoDB client bean and the beans related to embeddings:

@Configuration
public class ChatBotConfiguration {
    @Value("${app.mongodb.url}")
    private String mongodbUrl;
    @Value("${app.mongodb.db-name}")
    private String databaseName;
    @Value("${app.openai.apiKey}")
    private String apiKey;
    @Bean
    public MongoClient mongoClient() {
        return MongoClients.create(mongodbUrl);
    }
    @Bean
    public EmbeddingStore<TextSegment> embeddingStore(MongoClient mongoClient) {
        String collectionName = "embeddings";
        String indexName = "embedding";
        Long maxResultRatio = 10L;
        CreateCollectionOptions createCollectionOptions = new CreateCollectionOptions();
        Bson filter = null;
        IndexMapping indexMapping = IndexMapping.builder()
          .dimension(TEXT_EMBEDDING_3_SMALL.dimension())
          .metadataFieldNames(new HashSet<>())
          .build();
        Boolean createIndex = true;
        return new MongoDbEmbeddingStore(
          mongoClient,
          databaseName,
          collectionName,
          indexName,
          maxResultRatio,
          createCollectionOptions,
          filter,
          indexMapping,
          createIndex
        );
    }
    @Bean
    public EmbeddingModel embeddingModel() {
        return OpenAiEmbeddingModel.builder()
          .apiKey(apiKey)
          .modelName(TEXT_EMBEDDING_3_SMALL)
          .build();
    }
}

We’ve built the EmbeddingModel using the OpenAI text-embedding-3-small model. Of course, we can choose another embedding model that meets our needs. Then, we create a MongoDbEmbeddingStore bean. The store is backed by a MongoDB Atlas collection where embeddings will be saved and indexed for fast semantic retrieval. Next, we set the dimension to the default text-embedding-3-small value. Using the EmbeddingModel, we need to be sure the dimension of the created vector matches the mentioned model.

4. Load Documentation Data to Vector Store

We’ll use the MongoDB articles as our ChatBot data. For demonstration purposes, we can manually download the dataset with the articles from Hugging Face. Next, we’ll save this dataset as an articles.json file in the resources folder.  

We want to ingest these articles during application startup by converting them into vector embeddings and storing them in our MongoDB Atlas vector store.

Now, let’s add the property to the application.properties file. We’ll use it to control whether the data load is needed:

app.load-articles=true

4.1. ArticlesRepository

Now, let’s create ArticlesRepository, responsible for reading the dataset, generating embeddings, and storing them:

@Component
public class ArticlesRepository {
    private static final Logger log = LoggerFactory.getLogger(ArticlesRepository.class);
    private final EmbeddingStore<TextSegment> embeddingStore;
    private final EmbeddingModel embeddingModel;
    private final ObjectMapper objectMapper = new ObjectMapper();
    @Autowired
    public ArticlesRepository(@Value("${app.load-articles}") Boolean shouldLoadArticles, 
      EmbeddingStore<TextSegment> embeddingStore, EmbeddingModel embeddingModel) throws IOException {
        this.embeddingStore = embeddingStore;
        this.embeddingModel = embeddingModel;
        
        if (shouldLoadArticles) {
            loadArticles();
        }
    }
}

Here we’ve set up the EmbeddingStore, the embedding model, and a config flag. If app.load-articles is set to true, we trigger the document ingestion on startup. Now, let’s implement the loadArticles() method:

private void loadArticles() throws IOException {
    String resourcePath = "articles.json";
    int maxTokensPerChunk = 8000;
    int overlapTokens = 800;
    List<TextSegment> documents = loadJsonDocuments(resourcePath, maxTokensPerChunk, overlapTokens);
    log.info("Documents to store: " + documents.size());
    for (TextSegment document : documents) {
        Embedding embedding = embeddingModel.embed(document.text()).content();
        embeddingStore.add(embedding, document);
    }
    log.info("Documents are uploaded");
}

Here we use a loadJsonDocuments() method to load the data stored in the resource folder. We create a collection of TextSegment instances. For each TextSegment, we create an embedding and store it in the vector store. We’ll use the maxTokensPerChunk variable to specify the maximum number of tokens that will be present in the chunk of the vector store document. This value should be lower than the model dimension. Also, we use overlapTokens to indicate how many tokens may overlap between the text segments. This helps us preserve the context between segments.

4.2. loadJsonDocuments() Implementation

Next, let’s introduce the loadJsonDocuments() method, which reads raw JSON articles, parses them into LangChain4j Document objects, and prepares them for embedding:

private List<TextSegment> loadJsonDocuments(String resourcePath, int maxTokensPerChunk, int overlapTokens) throws IOException {
    InputStream inputStream = ArticlesRepository.class.getClassLoader().getResourceAsStream(resourcePath);
    if (inputStream == null) {
        throw new FileNotFoundException("Resource not found: " + resourcePath);
    }
    BufferedReader reader = new BufferedReader(new InputStreamReader(inputStream));
    int batchSize = 500;
    List<Document> batch = new ArrayList<>();
    List<TextSegment> textSegments = new ArrayList<>();
    String line;
    while ((line = reader.readLine()) != null) {
        JsonNode jsonNode = objectMapper.readTree(line);
        String title = jsonNode.path("title").asText(null);
        String body = jsonNode.path("body").asText(null);
        JsonNode metadataNode = jsonNode.path("metadata");
        if (body != null) {
            addDocumentToBatch(title, body, metadataNode, batch);
            if (batch.size() >= batchSize) {
                textSegments.addAll(splitIntoChunks(batch, maxTokensPerChunk, overlapTokens));
                batch.clear();
            }
        }
    }
    if (!batch.isEmpty()) {
        textSegments.addAll(splitIntoChunks(batch, maxTokensPerChunk, overlapTokens));
    }
    return textSegments;
}

Here, we parse the JSON file and iterate over each item. Then, we add the article title, body, and metadata as a document to the batch. Once the batch reaches 500 entries (or at the end), we process it using splitIntoChunks() to break the content into manageable text segments.  The method returns a complete list of TextSegment objects, ready for embedding and storage.

Let’s implement the addDocumentToBatch() method:

private void addDocumentToBatch(String title, String body, JsonNode metadataNode, List<Document> batch) {
    String text = (title != null ? title + "\n\n" + body : body);
    Metadata metadata = new Metadata();
    if (metadataNode != null && metadataNode.isObject()) {
        Iterator<String> fieldNames = metadataNode.fieldNames();
        while (fieldNames.hasNext()) {
            String fieldName = fieldNames.next();
            metadata.put(fieldName, metadataNode.path(fieldName).asText());
        }
    }
    Document document = Document.from(text, metadata);
    batch.add(document);
}

The article’s title and body are concatenated into a single text block. If we have metadata present, we parse the fields and add them to a Metadata object. The combined text and metadata are wrapped in a Document object, which we add to the current batch.

4.3. splitIntoChunks() Implementation and Getting the Upload Results

Once we’ve assembled our articles as Document objects, with both content and metadata, the next step is to split them into smaller, token-aware chunks that are compatible with our embedding model’s limits.  Finally, let’s see what splitIntoChunks() looks like:

private List<TextSegment> splitIntoChunks(List<Document> documents, int maxTokensPerChunk, int overlapTokens) {
    OpenAiTokenizer tokenizer = new OpenAiTokenizer(OpenAiEmbeddingModelName.TEXT_EMBEDDING_3_SMALL);
    DocumentSplitter splitter = DocumentSplitters.recursive(
            maxTokensPerChunk,
            overlapTokens,
            tokenizer
    );
    List<TextSegment> allSegments = new ArrayList<>();
    for (Document document : documents) {
        List<TextSegment> segments = splitter.split(document);
        allSegments.addAll(segments);
    }
    return allSegments;
}

First, we initialize a tokenizer compatible with OpenAI’s text-embedding-3-small model. Then, we use a DocumentSplitter to split the documents into chunks, while preserving the overlap between adjacent chunks. Each Document is processed and split into several TextSegment instances, which are then returned for embedding in our vector store (MongoDB). During the bootstrap, we should see the following logs:

Documents to store: 328

Stored embeddings

Also, if we review what was stored in the MongoDB using MongoDB Compass, we’ll see all the document content with embeddings generated:

MongoDB loaded articles embeddings

This process is important because most embedding models have token limits. This means only a certain amount of data can be embedded into vectors at once. Chunking allows us to obey these limits, while the overlap helps us maintain continuity between segments. This is especially important for paragraph-based content.

We’ve used only a part of the entire dataset for our demo purposes. Uploading the entire dataset may take some time and require more credits.

5. Chatbot API

Now, let’s implement the Chatbot API flow (our chatbot interface). Here, we’ll create a few beans that retrieve the documents from the vector store and communicate with the LLM, creating context-aware responses. Finally, we’ll build the Chatbot API and verify how it works.

5.1. ArticleBasedAssistant Implementation

We’ll start by creating the ContentRetriever bean, to perform a vector search in MongoDB Atlas using the user’s input:

@Bean
public ContentRetriever contentRetriever(EmbeddingStore<TextSegment> embeddingStore, EmbeddingModel embeddingModel) {
    return EmbeddingStoreContentRetriever.builder()
      .embeddingStore(embeddingStore)
      .embeddingModel(embeddingModel)
      .maxResults(10)
      .minScore(0.8)
      .build();
}

This retriever uses the embedding model to encode the user’s query and compares it against stored article embeddings. Also, we specified the maximum number of result items to be returned and the score, which will control how strictly the response must match our request.

Next, let’s create a ChatLanguageModel bean, that will generate responses based on the retrieved content:

@Bean
public ChatLanguageModel chatModel() {
    return OpenAiChatModel.builder()
      .apiKey(apiKey)
      .modelName("gpt-4o-mini")
      .build();
}

In this bean, we’ve used the gpt-4o-mini model, but we can always choose another model that better matches our requirements.

Now, we’ll create an ArticleBasedAssistant interface. Here, we’ll define the answer() method, which takes a text request and returns a text response:

public interface ArticleBasedAssistant {
    String answer(String question);
}

LangChain4j dynamically implements this interface by combining the configured language model and content retriever. Next, let’s create a bean for our assistant interface:

@Bean
public ArticleBasedAssistant articleBasedAssistant(ChatLanguageModel chatModel, ContentRetriever contentRetriever) {
    return AiServices.builder(ArticleBasedAssistant.class)
      .chatLanguageModel(chatModel)
      .contentRetriever(contentRetriever)
      .build();
}

This setup means we can now call assistant.answer(“…”), and under the hood, the query is embedded, and relevant documents are fetched from the vector store. These documents are used as context in the LLM prompt, and a natural language answer is generated and returned.

5.2. ChatBotController Implementation and Test the Results

Finally, let’s create ChatBotController, which maps a GET request to the chatbot logic:

@RestController
public class ChatBotController {
    private final ArticleBasedAssistant assistant;
    @Autowired
    public ChatBotController(ArticleBasedAssistant assistant) {
        this.assistant = assistant;
    }
    @GetMapping("/chat-bot")
    public String answer(@RequestParam("question") String question) {
        return assistant.answer(question);
    }
}

Here, we’ve implemented the chatbot endpoint, integrating it with the ArticleBasedAssistant. This endpoint accepts a user query via the question request parameter, delegates it to the ArticleBasedAssistant, and returns the generated response as plain text.

Let’s call our chatbot API and see what it responds with:

@AutoConfigureMockMvc
@SpringBootTest(classes = {ChatBotConfiguration.class, ArticlesRepository.class, ChatBotController.class})
class ChatBotLiveTest {
    Logger log = LoggerFactory.getLogger(ChatBotLiveTest.class);
    @Autowired
    private MockMvc mockMvc;
    @Test
    void givenChatBotApi_whenCallingGetEndpointWithQuestion_thenExpectedAnswersIsPresent() throws Exception {
        String chatResponse = mockMvc
          .perform(get("/chat-bot")
            .param("question", "Steps to implement Spring boot app and MongoDB"))
          .andReturn()
          .getResponse()
          .getContentAsString();
        log.info(chatResponse);
        Assertions.assertTrue(chatResponse.contains("Step 1"));
    }
}

In our test, we called the chatbot endpoint and asked it to provide us with the steps to create a Spring Boot application with MongoDB integration. After this, we retrieved and logged the expected result. The complete response is also visible in the logs:

To implement a MongoDB Spring Boot Java Book Tracker application, follow these steps. This guide will help you set up a simple CRUD application to manage books, where you can add, edit, and delete book records stored in a MongoDB database.
### Step 1: Set Up Your Environment
1. **Install Java Development Kit (JDK)**:
   Make sure you have JDK (Java Development Kit) installed on your machine. You can download it from the [Oracle website](https://www.oracle.com/java/technologies/javase-jdk11-downloads.html) or use OpenJDK.
2. **Install MongoDB**:
   Download and install MongoDB from the [MongoDB official website](https://www.mongodb.com/try/download/community). Follow the installation instructions specific to your operating system.
//shortened

6. Conclusion

In this article, we implemented the chatbot web application using Langchain4j and MongoDB Atlas. Using our application, we can interact with the chatbot to get information from the downloaded articles. For further improvements, we could add query preprocessing and handle ambiguous queries. Besides that, we can easily extend the datasets our chatbot bases its answers.

As always, the code is available over on GitHub.

The post Building an AI Chatbot in Java With Langchain4j and MongoDB Atlas first appeared on Baeldung.
         

 

Content mobilized by FeedBlitz RSS Services, the premium FeedBurner alternative.