Published: 2026-06-01 โ€ข Updated: 2026-06-20

Integrating AWS Bedrock and Amazon SageMaker with Spring Boot

As enterprise Java applications evolve, integrating artificial intelligence and machine learning directly into business logic has become a standard requirement. AWS provides two powerful ecosystems for this: AWS Bedrock (for serverless access to industry-leading foundational models like Anthropic Claude, Meta Llama, and Amazon Titan) and Amazon SageMaker (for deploying, training, and hosting custom-built machine learning models). This guide will walk you through integrating both services into a production-grade Spring Boot application using the AWS SDK for Java v2.

If you have not yet provisioned the necessary cloud topology parameters or IAM execution definitions to host these runtimes, review our end-to-end infrastructure automation playbook: Provisioning AWS AI Infrastructure with Terraform.

Understanding the Architecture

Before writing code, it is crucial to understand when to use AWS Bedrock versus Amazon SageMaker. Bedrock is ideal for Generative AI tasks where you want to leverage pre-trained, state-of-the-art foundation models via a simple API. SageMaker is designed for traditional machine learning (like classification, regression, and clustering) or custom deep learning models where you manage the underlying hosting endpoints.

+-------------------------------------------------------------+
|                     Spring Boot Application                 |
+------------------------------+------------------------------+
                               |
                +--------------+--------------+
                |                             |
                v (Serverless API)            v (Custom Endpoint)
        +---------------+             +------------------+
        |  AWS Bedrock  |             | Amazon SageMaker |
        |  (Claude,     |             | (Custom XGBoost, |
        |  Llama, etc.) |             | PyTorch, etc.)   |
        +---------------+             +------------------+
        

To evaluate how these target configurations fit inside a decentralized system boundary before building concrete service endpoints, check out Designing AI-Driven Microservices Architectures.

Prerequisites and Dependencies

To interact with AWS Bedrock and SageMaker, your Spring Boot application requires the AWS SDK for Java v2. Add the following dependencies to your Maven pom.xml file. We manage the SDK versions using the AWS Bill of Materials (BOM) to ensure compatibility.

<dependencyManagement>
    <dependencies>
        <dependency>
            <groupId>software.amazon.awssdk</groupId>
            <artifactId>bom</artifactId>
            <version>2.25.15</version>
            <type>pom</type>
            <scope>import</scope>
        </dependency>
    </dependencies>
</dependencyManagement>

<dependencies>
    <!-- AWS SDK for Bedrock Runtime -->
    <dependency>
        <groupId>software.amazon.awssdk</groupId>
        <artifactId>bedrockruntime</artifactId>
    </dependency>
    <!-- AWS SDK for SageMaker Runtime -->
    <dependency>
        <groupId>software.amazon.awssdk</groupId>
        <artifactId>sagemakerruntime</artifactId>
    </dependency>
    <!-- Spring Boot Starter Web -->
    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-web</artifactId>
    </dependency>
</dependencies>

For a detailed breakdown of aligning your local workstation parameters and compiler configurations to utilize these libraries efficiently, see Setting up Java Development Environment for AI.

Integrating AWS Bedrock in Spring Boot

AWS Bedrock allows you to execute inference on foundation models. In this example, we will configure the BedrockRuntimeClient and invoke the Anthropic Claude 3 Sonnet model using a JSON payload.

1. Configuration Class

We define a configuration class to initialize the BedrockRuntimeClient bean. It automatically resolves credentials from the default AWS credential provider chain (IAM roles, environment variables, or local AWS profiles).

package com.example.ai.config;

import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import software.amazon.awssdk.regions.Region;
import software.amazon.awssdk.services.bedrockruntime.BedrockRuntimeClient;

@Configuration
public class AwsBedrockConfig {

    @Bean
    public BedrockRuntimeClient bedrockRuntimeClient() {
        return BedrockRuntimeClient.builder()
                .region(Region.US_EAST_1)
                .build();
    }
}

2. Bedrock Service Implementation

Next, we create a service that constructs the required JSON payload for the target model, invokes the Bedrock API, and parses the response. Note that different foundation models expect different JSON structures.

package com.example.ai.service;

import org.springframework.stereotype.Service;
import software.amazon.awssdk.core.SdkBytes;
import software.amazon.awssdk.services.bedrockruntime.BedrockRuntimeClient;
import software.amazon.awssdk.services.bedrockruntime.model.InvokeModelRequest;
import software.amazon.awssdk.services.bedrockruntime.model.InvokeModelResponse;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.databind.node.ObjectNode;

@Service
public class BedrockService {

    private final BedrockRuntimeClient bedrockClient;
    private final ObjectMapper objectMapper;

    public BedrockService(BedrockRuntimeClient bedrockClient, ObjectMapper objectMapper) {
        this.bedrockClient = bedrockClient;
        this.objectMapper = objectMapper;
    }

    public String generateText(String prompt) {
        try {
            ObjectNode rootNode = objectMapper.createObjectNode();
            rootNode.put("anthropic_version", "bedrock-2023-05-31");
            rootNode.put("max_tokens", 1000);
            
            var messagesArray = rootNode.putArray("messages");
            var messageObj = objectMapper.createObjectNode();
            messageObj.put("role", "user");
            
            var contentArray = messageObj.putArray("content");
            var contentObj = objectMapper.createObjectNode();
            contentObj.put("type", "text");
            contentObj.put("text", prompt);
            contentArray.add(contentObj);
            messagesArray.add(messageObj);

            String jsonPayload = objectMapper.writeValueAsString(rootNode);

            InvokeModelRequest request = InvokeModelRequest.builder()
                    .modelId("anthropic.claude-3-sonnet-20240229-v1:0")
                    .contentType("application/json")
                    .accept("application/json")
                    .body(SdkBytes.fromUtf8String(jsonPayload))
                    .build();

            InvokeModelResponse response = bedrockClient.invokeModel(request);
            String responseBody = response.body().asUtf8String();

            var responseNode = objectMapper.readTree(responseBody);
            return responseNode.get("content").get(0).get("text").asText();

        } catch (Exception e) {
            throw new RuntimeException("Failed to invoke AWS Bedrock", e);
        }
    }
}

To inspect how to surface this generative backend capability through managed controller endpoints, read Building AI-Powered Spring Boot REST APIs. If you want to check your model integrations prior to orchestrating them on the cloud, review our local playground module: Integrating OpenAI, HuggingFace, and Local LLMs via Ollama.

Integrating Amazon SageMaker in Spring Boot

If you have a custom-trained model (e.g., a fraud detection model built with XGBoost or TensorFlow) deployed on an active SageMaker endpoint, you can query it using the SageMakerRuntimeClient.

1. Configuration Class

Define a configuration bean for the SageMaker Runtime client.

package com.example.ai.config;

import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import software.amazon.awssdk.regions.Region;
import software.amazon.awssdk.services.sagemakerruntime.SageMakerRuntimeClient;

@Configuration
public class AwsSageMakerConfig {

    @Bean
    public SageMakerRuntimeClient sageMakerRuntimeClient() {
        return SageMakerRuntimeClient.builder()
                .region(Region.US_EAST_1)
                .build();
    }
}

2. SageMaker Service Implementation

This service sends a CSV or JSON payload to a hosted SageMaker endpoint and processes the prediction output.

package com.example.ai.service;

import org.springframework.stereotype.Service;
import software.amazon.awssdk.core.SdkBytes;
import software.amazon.awssdk.services.sagemakerruntime.SageMakerRuntimeClient;
import software.amazon.awssdk.services.sagemakerruntime.model.InvokeEndpointRequest;
import software.amazon.awssdk.services.sagemakerruntime.model.InvokeEndpointResponse;

@Service
public class SageMakerService {

    private final SageMakerRuntimeClient sageMakerClient;

    public SageMakerService(SageMakerRuntimeClient sageMakerClient) {
        this.sageMakerClient = sageMakerClient;
    }

    public String predict(String endpointName, String csvData) {
        try {
            InvokeEndpointRequest request = InvokeEndpointRequest.builder()
                    .endpointName(endpointName)
                    .contentType("text/csv")
                    .accept("text/csv")
                    .body(SdkBytes.fromUtf8String(csvData))
                    .build();

            InvokeEndpointResponse response = sageMakerClient.invokeEndpoint(request);
            return response.body().asUtf8String();
        } catch (Exception e) {
            throw new RuntimeException("Failed to invoke SageMaker endpoint", e);
        }
    }
}

Abstracting Integrations with the Spring AI Ecosystem

While the direct AWS SDK v2 clients provide fine-grained, low-level control over network handshakes and payload manipulation, you can also leverage high-level framework patterns. Spring AI provides clean abstractions that shield your architecture from complex, vendor-specific configurations.

To learn how to manage model endpoints using generic application models, read our overview on Introduction to the Spring AI Framework. If you want to use these components to implement a production-grade multi-turn chat environment, read our guide on Managing Chat Memory and Conversational Context in Spring Boot.

Context Enrichment and Vector Data Pathways

In production applications, simply calling a base foundation model like Claude often results in missing enterprise context. To resolve this, modern microservice designs rely on Retrieval-Augmented Generation (RAG) pipelines to augment input structures with real-time vector matches.

To understand how to convert raw database content into mathematical vectors, see Understanding Vector Databases and Embeddings in Java. Once you have structured your vector database layers, implement clean contextual workflows by reading Implementing RAG with Spring AI.

Containerization, Cluster Deployment, and Scaling

Once your Spring Boot application can interact with AWS AI runtimes, you need to package and deploy it to an enterprise-grade execution platform like Kubernetes.

Learn how to safely wrap your binary into thin, optimized runtimes by exploring Containerizing AI-Enabled Java Applications with Docker. To launch these containers into a production-ready cloud architecture, review Deploying AI Java Microservices to Kubernetes.

If your workload needs to process decoupling streams asynchronously via messaging backplanes, make sure to configure your consumer loops according to Asynchronous AI Processing with Spring Boot and Kafka.

For applications deployed specifically within managed AWS environments, follow our deployment playbook at Deploying Java AI Microservices on AWS EKS. If your pods execute inference locally using deep native weights rather than serverless endpoints, monitor and optimize your hardware profiles using Kubernetes Scaling and GPU Resources for AI Workloads.

Enterprise Security, Observability, and Cost Control

Exposing open API routes to AI models introduces unique vulnerabilities, such as prompt injection and malicious semantic payloads. To protect your inputs and data channels from these vulnerabilities, apply the safeguards detailed in Securing AI APIs, Prompts, and Data Pipelines in Spring Boot.

Additionally, production-grade systems require deep metrics tracking to monitor performance and cost. To implement comprehensive tracing and metrics dashboarding across your cluster workloads, follow our setup guide: Observability Strategies for AI Apps via Prometheus and Grafana.

Finally, to optimize your cloud spend, ensure your Java runtimes are lean and highly efficient. You can drastically reduce container memory overhead and boot times by using GraalVM. Learn how by checking out Optimizing Java AI Applications: GraalVM Native Images & Cost Management.

Real-World Use Cases

  • Intelligent Customer Support (AWS Bedrock): Integrating Claude or Llama into a Spring Boot customer portal to answer complex policy questions by passing dynamic context retrieved from databases.
  • Real-Time Transaction Fraud Detection (Amazon SageMaker): Passing transaction attributes (amount, location, velocity) from a Spring Boot microservice to a custom SageMaker XGBoost model endpoint to get a fraud probability score before authorizing payments.
  • Document Summarization Pipelines: Using Spring Integration to listen to an S3 bucket, trigger a Bedrock summarization model when a new document is uploaded, and save the summary to a relational database.

Common Mistakes to Avoid

  • Blocking the Main Thread: Large language models (LLMs) on AWS Bedrock can take several seconds to generate a complete response. Do not block your main HTTP servlet threads. Use Spring Boot's asynchronous capabilities (@Async) or reactive programming (Spring WebFlux) with Bedrock's streaming APIs (InvokeModelWithResponseStream).
  • Hardcoding AWS Credentials: Never hardcode AWS Access Keys and Secret Keys in application.properties. Always use the Default Credential Provider Chain, which securely fetches credentials from IAM Instance Profiles in EC2, ECS Task Roles, or Kubernetes Service Accounts.
  • Ignoring Payload Size Limits: SageMaker and Bedrock have strict limits on payload sizes. For very large inputs, upload the data to an S3 bucket first and pass the S3 URI to the model rather than sending raw bytes over HTTP.
  • Omitting Timeout Configurations: Network glitches can cause requests to hang. Always configure explicit HTTP timeouts on the AWS SDK client builders to prevent thread starvation in your Spring Boot container.

Interview Preparation Notes

  • What is the difference between Bedrock and SageMaker? Bedrock is a fully managed serverless service for consuming foundational models via APIs. SageMaker is an end-to-end platform for building, training, tuning, and deploying custom models on managed infrastructure.
  • How do you handle rate-limiting and throttling in Bedrock/SageMaker? Implement exponential backoff and retries. The AWS SDK for Java has built-in retry strategies that can be configured when building the client.
  • How do you secure data in transit when calling these services? All communication with AWS Bedrock and SageMaker is encrypted in transit using HTTPS/TLS. IAM policies should be configured using the principle of least privilege to restrict access to specific models and endpoints.

Summary

Integrating machine learning into Spring Boot applications is highly streamlined using the AWS SDK v2. AWS Bedrock provides a quick, serverless route to integrating state-of-the-art Generative AI models, while Amazon SageMaker allows you to harness custom-trained models for domain-specific predictions. By configuring robust clients, handling payloads asynchronously, and adhering to AWS security best practices, you can build highly scalable, production-grade AI pipelines.

For further reading on securing your microservices before exposing these AI endpoints, refer to our comprehensive security guide: Securing AI APIs, Prompts, and Data Pipelines in Spring Boot.

About the Author

Naresh Kumar

Naresh Kumar

Senior Java Backend Engineer experienced in Banking, Payments, ISO 20022, Spring Boot, Microservices, Kafka, Docker, Kubernetes, AWS and Cloud Native Systems.

Built enterprise payment solutions, transaction processing systems, API platforms and scalable microservices used in production.

LinkedIn Profile