AWS Bedrock Converse chat model integration.

Setup: Install @langchain/aws and set the following environment variables:

npm install @langchain/aws
export BEDROCK_AWS_REGION="your-aws-region"
export BEDROCK_AWS_SECRET_ACCESS_KEY="your-aws-secret-access-key"
export BEDROCK_AWS_ACCESS_KEY_ID="your-aws-access-key-id"

Runtime args can be passed as the second argument to any of the base runnable methods .invoke. .stream, .batch, etc. They can also be passed via the .bind, or the second arg in the .bindTools method, like shown in the example below:

// When calling `.bind`, call options should be passed via the first argument
const llmWithArgsBound = llm.bind({
stop: ["\n"],
tools: [...],
});

// When calling `.bindTools`, call options should be passed via the second argument
const llmWithTools = llm.bindTools(
[...],
{
tool_choice: "auto",
}
);
Instantiate
import { ChatBedrockConverse } from '@langchain/aws';

const llm = new ChatBedrockConverse({
model: "anthropic.claude-3-5-sonnet-20240620-v1:0",
temperature: 0,
maxTokens: undefined,
timeout: undefined,
maxRetries: 2,
region: process.env.BEDROCK_AWS_REGION,
credentials: {
secretAccessKey: process.env.BEDROCK_AWS_SECRET_ACCESS_KEY!,
accessKeyId: process.env.BEDROCK_AWS_ACCESS_KEY_ID!,
},
// other params...
});

Invoking
const messages = [
{
type: "system" as const,
content: "You are a helpful translator. Translate the user sentence to French.",
},
{
type: "human" as const,
content: "I love programming.",
},
];
const result = await llm.invoke(messages);
console.log(result);
AIMessage {
  "id": "81a27f7a-550c-473d-8307-c2fbb9c74956",
  "content": "Here's the translation to French:\n\nJ'adore la programmation.",
  "response_metadata": {
    "$metadata": {
      "httpStatusCode": 200,
      "requestId": "81a27f7a-550c-473d-8307-c2fbb9c74956",
      "attempts": 1,
      "totalRetryDelay": 0
    },
    "metrics": {
      "latencyMs": 1109
    },
    "stopReason": "end_turn",
    "usage": {
      "inputTokens": 25,
      "outputTokens": 19,
      "totalTokens": 44
    }
  },
  "usage_metadata": {
    "input_tokens": 25,
    "output_tokens": 19,
    "total_tokens": 44
  }
}

Streaming Chunks
for await (const chunk of await llm.stream(messages)) {
console.log(chunk);
}
AIMessageChunk {
  "content": ""  
  "response_metadata": {
    "messageStart": {
      "p": "abcdefghijk",
      "role": "assistant"
    }
  }  
}
AIMessageChunk {
  "content": "Here"
}
AIMessageChunk {
  "content": "'s"
}
AIMessageChunk {
  "content": " the translation"
}
AIMessageChunk {
  "content": " to"
}
AIMessageChunk {
  "content": " French:\n\nJ"
}
AIMessageChunk {
  "content": "'adore la"
}
AIMessageChunk {
  "content": " programmation."
}
AIMessageChunk {
  "content": ""  
  "response_metadata": {
    "contentBlockStop": {
      "contentBlockIndex": 0,
      "p": "abcdefghijk"
    }
  }  
}
AIMessageChunk {
  "content": ""  
  "response_metadata": {
    "messageStop": {
      "stopReason": "end_turn"
    }
  }  
}
AIMessageChunk {
  "content": ""  
  "response_metadata": {
    "metadata": {
      "metrics": {
        "latencyMs": 838
      },
      "p": "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123",
      "usage": {
        "inputTokens": 25,
        "outputTokens": 19,
        "totalTokens": 44
      }
    }
  }  
  "usage_metadata": {
    "input_tokens": 25,
    "output_tokens": 19,
    "total_tokens": 44
  }
}

Aggregate Streamed Chunks
import { AIMessageChunk } from '@langchain/core/messages';
import { concat } from '@langchain/core/utils/stream';

const stream = await llm.stream(messages);
let full: AIMessageChunk | undefined;
for await (const chunk of stream) {
full = !full ? chunk : concat(full, chunk);
}
console.log(full);
AIMessageChunk {
  "content": "Here's the translation to French:\n\nJ'adore la programmation.",
  "response_metadata": {
    "messageStart": {
      "p": "ab",
      "role": "assistant"
    },
    "contentBlockStop": {
      "contentBlockIndex": 0,
      "p": "abcdefghijklmnopqrstuvwxyzABCDEFGHIJK"
    },
    "messageStop": {
      "stopReason": "end_turn"
    },
    "metadata": {
      "metrics": {
        "latencyMs": 838
      },
      "p": "abcdefghijklmnopqrstuvwxyz",
      "usage": {
        "inputTokens": 25,
        "outputTokens": 19,
        "totalTokens": 44
      }
    }
  },
  "usage_metadata": {
    "input_tokens": 25,
    "output_tokens": 19,
    "total_tokens": 44
  }
}

Bind tools
import { z } from 'zod';

const GetWeather = {
name: "GetWeather",
description: "Get the current weather in a given location",
schema: z.object({
location: z.string().describe("The city and state, e.g. San Francisco, CA")
}),
}

const GetPopulation = {
name: "GetPopulation",
description: "Get the current population in a given location",
schema: z.object({
location: z.string().describe("The city and state, e.g. San Francisco, CA")
}),
}

const llmWithTools = llm.bindTools(
[GetWeather, GetPopulation],
{
// strict: true // enforce tool args schema is respected
}
);
const aiMsg = await llmWithTools.invoke(
"Which city is hotter today and which is bigger: LA or NY?"
);
console.log(aiMsg.tool_calls);
[
  {
    id: 'tooluse_hIaiqfweRtSiJyi6J4naJA',
    name: 'GetWeather',
    args: { location: 'Los Angeles, CA' },
    type: 'tool_call'
  },
  {
    id: 'tooluse_nOS8B0UlTd2FdpH4MSHw9w',
    name: 'GetWeather',
    args: { location: 'New York, NY' },
    type: 'tool_call'
  },
  {
    id: 'tooluse_XxMpZiETQ5aVS5opVDyIaw',
    name: 'GetPopulation',
    args: { location: 'Los Angeles, CA' },
    type: 'tool_call'
  },
  {
    id: 'tooluse_GpYvAfldT2aR8VQfH-p4PQ',
    name: 'GetPopulation',
    args: { location: 'New York, NY' },
    type: 'tool_call'
  }
]

Structured Output
import { z } from 'zod';

const Joke = z.object({
setup: z.string().describe("The setup of the joke"),
punchline: z.string().describe("The punchline to the joke"),
rating: z.number().optional().describe("How funny the joke is, from 1 to 10")
}).describe('Joke to tell user.');

const structuredLlm = llm.withStructuredOutput(Joke);
const jokeResult = await structuredLlm.invoke("Tell me a joke about cats");
console.log(jokeResult);
{
  setup: "Why don't cats play poker in the jungle?",
  punchline: 'Too many cheetahs!',
  rating: 7
}

Multimodal
import { HumanMessage } from '@langchain/core/messages';

const imageUrl = "https://example.com/image.jpg";
const imageData = await fetch(imageUrl).then(res => res.arrayBuffer());
const base64Image = Buffer.from(imageData).toString('base64');

const message = new HumanMessage({
content: [
{ type: "text", text: "describe the weather in this image" },
{
type: "image_url",
image_url: { url: `data:image/jpeg;base64,${base64Image}` },
},
]
});

const imageDescriptionAiMsg = await llm.invoke([message]);
console.log(imageDescriptionAiMsg.content);
The weather in this image appears to be clear and pleasant. The sky is a vibrant blue with scattered white clouds, suggesting a sunny day with good visibility. The clouds are light and wispy, indicating fair weather conditions. There's no sign of rain, storm, or any adverse weather patterns. The lush green grass on the rolling hills looks well-watered and healthy, which could indicate recent rainfall or generally favorable weather conditions. Overall, the image depicts a beautiful, calm day with blue skies and sunshine - perfect weather for enjoying the outdoors.

Usage Metadata
const aiMsgForMetadata = await llm.invoke(messages);
console.log(aiMsgForMetadata.usage_metadata);
{ input_tokens: 25, output_tokens: 19, total_tokens: 44 }

Stream Usage Metadata
const streamForMetadata = await llm.stream(messages);
let fullForMetadata: AIMessageChunk | undefined;
for await (const chunk of streamForMetadata) {
fullForMetadata = !fullForMetadata ? chunk : concat(fullForMetadata, chunk);
}
console.log(fullForMetadata?.usage_metadata);
{ input_tokens: 25, output_tokens: 19, total_tokens: 44 }

Response Metadata
const aiMsgForResponseMetadata = await llm.invoke(messages);
console.log(aiMsgForResponseMetadata.response_metadata);
{
  '$metadata': {
    httpStatusCode: 200,
    requestId: '5de2a2e5-d1dc-4dff-bb02-31361f4107bc',
    extendedRequestId: undefined,
    cfId: undefined,
    attempts: 1,
    totalRetryDelay: 0
  },
  metrics: { latencyMs: 1163 },
  stopReason: 'end_turn',
  usage: { inputTokens: 25, outputTokens: 19, totalTokens: 44 }
}

Hierarchy (view full)

Implements

Constructors

Properties

client: BedrockRuntimeClient
model: string = "anthropic.claude-3-haiku-20240307-v1:0"

Model to use. For example, "anthropic.claude-3-haiku-20240307-v1:0", this is equivalent to the modelId property in the list-foundation-models api. See the below link for a full list of models.

https://docs.aws.amazon.com/bedrock/latest/userguide/model-ids.html#model-ids-arns

anthropic.claude-3-haiku-20240307-v1:0
region: string

The AWS region e.g. us-west-2. Fallback to AWS_DEFAULT_REGION env variable or region specified in ~/.aws/config in case it is not provided here.

streamUsage: boolean = true

Whether or not to include usage data, like token counts in the streamed response chunks. Passing as a call option will take precedence over the class-level setting.

true
streaming: boolean = false

Whether or not to stream responses

additionalModelRequestFields?: DocumentType

Additional inference parameters that the model supports, beyond the base set of inference parameters that the Converse API supports in the inferenceConfig field. For more information, see the model parameters link below.

endpointHost?: string

Override the default endpoint hostname.

guardrailConfig?: GuardrailConfiguration

Configuration information for a guardrail that you want to use in the request.

maxTokens?: number = undefined

Max tokens.

supportsToolChoiceValues?: ("any" | "auto" | "tool")[]

Which types of tool_choice values the model supports.

Inferred if not specified. Inferred as ['auto', 'any', 'tool'] if a 'claude-3' model is used, ['auto', 'any'] if a 'mistral-large' model is used, empty otherwise.

temperature?: number = undefined

Temperature.

topP?: number

The percentage of most-likely candidates that the model considers for the next token. For example, if you choose a value of 0.8 for topP, the model selects from the top 80% of the probability distribution of tokens that could be next in the sequence. The default value is the default value for the model that you are using. For more information, see the inference parameters for foundation models link below.

Methods

  • Parameters

    • tools: any[]
    • Optionalkwargs: Partial<unknown>

    Returns Runnable<BaseLanguageModelInput, AIMessageChunk, this["ParsedCallOptions"]>