LLM 호출 로깅

이 가이드는 커스텀 모델이나 커스텀 입출력 형식을 사용할 때 LangSmith에 LLM 호출을 로깅하는 방법을 다룹니다. LangSmith의 LLM 트레이스 처리 기능을 최대한 활용하려면 지정된 형식 중 하나로 LLM 트레이스를 로깅해야 합니다. LangSmith는 LLM 트레이스에 대해 다음과 같은 이점을 제공합니다:

메시지 목록의 풍부하고 구조화된 렌더링
LLM 호출별, 트레이스별, 시간 경과에 따른 트레이스 전체의 토큰 및 비용 추적

제안된 형식으로 LLM 트레이스를 로깅하지 않아도 LangSmith에 데이터를 로깅할 수는 있지만, 예상한 방식으로 처리되거나 렌더링되지 않을 수 있습니다. 언어 모델을 호출하기 위해 LangChain OSS나 LangSmith 래퍼(OpenAI, Anthropic)를 사용하는 경우, 이러한 접근 방식은 자동으로 올바른 형식으로 트레이스를 로깅합니다.

이 페이지의 예제는 모델 실행을 로깅하기 위해 traceable 데코레이터/래퍼를 사용합니다(Python 및 JS/TS에 권장되는 접근 방식). 그러나 동일한 개념이 RunTree나 API를 직접 사용하는 경우에도 적용됩니다.

메시지 형식

커스텀 모델이나 커스텀 입출력 형식을 추적할 때는 LangChain 형식, OpenAI completions 형식 또는 Anthropic messages 형식 중 하나를 따라야 합니다. 자세한 내용은 OpenAI Chat Completions 또는 Anthropic Messages 문서를 참조하세요. LangChain 형식은 다음과 같습니다:

Show LangChain 형식

messages

array

required

대화 내용을 포함하는 메시지 목록입니다.

role

string

required

메시지 유형을 식별합니다. 다음 중 하나: system | reasoning | user | assistant | tool

content

array

required

메시지의 내용입니다. 타입이 지정된 딕셔너리의 목록입니다.

Show 콘텐츠 옵션

type

string

required

Show text

type

literal('text')

required

text

string

required

텍스트 내용입니다.

annotations

object[]

텍스트에 대한 어노테이션 목록입니다.

extras

object

추가적인 제공자별 데이터입니다.

Show reasoning

type

literal('reasoning')

required

text

string

required

텍스트 내용입니다.

extras

object

추가적인 제공자별 데이터입니다.

Show image

type

literal('image')

required

url

string

이미지 위치를 가리키는 URL입니다.

base64

string

required

Base64로 인코딩된 이미지 데이터입니다.

string

외부에 저장된 이미지에 대한 참조 ID입니다(예: 제공자의 파일 시스템이나 버킷).

mime_type

string

이미지 MIME 타입입니다(예: image/jpeg, image/png).

Show file (예: PDF)

type

literal('file')

required

url

string

파일을 가리키는 URL입니다.

base64

string

required

Base64로 인코딩된 파일 데이터입니다.

string

외부에 저장된 파일에 대한 참조 ID입니다(예: 제공자의 파일 시스템이나 버킷).

mime_type

string

파일 MIME 타입입니다(예: application/pdf).

Show audio

type

literal('audio')

required

url

string

오디오 파일을 가리키는 URL입니다.

base64

string

required

Base64로 인코딩된 오디오 데이터입니다.

string

외부에 저장된 오디오 파일에 대한 참조 ID입니다(예: 제공자의 파일 시스템이나 버킷).

mime_type

string

오디오 MIME 타입입니다(예: audio/mpeg, audio/wav).

Show video

type

literal('video')

required

url

string

비디오 파일을 가리키는 URL입니다.

base64

string

required

Base64로 인코딩된 비디오 데이터입니다.

string

외부에 저장된 비디오 파일에 대한 참조 ID입니다(예: 제공자의 파일 시스템이나 버킷).

mime_type

string

비디오 MIME 타입입니다(예: video/mp4, video/webm).

Show tool_call

type

literal('tool_call')

required

name

string

args

object

required

도구에 전달할 인자입니다.

string

이 도구 호출의 고유 식별자입니다.

Show server_tool_call

type

literal('server_tool_call')

required

string

required

이 도구 호출의 고유 식별자입니다.

name

string

required

호출할 도구의 이름입니다.

args

object

required

도구에 전달할 인자입니다.

Show server_tool_result

type

literal('server_tool_result')

required

tool_call_id

string

required

해당 서버 도구 호출의 식별자입니다.

string

이 도구 호출의 고유 식별자입니다.

status

string

required

서버 측 도구의 실행 상태입니다. 다음 중 하나: success | error.

output

실행된 도구의 출력입니다.

tool_call_id

string

이전 assistant 메시지의 tool_calls[i] 항목의 id와 일치해야 합니다. role이 tool일 때만 유효합니다.

usage_metadata

object

이 필드를 사용하여 모델의 출력과 함께 토큰 수 및/또는 비용을 전송하세요. 자세한 내용은 이 가이드를 참조하세요.

예제

 inputs = {
  "messages": [
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "Hi, can you tell me the capital of France?"
        }
      ]
    }
  ]
}

outputs = {
  "messages": [
    {
      "role": "assistant",
      "content": [
        {
          "type": "text",
          "text": "The capital of France is Paris."
        },
        {
          "type": "reasoning",
          "text": "The user is asking about..."
        }
      ]
    }
  ]
}

커스텀 I/O 형식을 LangSmith 호환 형식으로 변환

커스텀 입력 또는 출력 형식을 사용하는 경우, @traceable 데코레이터(Python) 또는 traceable 함수(TS)의 process_inputs/processInputs 및 process_outputs/processOutputs 함수를 사용하여 LangSmith 호환 형식으로 변환할 수 있습니다. process_inputs/processInputs와 process_outputs/processOutputs는 특정 트레이스의 입력과 출력이 LangSmith에 로깅되기 전에 변환할 수 있는 함수를 허용합니다. 이들은 트레이스의 입력과 출력에 접근할 수 있으며, 처리된 데이터가 포함된 새 딕셔너리를 반환할 수 있습니다. 다음은 process_inputs와 process_outputs를 사용하여 커스텀 I/O 형식을 LangSmith 호환 형식으로 변환하는 보일러플레이트 예제입니다:

Show 코드

class OriginalInputs(BaseModel):
    """앱의 커스텀 요청 형태"""

class OriginalOutputs(BaseModel):
    """앱의 커스텀 응답 형태."""

class LangSmithInputs(BaseModel):
    """LangSmith가 예상하는 입력 형식."""

class LangSmithOutputs(BaseModel):
    """LangSmith가 예상하는 출력 형식."""

def process_inputs(inputs: dict) -> dict:
    """Dict -> OriginalInputs -> LangSmithInputs -> dict"""

def process_outputs(output: Any) -> dict:
    """OriginalOutputs -> LangSmithOutputs -> dict"""


@traceable(run_type="llm", process_inputs=process_inputs, process_outputs=process_outputs)
def chat_model(inputs: dict) -> dict:
    """
    앱의 모델 호출입니다. 커스텀 I/O 형태를 유지합니다.
    데코레이터는 process_*를 호출하여 LangSmith 호환 형식을 로깅합니다.
    """

트레이스에서 커스텀 모델 식별

커스텀 모델을 사용할 때는 트레이스를 볼 때와 필터링할 때 모델을 식별하기 위해 다음 metadata 필드도 제공하는 것이 좋습니다.

ls_provider: 모델의 제공자입니다(예: “openai”, “anthropic” 등).
ls_model_name: 모델의 이름입니다(예: “gpt-4o-mini”, “claude-3-opus-20240307” 등).

from langsmith import traceable

inputs = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "I'd like to book a table for two."},
]
output = {
    "choices": [
        {
            "message": {
                "role": "assistant",
                "content": "Sure, what time would you like to book the table for?"
            }
        }
    ]
}

@traceable(
    run_type="llm",
    metadata={"ls_provider": "my_provider", "ls_model_name": "my_model"}
)
def chat_model(messages: list):
    return output

chat_model(inputs)

이 코드는 다음과 같은 트레이스를 로깅합니다:

시스템 및 사용자 입력과 AI 출력이 포함된 ChatOpenAI라는 LLM 호출 트레이스를 보여주는 LangSmith UI.

커스텀 스트리밍 chat_model을 구현하는 경우, 출력을 비스트리밍 버전과 동일한 형식으로 “축소”할 수 있습니다. 현재 Python에서만 지원됩니다.

def _reduce_chunks(chunks: list):
    all_text = "".join([chunk["choices"][0]["message"]["content"] for chunk in chunks])
    return {"choices": [{"message": {"content": all_text, "role": "assistant"}}]}

@traceable(
    run_type="llm",
    reduce_fn=_reduce_chunks,
    metadata={"ls_provider": "my_provider", "ls_model_name": "my_model"}
)
def my_streaming_chat_model(messages: list):
    for chunk in ["Hello, " + messages[1]["content"]]:
        yield {
            "choices": [
                {
                    "message": {
                        "content": chunk,
                        "role": "assistant",
                    }
                }
            ]
        }

list(
    my_streaming_chat_model(
        [
            {"role": "system", "content": "You are a helpful assistant. Please greet the user."},
            {"role": "user", "content": "polly the parrot"},
        ],
    )
)

extra.metadata에 ls_model_name이 없는 경우, 토큰 수를 추정하기 위해 extra.metadata의 다른 필드가 사용될 수 있습니다. 다음 필드가 우선순위에 따라 사용됩니다:

metadata.ls_model_name
inputs.model
inputs.model_name

metadata 필드를 사용하는 방법에 대해 자세히 알아보려면 메타데이터 및 태그 추가 가이드를 참조하세요.

토큰 및 비용 정보 제공

LangSmith는 토큰 수가 제공되면 모델 가격표를 사용하여 비용을 자동으로 계산합니다. LangSmith가 토큰 기반 비용을 계산하는 방법을 알아보려면 이 가이드를 참조하세요. 많은 모델이 응답의 일부로 토큰 수를 포함합니다. 다음 두 가지 방법 중 하나로 LangSmith에 토큰 수를 제공할 수 있습니다:

추적된 함수 내에서 사용량을 추출하고 실행의 메타데이터에 usage_metadata 필드를 설정합니다.
추적된 함수 출력에 usage_metadata 필드를 반환합니다.

두 경우 모두 전송하는 사용량 메타데이터는 다음 LangSmith 인식 필드의 하위 집합을 포함해야 합니다:

아래에 나열된 필드 이외의 필드는 설정할 수 없습니다. 모든 필드를 포함할 필요는 없습니다.

class UsageMetadata(TypedDict, total=False):
    input_tokens: int
    """프롬프트에 사용된 토큰 수."""
    output_tokens: int
    """출력으로 생성된 토큰 수."""
    total_tokens: int
    """사용된 총 토큰 수."""
    input_token_details: dict[str, float]
    """입력 토큰의 세부 정보."""
    output_token_details: dict[str, float]
    """출력 토큰의 세부 정보."""
    input_cost: float
    """입력 토큰의 비용."""
    output_cost: float
    """출력 토큰의 비용."""
    total_cost: float
    """토큰의 총 비용."""
    input_cost_details: dict[str, float]
    """입력 토큰의 비용 세부 정보."""
    output_cost_details: dict[str, float]
    """출력 토큰의 비용 세부 정보."""

사용량 데이터에는 비용 정보도 포함될 수 있으며, LangSmith의 토큰 기반 비용 공식에 의존하지 않으려는 경우에 유용합니다. 이는 토큰 유형별로 선형적이지 않은 가격 책정을 가진 모델에 유용합니다.

실행 메타데이터 설정

추적된 함수 내에서 현재 실행의 메타데이터를 수정하여 사용량 정보를 포함할 수 있습니다. 이 접근 방식의 장점은 추적된 함수의 런타임 출력을 변경할 필요가 없다는 것입니다. 다음은 예제입니다:

langsmith>=0.3.43(Python) 및 langsmith>=0.3.30(JS/TS)이 필요합니다.

from langsmith import traceable, get_current_run_tree

inputs = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "I'd like to book a table for two."},
]

@traceable(
    run_type="llm",
    metadata={"ls_provider": "my_provider", "ls_model_name": "my_model"}
)
def chat_model(messages: list):
    llm_output = {
        "choices": [
            {
                "message": {
                    "role": "assistant",
                    "content": "Sure, what time would you like to book the table for?"
                }
            }
        ],
        "usage_metadata": {
            "input_tokens": 27,
            "output_tokens": 13,
            "total_tokens": 40,
            "input_token_details": {"cache_read": 10},
            # If you wanted to specify costs:
            # "input_cost": 1.1e-6,
            # "input_cost_details": {"cache_read": 2.3e-7},
            # "output_cost": 5.0e-6,
        },
    }
    run = get_current_run_tree()
    run.set(usage_metadata=llm_output["usage_metadata"])
    return llm_output["choices"][0]["message"]

chat_model(inputs)

실행 출력 설정

함수의 응답에 usage_metadata 키를 추가하여 수동으로 토큰 수와 비용을 설정할 수 있습니다.

from langsmith import traceable

inputs = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "I'd like to book a table for two."},
]
output = {
    "choices": [
        {
            "message": {
                "role": "assistant",
                "content": "Sure, what time would you like to book the table for?"
            }
        }
    ],
    "usage_metadata": {
        "input_tokens": 27,
        "output_tokens": 13,
        "total_tokens": 40,
        "input_token_details": {"cache_read": 10},
        # If you wanted to specify costs:
        # "input_cost": 1.1e-6,
        # "input_cost_details": {"cache_read": 2.3e-7},
        # "output_cost": 5.0e-6,
    },
}

@traceable(
    run_type="llm",
    metadata={"ls_provider": "my_provider", "ls_model_name": "my_model"}
)
def chat_model(messages: list):
    return output

chat_model(inputs)

Time-to-first-token

traceable이나 SDK 래퍼 중 하나를 사용하는 경우, LangSmith는 스트리밍 LLM 실행에 대해 time-to-first-token을 자동으로 채웁니다. 그러나 RunTree API를 직접 사용하는 경우, time-to-first-token을 올바르게 채우려면 실행 트리에 new_token 이벤트를 추가해야 합니다. 다음은 예제입니다:

from langsmith.run_trees import RunTree
run_tree = RunTree(
    name="CustomChatModel",
    run_type="llm",
    inputs={ ... }
)
run_tree.post()
llm_stream = ...
first_token = None
for token in llm_stream:
    if first_token is None:
      first_token = token
      run_tree.add_event({
        "name": "new_token"
      })
run_tree.end(outputs={ ... })
run_tree.patch()

Edit the source of this page on GitHub.

Connect these docs programmatically to Claude, VSCode, and more via MCP for real-time answers.

Tracing setup

Configuration & troubleshooting

Viewing & managing traces

Automations

Feedback & evaluation

Monitoring & alerting

Data type reference

메시지 형식

예제

커스텀 I/O 형식을 LangSmith 호환 형식으로 변환

트레이스에서 커스텀 모델 식별

토큰 및 비용 정보 제공

실행 메타데이터 설정

실행 출력 설정

Time-to-first-token

Tracing setup

Configuration & troubleshooting

Viewing & managing traces

Automations

Feedback & evaluation

Monitoring & alerting

Data type reference

​메시지 형식

​예제

​커스텀 I/O 형식을 LangSmith 호환 형식으로 변환

​트레이스에서 커스텀 모델 식별

​토큰 및 비용 정보 제공

​실행 메타데이터 설정

​실행 출력 설정

​Time-to-first-token

메시지 형식

예제

커스텀 I/O 형식을 LangSmith 호환 형식으로 변환

트레이스에서 커스텀 모델 식별

토큰 및 비용 정보 제공

실행 메타데이터 설정

실행 출력 설정

Time-to-first-token