Test Drive: Sambanova by Nat Taylor

Test Drive: Sambanova

Published on Sep 29, 2024.

Today I’m test driving Sambanova, “the World’s Fastest AI Inference.” Inference is typically measured in tokens per second and Sambanova is 10X faster. I’ve been imagining free text inputs that do live validation of the results of extraction by an LLM. e.g. Imagine this textarea below for a user to input job details like “product manager in boston”.

The process is:

Get an API key from Sambanova
Write a prompt
Call the model

Here’s my sample code, which runs in about 500ms. So… yes, you could do live validation. Neat!

from litellm import completion
from textwrap import dedent

response = completion(
    model="sambanova/Meta-Llama-3.1-8B-Instruct",
    messages=[
        {"role": "system", "content": dedent("""\
            Extract the job details as JSON with keys:
            - location: str
            - job_title: str
            - level: Literal['junior', 'mid', 'senior']
            Respond only with valid JSON. Do not write an introduction or summary.
        """)},
        {
            "role": "user",
            "content": dedent("""\
                VP Product in boston
            """),
        }
    ],
    
)
print(response.choices[0].message.content)
# {
#   "location": "Boston",
#   "job_title": "VP Product",
#   "level": "senior"
# }

Nat Taylor — Blog, AI, Product Management & Tinkering

Test Drive: Sambanova

Post Navigation