An end-to-end information masking integration with the Sleeper API, creation of a Streamlit UI, and deployment by way of AWS CDK
It’s embarrassing how a lot time I spend enthusiastic about my fantasy soccer group.
Managing a squad means processing a firehose of data — damage experiences, skilled projections, upcoming bye weeks, and favorable matchups. And it’s not simply the quantity of knowledge, however the ephermerality— in case your star RB tweaks a hamstring throughout Wednesday apply, you higher not be basing lineup selections off of Tuesday’s report.
This is the reason general-purpose chatbots like Anthropic’s Claude and OpenAI’s ChatGPT are basically ineffective for fantasy soccer suggestions, as they’re restricted to a static coaching corpus that cuts off months, even years in the past.
As an illustration, if we ask Claude Sonnet 3.5 who the present greatest operating again is, we see names like Christian McCaffrey, Breece Corridor, and Travis Etienne, who’ve had injury-ridden or in any other case disappointing seasons to date in 2024. There isn’t any point out of Saquon Barkley or Derrick Henry, the plain frontrunners at this stage. (Although to Claude’s credit score, it discloses its limitations.)
Apps like Perplexity are extra correct as a result of they do entry a search engine with up-to-date data. Nonetheless, it after all has no information of my whole roster scenario, the state of our league’s playoff image, or the nuances of our keeper guidelines.
There is a chance to tailor a fantasy football-focused Agent with instruments and customized context for every person.
Let’s dig into the implementation.
Structure Overview
The center of the chatbot will likely be a LangGraph Agent based mostly on the ReAct framework. We’ll give it entry to instruments that combine with the Sleeper API for frequent operations like checking the league standings, rosters, participant stats, skilled evaluation, and extra.
Along with the LangGraph API server, our backend will embrace a small Postgres database and Redis cache, that are used to handle state and route requests. We’ll use Streamlit for a easy, however efficient UI.
For growth, we are able to run all of those parts domestically by way of Docker Compose, however I’ll additionally present the infrastructure-as-code (IaC) to deploy a scalable stack with AWS CDK.
Sleeper API Integration
Sleeper graciously exposes a public, read-only API that we are able to faucet into for person & league particulars, together with a full listing of gamers, rosters, and draft data. Although it’s not documented explicitly, I additionally discovered some GraphQL endpoints that present important statistics, projections, and — maybe most dear of all — current skilled evaluation by NFL reporters.
I created a easy API shopper to entry the assorted strategies, which yow will discover here. The one trick that I needed to spotlight is the requests-cache
library. I don’t wish to be a grasping shopper of Sleeper’s freely-available datasets, so I cache responses in an area Sqlite database with a primary TTL mechanism.
Not solely does this reduce the quantity redundant API site visitors bombarding Sleeper’s servers (decreasing the prospect that they blacklist my IP deal with), however it considerably reduces latency for my purchasers, making for a greater UX.
Organising and utilizing the cache is useless easy, as you possibly can see on this snippet —
import requests_cache
from urllib.parse import urljoin
from typing import Union, Non-compulsory
from pathlib import Pathclass SleeperClient:
def __init__(self, cache_path: str = '../.cache'):
# config
self.cache_path = cache_path
self.session = requests_cache.CachedSession(
Path(cache_path) / 'api_cache',
backend='sqlite',
expire_after=60 * 60 * 24,
)
...
def _get_json(self, path: str, base_url: Non-compulsory[str] = None) -> dict:
url = urljoin(base_url or self.base_url, path)
return self.session.get(url).json()
def get_player_stats(self, player_id: Union[str, int], season: Non-compulsory[int] = None, group_by_week: bool = False):
return self._get_json(
f'stats/nfl/participant/{player_id}?season_type=common&season={season or self.nfl_state["season"]}{"&grouping=week" if group_by_week else ""}',
base_url=self.stats_url,
)
So operating one thing like
self.session.get(url)
first checks the native Sqlite cache for an unexpired response that specific request. If it’s discovered, we are able to skip the API name and simply learn from the database.
Defining the Instruments
I wish to flip the Sleeper API shopper right into a handful of key capabilities that the Agent can use to tell its responses. As a result of these capabilities will successfully be invoked by the LLM, I discover it vital to annotate them clearly and ask for easy, versatile arguments.
For instance, Sleeper’s API’s typically ask for numeric participant id’s, which is sensible for a programmatic interface. Nonetheless, I wish to summary that idea away from the LLM and simply have it enter participant names for these capabilities. To make sure some extra flexibility and permit for issues like typos, I applied a primary “fuzzy search” technique to map participant identify searches to their related participant id.
# file: fantasy_chatbot/league.pydef get_player_id_fuzzy_search(self, player_name: str) -> tuple[str, str]:
# will want a easy search engine to go from participant identify to participant id with no need precise matches. returns the player_id and matched participant identify as a tuple
nearest_name = course of.extract(question=player_name, decisions=self.player_names, scorer=fuzz.WRatio, restrict=1)[0]
return self.player_name_to_id[nearest_name[0]], self.player_names[nearest_name[2]]
# instance utilization in a software
def get_player_news(self, player_name: Annotated[str, "The player's name."]) -> str:
"""
Get current information a few participant for probably the most up-to-date evaluation and damage standing.
Use this each time naming a participant in a possible deal, as you need to all the time have the correct context for a suggestion.
If sources are offered, embrace markdown-based hyperlink(s)
(e.g. [Rotoballer](https://www.rotoballer.com/player-news/saquon-barkley-has-historic-night-sunday/1502955) )
on the backside of your response to offer correct attribution
and permit the person to be taught extra.
"""
player_id, player_name = self.get_player_id_fuzzy_search(player_name)
# information
information = self.shopper.get_player_news(player_id, restrict=3)
player_news = f"Current Information about {player_name}nn"
for n in information:
player_news += f"**{n['metadata']['title']}**n{n['metadata']['description']}"
if evaluation := n['metadata'].get('evaluation'):
player_news += f"nnAnalysis:n{evaluation}"
if url := n['metadata'].get('url'):
# markdown hyperlink to supply
player_news += f"n[{n['source'].capitalize()}]({url})nn"
return player_news
That is higher than a easy map of identify to participant id as a result of it permits for misspellings and different typos, e.g. saquon
→ Saquon Barkley
I created numerous helpful instruments based mostly on these ideas:
- Get League Standing (standings, present week, no. playoff groups, and many others.)
- Get Roster for Staff Proprietor
- Get Participant Information (up-to-date articles / evaluation in regards to the participant)
- Get Participant Stats (weekly factors scored this season with matchups)
- Get Participant Present Proprietor (important for proposing trades)
- Get Greatest Accessible at Place (the waiver wire)
- Get Participant Rankings (efficiency up to now, damaged down by place)
You’ll be able to most likely suppose of some extra capabilities that may be helpful so as to add, like particulars about current transactions, league head-to-heads, and draft data.
LangGraph Agent
The impetus for this complete venture was a possibility to be taught the LangGraph ecosystem, which can be changing into the de facto normal for setting up agentic workflows.
I’ve hacked collectively brokers from scratch up to now, and I want I had recognized about LangGraph on the time. It’s not only a skinny wrapper across the varied LLM suppliers, it supplies immense utility for constructing, deploying, & monitoring advanced workflows. I’d encourage you to take a look at the Introduction to LangGraph course by LangChain Academy in case you’re curious about diving deeper.
As talked about earlier than, the graph itself relies on the ReAct framework, which is a well-liked and efficient approach to get LLM’s to work together with exterior instruments like these outlined above.
I’ve additionally added a node to persist long-term reminiscences about every person, in order that data will be continued throughout periods. I would like our agent to “keep in mind” issues like customers’ considerations, preferences, and previously-recommended trades, as this isn’t a function that’s applied significantly properly within the chatbots I’ve seen. In graph type, it appears like this:
Fairly easy proper? Once more, you possibly can checkout the total graph definition within the code, however I’ll spotlight the write_memory
node, which is accountable for writing & updating a profile for every person. This enables us to trace key interactions whereas being environment friendly about token use.
def write_memory(state: MessagesState, config: RunnableConfig, retailer: BaseStore):
"""Replicate on the chat historical past and save a reminiscence to the shop."""# get the username from the config
username = config["configurable"]["username"]
# retrieve present reminiscence if accessible
namespace = ("reminiscence", username)
existing_memory = retailer.get(namespace, "user_memory")
# format the reminiscences for the instruction
if existing_memory and existing_memory.worth:
memory_dict = existing_memory.worth
formatted_memory = (
f"Staff Title: {memory_dict.get('team_name', 'Unknown')}n"
f"Present Issues: {memory_dict.get('current_concerns', 'Unknown')}"
f"Different Particulars: {memory_dict.get('other_details', 'Unknown')}"
)
else:
formatted_memory = None
system_msg = CREATE_MEMORY_INSTRUCTION.format(reminiscence=formatted_memory)
# invoke the mannequin to supply structured output that matches the schema
new_memory = llm_with_structure.invoke([SystemMessage(content=system_msg)] + state['messages'])
# overwrite the present person profile
key = "user_memory"
retailer.put(namespace, key, new_memory)
These reminiscences are surfaced within the system prompt, the place I additionally gave the LLM primary particulars about our league and the way I would like it to deal with frequent person requests.
Streamlit UI and Demo
I’m not a frontend developer, so the UI leans closely on Streamlit’s parts and acquainted chatbot patterns. Customers enter their Sleeper username, which is used to lookup their accessible leagues and persist reminiscences throughout threads.
I additionally added a few bells and whistles, like implementing token streaming in order that customers get on the spot suggestions from the LLM. The opposite vital piece is a “analysis pane”, which surfaces the outcomes of the Agent’s software calls in order that person can examine the uncooked information that informs every response.
Right here’s a fast demo.
Deployment
For growth, I like to recommend deploying the parts domestically by way of the offered docker-compose.yml
file. This may expose the API domestically at http://localhost:8123
, so you possibly can quickly take a look at modifications and connect with it from an area Streamlit app.
I’ve additionally included IaC for an AWS CDK-based deployment that I take advantage of to host the app on the web. Many of the assets are outlined here. Discover the parallels between the docker-compose.yml
and the CDK code associated to the ECS setup:
Snippet from docker-compose.yml
for the LangGraph API container:
# from docker-compose.ymllanggraph-api:
picture: "fantasy-chatbot"
ports:
- "8123:8000"
healthcheck:
take a look at: curl --request GET --url http://localhost:8000/okay
timeout: 1s
retries: 5
interval: 5s
depends_on:
langgraph-redis:
situation: service_healthy
langgraph-postgres:
situation: service_healthy
env_file: "../.env"
atmosphere:
REDIS_URI: redis://langgraph-redis:6379
POSTGRES_URI: postgres://postgres:postgres@langgraph-postgres:5432/postgres?sslmode=disable// file: fantasy-football-agent-stack.ts
And right here is the analogous setup within the CDK stack:
// fantasy-football-agent-stack.tsconst apiImageAsset = new DockerImageAsset(this, 'apiImageAsset', {
listing: path.be part of(__dirname, '../../fantasy_chatbot'),
file: 'api.Dockerfile',
platform: belongings.Platform.LINUX_AMD64,
});
const apiContainer = taskDefinition.addContainer('langgraph-api', {
containerName: 'langgraph-api',
picture: ecs.ContainerImage.fromDockerImageAsset(apiImageAsset),
portMappings: [{
containerPort: 8000,
}],
atmosphere: {
...dotenvMap,
REDIS_URI: 'redis://127.0.0.1:6379',
POSTGRES_URI: 'postgres://postgres:postgres@127.0.0.1:5432/postgres?sslmode=disable'
},
logging: ecs.LogDrivers.awsLogs({
streamPrefix: 'langgraph-api',
}),
});
apiContainer.addContainerDependencies(
{
container: redisContainer,
situation: ecs.ContainerDependencyCondition.HEALTHY,
},
{
container: postgresContainer,
situation: ecs.ContainerDependencyCondition.HEALTHY,
},
)
Except for some refined variations, it’s successfully a 1:1 translation, which is all the time one thing I search for when evaluating native environments to “prod” deployments. The DockerImageAsset
is a very helpful useful resource, because it handles constructing and deploying (to ECR) the Docker picture throughout synthesis.
Word: Deploying the stack to your AWS account by way of
npm run cdk deploy
WILL incur costs. On this demo code I’ve not included any password safety on the Streamlit app, which means anybody who has the URL can use the chatbot! I extremely suggest including some extra safety in case you plan to deploy it your self.
Takeaways
You wish to hold your instruments easy. This app does lots, however remains to be lacking some key performance, and it’ll begin to break down if I merely add extra instruments. Sooner or later, I wish to break up the graph into task-specific sub-components, e.g. a “Information Analyst” Agent and a “Statistician” Agent.
Traceability and debugging are extra vital with Agent-based apps than conventional software program. Regardless of important developments in fashions’ means to supply structured outputs, LLM-based perform calling remains to be inherently much less dependable than typical packages. I used LangSmith extensively for debugging.
In an age of commoditized language fashions, there isn’t a alternative for dependable reporters. We’re at a degree the place you possibly can put collectively an inexpensive chatbot in a weekend, so how do merchandise differentiate themselves and construct moats? This app (or another prefer it) could be ineffective with out entry to high-quality reporting from analysts and specialists. In different phrases, the Ian Rapaport’s and Matthew Berry’s of the world are extra worthwhile than ever.
Repo
All photographs, except in any other case famous, are by the writer.