Deploy your own chatGPT-like assistant on AWS

Posted by Chris McKinnel - 21 November 2023
7 minute read

There is a heap of turmoil surrounding the abrupt exit of Sam Altman from OpenAI this week, apparently a heap of staff at OpenAI are also leaving / threatening to leave. I suspect this will mean feature releases of chatGPT slow down a bit while they figure out how to ship code with a development team in tatters.

I don't know about you, but I'm not interested in letting the Silicon Valley Game of Thrones affect my ability to crank out code at a pace only seen in movies like Swordfish... Nah, just kidding - the real reason I wrote this blog post is to show you how to get most of the good bits of chatGPT without any of the bad bits, and without having to buy a chatGPT Enterprise license. All on AWS! Sweet!

First off, let me be clear that I'm not rubbishing chatGPT with this post. I think it's a revolutionary tool that either has or will soon create a step-change in the productivity of tech workers globally. The impact it'll have can't be understated.

BUT, it doesn't fill me with joy that whatever I put into the chatGPT UI can be used to train their models. I'm sure it's great for them, and maybe great for humanity but it aint great for the NDAs and contracts that have my signature on them. This is a huge handbrake for using it at work - you can't just copy and paste in some Terraform or customer code with a related error and have it magically fixed like you do in your personal projects (and if you do, good luck with the lawsuit).

Sure, you can buy chatGPT Enterprise and run it in your Azure subscription, but I wonder how much it costs? If you have to ask, then you can't afford it. And I definintely have to ask.

Anyway, what exactly are the things that are stopping you using chatGPT in your day-to-day at work the way you really want to use it?

Your data is probably being used to re-train the models
You have to upload your data to a public SaaS
Once your data is up there, there's no getting it back
Larry the Lawyer will have your guts for garters if you share the wrong thing

Borrrrrrrrrring.

In more interesting news, last week we had AWS run a Bedrock / GenAI training session for us at CCL. It was great because it showed everyone what was possible with the AWS Bedrock service, LangChain and the currently available models in Bedrock. Thanks to Shivonne Londt for delivering this for us, it really was great!

During this session we were introduced to a Python library called Streamlit. It takes some of the pain away for backend developers like me that want to fight someone every time they open a .js or .css file. Pre-built, beautifully styled UI elements with a fully functional API to back them all. Whoa.

I sat down last night and started trying to build a chatGPT-like interface for Bedrock with Streamlit. I got a decent way through and was asking various LLMs and Google how to solve this error:

As I was looking into this error, I was thinking to myself: "man, all I'd need to do is save the chat histories in Dynamodb, spruce up the UI and I'd be away laughing".

Then, lo and behold, I stumbled across a couple of Japanese guys that did exactly this 3 months ago. Yep, you read that right - they have written a chatGPT-like UI for Bedrock and Claude.

Yahtzee!

First I thought: "damn it, beaten to the punch again", and then I thought "wait, this is awesome, I don't have to do any work!".

The architecture looks like this:

They've made it totally serverless, and in true AWS fashion you only pay for what you use. So you can have this thing deployed 24/7 and pay sweet FA until you get down to coding. I know I paid for chatGPT for a month or two without using it, inbetween bouts of coding.

Let's deploy it!

Before we hit go on the CDK deploy, we need to update a couple of settings so this thing isn't accessible by the public, and is usable as a coding assistant.

Fire up a CloudShell and check out the repo.

Update the IP addresses to expose the web app to:

cdk/cdk.json

Disable self-register:

cdk/lib/constructs/auth.ts

Change token limit

backend/app/config.py

Let's goooooo

Now all we need to do is run `bin.sh` (and yes, you should read what it does first).

This will deploy a CDK Toolkit stack and a CodeBuild stack, which will then kick off a CDK deploy for the frontend and backend.

After about 10 minutes, you'll get a CloudFront URL. When you hit this URL, you shouldn't be able to self-register, but it turns out that flag in auth.ts doesn't quite work.

The WAF IP ranges don't seem to work either, but that could be because I've deployed this stack a few times and there are leftover resources. I wish they had used Terraform instead of CDK - maybe that'll be a mini-project to keep me busy over Christmas.

While self-register is enabled, you may as well make yourself a user.

Make sure you manually go into Cognito and disable self-register next, and check the WAF IP list while you're at it.

Let's give it a whirl

Use the instant model for basic stuff like summarising your notes, writing emails for you, etc.

Use Claudev2 for writing code / config, as it's most like the GPT4 model.

Limitations

It doesn't have the ability to do stuff that OpenAI just released, like reading PDFs, but I don't see that as a super hard thing to implement.

I'd love to see the open source community rally behind this thing!

Summary

Things are moving extremely quickly in the GenAI space, and I suspect this post will be out of date shortly after it's written. But I wrote it anyway because the ability to run something like this in a private AWS account basically for free is a huge leap foward in achieving the big productivity gains I think are possible with this technology.

Deploying this thing in your own AWS account gives you certainty that your data won't be used to train any models, and won't be shared outside the bounds of your AWS account. You have the freedom to feed in sensitive documents / data, and the ability to write your own features / customisations.

And you can tell Larry the Lawyer to go and give someone else a speeding ticket.

Bring on the next 12 months!

Deleting the stack(s)

Just a final note on deleting the stack as it's a bit painful. As long as you kill the following bits it should be all good, though.

Delete CDK Toolkit S3 bucket obejcts / version
Delete CDK Toolkit bucket
Delete access logs S3 bucket objects
Delete access logs S3 objects
Delete Cognito user pool
Delete CDK tools CloudFormation stack
Delete Codebuild CloudFormation stack
Run `cdk destroy --all` in the `cdk` directory