Proxy Guide (Free)

3.2k

9.5k

(Last Updated: July 6th 2026)

Use proxy. JLLM is not that good, and it can't handle many tokens. (I swear, it's addicting).

I tested a few proxies for my bots (all with free plans), and here is how you can use each one of them:

The way proxy works has made me rework this guide entirely many times up to now, so I decided to change my approach slightly. This guide will now be divided into a few segments, being a "General Guide" at the top — for an explanation of proxy implementation through any routers — and specific guides (such as "Openrouter Guide") below, with more summed up instructions on how to set up your proxy with recommended routers, and recommended models.

Because of that, since information is now a bit spread out, I will leave what I believe to be the best service and AI model here at the top.

Best Model(s):

TokenReply: minimaxai/minimax-m3 | moonshotai/kimi-k2.6

Openrouter: google/gemma-4-31b-it:free

Literouter (Limited): deepseek-v4-flash:free

I have also seen some people having trouble setting it up from this guide. Since the comments have a character limit, if you need any help feel free to send me a DM in Discord: nalabyst

General Guide (Janitor)

For any configuration you want to set up, you will always have to follow the same steps with slight differences. While in Janitor:

• Start a chat with any character

• Click on the purple button above saying "using janitor" or "using [something]" and a new tab will pop up

• Select "Proxy"

• Scroll down to Proxy Configurations and click in "+ New", and it will ask for a few settings:

° Name: you can put whatever you want in here

° Model: model ID (for example: "deepseek-free")

° Proxy URL: endpoint for the router you are using (for example: https://openrouter.ai/api/v1/chat/completions)

° API Key: the API Key that you will create in a specific router

(for all of these I am going to present more information in the specific guides below)

• There are also two sections for prompts, that are important to lead the LLM to behave as you wish:

° Global Prompt: I mostly use this for setting up the prompt, and I like the prompt from this guide

° Custom Prompt (for each model): I only use this when I notice that a certain model needs some fine tuning in the instructions, or for models like Gemini that can flag jailbreaks or NSFW content

• At last, click on "Save", refresh the page, and it's done!

Generation Settings:

The range in brackets represents the usual range for that setting across most models.

• Temperature [0.6 - 1.2]: how "creative" your model will be — higher values means it's more creative. Turning this up too high will make the AI say incoherent stuff. This varies a lot depending on the model you are using.

• Max Tokens [0 - 2k]: how many tokens the AI can generate as output (response). You can set it to 0 and it will be unlimited, and this is not that important, I leave mine at 1.5k just to avoid some glitches.
Note: If you are using "thinking" or "reasoning" models, set this to 0, or it will limit the amount of tokens it can use to generate it's thought process.

• Context Size [32k / 64k / 128k]: how many tokens the AI can take as input (context). For this one you will usually want to set it to the max that the model you are using can take. Most good models are in the range of 32k or 64k contexts, some even 128k and 256k. Always check the limit of the model you are using, and set the context size to be inside that limit.

Advanced:

If you don't want to mess too much with this, you can just set everything to 0 and Janitor will use the model's default. But here's a summary of what each setting is for:

• Top K [30 - 50]: the amount of words the AI has to pick from — the higher, the more words it will gain access to use.

• Top P [0.9 - 0.99]: the probability of the AI picking a more unique word, basically how much of the vocabulary from Top K it can really access.

• Repetition Penalty [0.8 - 1.3]: higher = makes the AI repeats less words in a message.
Note: If you are using "thinking" or "reasoning" models, set this to 0, or it will limit the repetition of words in the model's reasoning.

• Frequency Penalty [0.1 - 0.3]: higher = makes the AI repeats less words in the total chat.
Note: If you are using "thinking" or "reasoning" models, I also recommend setting this to 0. This is not as bad as the other two configurations with this note, but it still affects the model's reasoning over long chats.

Prefill:

You can enable it if you are getting errors, but leave it disabled otherwise.

TokenReply Guide

TokenReply is still my favorite model provider, with decent models (including unlimited ones), as well as availability for each model. Though, some free models seem to get really unstable with high usage.

TokenReply has a general rate limit of 40 requests per day for limited models, and a 3 requests per minute limit for unlimited ones.

Setup:

• Enter tokenreply.com

• Create an account and log in

Proxy URL:

Paste it in the "Proxy URL" field on Janitor (mentioned in the General Guide):

• Default: https://api.tokenreply.com/v1

• Modified (Sophia's Lorebay): (custom — more information in the "Sohpia's Lorebay Guide" below)

API Key:

Paste it in the "API Key" field on Janitor (mentioned in the General Guide):

• Go to "Console" at the top

• Copy your API Key and paste it on Janitor

Note: After you create your account, you can use any model labeled as "Free" in the model list without any hard limits. However, sometimes there are a $1 claim you can take, or you can use the daily check-ins to add $0.01 to your wallet — which allow for you to use the "Weekly Featured" models.

Models:

You can search for the available models on your own and test them, but here are my recommendations:

Free (Fully Unlimited):

• minimaxai/minimax-m3

• moonshotai/kimi-k2.6

• stepfun-ai/step-3.5-flash | stepfun-ai/step-3.7-flash (for some reason Step 3.5 Flash works better for me)

• qwen/qwen3.5-397b-a17b

• grok-4.3-high

Weekly Featured (Limited):

• gemini-2.5-flash

• deepseek-v4-pro

• deepseek-v4-flash-thinking

• kimi-k2.7

You can just paste the model IDs in the "Model" field on Janitor (mentioned in the General Guide).

Each one of the models you set up will have better settings that change drastically their performance (specially the Temperature). If you want you can search it up, or if you are too lazy just ask another AI, like Gemini or ChatGPT, what are the best parameters for each one of the models you find.

P.S.: If you are getting too many errors with a model using this router, check the model availability. If it's below 50%, you should probably switch to another model until it goes back up.

Openrouter Guide

Openrouter is probably the most known model router available, being the most stable one. Recently, it had been in it's "dark ages", but some good models have now showed up again.

Openrouter has a general limit rate of around 50 requests per day for free.

Setup:

• Enter openrouter.ai

• Create an account and log in

Proxy URL:

Paste it in the "Proxy URL" field on Janitor (mentioned in the General Guide):

• Default: https://openrouter.ai/api/v1/chat/completions

• Modified (Sophia's Lorebay): https://api.lorebary.com/openrouter

API Key:

Paste it in the "API Key" field on Janitor (mentioned in the General Guide):

• Go to you account settings (usually in the drop down menu by your icon)

• Go to "API Keys"

• Click on "Create"

• It will ask for a few settings, but they are not important — you can just set a name of your choosing and click "Create"

• Copy your key and paste it on Janitor

Models:

You can search for the available models on your own and test them, but here are my recommendations:

• google/gemma-4-31b-it:free

• nousresearch/hermes-3-llama-3.1-405b:free (I couldn't even test this because of errors 429, but I heard it's decent)

• nvidia/nemotron-3-ultra-550b-a55b:free

You can just paste the model IDs in the "Model" field on Janitor (mentioned in the General Guide).

Literouter Guide

Literouter has some good models and they have a limit per model instead of being per account. However, recently I found out that the context window of the models in the free plan are limited to 5k, so I would not recommend using it, as that is pretty low.

Literouter has a limit rate defined per model, most having a limit of 30 requests per day, with some even being completely unlimited.

Setup:

• Go to literouter.com (for some reason if you search for "literouter" on Google it doesn't show the page up)

• Click on "Get Started"

• Create an account and log in

Proxy URL:

Paste it in the "Proxy URL" field on Janitor (mentioned in the General Guide):

• Default: https://api.literouter.com/v1/chat/completions

• Modified (Sophia's Lorebay): (custom — more information in the "Sohpia's Lorebay Guide" below)

API Key:

Paste it in the "API Key" field on Janitor (mentioned in the General Guide):

• Go to "API Keys" on the sidebar

• Click on "Create New API Key"

• Click on "Copy" by the key you just created and paste it on Janitor

Models:

You can search for the available models on your own and test them, but here are my recommendations:

Limited (Daily Rate Limit: 30):

• deepseek-v4-flash:free

• deepseek-v3.2:free

• kimi-k2.5:free

Unlimited:

• openrouter:free:full-context

• minimax-m2.1:free

Honestly, these are so bad I would recommend not using Literouter right now if you want decent unlimited ones.

You can just paste the model IDs in the "Model" field on Janitor (mentioned in the General Guide).

Other Routers

There are a few other router options that I tested and I think are decent. The process is the same:

• Create an account and log in

• Create and copy an API Key then paste it on Janitor

• Search the docs for the Proxy URL (and if it doesn't work try adding "/v1" or "/v1/chat/completions" at the end) and paste it on Janitor

• Search for models and paste the ID on Janitor

Other options I tried, if you want to check out for yourself, are:

• Electron Hub (model examples: Deepseek v4 Flash, Kimi K2.5 — it's actually very good too, but it's currently unavailable for free)

• NavyAI (model examples: Gemini v2.5 Pro — but has a very low token limitation per day)

• MeganovaAI (model examples: Sapphira-L3.3-70B-0.1 — decent but not as good as the others)

Sophia's Lorebay Guide

For any routers you can set them up to work with Sophia's Lorebay. This is a service made by the community that has many functions to improve your chats, the only downside I have heard is that it has some added restrictions for extreme NSFW content.

Setup:

• Go to lorebary.sophiamccarty.com

• Create an account and log in

• Hover over "Proxy Portal" and go to "Connect Proxy"

• There are default URLs for popular routers (such as Openrouter) for you to pick from, but if you don't see the service you are using there, click on "Cheating on Us?"

• Click on "Add Proxy" and set it up:

° Endpoint URL: paste the default Proxy URL for the router you are using

° Nickname: give it a name of your choosing

• Click on "Create & Test"

• Most of the time it will give an error, but just click on "Continue anyway"

• Copy the URL and click on "Got it"

• Paste it in the "Proxy URL" field on Janitor (mentioned in the General Guide)

Server Commands:

With Sophia's URL set up, you can use some "commands", which are just prompt injections that will give instructions to the AI to enhance specific behaviors. You can look at the list of full commands by going to "Proxy Portal" and then "Server Commands".

You can insert the commands in your custom prompt, in your chat memory, or the bot's definition. A few examples of commands I recommend:

<NOOMNISCIENCE>
Characters only know what they witnessed, were told, or logically deduced. Stops NPCs from magically knowing secrets or reacting to things they could not have seen.

<NOCLICHES>
Kills the cringe. No more "orbs" for eyes, "shivers down spines", or dramatic monologues. Fresh expressions, simple gestures, understated reactions.

<REALISTICDIALOGUE>
Messy human conversation - interruptions, filler words, trailing off, awkward pauses, talking over each other, mumbling. No perfect speeches.

proxy allowed

Published chats

comments

Leave a comment or feedback for the creator ❤️

Proxy Guide (Free)

3.2k

9.5k

by:@nalabyst

(Last Updated: July 6th 2026)

Use proxy. JLLM is not that good, and it can't handle many tokens. (I swear, it's addicting).

I tested a few proxies for my bots (all with free plans), and here is how you can use each one of them:

Because of that, since information is now a bit spread out, I will leave what I believe to be the best service and AI model here at the top.

Best Model(s):

TokenReply: minimaxai/minimax-m3 | moonshotai/kimi-k2.6

Openrouter: google/gemma-4-31b-it:free

Literouter (Limited): deepseek-v4-flash:free

I have also seen some people having trouble setting it up from this guide. Since the comments have a character limit, if you need any help feel free to send me a DM in Discord: nalabyst

General Guide (Janitor)

For any configuration you want to set up, you will always have to follow the same steps with slight differences. While in Janitor:

• Start a chat with any character

• Click on the purple button above saying "using janitor" or "using [something]" and a new tab will pop up

• Select "Proxy"

• Scroll down to Proxy Configurations and click in "+ New", and it will ask for a few settings:

° Name: you can put whatever you want in here

° Model: model ID (for example: "deepseek-free")

° Proxy URL: endpoint for the router you are using (for example: https://openrouter.ai/api/v1/chat/completions)

° API Key: the API Key that you will create in a specific router

(for all of these I am going to present more information in the specific guides below)

• There are also two sections for prompts, that are important to lead the LLM to behave as you wish:

° Global Prompt: I mostly use this for setting up the prompt, and I like the prompt from this guide

• At last, click on "Save", refresh the page, and it's done!

Generation Settings:

The range in brackets represents the usual range for that setting across most models.

Advanced:

If you don't want to mess too much with this, you can just set everything to 0 and Janitor will use the model's default. But here's a summary of what each setting is for:

• Top K [30 - 50]: the amount of words the AI has to pick from — the higher, the more words it will gain access to use.

• Top P [0.9 - 0.99]: the probability of the AI picking a more unique word, basically how much of the vocabulary from Top K it can really access.

Prefill:

You can enable it if you are getting errors, but leave it disabled otherwise.

TokenReply Guide

TokenReply has a general rate limit of 40 requests per day for limited models, and a 3 requests per minute limit for unlimited ones.

Setup:

• Enter tokenreply.com

• Create an account and log in

Proxy URL:

Paste it in the "Proxy URL" field on Janitor (mentioned in the General Guide):

• Default: https://api.tokenreply.com/v1

• Modified (Sophia's Lorebay): (custom — more information in the "Sohpia's Lorebay Guide" below)

API Key:

Paste it in the "API Key" field on Janitor (mentioned in the General Guide):

• Go to "Console" at the top

• Copy your API Key and paste it on Janitor

Models:

You can search for the available models on your own and test them, but here are my recommendations:

Free (Fully Unlimited):

• minimaxai/minimax-m3

• moonshotai/kimi-k2.6

• stepfun-ai/step-3.5-flash | stepfun-ai/step-3.7-flash (for some reason Step 3.5 Flash works better for me)

• qwen/qwen3.5-397b-a17b

• grok-4.3-high

Weekly Featured (Limited):

• gemini-2.5-flash

• deepseek-v4-pro

• deepseek-v4-flash-thinking

• kimi-k2.7

You can just paste the model IDs in the "Model" field on Janitor (mentioned in the General Guide).

P.S.: If you are getting too many errors with a model using this router, check the model availability. If it's below 50%, you should probably switch to another model until it goes back up.

Openrouter Guide

Openrouter is probably the most known model router available, being the most stable one. Recently, it had been in it's "dark ages", but some good models have now showed up again.

Openrouter has a general limit rate of around 50 requests per day for free.

Setup:

• Enter openrouter.ai

• Create an account and log in

Proxy URL:

Paste it in the "Proxy URL" field on Janitor (mentioned in the General Guide):

• Default: https://openrouter.ai/api/v1/chat/completions

• Modified (Sophia's Lorebay): https://api.lorebary.com/openrouter

API Key:

Paste it in the "API Key" field on Janitor (mentioned in the General Guide):

• Go to you account settings (usually in the drop down menu by your icon)

• Go to "API Keys"

• Click on "Create"

• It will ask for a few settings, but they are not important — you can just set a name of your choosing and click "Create"

• Copy your key and paste it on Janitor

Models:

You can search for the available models on your own and test them, but here are my recommendations:

• google/gemma-4-31b-it:free

• nousresearch/hermes-3-llama-3.1-405b:free (I couldn't even test this because of errors 429, but I heard it's decent)

• nvidia/nemotron-3-ultra-550b-a55b:free

You can just paste the model IDs in the "Model" field on Janitor (mentioned in the General Guide).

Literouter Guide

Literouter has a limit rate defined per model, most having a limit of 30 requests per day, with some even being completely unlimited.

Setup:

• Go to literouter.com (for some reason if you search for "literouter" on Google it doesn't show the page up)

• Click on "Get Started"

• Create an account and log in

Proxy URL:

Paste it in the "Proxy URL" field on Janitor (mentioned in the General Guide):

• Default: https://api.literouter.com/v1/chat/completions

• Modified (Sophia's Lorebay): (custom — more information in the "Sohpia's Lorebay Guide" below)

API Key:

Paste it in the "API Key" field on Janitor (mentioned in the General Guide):

• Go to "API Keys" on the sidebar

• Click on "Create New API Key"

• Click on "Copy" by the key you just created and paste it on Janitor

Models:

You can search for the available models on your own and test them, but here are my recommendations:

Limited (Daily Rate Limit: 30):

• deepseek-v4-flash:free

• deepseek-v3.2:free

• kimi-k2.5:free

Unlimited:

• openrouter:free:full-context

• minimax-m2.1:free

Honestly, these are so bad I would recommend not using Literouter right now if you want decent unlimited ones.

You can just paste the model IDs in the "Model" field on Janitor (mentioned in the General Guide).

Other Routers

There are a few other router options that I tested and I think are decent. The process is the same:

• Create an account and log in

• Create and copy an API Key then paste it on Janitor

• Search the docs for the Proxy URL (and if it doesn't work try adding "/v1" or "/v1/chat/completions" at the end) and paste it on Janitor

• Search for models and paste the ID on Janitor

Other options I tried, if you want to check out for yourself, are:

• Electron Hub (model examples: Deepseek v4 Flash, Kimi K2.5 — it's actually very good too, but it's currently unavailable for free)

• NavyAI (model examples: Gemini v2.5 Pro — but has a very low token limitation per day)

• MeganovaAI (model examples: Sapphira-L3.3-70B-0.1 — decent but not as good as the others)

Sophia's Lorebay Guide

Setup:

• Go to lorebary.sophiamccarty.com

• Create an account and log in

• Hover over "Proxy Portal" and go to "Connect Proxy"

• There are default URLs for popular routers (such as Openrouter) for you to pick from, but if you don't see the service you are using there, click on "Cheating on Us?"

• Click on "Add Proxy" and set it up:

° Endpoint URL: paste the default Proxy URL for the router you are using

° Nickname: give it a name of your choosing

• Click on "Create & Test"

• Most of the time it will give an error, but just click on "Continue anyway"

• Copy the URL and click on "Got it"

• Paste it in the "Proxy URL" field on Janitor (mentioned in the General Guide)

Server Commands:

You can insert the commands in your custom prompt, in your chat memory, or the bot's definition. A few examples of commands I recommend:

<NOOMNISCIENCE>
Characters only know what they witnessed, were told, or logically deduced. Stops NPCs from magically knowing secrets or reacting to things they could not have seen.

<NOCLICHES>
Kills the cringe. No more "orbs" for eyes, "shivers down spines", or dramatic monologues. Fresh expressions, simple gestures, understated reactions.

<REALISTICDIALOGUE>
Messy human conversation - interruptions, filler words, trailing off, awkward pauses, talking over each other, mumbling. No perfect speeches.

proxy allowed