About My Bots and API reccomendations, and setting reccomendations.
Hello everyone!
This isn't a normal update and this will be the last post I make before I finally upload the next batch of bots.
While working on Yang, I thought of something:
Some people may not know the best APIs to use for my bots. Or what is a cheaper way of using APIs?
I'll get into it now.
First of all, I do not make bots for JLLM. JLLM is seriously outdated to me, and I always end up making high-token bots, which seriously kills JLLMs context window.
No, what I actually use is a mix of APIs.
-Deepseek is the most popular right now but can be expensive depending on the size of the character, and long roleplays. With this, I would seriously reccomend saving your memory every 12 to 15 messages to free up space. Deepseek is very good at conversational and regular RPs, too.
-I also use Grok, mainly, for NSFW roleplay. I find that it writes significantly better than Deepseek during ERP, and is less repetitive. That being said: It's still more expensive to run.
But I find a mix of both works well. I also found that Grok tends to follow the scripts and lorebooks a thousand times better. Deepseek is a mixed bag too, and is getting older.
Other APIs I have used that are good replacements:
Trinity- This one I just tested on Monika, and I thought it wrote very well. And its cheaper.
Qwen3 235B A22B Instruct 2507- One of the cheaper options, but gets repetitive.
Step 3.5 Flash- There was a free version of this on openrouter, but they tend to not allow extreme NSFW. And they tend to throttle at weird times.
I do not use Chatgpt, I have noticed that it can be good at writing but it is way too expensive for me. Same with GLM 5 (despite it being among the best for censorship). That and Chatgpt censors basically everything.
Settings:
Temp: For deepseek: .30-.50 (though, it can get a bit "too" creative) For the rest: 1.0-1.20
Max Tokens: I have mine at about a thousand, but it really is up to you. I will say that the more tokens it produces per message, the more expensive.
Context size: Mines maxed out, but again up to you.
Advanced Settings:
Top K: 80. This setting reduces nonsensical generations.
Top P: .8-.85. an AI parameter that controls text generation creativity by limiting the next-word selection to the smallest subset of top-ranked words whose cumulative probability exceeds a threshold
Repetition Penalty: Around 1.0-1.20. This penalizes the AI for repeating certain words too often.
Frequency Penalty: 0.4-0.6. reduces token repetition by applying a penalty proportional to how often a word has already appeared in the text.
And yes, I use all these settings the same across all LLMs. Except for temperature settings.
I usually put 10$ on Openrouter once a month and it usually lasts 4-6 weeks, using this method.
Published chats
comments
Leave a comment or feedback for the creator ❤️