
<span class="emp-dg"> SoloTagger is an LLM-based image captioning tool. </span>
Ever since I built this little tool, **SoloTagger**, it has become my main tool for dataset annotation. Recently all of my dataset labeling work has been done with it.
To create high quality **LoRAs**, high quality dataset annotations are essential. Generating captions with a large language model is similar to generating images with one. If you want good results, a well designed prompt is necessary.
Different datasets and different goals require different tagging styles, which means different prompts. Over the past few days I have been tweaking and optimizing the prompts used by SoloTagger. During this process I ran into an annoying issue. In the previous version all prompts were stored in a JSON file, and manually editing JSON files is something I really hate.
So **SoloTagger v0.12** was born. 😄
## What's Changed
- Added a `prompt.txt` file to store prompts.
- The `config.json` file now only keeps two parameters: `API_URL` and `models`.
## Prompt Editing Guide
The structure of the `prompt.txt` file looks like this:
```
==Natural Language==
Describe the image content in natural language, in no more than 200 words.
==Tags==
Describe the content of the image using tags.
==Titre==
Veuillez décrire le contenu de l'image en détail.
==Заголовок==
Пожалуйста, подробно опишите содержание изображения.
==标题==
请描述图片内容
==タイトル==
画像の内容を詳しく説明してください。
```
You can store multiple prompts in the file. Each prompt starts with a title in the format `==Title==`, followed by the actual prompt content.
The title is only used as a display label so users can select different prompts. It is not sent to the language model.
The title must be written exactly in this format: <span class="emp-dr">==Title==</span>.
Below the title is the prompt content that will be sent to the model. You can write these prompts based on your own needs. Different prompts should be separated by a blank line.
When editing prompts, you do not need to worry about characters or formatting. Before sending the prompt, SoloTagger will automatically escape line breaks, quotation marks, and other characters when necessary.
The `prompt.txt` file uses **UTF-8 encoding**, so it supports multiple languages.
For detailed instructions on using SoloTagger and installing and setting up **LM Studio on Windows**, please see the previous article:
https://sololo.xyz/article/24-solotagger-local-joycaption-beta-one-gguf-setup-on-windows-via-lm-studio
## Usage Notes
I deployed **JoyCaption beta one** and **Qwen3.5 0.8B, 4B, and 9B** on my laptop, all in **GGUF** format.
After a few days of testing and comparison, here are some of my personal impressions:
- **Qwen3.5 0.8B** is the smallest model, so it is the fastest. The tagging quality is not bad, but it is slightly weaker than the larger models.
- **JoyCaption beta one** and **Qwen3.5 9B** both handle tagging tasks very well, and the speed is also quite good.
- **Qwen3.5** follows prompts more reliably.
- When using longer prompts, around **500 words**, **Qwen3.5 9B** produces better outputs and can even outperform **JoyCaption**.
<br><br>
**Download:** <a href="https://assets.sololo.xyz/articles/025/SoloTagger_v0.12.zip">SoloTagger_v0.12.zip</a>