Usage guidelines

[Hinata intro video](https://www.youtube.com/watch?v=H-maLsw-pRI)

This model is specially designed to work with Danbooru tags and performs very well with them. However, quality tags like "best quality" and "masterpiece" or score tags like 'score_9' are not supported.

Style

Artist tags: artist:ningenmame
year tags: year 2024

Character

Character name: furina (genshin impact)

Content

Danbooru tags

PixAI tutorial

Check pixai tutorial [English](https://www.youtube.com/@PixaiART/videos), [Japanses](https://www.youtube.com/@PixAIJP) and [danbooru wiki](https://danbooru.donmai.us/wiki_pages/howto%3Atag)

Example

artist:ningenmame, artist:ciloranko, artist:shosho_lwlw, artist:as109, furina (genshin impact), 1girl, blue eyes, solo, hat, long hair, looking at viewer, smile, blue hair, underwater, bangs, bubble, hair between eyes, white hair, upper body, closed mouth, crown, multicolored hair, air bubble, light rays, ribbon

Setting the resolution to something too large (far above 1024×1024) may result in poor results, but this will be addressed in the next version. LoRA compatibility: many LoRAs trained on top of SDXL or "XL" models will work to some degree with Hinata; however, for best results, try using LoRAs trained on top of Hinata.

Tip

Try adjusting the Shift parameter (available starting late March 2025) in the Advanced section of the image generation setting panel to add variety and change the look and feel of the image while maintaining the image composition.

Technical training details

Hinata follows the architecture of SDXL. We trained this model in 3 stages,

Starting from a pretrained model with limited anime image knowledge
10M anime images to establish basic knowledge, then refined it with 3 million high-quality images to improve overall quality,
enhanced aesthetics using carefully selected small anime image sets

We used the rectified flow (RF) algorithm across all stages like Stable Diffusion 3 (https://arxiv.org/abs/2403.03206). RF algorithm benefits in:

Simplifying the modeling process
Providing improved theoretical properties
Conceptual clarity
Faster convergence across data distributions

Current maximum training resolution is 1024×1024, with plans to increase to 1536×1536 in the next version.

We're continuously improving the model - the next version will be more stable, diverse, and aesthetically pleasing. Looking forward to your feedback!

利用ガイドライン

Hinata紹介動画

[Hinata紹介動画を見る](https://www.youtube.com/watch?v=H-maLsw-pRI)

このモデルは Danbooruタグ に最適化されており、それらのタグを用いることで高いパフォーマンスを発揮します。 (ただし、 "best quality" や "masterpiece" のような品質タグ、そして"score_9" のようなスコアタグには対応していません）

スタイル

アーティストタグ: artist:ningenmame
年タグ: year 2024

キャラクター

キャラクター名: furina (genshin impact)

コンテンツ

Danbooruタグ対応

使い方を学ぶ

PixAIチュートリアル（[英語](https://www.youtube.com/@PixaiART/videos)、[日本語](https://www.youtube.com/@PixAIJP)）や[Danbooru Wiki](https://danbooru.donmai.us/wiki_pages/howto%3Atag)を確認してください。

タグの例

artist:ningenmame, artist:ciloranko, artist:shosho_lwlw, artist:as109, furina (genshin impact), 1girl, blue eyes, solo, hat, long hair, looking at viewer, smile, blue hair, underwater, bangs, bubble, hair between eyes, white hair, upper body, closed mouth, crown, multicolored hair, air bubble, light rays, ribbon

解像度を 1024×1024を大幅に超える値に設定すると、画質が低下する可能性があります。この問題は次回のバージョンで改善予定です。 SDXL または「XL」モデル向けに学習された多くのLoRAは、Hinataでもある程度動作します。最適な結果を得るためには、Hinata用に学習されたLoRAの使用を推奨します。

ヒント

2025年3月下旬以降、「画像生成設定」パネルの「Advanced」セクション に追加される「Shift」パラメータ*を調整すると、画像の構図を維持しながら、見た目や雰囲気に変化を加えることができます。

テクニカルトレーニング詳細

HinataはSDXLアーキテクチャに基づいています。
このモデルは、以下の3段階のトレーニングプロセスを経て開発されました。

事前学習済みのモデル（アニメ画像知識が限定的な状態）からスタート
1,000万枚のアニメ画像で基礎知識を構築, 300万枚の高品質画像を用いて全般的な品質を改善
選定した小規模アニメ画像セットで美的センスを強化

また、Stable Diffusion 3（[論文リンク](https://arxiv.org/abs/2403.03206)）と同様に、全トレーニング段階でRectified Flow（RF）アルゴリズムを採用しました。
RFアルゴリズムは、以下の利点を提供します：

モデリングプロセスの簡素化
理論的な安定性向上
概念的な明確化
データ分布全体での収束の高速化

現在の最大トレーニング解像度は1024×1024ですが、次のバージョンでは1536×1536に拡張予定です。

次のバージョンでは、より安定し、多様性が向上し、美的にも優れたモデルとなる予定です。
皆様のフィードバックをお待ちしています！

Hinata

詳細説明