Big Nonsense Plan

From near zero to custom vtuber front end

What it needs to do: the front end app and the way I manage it should save me time and effort, allow me to use Unity features that other front ends do not, make streaming easier for me specifically, and enable fun chat interactions in ways that Vnyan and Warudo do (and more).

Why a custom front end?

In short: popular 3d vtuber frontends have serious issues, but I do not want to make a vtuber app from the ground up. Building a front end will let me freely use a wide variety of Unity features while taking advantage of existing lipsync and motion tracking tools and software.

A 3d vtuber front end is a program that does not do any motion tracking itself. It is a program that displays a vtuber model, and applies motion tracking data from external sources to it. In addition, vtuber frontends can do things like Twitch and Youtube integration, display 3d environments with models and lights, apply custom motion animations to avatars, display visual effects, and have interactive stream objects like 3d and 2d propts and throwable items. Front ends add interactivity and visual interest to 3D vtubing.

Most 3d vtubing programs provide both motion tracking and front end functions. However, it is easy to use separate programs for motion tracking and the final display. Some vtuber apps are front end only, although it is rare - Vnyan started out this way, and Vtuber Plus is still like this. As it is easy to separate these functions, there is no need to make a vtubing app from the ground up. The available 3d motion tracking and lipsync software is very good! The VMC protocol makes it so that motion tracking from different devices can be easily combined into a single input, or blended into preset poses and animations. Motion tracking data can be readily applied to 3D vtuber avatars by using VMC protocol-compatible motion tracking. There are lots of apps and trackers that support it, and it is free and easy to use.

VMC works well because it is built for a standardised avatar format - VRM. VRM is a very constrained format. This makes it easy for it to work with VMC out of the box, the motion tracking is standardised. While some of its features and restrictions are far from ideal (linear transforms between shapekey states, no built-in way to drive shapekeys with bone positions, heavy restrictions on armature layout), it is possible to deal with those issues with tools available in Unity. Dealing with those issues requires components and settings that are not part of VRM, or other formats built on top of it. The additional settings, components, and other various bits of chewing gum and loose bandaids cannot be freely exported out of Unity as part of a VRM avatar to be used in pre-existing vtuber programs, front end or not. Existing frontends use either VRM format avatars, or their own formats built on top of VRM, with specific sets of additional features that they support.

Because front ends have to be standalone packages that can import avatar and asset files, they serve as a filter on the Unity components and features available to vtubers. They often don't support third-party plugins, and when they do, they may only support an older version of a plugin. For example, Vnyan supports Zibra Liquid, but only a version that is no longer available. Warudo also supports Zibra Liquid, but when I tried importing Zibra liquid objects as environment assets, they didn't work! Sometimes, devs do not fix things when third-party addons don't work, and can't be bothered. For example, the free VRM springbone components don't work in Vnyan when they are applied to a prop. Dynamic Bone components do, and so the dev recommends them instead, but they require a paid asset to use. What devs are willing to work on and support is the limiting factor to third-party tools available in those front end apps.

For compatibility with VRM, most frontends (with the exception of Warudo Pro) use Unity's Built-In Render Pipeline. While it's lightweight, it's quite old and it's not scriptable. Making my own frontend will give me the option of using Universal Render Pipeline, or even HDRP if I want to go utterly wild. While MToon and Poiyomi don't work with URP, other anime shaders like RealToon, Anime Shader Plus, and LilToon shaders work with URP just fine. MToon shaders can be automatically converted to LilToon shaders in URP. There are also modified versions of MToon for URP.

Another reason to make a custom front end is the ability to write scripts of my own. Warudo allows custom scripts and plugins, with some limitations, and does not allow the use of third party assets as part of plugins. Warudo only supports URP for the Pro version, and has a confusing UI. While it supports imported C# scripts, it imports the as nodes, and uses a node system. Vnyan also uses a node system. Nodes are convenient for clicking and dragging, and making sure that outputs match inputs. However, nodes obfuscate the Unity functions that they invoke, and restrict some functions which can theoretically take a large number of parameters to specific numbers of parameters. You could say they abstract away those specifics, but the documentation on Vnyan nodes is scarce and it's hard to figure out exactly what the hell is going on sometimes.

So, I need to make my own front end. Having investigated Unreal Engine and UPBGE as options, I have begrudgingly settled on Unity. However, before I set up the app itself, I need to create a system for managing and importing the (very many) 3D assets involved.

Managing Game Assets and Settings in Unity Editor:

In addition to the app itself, having a custom front end and environments means managing and importing 3d models and environments in the Unity project folders and beyond. Here, I use words like 3d models, prefabs, and assets interchangeably. I know that the word asserts also means scripts, sounds, textures and so on. However, I am a 3d vtuber and I have a lot of 3d junk to set up and manage. Therefore, I am going to care about managing 3D models first and foremost. Furthermore, any system that can manage 3D assets could be extended to managing other assets.

Streamlining and automating import of 3d assets

The way that 3d vtuber assets for Vnyan and Warudo are imported and set up in Unity is way too labour intensive, as there is a lot of manual setting up. Most Vtubers don't have nearly as many 3d assets as I do, or use assets made by others. But I have a ton of assets, and I make them myself. When I import them, I have to set them up anew each time. Much of this is done via the Unity GUI. It also means that different batches of assets use different versions of components, and have different settings. It's inconsistent and inefficient, and it can't be the way that game devs use Unity. I need to learn how Untiy devs automate asset setup and import, and how they manage assets and settings across those assets. I need to solve a related cluster of problems, and there are probably already tools to do it.

In short: Tools for automatically generating variants of an existing asset, eg by pulling files from a folder and automatically naming them sequentially, and then applying components and settings to them (including automatic re-sizing and rotation). Means to mass apply settings to game objects based on properties (but only as long as those properties hold at the time the settings are applied), or manually-assigned groups. Means to manually assign game objects into named groups based on properties, even if those properties do not hold after the assignment.

I have a lot of throwable ducks. While a lot of the ducks use the same model, some use a variant thereof. Each duck has to be set up and imported manually. I can copy over some of the properties, but it's such a pain. I have to do this with every 3d asset, and half the time I forget to lock the shader before export and so I have to re-export the asset. Blender and Unity also use a different 3d axis orientation convention, so I have to rotate assets every time. This manual task is an unnecessary barrier to creating fun new things for the stream.

It'd be nice to tell Unity hey, pull 3d models from this folder, and apply these components/settings according to this template. This would save time repeating things that are the same for every model of a type, for example, every duck uses the same shader and has the same components to make it throwable. This bit doesn't require any kind of conditionals, just mass-applying settings.

A more challenging thing would be to automate making adjustments for 3d assets that have significant variations and need different settings accordingly. Some of the older duck models don't have the same scale as the newer duck models. It would be good to consistently and automatically scale models on import, based on mesh properties. For example, all ducks should be the same size from chest to tail, so detecting that property and scaling accordingly. For assets that use primitive colliders, it would be good to detect the scale of a mesh and then automatically transform any primitive colliders to match. Models with PBR textures should get additional, different shader settings to those without.

After initial setup it should be possible to create asset variants quickly, and standardise the shader settings and how things look at import. It should also be easy to create new asset templates, both manually and based on a model exemplar asset.

Grouping assets manually and based on property or set of properties

I need an easy way to sort and keep track of game objects (assets). It should be possible to label and sort assets, both based on manually-assigned labels ('I declare that this is a duck'), file-path ('all assets in this folder are ducks'), and properties ('all objects that have a mesh of this size are ducks'). Property-based groupings should only apply as long as the property holds, rather than at search or sort time, but it should be possible to give a group a name based on properties at sort time that would persist past property changes ('if this asset has a mesh of this size now, it will from now on be known as a duck'). It should be possible to chain properties to create boolean expressions to select assets ('if an asset has the word 'duck' in the name or is in this folder, it is a duck').

Groups should be able to overlap without being nested, eg it should be possible to have a combo of 'Duck' and 'Toon Shading', or 'Duck' and 'PBR Shading' groupings, while any other type of thing can be 'Toon' or 'PBR'. However, it should be possible to nest groups without having to do anything to the individual objects (eg if the objects are already labelled as 'Ducks' and I want to create a group 'Birds', it should be possible to make it so that all settings in 'Birds' apply to all 'Ducks' by specifying that where 'Ducks' properties are configured, and not doing something like applying new components to all 'Ducks').

Some assets use other shaders, like custom shaders (eg Z-Clipping Shader in the 'window to another world' asset) or shaders for specific purposes, eg Crystal Shader in the gem throwables. Browsing assets in the editor grouped according to the shader component they use would save a lot of time.

Mass update settings for a group of 3d assets according to property or label

Because of the piecemeal way I've been adding 3d stream assets, the settings are inconsistent across them. They use different versions of the Poiyomi shader, as I update it every time I make a new batch of assets. The end result is inconsistencies in how shiny certain materials are, outline widths, shading settings, and so on. I want to label assets and have master settings for them. But I also want it to be possible to automatically apply master settings based on some property, eg the presence of a PBR shading texture set in the materials, creation date range, presense of a Unity component, component setting (eg all objects with a specific Physic Material setting).

This should be something that can be combined as a series of conditionals, for example, 'for all objects with PBR shading and Physic material Bounciness value over 0.8, change the bounciness value to 0.8'. Being able to do this with a text-based file or in a terminal would be best (I AM TIRED OF GUI INEFFICIENCY, I WANT CLI POWER).

Procedurally generating presets for VRM files

Setting up Blendshape material properties for VRMs via the GUI is a PAIN IN THE ASS. Blendshape preset files are text files, it should be easy to generate them procedurally based on what kind of setting they apply. UV Tiling is just basic arithmetic and find and replace, changing material colour and shadow colour is find and replace and a bit of calc (generate shade colour from lit colour), glow properties are straight up inaccessible in the blendshape editor menu. Screw this, this should be a breeze to procedurally generate. Not a problem for most vtubers, but for those that use UV tiling for blendshapes (low poly vtubers who animate their mounts by swapping still images, for example), PAIN. For those who like having a lot of effects, PAIN. A shitty GUI standing between me and some text files should not create so much manual work and pain! Maybe this should be the first thing to try. This could help some niche vtubers save effort and time.

This broad class of features is a MUST priority. Managing assets manually will make this project impossible, as I just don't have the time. If I did have the time, it would make it unwieldy. These tools should be some of the first things I make. After that, I can think about the front end app itself.

The Front End App Itself

GUI and General Requirements

Vnyan and Warudo make it hard to tinker around with game object parameters live. They are closed and opaque, which is good because it's possible to mess a lot of stuff up if you are not careful. But I am ok with messing things up. The front end itself should be transparent when it comes to various game object parameters, and allow as much modification of them live as is possible. It's fun to experiment with things live, like making new redeems on a whim, or making one-off adjustments apropos of something on stream.

In Broad Strokes

GUI should be omitted from Spout2 output (easy to do with layers in Unity).

There should be a searchable catalog of adjustable game objects via GUI. This requires a way to read the parameters of an asset, such as the names of blendshapes and bones. Make it possible to adjust game object parameters via GUI.

Rather than an import GUI, it could be easier to have the front end read all contents of a dedicated folder at startup, with some settings as text file presets. Add a button to reload presets only, or to reload all assets if making a change while the app is running. Make it possible to import assets with location, rotation, and scale data, and if they are attached to a specific bone, with that data too. This could be included in the presets.

The GUI shoud include camera controls, such as shifting camera views, adjusting camera properties, and creating new cameras, as well as means to adjust post-processing parameters.

Control over avatar position. Vnyan and Warudo only have draggable cameras, because changing the VRM root bone position can break things (I think that's the reason anyway?). Instead the avatar could be a child of an invisible draggable object, and that way the root bone will maintain a 0 position relative to its parent and be moveable in 3d space (I think?? test it).

MUST: GUI hidden from stream, shifting cameras, adjustable cameras, adjusting post-processing

SHOULD: Easy import, Add Scripts, Searchable Catalog, Read Parameters, Adjustable Parameters, Create new cameras

COULD: Draggable avatar

Environment

By environment, I mean 3d surrounds, like a room, not an environment in a computing sense.

Different modes for different streams: game mode, just chatting mode, special event modes like decorating the Crimsis tree. Animated, cycling, and independently active elements in the environment (eg random trash rain).

3D Random Spin wheel, perhaps one that is physics-based and I can manually spin by grabbing it with hand tracking?).

MUST: Game mode

SHOULD: Just Chatting Mode

WOULD: Special event mode, spin wheel, environment events

Features in Broad Strokes

Modularity (I think modularity is a term of art so I am probably using it wrong here?? I mean it should be possible to add bits and bobs on no probbie).

It should be possible to extend the front end's functionality by importing assets with C# scripts without having to export a new version of it from Unity.

Control

It should be easy to access parameters that can be modified in the Unity editor (eg shader parameters) via the GUI.

Chat Interaction

Twitch has an API for Unity, and it is possible to incorporate the chat feed into Unity apps in various ways. In Vnyan, that includes things like chat commands, emote confetti, and event triggers.

It should be possible to track and record individual user interaction counts in an independently viewable way. People throw a lot of ducks, and the number/kind of ducks is random. It'd be fun to record the kind and number of ducks each user throws, and make it so they can view them easily outside of chat if they want. The number of times a user uses a command is easy enough to record in StreamerBot or the like, but the results of the random draws like the kinds of duck, hat etc are not. Counting things is easy, but structuring it to be viewable is another matter. The frontend should have an opt-incapacity to track these things, and to record them to a JSON file (or similar). Chatters should then be able to view their stats, either in chat, as a stream widget (easiest probably, locally host it via Slime2) or online on my cool website.

Cool effects for chat to trigger and interact with

Whole camera post-processing effects (interact with Unity's post-processing and with Aura2)

Room lighting effects (interact with Aura2)

Trigger shader-based effects on vtuber model (update existing shader settings, activate additional shader passes).

3d interactable assets like throwables

Interactable environment objects

Simple built-in chat games ('Duck Gacha?' Plinko?)

SHOULD: Throwables, avatar effects

COULD: Room effects, post-processing effects, interactive environment

WOULD: Chat games

Streamer QoL:

Helpful GUI elements, hidden from stream by putting them on a layer not accessible to Spout2. The front end is for me, and should address things I specifically need.

Sticky note field triggerable by hotkey, for reminders that don't disappear out of my brain's RAM.
Visual timer, customisable but with some basic presets. Triggerable by hotkey.
Chat highlight, similar to sticky note but pulled from chat. Add a timestamp and stopwatch so it is clear how long ago it was posted in chat.
Quick ways to send some preset webhook commands to StreamerBot and OBS, to avoid tabbing out for common or quick tasks (shoutout latest raider, shield mode, BRB screen, mic mute, emote fixer).
'Kill switch' for nonsense, in case it gets out of hand. Reset app to default state.

Vtuber pose adjustment: Sometimes I just want to slouch a lil', but that would look bad with my avatar. Artificially counteracting some of that shrimping by making triggerable bone rotation adjustments couldn't go astray.

Existing Packages and Functions

Much of the stuff I need already exists, either free or paid. Much of it is ez-pez to do in Unity.

Vtuber Avatar Display and Tracking

Import and display avatar models - UniVRMMotion tracking - Vseeface, MediaPipe, MeowFace

VMC Data input into front end - EVMC4U

Video and StreamSend output to OBS - Klak Spout Sender for Unity

Sending commands to OBS - webhook commandsTwitch events - API, only need to use listeners for events available to non-affiliates OR Webhooks with StreamerBot

Grabbing emotes and text from chat - Twitch uses IRC, will need to find out how to grab emote images from chat as well

Environment

Fog volume - AURA2 This is BRP only, URP has built-in volumetric effects. This will depend on whether I go with URP in the end.

Liquids - Zibra is great and allows real-time mesh collider interaction between avatar and liquid. Zibra (version 2 onwards) makes it possible for my avatar to splash around in a pool and realistically play with water in real time! The ways in which Zibra interacts with built-in physic materials is wacky, and I need to learn how to set up objects that can work with Zibra properly.

Easy to make with built-in Unity functions

Camera controls

Switching environments

Post processing (Render Pipelines have a post-processing layer)

Triggerable props (you can just show and hide stuff in Unity)

Throwable and droppable props (physic material, colliders, instantiate)

Triggerable particles (the loathed particle system!!!)