Этот сайт лучше всего просматривать в современном браузере с включённым JavaScript.

Feedback on skeleton "clip" shader

WiseKodama

Greetings,

When I was profiling our game I noticed that our masks were quite big and thus resulting to overdraw not to mention breaking batching.
Masking in our game is quite simple, we have a water surface and anything below it should be hidden.

I've made the following shader, which works for our needs.

Shader "Spine/Skeleton-Clip" {
   Properties{
      _Cutoff("Shadow alpha cutoff", Range(0,1)) = 0.1
      _WaterSurface("Water surface", Float) = 0.1
      _MaskHeight("Mask height", Float) = 0.1
      _MaskWidth("Mask width", Float) = 0.5
      [NoScaleOffset] _MainTex("Main Texture", 2D) = "black" {}
   }

  SubShader{
     Tags { "Queue" = "Transparent" "IgnoreProjector" = "True" "RenderType" = "Transparent" "PreviewType" = "Plane" }

     Fog { Mode Off }
     Cull Off
     ZWrite Off
     Blend SrcAlpha OneMinusSrcAlpha
     Lighting Off

     Pass {
        CGPROGRAM
        #pragma vertex vert
        #pragma fragment frag
        #include "UnityCG.cginc" 
        sampler2D _MainTex;
        float _WaterSurface;
        float _MaskHeight;
        float _MaskWidth;
        fixed _CutOff;

        struct VertexInput {
           float4 vertex : POSITION;
           float2 uv : TEXCOORD0;
           float4 vertexColor : COLOR;
        };

        struct VertexOutput {
           float4 pos : SV_POSITION;
           float2 uv : TEXCOORD0;
           float3 worldPos : TEXCOORD1;
           float4 vertexColor : COLOR;
        };

        VertexOutput vert(VertexInput v) {
           VertexOutput o;
           o.pos = UnityObjectToClipPos(v.vertex);
           o.uv = v.uv;
           o.worldPos = mul(unity_ObjectToWorld, v.vertex).xyz;
           o.vertexColor = v.vertexColor;
           return o;
        }

        float4 frag(VertexOutput i) : COLOR {
           float4 texColor = tex2D(_MainTex, i.uv);

           const float PI = 3.1415926535897932384626433832795;
           fixed height = _MaskHeight * sin(((i.worldPos.x - _MaskWidth * 0.5)/_MaskWidth) * PI);
           texColor.a = lerp(0.0, texColor.a,step( _WaterSurface, i.worldPos.y - height ));

           return texColor;
        }
        ENDCG
     }
  }
}

I would love to hear some suggestions, comments about it to maximize the upsides and minimize the downsides.
** note ** We are targeting mobile devices, that is why I opted out of using Clip since the pixels that are transparent are a very small area and thus do not really impact overdraw.

Harald

Your shader looks good to me.

WiseKodama написал
** note ** We are targeting mobile devices, that is why I opted out of using Clip since the pixels that are transparent are a very small area and thus do not really impact overdraw.

Not using clip (or any discard operation) might be beneficial especially on mobile devices (which may disable some hardware optimizations, see here, section "Avoid discard" and here), as you're likely already aware of. Adding a discard operation at the very end is pointless anyway and would be harmful (it would only make sense if you must avoid writing to the z-buffer or stencil buffer). A discard could only improve performance if e.g. texture sampling operations would follow it which might then be reduced to no-ops (or perhaps if a lot of expensive shader code follows it and it is really branched instead of both paths being executed, and other fragments of the 2x2 block are also executing the discard code branch).

It's not relevant in your scenario, but for sake of completeness: A potentially good way to achieve early-out fragment rejection of fully transparent pixels (if expensive shader code follows a potential discard) would be to split the shader into two passes. The first pass would then perform simple code up to the discard operation and only writes to the depth buffer. Then a second pass with ZTest set to Equal follows, which executes the expensive shader code. This way the expensive fragments are not processed when early-z-rejection determines that the fragment shall be rejected. Similar things could be achieved using the stencil buffer, if the mask object is separete from the masked-objects.

WiseKodama написал
When I was profiling our game I noticed that our masks were quite big and thus resulting to overdraw not to mention breaking batching.
Masking in our game is quite simple, we have a water surface and anything below it should be hidden.

By "our masks" do you mean the objects with the water shader applied (which draws the object and additionally sets pixels below the water line to transparent), right? The term "mask" is a bit confusing here, given the shader code that follows it. Or do you have additional mask objects which are separate from the objects using the water shader?

WiseKodama

We use SpriteMasks, which in some cases need to be huge.

It's the big rectangles towards the bottom of the screen.

And it takes 3 draw calls for a single spine mesh(2 for incrementing/decrementing stencil, 1 for mesh) if I am not mistaken.
With the shader it gets down to 1 draw for single spine mesh(usually our meshes have more than the maximum amount of vertices to batch them, so no gain there a part from a few small enemies.

Harald

WiseKodama написал
And it takes 3 draw calls for a single spine mesh(2 for incrementing/decrementing stencil, 1 for mesh) if I am not mistaken.

Why 2 draw calls per mask? It should only require a single draw call to render a mask into the stencil buffer. Or did I misunderstand you there?

What would make the most difference is whether you need to draw the mask once (one mask per skeleton), or only use a single mask for all your skeletons.

WiseKodama

It is a fake perspective, therefore one global mask would not suffice, since they are on different Z/Y positions.

First one has increment, then spine mesh, then decrement.
3 drawcalls per skeleton in our case. Unless we are doing something wrong.

Harald

WiseKodama написал
It is a fake perspective, therefore one global mask would not suffice, since they are on different Z/Y positions.

I see, thanks for the clarification.

One more solution that comes to my mind is to have an oblique water plane object (intersecting skeletons where the water surface would be) which only renders to the Z-Buffer just before your to-be-masked skeleton objects. Depending on your requirements regarding the water shape, skeleton placement and so on this might however not be a viable option.

WiseKodama написал
First one has increment, then spine mesh, then decrement.
3 drawcalls per skeleton in our case. Unless we are doing something wrong.

Ah, thanks for the screenshot, I never noticed the two passes per mask before. This makes sense when each mask shall only affect a set of objects and then must have its effect cleared from the stencil buffer again (since clearing the entire stencil buffer is not an option because of other masks).