Monday, September 28, 2015

Deinterlacing with POT on 1080i Videos

A few weeks ago I came across the interesting task of adding deinterlacing to the POT rendering pipeline.
I had never looked into deinterlacing with POT, but the code was actually present in the repo. By switching the proper enum in the sources deinterlacing could be enabled. The resulting performance was however pretty bad when applied to 1080i videos.
This demo is, instead, the result of a few quick patches:


After a quick investigation it seemed strangely clear that the reason was too much work being done by the VPU. When running without deinterlacing, this is approximately the situation of the VPUs as reported by vcdbg:

[...]
set yrange [-0.1:30.1]
set multiplot
set title 'Core 1 (VPU0)'
set ytics ("Idle" 0, "Int timer0" 1, "Int timer2" 2, "Int codec0" 3, "Int codec2" 4, "Int 3d" 5, "Int msync2" 6, "Int dma3" 7, "Int 0x5e" 8, "Int 0x5f" 9, "Int scaler" 10, "DISPSERVX" 11, "SysTimer" 12, "Dispmanx Ch1 EO" 13, "AUDSRV" 14, "HDMI_HOTPLUG_TM" 15, "HDMI TASK" 16, "KHRN_S" 17, "powerman" 18, "temp_check" 19, "v3d_gfxh16_thre" 20, "VCHIQ-0" 21, "VCHIQr-0" 22, "ILCS_VC" 23, "ILClock" 24, "ILVDecode" 25, "ILADecode" 26, "ILEGLRender" 27, "ILAMixer" 28, "ILVScheduler" 29, "ILARender" 30)
plot 'task.dat' index 30 with boxes lt 5 title 'ILARender 3.39%', 'task.dat' index 29 with boxes lt 1 title 'ILVScheduler 0.26%', 'task.dat' index 28 with boxes lt 2 title 'ILAMixer 1.19%', 'task.dat' index 27 with boxes lt 3 title 'ILEGLRender 46.3%', 'task.dat' index 26 with boxes lt 4 title 'ILADecode 0.64%', 'task.dat' index 25 with boxes lt 11 title 'ILVDecode 0.36%', 'task.dat' index 24 with boxes lt 12 title 'ILClock 0.18%', 'task.dat' index 23 with boxes lt 13 title 'ILCS_VC 0.30%', 'task.dat' index 22 with boxes lt 14 title 'VCHIQr-0 0.01%', 'task.dat' index 21 with boxes lt 15 title 'VCHIQ-0 0.30%', 'task.dat' index 20 with boxes lt 10 title 'v3d_gfxh16_thre 0.02%', 'task.dat' index 19 with boxes lt 9 title 'temp_check 0.02%', 'task.dat' index 18 with boxes lt 5 title 'powerman 0.15%', 'task.dat' index 17 with boxes lt 1 title 'KHRN_S 5.13%', 'task.dat' index 16 with boxes lt 2 title 'HDMI TASK 0.04%', 'task.dat' index 15 with boxes lt 3 title 'HDMI_HOTPLUG_TM 0.01%', 'task.dat' index 14 with boxes lt 4 title 'AUDSRV 0.08%', 'task.dat' index 13 with boxes lt 11 title 'Dispmanx Ch1 EO 0.08%', 'task.dat' index 12 with boxes lt 12 title 'SysTimer 0.16%', 'task.dat' index 11 with boxes lt 13 title 'DISPSERVX 0.14%', 'task.dat' using 1:2:($3/2) index 10 with xerrorbars lt 14 title 'Int scaler 0.07%', 'task.dat' using 1:2:($3/2) index 9 with xerrorbars lt 15 title 'Int 0x5f 0.02%', 'task.dat' using 1:2:($3/2) index 8 with xerrorbars lt 10 title 'Int 0x5e 0.10%', 'task.dat' using 1:2:($3/2) index 7 with xerrorbars lt 9 title 'Int dma3 0.17%', 'task.dat' using 1:2:($3/2) index 6 with xerrorbars lt 5 title 'Int msync2 0.00%', 'task.dat' using 1:2:($3/2) index 5 with xerrorbars lt 1 title 'Int 3d 0.07%', 'task.dat' using 1:2:($3/2) index 4 with xerrorbars lt 2 title 'Int codec2 0.11%', 'task.dat' using 1:2:($3/2) index 3 with xerrorbars lt 3 title 'Int codec0 0.02%', 'task.dat' using 1:2:($3/2) index 2 with xerrorbars lt 4 title 'Int timer2 0.10%', 'task.dat' using 1:2:($3/2) index 1 with xerrorbars lt 11 title 'Int timer0 0.03%', 0 title 'Idle 40.4%'
set nomultiplot
pause -1 'Hit return to continue'

set yrange [-0.1:3.1]
set multiplot
set title 'Core 2 (VPU1)'
set ytics ("Idle" 0, "Int msync3" 1, "khrn_llat_threa" 2, "H264#0 Outer" 3)
plot 'task.dat' index 34 with boxes lt 5 title 'H264#0 Outer 1.96%', 'task.dat' index 33 with boxes lt 1 title 'khrn_llat_threa 1.48%', 'task.dat' using 1:2:($3/2) index 32 with xerrorbars lt 2 title 'Int msync3 0.02%', 0 title 'Idle 96.5%'
set nomultiplot
pause -1 'Hit return to continue'
[...]

By forcing deinterlacing we get this:

[...]
set yrange [-0.1:31.1]
set multiplot
set title 'Core 1 (VPU0)'
set ytics ("Idle" 0, "Int timer0" 1, "Int timer2" 2, "Int codec0" 3, "Int codec2" 4, "Int 3d" 5, "Int msync2" 6, "Int dma3" 7, "Int 0x5e" 8, "Int 0x5f" 9, "Int scaler" 10, "DISPSERVX" 11, "SysTimer" 12, "Dispmanx Ch1 EO" 13, "AUDSRV" 14, "HDMI_HOTPLUG_TM" 15, "HDMI TASK" 16, "KHRN_S" 17, "mbox_read" 18, "powerman" 19, "temp_check" 20, "v3d_gfxh16_thre" 21, "VCHIQ-0" 22, "VCHIQr-0" 23, "ILCS_VC" 24, "ILClock" 25, "ILVDecode" 26, "ILADecode" 27, "ILAMixer" 28, "ILEGLRender" 29, "ILVScheduler" 30, "ILARender" 31)
plot 'task.dat' index 31 with boxes lt 5 title 'ILARender 3.15%', 'task.dat' index 30 with boxes lt 1 title 'ILVScheduler 0.39%', 'task.dat' index 29 with boxes lt 2 title 'ILEGLRender 84.9%', 'task.dat' index 28 with boxes lt 3 title 'ILAMixer 1.42%', 'task.dat' index 27 with boxes lt 4 title 'ILADecode 0.95%', 'task.dat' index 26 with boxes lt 11 title 'ILVDecode 0.35%', 'task.dat' index 25 with boxes lt 12 title 'ILClock 0.36%', 'task.dat' index 24 with boxes lt 13 title 'ILCS_VC 0.36%', 'task.dat' index 23 with boxes lt 14 title 'VCHIQr-0 0.02%', 'task.dat' index 22 with boxes lt 15 title 'VCHIQ-0 0.34%', 'task.dat' index 21 with boxes lt 10 title 'v3d_gfxh16_thre 0.02%', 'task.dat' index 20 with boxes lt 9 title 'temp_check 0.02%', 'task.dat' index 19 with boxes lt 5 title 'powerman 0.14%', 'task.dat' index 18 with boxes lt 1 title 'mbox_read 0.14%', 'task.dat' index 17 with boxes lt 2 title 'KHRN_S 5.10%', 'task.dat' index 16 with boxes lt 3 title 'HDMI TASK 0.03%', 'task.dat' index 15 with boxes lt 4 title 'HDMI_HOTPLUG_TM 0.01%', 'task.dat' index 14 with boxes lt 11 title 'AUDSRV 0.08%', 'task.dat' index 13 with boxes lt 12 title 'Dispmanx Ch1 EO 0.08%', 'task.dat' index 12 with boxes lt 13 title 'SysTimer 0.15%', 'task.dat' index 11 with boxes lt 14 title 'DISPSERVX 0.15%', 'task.dat' using 1:2:($3/2) index 10 with xerrorbars lt 15 title 'Int scaler 0.07%', 'task.dat' using 1:2:($3/2) index 9 with xerrorbars lt 10 title 'Int 0x5f 0.03%', 'task.dat' using 1:2:($3/2) index 8 with xerrorbars lt 9 title 'Int 0x5e 0.11%', 'task.dat' using 1:2:($3/2) index 7 with xerrorbars lt 5 title 'Int dma3 0.19%', 'task.dat' using 1:2:($3/2) index 6 with xerrorbars lt 1 title 'Int msync2 0.01%', 'task.dat' using 1:2:($3/2) index 5 with xerrorbars lt 2 title 'Int 3d 0.06%', 'task.dat' using 1:2:($3/2) index 4 with xerrorbars lt 3 title 'Int codec2 0.13%', 'task.dat' using 1:2:($3/2) index 3 with xerrorbars lt 4 title 'Int codec0 0.01%', 'task.dat' using 1:2:($3/2) index 2 with xerrorbars lt 11 title 'Int timer2 0.11%', 'task.dat' using 1:2:($3/2) index 1 with xerrorbars lt 12 title 'Int timer0 0.04%', 0 title 'Idle 1.01%'
set nomultiplot
pause -1 'Hit return to continue'

set yrange [-0.1:6.1]
set multiplot
set title 'Core 2 (VPU1)'
set ytics ("Idle" 0, "Int msync3" 1, "SysTimer2" 2, "khrn_llat_threa" 3, "H264#0 Outer" 4, "ILARender" 5, "ILImageFX" 6)
plot 'task.dat' index 38 with boxes lt 5 title 'ILImageFX 21.5%', 'task.dat' index 37 with boxes lt 1 title 'ILARender 0.00%', 'task.dat' index 36 with boxes lt 2 title 'H264#0 Outer 1.58%', 'task.dat' index 35 with boxes lt 3 title 'khrn_llat_threa 1.54%', 'task.dat' index 34 with boxes lt 4 title 'SysTimer2 0.02%', 'task.dat' using 1:2:($3/2) index 33 with xerrorbars lt 11 title 'Int msync3 0.02%', 0 title 'Idle 75.2%'
set nomultiplot
pause -1 'Hit return to continue'
[...]

It is clear that one VPU is completely saturated by the EGL component.

By discussing this issue here I added a couple of patches to POT that now allows to reduce the resulting frame rate of the deinterlaced stream and to move the deinterlacing algorithm from the VPU to the QPU by setting three environment variables before the startup:

POT_DEINTERLACE_MODE
POT_DEINTERLACE_QPU
POT_HALF_FRAMERATE_MODE

The result of halving the frame rate is:

[...]
set yrange [-0.1:30.1]
set multiplot
set title 'Core 1 (VPU0)'
set ytics ("Idle" 0, "Int timer0" 1, "Int timer2" 2, "Int codec0" 3, "Int codec2" 4, "Int 3d" 5, "Int msync2" 6, "Int dma3" 7, "Int 0x5e" 8, "Int 0x5f" 9, "Int scaler" 10, "DISPSERVX" 11, "SysTimer" 12, "Dispmanx Ch1 EO" 13, "AUDSRV" 14, "HDMI_HOTPLUG_TM" 15, "HDMI TASK" 16, "KHRN_S" 17, "powerman" 18, "temp_check" 19, "v3d_gfxh16_thre" 20, "VCHIQ-0" 21, "VCHIQr-0" 22, "ILCS_VC" 23, "ILClock" 24, "ILVDecode" 25, "ILADecode" 26, "ILAMixer" 27, "ILEGLRender" 28, "ILVScheduler" 29, "ILARender" 30)
plot 'task.dat' index 30 with boxes lt 5 title 'ILARender 3.81%', 'task.dat' index 29 with boxes lt 1 title 'ILVScheduler 0.26%', 'task.dat' index 28 with boxes lt 2 title 'ILEGLRender 55.3%', 'task.dat' index 27 with boxes lt 3 title 'ILAMixer 1.49%', 'task.dat' index 26 with boxes lt 4 title 'ILADecode 0.91%', 'task.dat' index 25 with boxes lt 11 title 'ILVDecode 0.39%', 'task.dat' index 24 with boxes lt 12 title 'ILClock 0.19%', 'task.dat' index 23 with boxes lt 13 title 'ILCS_VC 0.37%', 'task.dat' index 22 with boxes lt 14 title 'VCHIQr-0 0.02%', 'task.dat' index 21 with boxes lt 15 title 'VCHIQ-0 0.38%', 'task.dat' index 20 with boxes lt 10 title 'v3d_gfxh16_thre 0.02%', 'task.dat' index 19 with boxes lt 9 title 'temp_check 0.02%', 'task.dat' index 18 with boxes lt 5 title 'powerman 0.15%', 'task.dat' index 17 with boxes lt 1 title 'KHRN_S 4.95%', 'task.dat' index 16 with boxes lt 2 title 'HDMI TASK 0.04%', 'task.dat' index 15 with boxes lt 3 title 'HDMI_HOTPLUG_TM 0.01%', 'task.dat' index 14 with boxes lt 4 title 'AUDSRV 0.08%', 'task.dat' index 13 with boxes lt 11 title 'Dispmanx Ch1 EO 0.08%', 'task.dat' index 12 with boxes lt 12 title 'SysTimer 0.17%', 'task.dat' index 11 with boxes lt 13 title 'DISPSERVX 0.13%', 'task.dat' using 1:2:($3/2) index 10 with xerrorbars lt 14 title 'Int scaler 0.08%', 'task.dat' using 1:2:($3/2) index 9 with xerrorbars lt 15 title 'Int 0x5f 0.03%', 'task.dat' using 1:2:($3/2) index 8 with xerrorbars lt 10 title 'Int 0x5e 0.12%', 'task.dat' using 1:2:($3/2) index 7 with xerrorbars lt 9 title 'Int dma3 0.18%', 'task.dat' using 1:2:($3/2) index 6 with xerrorbars lt 5 title 'Int msync2 0.01%', 'task.dat' using 1:2:($3/2) index 5 with xerrorbars lt 1 title 'Int 3d 0.07%', 'task.dat' using 1:2:($3/2) index 4 with xerrorbars lt 2 title 'Int codec2 0.13%', 'task.dat' using 1:2:($3/2) index 3 with xerrorbars lt 3 title 'Int codec0 0.02%', 'task.dat' using 1:2:($3/2) index 2 with xerrorbars lt 4 title 'Int timer2 0.10%', 'task.dat' using 1:2:($3/2) index 1 with xerrorbars lt 11 title 'Int timer0 0.04%', 0 title 'Idle 30.3%'
set nomultiplot
pause -1 'Hit return to continue'

set yrange [-0.1:5.1]
set multiplot
set title 'Core 2 (VPU1)'
set ytics ("Idle" 0, "Int msync3" 1, "SysTimer2" 2, "khrn_llat_threa" 3, "H264#0 Outer" 4, "ILImageFX" 5)
plot 'task.dat' index 36 with boxes lt 5 title 'ILImageFX 15.7%', 'task.dat' index 35 with boxes lt 1 title 'H264#0 Outer 1.95%', 'task.dat' index 34 with boxes lt 2 title 'khrn_llat_threa 1.43%', 'task.dat' index 33 with boxes lt 3 title 'SysTimer2 0.03%', 'task.dat' using 1:2:($3/2) index 32 with xerrorbars lt 4 title 'Int msync3 0.02%', 0 title 'Idle 80.8%'
set nomultiplot
pause -1 'Hit return to continue'
[...]

And now enabling QPU deinterlacing:

[...]
set yrange [-0.1:31.1]
set multiplot
set title 'Core 1 (VPU0)'
set ytics ("Idle" 0, "Int timer0" 1, "Int timer2" 2, "Int codec0" 3, "Int codec2" 4, "Int 3d" 5, "Int msync2" 6, "Int dma3" 7, "Int 0x5e" 8, "Int 0x5f" 9, "Int scaler" 10, "DISPSERVX" 11, "SysTimer" 12, "Dispmanx Ch1 EO" 13, "AUDSRV" 14, "HDMI_HOTPLUG_TM" 15, "HDMI TASK" 16, "KHRN_S" 17, "mbox_read" 18, "powerman" 19, "temp_check" 20, "v3d_gfxh16_thre" 21, "VCHIQ-0" 22, "VCHIQr-0" 23, "ILCS_VC" 24, "ILClock" 25, "ILVDecode" 26, "ILADecode" 27, "ILEGLRender" 28, "ILAMixer" 29, "ILVScheduler" 30, "ILARender" 31)
plot 'task.dat' index 31 with boxes lt 5 title 'ILARender 3.67%', 'task.dat' index 30 with boxes lt 1 title 'ILVScheduler 0.26%', 'task.dat' index 29 with boxes lt 2 title 'ILAMixer 1.57%', 'task.dat' index 28 with boxes lt 3 title 'ILEGLRender 50.3%', 'task.dat' index 27 with boxes lt 4 title 'ILADecode 0.97%', 'task.dat' index 26 with boxes lt 11 title 'ILVDecode 0.39%', 'task.dat' index 25 with boxes lt 12 title 'ILClock 0.19%', 'task.dat' index 24 with boxes lt 13 title 'ILCS_VC 0.37%', 'task.dat' index 23 with boxes lt 14 title 'VCHIQr-0 0.02%', 'task.dat' index 22 with boxes lt 15 title 'VCHIQ-0 0.34%', 'task.dat' index 21 with boxes lt 10 title 'v3d_gfxh16_thre 0.02%', 'task.dat' index 20 with boxes lt 9 title 'temp_check 0.02%', 'task.dat' index 19 with boxes lt 5 title 'powerman 0.15%', 'task.dat' index 18 with boxes lt 1 title 'mbox_read 0.13%', 'task.dat' index 17 with boxes lt 2 title 'KHRN_S 4.96%', 'task.dat' index 16 with boxes lt 3 title 'HDMI TASK 0.04%', 'task.dat' index 15 with boxes lt 4 title 'HDMI_HOTPLUG_TM 0.01%', 'task.dat' index 14 with boxes lt 11 title 'AUDSRV 0.08%', 'task.dat' index 13 with boxes lt 12 title 'Dispmanx Ch1 EO 0.08%', 'task.dat' index 12 with boxes lt 13 title 'SysTimer 0.17%', 'task.dat' index 11 with boxes lt 14 title 'DISPSERVX 0.14%', 'task.dat' using 1:2:($3/2) index 10 with xerrorbars lt 15 title 'Int scaler 0.08%', 'task.dat' using 1:2:($3/2) index 9 with xerrorbars lt 10 title 'Int 0x5f 0.03%', 'task.dat' using 1:2:($3/2) index 8 with xerrorbars lt 9 title 'Int 0x5e 0.12%', 'task.dat' using 1:2:($3/2) index 7 with xerrorbars lt 5 title 'Int dma3 0.19%', 'task.dat' using 1:2:($3/2) index 6 with xerrorbars lt 1 title 'Int msync2 0.01%', 'task.dat' using 1:2:($3/2) index 5 with xerrorbars lt 2 title 'Int 3d 0.07%', 'task.dat' using 1:2:($3/2) index 4 with xerrorbars lt 3 title 'Int codec2 0.14%', 'task.dat' using 1:2:($3/2) index 3 with xerrorbars lt 4 title 'Int codec0 0.02%', 'task.dat' using 1:2:($3/2) index 2 with xerrorbars lt 11 title 'Int timer2 0.10%', 'task.dat' using 1:2:($3/2) index 1 with xerrorbars lt 12 title 'Int timer0 0.04%', 0 title 'Idle 35.3%'
set nomultiplot
pause -1 'Hit return to continue'

set yrange [-0.1:5.1]
set multiplot
set title 'Core 2 (VPU1)'
set ytics ("Idle" 0, "Int msync3" 1, "SysTimer2" 2, "khrn_llat_threa" 3, "H264#0 Outer" 4, "ILImageFX" 5)
plot 'task.dat' index 37 with boxes lt 5 title 'ILImageFX 15.1%', 'task.dat' index 36 with boxes lt 1 title 'H264#0 Outer 2.05%', 'task.dat' index 35 with boxes lt 2 title 'khrn_llat_threa 1.42%', 'task.dat' index 34 with boxes lt 3 title 'SysTimer2 0.03%', 'task.dat' using 1:2:($3/2) index 33 with xerrorbars lt 4 title 'Int msync3 0.02%', 0 title 'Idle 81.3%'
set nomultiplot
pause -1 'Hit return to continue'
[...]

so a very small, but still positive, impact.
Being able to deinterlace 1080i may be useful when working with DVB-T/S for instance, so someone may be interested in such a feature.

NOTE: Of course to properly analyse the results we should plot multiple samples and get to conclusions accordingly, not just one as I did. However, as time is a little short, I think I'll end here this topic with these rough and approximate values. If anyone wants to go on with a better analysis, the repository will be updated soon.

5 comments:

  1. Hello Luca Carlon,
    I See your result and its very interesting. The Result ist very good. My Question do you have Pi3 Images with this solution? Maype is it possible to make ts streams with it?

    ReplyDelete
    Replies
    1. Pi3 is perfectly compatible with Pi2. The same image works properly.
      I think ts streams should work already if the codec used is h264 or something you have an hardware decoder for.

      Delete
    2. This comment has been removed by the author.

      Delete
  2. Why i ask for it is, i´ve try realtime transcode of dvb stream which use omx decode and encode withou deinterlacing with gstreamer. It works really good and now i try to use your solution to deinterlace 1080i source by this but its very difficult. Could you help me by this a little bit?

    ReplyDelete
    Replies
    1. I'm sorry, at the moment I don't have much spare time. You can open an issue and provide a sample but I can't guarantee.

      Delete