A Fast, Small-Radius GPU Median Filter

Published in ShaderX6.

Morgan McGuire, Williams College

3x3 Median Shader (GLSL)
3x3 Median Shader (GLSL, optimized for GeForce 8800)
5x5 Median Shader (GLSL)

Used by Qualcomm [Michael Mangan], libavg [ref], and many research papers--e-mail to list your project here.

Thanks to Kyle Whitson '09, who wrote the demos and GPU performance tests, and to NVIDIA for donating the GeForce 8800 GPUs used in testing.


This chapter describes a very fast median filter for today's GPUs, and explains how to port it to future GPUs and other data-parallel processors like DSPs and CPUs with vector instructions (e.g., MMX, SIMD). The technique used in this chapter is inherently fast because it is designed with ideal characteristics for streaming parallel architectures:

  • No branches
  • Single-pass
  • Data-parallel across pixels
  • Data-parallel at each pixel
  • High compute-to-memory ratio
On a GeForce 8800 or comparable GPU, this optimized filter can process multiple 4096x4096 video sequences at over 100 fps, which is important for real-time video processing. We give shaders for the 3x3 and 5x5 kernels for which our filter is appopriate and for a sample higher-order real-time non-photorealistic filter built on several applications median.


Source, 3x3, and 5x5 filtered:

Watercolor filter before and after:

Cartoon filter before and after:
(attempting to mimic A Scanner Darkly in real-time)

Performance comparison:


  author = {Morgan Mc{G}uire},
  title = {A fast, small-radius GPU median filter},
  booktitle = {ShaderX6},
  year = {2008},
  month = {February},
  url = {http://graphics.cs.williams.edu/papers/MedianShaderX6/},