3 Answers
PO
0

It looks like both tensorflow methods (`tf.nn.max_pool2d` and `tf.nn.dilation2d`) will work, but are pretty horrifically inefficient compared to what they could be (as demonstrated by opencv).  The following script yields

Dilation of 640x480 image with a 25x25 kernel took: 

  545.27ms with maxpool

  228.72ms with dilate

  0.66ms with opencv

On my computer (Macbook air with M1) 

Source:

import numpy as np
import cv2
import tensorflow as tf
import time


def tf_dilate(heatmap, width: int = 20, use_max_pool_backend: bool = False):
    """ Dilate the heatmap with a square kernel
    Note - this is probably inefficient, as I suspect it's computing a max over a 20x20 box of pixels for each pixel
    """
    if use_max_pool_backend:
        return tf.nn.max_pool2d(heatmap[None, :, :, None], ksize=width, padding='SAME', strides=(1, 1))[0, :, :, 0]
    else:
        return tf.nn.dilation2d(heatmap[None, :, :, None], filters=tf.zeros((width, width, 1), dtype=heatmap.dtype),
                                strides=(1, 1, 1, 1), padding="SAME", data_format="NHWC", dilations=(1, 1, 1, 1))[0, :, :, 0]


def test_dilation_options(img_shape=(480, 640), kernel_size=25):

    img = np.random.randn(*img_shape).astype(np.float32)**2
    tf_image = tf.constant(img, dtype=tf.float32)
    t0 = time.time()
    result_tf_maxpool = tf_dilate(tf_image, width=kernel_size, use_max_pool_backend=True)
    t1 = time.time()
    result_tf_dilate = tf_dilate(tf_image, width=kernel_size, use_max_pool_backend=False)
    t2 = time.time()
    result_opencv = cv2.dilate(img, kernel=np.ones((kernel_size, kernel_size), dtype=np.float32))
    t3 = time.time()
    assert np.array_equal(result_tf_maxpool.numpy(), result_tf_dilate.numpy()), "Results of two tensorflow dilates not equal"
    assert np.array_equal(result_tf_dilate.numpy(), result_opencv), "Results of tensorflow and opencv not equal"
    print(f'Dilation of {img_shape[1]}x{img_shape[0]} image with a {kernel_size}x{kernel_size} kernel took: '
          f'\n  {(t1-t0)*1000:.2f}ms with maxpool'
          f'\n  {(t2-t1)*1000:.2f}ms with dilate'
          f'\n  {(t3-t2)*1000:.2f}ms with opencv'
          )


if __name__ == '__main__':
    test_dilation_options()
Reply
PO
0

It looks like we get a 10x speedup (which still leaves us 40x slower than opencv), if we decompose the square dilation into row-wise and column-wise.

Dilation of 640x480 image with a 25x25 kernel took: 

  597.59ms with maxpool

  23.50ms with dilate

  0.50ms with opencv

Modified dilation function:

def tf_dilate(heatmap, width: int = 20, use_max_pool_backend: bool = False):
    """ Dilate the heatmap with a square kernel
    Note - this is probably inefficient, as I suspect it's computing a max over a 20x20 box of pixels for each pixel
    """
    if use_max_pool_backend:
        return tf.nn.max_pool2d(heatmap[None, :, :, None], ksize=width, padding='SAME', strides=(1, 1))[0, :, :, 0]
    else:
        row_dilation = tf.nn.dilation2d(heatmap[None, :, :, None], filters=tf.zeros((1, width, 1), dtype=heatmap.dtype),
                                        strides=(1, 1, 1, 1), padding="SAME", data_format="NHWC", dilations=(1, 1, 1, 1))
        full_dilation = tf.nn.dilation2d(row_dilation, filters=tf.zeros((width, 1, 1), dtype=heatmap.dtype),
                                         strides=(1, 1, 1, 1), padding="SAME", data_format="NHWC", dilations=(1, 1, 1, 1))
        return full_dilation[0, :, :, 0]

I've make a StackOverflow question to solicit more input on this: https://stackoverflow.com/questions/72733907/efficient-image-dilation-in-tensorflow

JW330.00
3

There is another option for it. You can do it by using the max-pooling operation of Tensorflow and it does not have to be of version 2. Here is how you can do it.

erosion = -tf.nn.max_pool2d(-x, ksize=(k, k), stride=1, name='erosion2D')
dilation = tf.nn.max_pool2d(x, ksize=(k, k), stride=1, name='dilation2D')

It is a short way to have dilation and erosion in Tensorflow overall.

Reply
PO
0

I haven't dug into the efficiency of this implementation.  E.g., say you have a KxK square kernel on a size HxW image - imaging K=25 or something.. 

Is it O(H*W*K*K)? (ie max over each KxK box per pixel?), or something more efficient? (it seems like it must be possible to take advantage of the fact that neighbouring boxes  mostly overlap - I assume the implementation does not do this since max pooling is usually done with non-overlapping boxes).

JO297.00
1

There are 2 implementations in Tensorflow for erosion and dilation in 2D.

tf.nn.erosion2d(
    value,
    filters,
    strides,
    padding,
    data_format,
    dilations,
    name=None
)

tf.nn.dilation2d(
    input,
    filters,
    strides,
    padding,
    data_format,
    dilations,
    name=None
)

The official doc of this implementation is written in Tensorflow 2.*, but I believe it will work in Tensorflow 1.* as well. Here are the links of tf.nn.erosion2d and tf.nn.dilation2d.

Reply
Couldn't find what you were looking for?and we will find an expert to answer.
How helpful was this page?