The convolution module of convolution neural networks is highly computation demanding. In order to execute a neural network inference on embedded platforms, an ecient implementation of the convolution is required. Low precision parameters can provide an implementation that requires less memory, less computation time, and less power consumption. Nevertheless, streaming the convolution computation over parallelized processing units saves a lot of memory, which is a key concern in memory constrained embedded platforms. In this paper, we show how the convolution module can be implemented on Epiphany manycore architecture. Low precision parameters are used with ternary weights of +1, 0, and -1 values. The computation is done through a pipeline by streaming data through processing units. The proposed approach decreases the memory requirements for CNN implementation and could reach up to 282 GOPS and up to 5.6 GOPs/watt.