This paper presents a unified design flow that aims at accelerating parallelizable data-intensive applications in the context of ubiquitous computing. This contribution relies on the JubiTool: a set of integrated tools (JubiSplitter, JubiCompiler), allowing respectively to extract and compile parallelizable parts of applications described in a Java extended language called Jubi. By appending hardware directives to a software agent description, the inherent flexibility of software is combined with the runtime performance of a hardware execution. In the case of typical Perplexus applications such as a biologically plausible neural network simulator, this contribution takes profit of the intrinsic property of the Perplexus Ubichip in terms of parallelism resulting in an expected speedup of one order of magnitude. Finally, we show that this original flow allowing HW acceleration can be modified to support other types of distributed platforms.