Dear Jan,

 

Thank you very much for your very detailed analysis. We will try to reproduce this as soon as possible.

 

Three questions:

- You only use threads, no MPI parallelization, correct?

- Your machine has >= 32 cores?

- Do the neurons receive the expected input currents, especially the same currents independent of number of threads?

 

Best,

Hans Ekkehard

 

--

 

Prof. Dr. Hans Ekkehard Plesser

Head, Department of Data Science

 

Faculty of Science and Technology

Norwegian University of Life Sciences

PO Box 5003, 1432 Aas, Norway

 

Phone +47 6723 1560

Email hans.ekkehard.plesser@nmbu.no

Home http://arken.nmbu.no/~plesser

 

 

 

On 28/04/2022, 16:22, "Jan Střeleček" <strelda@protonmail.com> wrote:

 

 

Dear NEST developers,

In our group, we're working on a model of the primary visual cortex and use step_current_source generators to simulate the input current of the LGN neurons. We noticed that the simulation time of our model was very sensitive to the number of step_current_sources. When trying to narrow down the cause, we found out that this might be due to an issue with the parallelization of the step_current_source_generators. The resulting simple system in which the problem can be observed is attached below, simple_example.py. It essentially creates NS step_current_generators and injects them into NL neurons with fixed indegree. The iaf_cond_exp neuron model is used here. The increment in the number of step_current sources does not benefit from a multithreading performance boost as one would expect. This is compared to the performance boost for the number of neurons; see the technical details below. Our estimated guess is that the difference between 1 and 32 threads is 10 to 20 times slower than the parallelization suggests.

 

 

Technical details:

 

The relative slowdown due to the parallelization of step_current_sources was measured using linear regression over

  simulation time = a NL + b NS.

See slowdown_example.png.

The ratio b/a was then calculated. This ratio was then measured in dependence on the number of threads. A bigger difference between the ratio for 1 thread and 32 threads means a greater problem in parallelization in step_current_generators.

 

Some additional results:

·         interval_dependence.png - the slowdown does not depend on amplitude_times in the step_current_source function

·         indegree_dependence.png - the slowdown depends on the indegree of nest.Connect(source, neurons). Specifically, the slowdown is worse for low indegree values. This shows the slowdown depends on the number of step_current_sources created, not on the injections themselves.

 

 

Are you aware of some lack of parallelization of the step_current_source or current the injection itself? If so, are there any plans for improving it? 

 

best regards,

Jan Střeleček