"Since 2002, the limits of power, available instruction-level parallelism, and long memory latency have slowed uniprocessor performance [growth] recently, to about 20% per year [down from 52% per year]." Hennessy and Patterson, Computer Architecture: A Quantitative Approach, Fourth Edition (2007), p.3.
Post-2002 the Moore's Law growth in transistor count has been increasingly devoted to adding additional cores to CPUs, rather than trying to improve single-core performance. In the prior decade single-threaded C/C++ applications were sped up "for free" with successive CPU generations. This free ride has ended, and left programmers with a choice. Either they can rewrite their single-threaded C/C++ applications to be multi-threaded (either explicitly, with pthreads, or more implicitly, say with OpenMP), or they can rewrite their applications in a different more multi-core-friendly programming language.
In choosing an alternative language two important related factors must be examined. First, how much of a single-threaded performance hit is taken moving to the new language? If the new language is say 40 times slower than C/C++, you will need a 40-core CPU to get back to where you started (assuming perfect linear speedup). Second, how much worse than linear is the actual speedup? If the speedup starts out at 1.5 with 2 cores, and drops below 1 at 8 cores, the new language will probably never catch single-threaded C/C++.