The more time I spent building products that use machine learning at their core the more it becomes clear to me that the ideal of a single data science language and framework will never be fully realized. There will always be legacy code that was written before the framework du jour arrived. Even without legacy code, the reality looks already rather polyglot. Consider Spark. When using their Python bindings one will almost surely end up with a mix of Scala and Python.
To support these polyglot architectures more and more effort is being spent to ensure that the underlying algorithms are the same regardless of the programming language or programming framework chosen. A good example of such an algorithm is xgboost. The core algorithm is implemented in C++, but it also supports current data processing frameworks, like Spark or Flink, and data science languages, such as Python or R. As one central feature of its design, xgboost uses its own framework for parallelization: even if you use Spark, the parallelization will be fully handled by xgboost once the training starts. Another example of this design is CaffeOnSpark.
If similarly control flow would be possible between machine learning library and parallelization framework, the algorithm could use the native parallelization mechanisms supplied by the framework. Think of an algorithm that is implemented in terms of primitives such as allreduce and broadcast - as is xgboost. Whenever the algorithm requests an allreduce, the parallelization framework takes over, but hands back control once the reduction has been performed. This way each part of the system could handle what it does best: the algorithm library the actual algorithm, the parallelization framework the parallelization.