Second International Workshop on
|
[1] |
A. Benoit and M. Cole.
Two Fundamental Concepts in Skeletal Programming.
In V. Sunderam, D. van Albada, P. Sloot, and J. Dongarra, editors,
International Conference on Computational Science (ICCS 2005), LNCS.
Springer, 2005.
[ bib ]
We define the concepts of nesting mode and interaction mode as they arise in the description of skeletal parallel programming systems. We suggest that these new concepts encapsulate fundamental design issues and may play a useful role in defining and distinguishing between the capabilities of competing systems. We present the decisions taken in our own Edinburgh Skeleton Library eSkel, and review the approaches chosen by a selection of other skeleton libraries. |
[2] |
S. Campa.
A Formal Framework for Orthogonal Data and Control Parallelism
Handling.
In V. Sunderam, D. van Albada, P. Sloot, and J. Dongarra, editors,
International Conference on Computational Science (ICCS 2005), LNCS.
Springer, 2005.
[ bib ]
We propose a semantic framework for parallel programming based on the orthogonalization of data access and control concerns by means of a set of abstraction mechanisms. Such mechanisms regard the description of how data has to be accessed, the description of how data has to be computed and the description of how coupling data accesses and patterns of control. Each description is represented by an abstraction mechanism formalized through a formal semantics. The set of semantics specifications defines a method to investigate the structure of the whole application. We demonstrate how this semantics provides a formal, provable method to statically or dynamically evaluate the overall performance of the application and, eventually, apply optimization rules. |
[3] |
M. Bamha and G. Hains.
An Efficient equi-semi-Join Algorithm for Distributed
Architectures.
In V. Sunderam, D. van Albada, P. Sloot, and J. Dongarra, editors,
International Conference on Computational Science (ICCS 2005), LNCS.
Springer, 2005.
[ bib ]
Semi-joins is the most used technique to optimize the treatment of complex relational queries on distributed architectures. However the overcost related to semi-joins computation can be very high due to data skew and to the high cost of communication in distributed architectures. In this paper we present a parallel equi-semi-join algorithm for shared nothing machines. The performance of this algorithm is analyzed using the BSP cost model and is proved to have asymptotic optimal complexity and perfect load balancing even for highly skewed data. This guarantees unlimited scalability in all situations for this key algorithm. |
[4] |
G. Michaelson, N. Scaife, and S. Horiguchi.
Empirical Parallel Performance Prediction from Semantics-based
Profiling.
In V. Sunderam, D. van Albada, P. Sloot, and J. Dongarra, editors,
International Conference on Computational Science (ICCS 2005), LNCS.
Springer, 2005.
[ bib ]
The PMLS parallelizing compiler for Standard ML is based upon the automatic instantiation of algorithmic skeletons at sites of higher order function use. PMLS seeks to optimise run-time parallel behaviour by combining skeleton cost models with Structural Operational Semantics rule counts for HOF argument functions. In this paper, the formulation of a general rule count cost model as a set of over-determined linear equations is discussed, and their solution by singular value decomposition, and by a genetic algorithm, are presented. |
[5] |
Y. Zhang and E. Luke.
Dynamic Memory Management in the Loci Framework.
In V. Sunderam, D. van Albada, P. Sloot, and J. Dongarra, editors,
International Conference on Computational Science (ICCS 2005), LNCS.
Springer, 2005.
[ bib ]
Resource management is a critical concern in high-performance computing software. While management of processing resources to increase performance is the most critical, efficient management of memory resources plays an important role in solving large problems. This paper presents a dynamic memory management scheme for a declarative high-performance data-parallel programming system - the Loci framework. In such systems, some sort of automatic resource management is a requirement. We present an automatic memory management scheme that provides good compromise between memory utilization and speed. In addition to basic memory management, we also develop methods that take advantages of the cache memory subsystem and explore balances between memory utilization and parallel communication costs. |
[6] |
A. Al Zain, P. Trinder, H.-W. Loidl, and G. Michaelson.
Managing Heterogeneity in a Grid Parallel Haskell .
In V. Sunderam, D. van Albada, P. Sloot, and J. Dongarra, editors,
International Conference on Computational Science (ICCS 2005), LNCS.
Springer, 2005.
[ bib ]
Grid-GUM2 is a distributed virtual shared-memory implementation of a high-level parallel language for computational Grids. While the implementation delivers good speedups on multiple homogeneous clusters with low-latency interconnect, on heterogeneous clusters, however, poor load balance limits performance. Here we present new load management mechanisms that combine static and partial dynamic information to adapt to heterogeneous Grids. The mechanisms are evaluated by measuring four non-trivial programs with different parallel properties, and show runtime improvements between 17% and 57%, with the most dynamic program giving the greatest improvement. |