ARTICLES / PAPERS
Revue / Journal

[1]

Virginia Niculescu and Frédéric Loulergue.
Transforming powerlist based divide&conquer programs for an improved
execution model.
The Journal of Supercomputing, 2019.
[ DOI ]
Powerlists are data structures that can be successfully
used for defining parallel programs based on
Divide&Conquer paradigm. These parallel recursive
data structures and their algebraic theories offer
both a methodology to design parallel algorithms and
parallel programming abstractions to ease the
development of parallel applications. The paper
presents a technique for speeding up the parallel
recursive programs defined based on powerlists. The
improvements are achieved by applying transformation
rules that introduce tuple functions and prefix
operators, for which a more efficient execution
model is defined. Together with the execution model,
a cost model is also defined in order to allow a
proper evaluation. The treated examples emphasise
the fact that the transformation leads to important
improvements of the programs. The speeding up is
achieved by reducing the number of recursive calls,
and also by enable the fusion of splitting/combining
operations on different data structures. In
addition, enhancing the function that has to be
computed to other useful functions using a tuple,
could improved the cost reduction even more.

[2]

Allan Blanchard, Nikolai Kosmatov, and Frédéric Loulergue.
MMFilter: A CHRbased solver for generation of executions under
weak memory models.
Comput Lang Syst Str, 53:121142, 2018.
[ DOI ]
With the wide expansion of multiprocessor
architectures, the analysis and reasoning for
programs under weak memory models has become an
important concern. This work presents MMFilter, an
original constraint solver for generating program
behaviors respecting a particular memory model. It
is implemented in Prolog using CHR (Constraint
Handling Rules). The CHR formalism provides a
convenient generic solution for specifying memory
models. It benefits from the existing optimized
implementations of CHR and can be easily extended to
new models. We present MMFilter design, illustrate
the encoding of memory model constraints in CHR and
discuss the benefits and limitations of the proposed
technique.

[3]

Frédéric Loulergue, Wadoud Bousdira, and Julien Tesson.
Calculating Parallel Programs in Coq using List Homomorphisms.
Int J Parallel Prog, 45:300319, 2017.
[ DOI ]
SyDPaCC is a set of libraries for the Coq proof
assistant. It allows to write naive functional
programs (i.e. with high complexity) that are
considered as specifications, and to transform them
into more efficient versions. These more efficient
versions can then be automatically parallelised
before being extracted from Coq into source code for
the functional language OCaml together with calls to
the Bulk Synchronous Parallel ML (BSML) library. In
this paper we present a new core version of SyDPaCC
for the development of parallel programs
correctbycontruction using the theory of list
homomorphisms and algorithmic skeletons implemented
and verified in Coq. The framework is illustrated on
the the maximum prefix sum problem.

[4]

Frédéric Loulergue.
A BSPlibstyle API for Bulk Synchronous Parallel ML.
Scalable Computing: Practice and Experience, 18:261274, 2017.
[ DOI ]
Bulk synchronous parallelism (BSP) offers an
abstract and simple model of parallelism yet allows
to take realistically into account the communication
costs of parallel algorithms. BSP has been used in
many application domains. BSPlib and its variants
are programming libraries for the C language that
support the BSP style. Bulk Synchronous Parallel ML
(BSML) is a library for BSP programming with the
functional language OCaml. It offers parallel
operations on a data structure named parallel
vector. BSML provides a global view of programs,
i.e. BSML programs can be seen as sequential
programs working on a parallel data structure (seq
of par) while a BSPlib program is written in the
SPMD style and understood as a parallel composition
of communicating sequential programs (par of
seq). The communication styles of BSML and BSPlib
are also quite different. The contribution of this
paper is a BSPlibstyle communication API
implemented on top of BSML. It has been designed
without extending BSML, but only using the
imperative features of the underlying function
language OCaml. Programs implemented using this API
are very close to programs implemented using a
BSPlib library for the C language. It therefore
shows that BSML is universal for the BSP model.

[5]

Thibaut Tachon, Chong Li, Gaétan Hains, and Frédéric
Loulergue.
Automated generation of BSP automata.
Parallel Processing Letters, 17(1), 2017.
[ DOI ]
BulkSynchronous Parallel (BSP) is a bridging model
between abstract execution and concrete parallel
architectures that retains enough information to
model performance scalability. In order to model BSP
program executions, Hains adapted the theory of
finite automata to the BSP paradigm by introducing
BSP automata. In the initial description of the
theory, BSP automata had to be explicitly defined as
structured sets of states and transitions. The lack
of a clean and efficient algorithm for generating
them from regular expressions would have prevented
this theory from being used in practice. To
alleviate this problem we introduce in this paper an
algorithm that generates a BSP automaton recognizing
a BSP language defined by a BSP regular
expression. The main practical use of BSP automata
developed in this paper is the parallelization of
finite state automata with an explicit distribution
and a performance model, that enable parallel
matching of regular expressions. Secondarily, BSP
regular expressions provide a convenient structure
for automatic code generation of imperative BSP
program that is also developed in this paper.

[6]

Frédéric Dabrowski, Frédéric Loulergue, and Thomas Pinsard.
A formal semantics of nested atomic sections with thread escape.
Comput Lang Syst Str, 42:221, 2015.
[ DOI ]
The multicore trend is widening the gap between
programming languages and hardware. Taking
parallelism into account in the programs is
necessary to improve performance. Unfortunately,
current mainstream programming languages fail to
provide suitable abstractions to do so. The most
common pattern relies on the use of mutexes to
ensure mutual exclusion between concurrent accesses
to a shared memory. However, this model is
errorprone and scales poorly by lack of modularity.
Recent research proposes atomic sections as an
alternative. The user simply delimits portions of
code that should be free from interference. The
responsibility for ensuring interference freedom is
left either to the compiler or to the runtime
system. In order to provide enough modularity, it
is necessary that both atomic sections could be
nested and threads could be forked inside an atomic
section. In this paper we focus on the semantics of
programming languages providing these features.
More precisely, without being tied to a specific
programming language, we consider program traces
satisfying some basic wellformedness conditions.
Our main contribution is the precise definition of
atomicity, wellsynchronisation and the proof that
the latter implies the strong form of the former. A
formalisation of our results in the Coq proof
assistant is described.

[7]

Louis Gesbert, Frédéric Gava, Frédéric Loulergue, and
Frédéric Dabrowski.
Bulk Synchronous Parallel ML with Exceptions.
Future Gener Comp Sy, 26(3):486490, 2010.
[ DOI ]
Bulk Synchronous Parallel ML is a highlevel
language for programming parallel algorithms. Built
upon OCaml and using the BSP model, it provides a
safe setting for their implementation, avoiding
concurrency related problems (deadlocks,
indeterminism). Only a limited set of the
features of OCaml can be used in BSML to respect its
properties of safety: this paper describes a way to
add exception handling to this set by extending and
adapting OCaml's exceptions. The behaviour of these
new exceptions and the syntactic constructs to
handle them, together with their implementation, are
described in detail, and results over an example are
given.

[8]

Frédéric Gava and Frédéric Loulergue.
A Static Analysis for Bulk Synchronous Parallel ML to Avoid Parallel
Nesting.
Future Gener Comp Sy, 21(5):665671, 2005.
[ DOI ]
The BSMLlib library is a library for Bulk
Synchronous Parallel (BSP) programming with the
functional language Objective Caml. It is based on
an extension of the λcalculus by parallel
operations on a data structure named parallel
vector, which is given by intention. In order to
have an execution that follows the BSP model, and to
have a simple cost model, nesting of parallel
vectors is not allowed. The novelty of this paper
is a type system which prevents such nesting. This
system is correct w.r.t. the dynamic semantics.

[9]

Frédéric Gava and Frédéric Loulergue.
A Functional Language for Departmental Metacomputing.
Parallel Processing Letters, 15(3):289304, 2005.
[ DOI ]
We have designed a functional dataparallel
language called BSML for programming
bulksynchronous parallel algorithms, a model of
computing. Deadlocks and indeterminism are avoided
and the execution time can be then estimated. For
large scale applications where parallel processing
is helpful, more than one parallel machine is
needed. One speaks about metacomputing. A major
problem in programming parallel application for such
platforms is their hierarchical network structures:
latency and bandwidth of global ones are orders of
magnitude worse than those of local networks. Here
we consider how an extention of the wellsuited BSP
model might be for these kinds of computing and how
extended our functional language can be for this.

[10]

Frédéric Loulergue, Frédéric Gava, Myrto Arapinis, and
Frédéric Dabrowski.
Semantics and Implementation of Minimally Synchronous Parallel ML.
International Journal of Computer and Information Science,
5(3):182199, 2004.
This paper presents a new functional parallel
language: Minimally Synchronous Parallel ML
(MSPML). The execution time can be estimated,
deadlocks and indeterminism are avoided. Programs
are written as usual ML programs but using a small
set of additional functions. Provided functions are
used to access the parameters of the parallel
machine and to create and operate on a parallel data
structure. It follows the execution and cost model
of the Message Passing Machine model (MPM). It
shares with Bulk Synchronous Parallel ML its syntax
and highlevel semantics but it has a minimally
synchronous distributed semantics. Experiments have
been run on a cluster of PC using an implementation
of the Diffusion algorithmic skeleton.

[11]

Gaétan Hains, Frédéric Loulergue, and John Mullins.
Concrete data structures and functional parallel programming.
Theor Comput Sci, 258(12):233267, 2001.
[ DOI ]
A framework is presented for designing parallel
programming languages whose semantics is functional
and where communications are explicit. To this end,
Brookes and Geva's generalized concrete data
structures are specialized with a notion of explicit
data layout to yield a CCC of distributed structures
called arrays. Arrays' symmetric replicated
structures, suggested by the dataparallel SPMD
paradigm, are found to be incompatible with sum
types. We then outline a functional language with
explicitlydistributed (monomorphic) concrete types,
including higherorder, sum and recursive ones. In
this language, programs can be as large as the
network and can observe communication events in
other programs. Such flexibility is missing from
current dataparallel languages and amounts to a
fusion with their socalled annotations, directives
or metalanguages.

[12]

Frédéric Loulergue.
Distributed Evaluation of Functional BSP Programs.
Parallel Processing Letters, 11(4):423437, 2001.
[ DOI ]
The BSlambda_pcalculus is a calculus of functional
bulk synchronous parallel (BSP) programs. It is the
basis for the design of a bulk synchronous parallel
ML language. For dataparallel languages, there are
two points of view: the programming model where a
program is seen as a sequence of operations on
parallel vectors, and the execution model where the
program is a parallel composition of programs run on
each processor of the parallel machine. BSP
algorithms are defined by dataparallel algorithms
with explicit (physical) processes in order to allow
their parallel execution time to be estimated. We
present here a distributed evaluation minimally
synchronous for BSP execution (which corresponds to
the execution model). This distributed evaluation is
correct w.r.t. the callbyvalue strategy of the
BSlmabda_pcalculus (which corresponds to the
programming model).

[13]

Frédéric Loulergue, Gaétan Hains, and Christian Foisy.
A Calculus of Functional BSP Programs.
Sci Comput Program, 37(13):253277, 2000.
[ DOI 
ePub ]
An extension of the lambdacalculus called BSlambda
is introduced as a formal basis for functional
languages expressing bulk synchronous parallel
algorithms. A confluence result is shown. The
application of the calculus is illustrated by
examples of program proofs and the associated notion
of parallel reduction. The reduction process is
interpreted in the BSP cost model.
Chapitre de livre / Book Chapter

[1]

Frédéric Loulergue, Wadoud Bousdira, and Julien Tesson.
Calcul de programmes parallèles avec Coq.
In Nicolas Ollinger, editor, Informatique Mathématique une
photographie en 2015, collection Alpha, pages 87134. CNRS Éditions,
2015.
Numéro spécial de revue / Special Issue

[1]

Frédéric Loulergue.
Special Issue on Practical Aspects of HighLevel Parallel
Programming.
Scalable Computing: Practice and Experience, 18(1), 2017.
[ http ]
Selected extended and revised papers from the
International Workshop and the ACM SAC track on
Practical Aspects of HighLevel Parallel
Programming

[2]

Frédéric Loulergue and A. Tiskin.
Preface: Special Issue on HighLevel Parallel Programming and
Applications.
Parallel Processing Letters, 18(1), 2008.
[ DOI ]

[3]

Anne Benoit and Frédéric Loulergue.
Special Issue on Practical Aspects of HighLevel Parallel
Programming.
Scalable Computing: Practice and Experience, 8(4), 2007.
Selected extended and revised papers from the Third International
Workshop on Practical Aspects of HighLevel Parallel Programming (PAPP
2006).
[ http ]

[4]

Frédéric Loulergue.
Introduction to the special issue on semantics and costs models for
highlevel parallel programming.
Computer Languages, Systems and Structures, 33(34):7981,
2007.
[ DOI ]

[5]

Frédéric Loulergue.
Special Issue on Practical Aspects of HighLevel Parallel
Programming.
Scalable Computing: Practice and Experience, 7(3), 2006.
Selected extended and revised papers from the Second International
Workshop on Practical Aspects of HighLevel Parallel Programming (PAPP
2005).
[ http ]

[6]

Frédéric Loulergue.
Special Issue on Practical Aspects of HighLevel Parallel
Programming.
Scalable Computing: Practice and Experience, 6(4), 2005.
Selected extended and revised papers from the First International
Workshop on Practical Aspects of HighLevel Parallel Programming (PAPP
2004).
[ http ]

[7]

Gaétan Hains and Frédéric Loulergue.
Preface: Special Issue on HighLevel Parallel Programming and
Applications.
Parallel Processing Letters, 13(3), 2003.
[ DOI ]
Colloque international avec publication des actes / International Conference with Published Proceedings

[1]

Allan Blanchard, Frédéric Loulergue, and Nikolai Kosmatov.
Towards Full Proof Automation in FramaC using AutoActive
Verification.
In Nasa Formal Methods, LNCS. Springer, 2019.
to appear.
While deductive verification is increasingly used on
reallife code, making it fully automatic remains
difficult. The development of powerful SMT solvers
has improved the situation, but some proofs still
require interactive theorem provers in order to
achieve full formal verification. Autoactive
verification relies on additional guiding
annotations (assertions, ghost code, lemma
functions, etc.) and provides an important step
towards a greater automation of the proof. However,
the support of this methodology often remains
partial and depends on the verification tool. This
paper presents an experience report on a complete
functional verification of several C programs from
the literature and reallife code using autoactive
verification with the C software analysis platform
FramaC and its deductive verification plugin
. The goal is to use automatic solvers to verify
properties that are classically verified with
interactive provers. Based on our experience, we
discuss the benefits of this methodology and the
current limitations of the tool, as well as
proposals of new features to overcome them.

[2]

Allan Blanchard, Nikolai Kosmatov, and Frédéric Loulergue.
Logic against ghosts: Comparison of two proof approaches for a list
module.
In ACM Symposium on Applied Computing (SAC), pages 21862195.
ACM, 2019.
Best Paper Award.
[ DOI ]
Modern verification projects continue to offer new
challenges for formal verification. One of them is
the linked list module of Contiki, a popular
opensource operating system for the Internet of
Things. It has a rich API and uses a particular list
representation that make it different from the
classical linked list implementations. Being widely
used in the OS, the list module is critical for
reliability and security. A recent work verified the
list module using ghost arrays. This article
reports on a new verification effort for this
module. Realized in the FramaC/Wp tool, the new
approach relies on logic lists. A logic list
provides a convenient highlevel view of the linked
list. The specifications of all functions are now
proved faster and almost all automatically, only a
small number of auxiliary lemmas and a couple of
assertions being proved interactively in Coq. The
proposed specifications are validated by proving a
few client functions manipulating lists. During the
verification, a more efficient implementation for
one function was found and verified. We compare the
new approach with the previous effort based on ghost
arrays, and discuss the benefits and drawbacks of
both techniques.

[3]

Jolan Philippe and Frédéric Loulergue.
Parallel programming with Coq: Map and reduce skeletons on trees.
In ACM Symposium on Applied Computing (SAC), pages 15781581.
ACM, 2019.
[ DOI ]
SyDPaCC is a set of libraries for the Coq interactive
theorem prover. It allows to develop correct
functional parallel programs on distributed lists
based on the transformation of naive sequential
programs that are considered as specifications. To
offer the parallelization of functions on other data
structures, the first step is to implement a
parallel version of the considered data structure
and to provide parallel implementations of primitive
functions manipulating it. This paper presents such
a first step: a binary tree extension which includes
new map and reduce pure functional algorithmic
skeletons for binary trees. Such algorithmic
skeletons are templates of parallel algorithms,
realized in a functional context as higherorder
functions implemented in parallel. The use of these
new primitives is illustrated on example
applications.

[4]

Jolan Philippe and Frédéric Loulergue.
PySke: Algorithmic skeletons for Python.
In International Conference on High Performance Computing and
Simulation (HPCS). IEEE, 2019.
to appear.
PySke is a library of parallel algorithmic skeletons in
Python designed for list and tree data structures.
Such algorithmic skeletons are highorder functions
implemented in parallel. An application developed
with PySke is a composition of skeletons. It does
not follow the Single Program Multiple Data (SPMD)
paradigm but offers a global view of parallel
programs. All these features make parallel
programming with PySke easier. PySke offers a good
balance between execution efficiency and programming
productivity. In addition to the library, we
present experiments performed on a highperformance
multicore workstation (shared memory) and a
highperformance computing cluster (distributed
memory) on a set of example applications developed
with PySke.

[5]

Allan Blanchard, Nikolai Kosmatov, and Frédéric Loulergue.
Ghosts for Lists: A Critical Module of Contiki Verified in FramaC.
In Nasa Formal Methods, number 10811 in LNCS, pages 3753.
Springer, 2018.
[ DOI ]
Internet of Things (IoT) applications are becoming
increasingly critical and require rigorous formal
verification. In this paper we target Contiki, a
widely used opensource OS for IoT, and present a
verification case study of one of its most critical
modules: that of linked lists. Its API and list
representation differ from the classical linked list
implementations, and are particularly challenging
for deductive verification. The proposed
verification technique relies on a parallel view of
a list through a companion ghost array. This
approach makes it possible to perform most proofs
automatically using the FramaC/WP tool, only a
small number of auxiliary lemmas being proved
interactively in the Coq proof assistant. We
present an elegant segmentbased reasoning over the
companion array developed for the proof. Finally,
we validate the proposed specification by proving a
few functions manipulating lists.

[6]

Frédéric Loulergue, Allan Blanchard, and Nikolai Kosmatov.
Ghosts for lists: from axiomatic to executable specifications.
In Tests and Proofs (TAP), volume 10889 of LNCS, pages
177184. Springer, 2018.
[ DOI ]
Internet of Things (IoT) applications are becoming
increasingly critical and require formal
verification. Our recent work presented formal
verification of the linked list module of Contiki,
an OS for IoT. It relies on a parallel view of a
linked list via a companion ghost array and uses an
inductive predicate to link both views. A few
interactively proved lemmas allow for an automatic
proof of other specifications, expressed in the ACSL
specification language and proved with the
FramaC/WP tool. In a broader verification context,
especially as long as the whole system is not yet
formally verified, it would be very useful to use
runtime verification, in particular, to test client
modules that use the list module. It is not
possible with the current specifications, which
include an inductive predicate and axiomatically
defined functions. In this earlyidea paper we show
how to define a provably equivalent noninductive
predicate and a provably equivalent nonaxiomatic
function that belong to the executable subset
of ACSL and can be transformed into executable C
code. Finally, we propose an extension of FramaC
to handle both axiomatic specifications for
deductive verification and executable specifications
for runtime verification.

[7]

Frédéric Loulergue and Christopher D. Whitney.
Verified programs for frequent itemset mining.
In 15th IEEE International Conference on Advanced and Trusted
Computing (ATC 2018), pages 15161523, Guangzhou, China, 2018. IEEE.
[ DOI ]
Frequent itemset mining is one pillar of machine
learning and is very important for many data mining
applications. There are many different algorithms
for frequent itemset mining, but to our knowledge no
implementation has been proven correct using
computer aided verification. Hu et al. derived on
paper an efficient algorithm for this problem,
starting from an inefficient functional program and
by using program calculation derived an efficient
version. Based on their work, we propose a formally
verified functional implementation for frequent
itemset mining developed with the Coq proof
assistant. All the proposed programs are evaluated
on classical datasets and are compared to a non
verified Java implementation of the Apriori
algorithm.

[8]

Salwa Souaf, Pascal Berthomé, and Frédéric Loulergue.
A Cloud Brokerage Solution: Formal Methods Meet Security in Cloud
Federations.
In International Conference on High Performance Computing
Simulation (HPCS), pages 691699. IEEE, 2018.
[ DOI ]
Cloud Computing has known in the past few years a
fast development, leading to a spike in the number
of companies competing on providing the best Cloud
services. This makes it harder for potential Cloud
customers to chose the adequate provider. Despite
its wide adoption, many are still hesitant due to
the security issues Cloud Computing poses. In this
paper, we propose a brokerage solution that
formalizes security properties under the form of
interVM relations, and gives the possibility of
setting these security requirements to its customers
from the first steps. This solution uses formal
methods and the finite model finder KodKod to verify
the consistency of the customer's requirements, and
to find a placement for his deployment model.

[9]

Julien Tesson and Frédéric Loulergue.
Interactive bulk synchronous parallel functional programming in a
browser.
In 15th IEEE International Conference on Advanced and Trusted
Computing (ATC 2018), pages 15931598, Guangzhou, China, 2018. IEEE.
[ DOI ]
This paper presents the design and implementation of
a sequential simulator for the parallel functional
language BSML based on the parallel computation
model Bulk Synchronous Parallelism (BSP). This
simulator is implemented in sequential, runs in any
browser, and provides a graphical representation of
the parallel executions. Due to the pure functional
nature of BSML, the results obtained by this
simulator are the same than the results that could
be obtained by executing the program in parallel

[10]

Arvid Jakobsson, Frédéric Dabrowski, Wadoud Bousdira,
Frédéric Loulergue, and Gaétan Hains.
Replicated Synchronization for Imperative BSP Programs.
In International Conference on Computational Science (ICCS),
Procedia Computer Science, pages 535544, Zurich, Switzerland, 2017.
Elsevier.
[ DOI ]
The BSP model (Bulk Synchronous Parallel) simplifies
the construction and evaluation of parallel
algorithms, with its simplified synchronization
structure and cost model. Nevertheless, imperative
BSP programs can suffer from synchronization errors.
Programs with textually aligned barriers are free
from such errors, and this structure eases program
comprehension. We propose a simplified
formalization of barrier inference as data flow
analysis, which verifies statically whether an
imperative BSP program has replicated
synchronization, which is a sufficient condition for
textual barrier alignment.

[11]

Frédéric Loulergue.
A verified accumulate algorithmic skeleton.
In Fifth International Symposium on Computing and Networking
(CANDAR), pages 420426, Aomori, Japan, November 1922 2017. IEEE.
[ DOI ]
This paper presents an extension of a library for
the Coq interactive theorem prover that enables the
development of correct functional parallel programs
based on sequential program transformation and
automatic parallelization using an algorithmic
skeleton named accumulate. Such an algorithmic
skeleton is a pattern of a parallel algorithm that
is provided as a highorder function implemented in
parallel. The use of this framework is illustrated
with the bracket matching problem, including
experiments on a parallel machine.

[12]

Frédéric Loulergue.
Imperative BSPlibstyle Communications in Bulk Synchronous Parallel
ML.
In International Conference on Computational Science (ICCS),
Procedia Computer Science, pages 23682372, Zurich, Switzerland, 2017.
Elsevier.
[ DOI ]
Bulk synchronous parallelism (BSP) oﬀers an abstract
and simple model of parallelism yet allows to take
realistically into account the communication costs
of parallel algorithms. BSP has been used in many
application domains. BSPlib and its variants are
programming libraries for the C language that
support the BSP style. Bulk Synchronous Parallel ML
(BSML) is a library for BSP programming with the
functional language OCaml. It is based on an
extension of the lambdacalculus by parallel
operations on a data structure named parallel
vector. BSML oﬀers a global view of programs,
i.e. BSML programs can be seen as sequential
programs working on a parallel data structure (seq
of par) while a BSPlib program is written in the
SPMD style and understood as a parallel composition
of communicating sequential programs (par of
seq). The communication styles of BSML and BSPlib
are also quite diﬀerent.

[13]

Frédéric Loulergue.
Implementing Algorithmic Skeletons with Bulk Synchronous Parallel
ML.
In Parallel and Distributed Computing, Applications and
Technologies (PDCAT), pages 461468. IEEE, 2017.
[ DOI ]
Skeletal parallelism offers a good tradeoff between
programming productivity and execution
efficiency. In this style of parallelism, an
application is a composition of algorithmic
skeletons. An algorithmic skeleton captures a
pattern of parallel algorithm on a distributed data
structure, and is also often associated with a
sequential algorithm on a sequential data structure
that is the counterpart of the parallel data
structure. The algorithmic skeleton approach has
been inspired by functional programming. It is
therefore very natural to embed algorithmic
skeletons in a functional programming language. In
this paper we present a new algorithmic skeleton
library for the statically typed functional language
OCaml, and illustrate its use on some
applications. This functional skeletal parallel
programming library is implemented using the Bulk
Synchronous Parallel ML parallel programming library
for OCaml.

[14]

Virginia Niculescu, Frédéric Loulergue, Darius Bufnea, and Adrian
Sterca.
A Java Framework for High Level Parallel Programming using
Powerlists.
In Parallel and Distributed Computing, Applications and
Technologies (PDCAT), pages 255262. IEEE, 2017.
[ DOI ]
Parallel programs based on the Divide&Conquer
paradigm could be successfully defined in a simple
way using powerlists. These parallel recursive data
structures and their algebraic theories offer both a
methodology to design parallel algorithms and
parallel programming abstractions to ease the
development of parallel applications. The paper
presents how programs based on powerlists can be
implemented in Java using the framework
we developed. The design of this framework is based
on powerlists theory, but in the same time follows
the objectoriented design principles that provide
flexibility and maintainability. Examples are given
and performance experiments are conducted. The
results emphasise the utility and the efficiency of
the framework.

[15]

Allan Blanchard, Nikolai Kosmatov, Matthieu Lemerre, and Frédéric
Loulergue.
conc2seq: A FramaC plugin for verification of parallel
compositions of C programs.
In 16th IEEE International Working Conference on Source Code
Analysis and Manipulation (SCAM), pages 6772, Raleigh, NC, USA, 2016.
IEEE.
[ DOI ]
FramaC is an extensible modular framework for
analysis of C programs that offers different
analyzers in the form of collaborating plugins.
Currently, FramaC does not support the proof of
functional properties of concurrent code. We
present conc2seq, a new code transformation based
tool realized as a FramaC plugin and dedicated
to the verification of concurrent C programs.
Assuming the program under verification respects an
interleaving semantics, conc2seq transforms the
original concurrent C program into a sequential one
in which concurrency is simulated by interleavings.
User specifications are automatically reintegrated
into the new code without manual intervention. The
goal of the proposed code transformation technique
is to allow the user to reason about a concurrent
program through the interleaving semantics using
existing FramaC analyzers.

[16]

Frédéric Dabrowski, Frédéric Loulergue, and Thomas Pinsard.
Nested atomic sections with thread escape: Compilation to threads and
locks.
In ACM Symposium on Applied Computing (SAC), pages 20992106,
Salamanca, Spain, 2015. ACM.
[ DOI ]
For the sake of modularity, programming languages
with atomic sections should offer nesting and inner
parallelism in such sections. The possible escape
of a thread created inside a section makes these
languages also more expressive. In this paper we
address the compilation of such a language towards a
language with threads and locks. The design
decisions of this compilation pass and of the target
language were made with respect to the ultimate goal
of a mechanised proof of semantic preservation.

[17]

Asma Guesmi, Patrice Clemente, Frédéric Loulergue, and Pascal
Berthomé.
Cloud resources placement based on functional and nonfunctional
requirements.
In International Conference on Security and Cryptography
(SECRYPT), pages 335324. ScitePress, 2015.
[ DOI ]
It is difficult for customers to select the
adequate cloud providers which fit their needs, as
the number of cloud offerings increases
rapidly. Many works thus focus on the design of
cloud brokers. Unfortunately, most of them do not
consider precise security requirements of customers.
In this paper, we propose a methodology defined to
place services in a multiprovider cloud
environment, based on functional and nonfunctional
requirements, including security requirements. To
eliminate inner conflicts within customers
requirements, and to match the cloud providers
offers with these customers requirements, we use a
formal analysis tool: Alloy. The broker uses a
matching algorithm to place the required services in
the adequate cloud providers, in a way that fulfills
all customer requirements. We finally present a
prototype implementation of the proposed broker.

[18]

Mohamad Al Hajj Hassan, Mostafa Bamha, and Frédéric Loulergue.
Handling Dataskew Effects in Join Operations using MapReduce.
In International Conference on Computational Science (ICCS),
pages 145158, Cairns, Australia, 2014. Elsevier.
[ DOI ]
For over a decade, MapReduce has become a prominent
programming model to handle vast amounts of raw data
in large scale systems. This model ensures
scalability, reliability and availability aspects
with reasonable query processing time. However
these large scale systems still face some
challenges: data skew, task imbalance, high disk I/O
and redistribution costs can have disastrous effects
on performance. In this paper, we introduce
MRFAJoin algorithm: a new frequency adaptive
algorithm based on MapReduce programming model and a
randomised key redistribution approach for join
processing of largescale datasets. A cost analysis
of this algorithm shows that our approach is
insensitive to data skew and ensures perfect
balancing properties during all stages of join
computation. These performances have been confirmed
by a series of experimentations.

[19]

Frédéric Dabrowski, Frédéric Loulergue, and Thomas Pinsard.
Nested Atomic Sections with Thread Escape: A Formal Definition.
In ACM Symposium on Applied Computing (SAC), pages 15851592,
Gyeongju, Korea, 2014. ACM.
[ DOI 
The paper ]
We consider a simple imperative language with
fork/join parallelism and lexically scoped nested
atomic sections from which threads can escape. In
this context, our contribution is the precise
definition of atomicity, wellsynchronisation and
the proof that the latter implies the strong form of
the former. A formalisation of our results in the
Coq proof assistant is also available.

[20]

Kento Emoto, Frédéric Loulergue, and Julien Tesson.
A Verified GenerateTestAggregate Coq Library for Parallel Programs
Extraction.
In Interactive Theorem Proving (ITP), number 8558 in LNCS,
pages 258274, Wien, Austria, 2014. Springer.
[ DOI ]
The integration of the generateandtest paradigm
and semirings for the aggregation of results
provides a parallel programming framework for large
scale dataintensive applications. The socalled GTA
framework allows a user to define an inefficient
specification of his/her problem as a composition of
a generator of all the candidate solutions, a tester
of valid solutions, and an aggregator to combine the
solutions. Through two calculation theorems a GTA
specification is transformed into a
divideandconquer efficient program that can be
implemented in parallel. In this paper we present a
verified implementation of this framework in the Coq
proof assistant: efficient bulk synchronous parallel
functional programs can be extracted from naive GTA
specifications. We show how to apply this framework
on an example, including performance experiments on
parallel machines.

[21]

Joeffrey Légaux, Sylvain Jubertie, and Frédéric Loulergue.
Development Effort and Performance Tradeoff in HighLevel Parallel
Programming.
In International Conference on High Performance Computing and
Simulation (HPCS), pages 162169, Bologna, Italy, 2014. IEEE.
[ DOI ]
Research on highlevel parallel programming
approaches systematically evaluate the performance
of applications written using these approaches and
informally argue that highlevel parallel
programming languages or libraries increase the
productivity of programmers. In this paper we
present a methodology that allows to evaluate the
tradeoff between programming effort and performance
of applications developed using different
programming models. We apply this methodology on
some implementations of a function solving the all
nearest smaller values problem. The highlevel
implementation is based on a new version of the BSP
homomorphism algorithmic skeleton.

[22]

Frédéric Loulergue, Virginia Niculescu, and Julien Tesson.
Implementing powerlists with Bulk Synchronous Parallel ML.
In Symposium on Symbolic and Numeric Algorithms for Scientific
Computing (SYNASC), pages 325332, Timisoara, Romania, 2014. IEEE.
[ DOI ]
The latest developments of the computation systems
impose using tools and methodologies able to
simplify the development process of parallel
software, but also to assure a high level of
performance and robustness. Powerlists and their
variants are data structures that can be
successfully used in a simple, provably correct,
functional description of parallel programs, which
are divide and conquer in nature. They represent one
of the highlevel algebraic theories which are
appropriate to be used as fundamentals for a model
of parallel computation that assures correctness
proving. The paper presents how programs defined
based on powerlists could be transformed to real
code in the functional language OCaml plus calls to
the parallel functional programming library Bulk
Synchronous Parallel ML. BSML functions follow the
BSP model requirements, and so its advantages are
introduced in OCaml parallel code. The
transformations are based on a framework that
assures: simple, correct, efficient
implementation. Examples are given and concrete
experiments for their executions are conducted. The
results emphasise the utility and the efficiency of
the framework.

[23]

Frédéric Loulergue, Simon Robillard, Julien Tesson, Joeffrey
Légaux, and Zhenjiang Hu.
Formal Derivation and Extraction of a Parallel Program for the All
Nearest Smaller Values Problem.
In ACM Symposium on Applied Computing (SAC), pages 15771584,
Gyeongju, Korea, 2014. ACM.
[ DOI 
The paper ]
The All Nearest Smaller Values (ANSV) problem is an
important problem for parallel programming as it can
be used to solve several problems and is one of the
phases of several other parallel algorithms. We
formally develop by construction a functional
parallel program for solving the ANSV problem using
the theory of Bulk Synchronous Parallel (BSP)
homomorphisms within the Coq proof assistant. The
performances of the Bulk Synchronous Parallel ML
program obtained from Coq is compared to a version
derived without software support (penandpaper) and
implemented using the Orléans Skeleton Library
of algorithmic skeletons, and to a (unproved
correct) direct implementation of the BSP algorithm
of He and Huang.

[24]

Frédéric Dabrowski, Frédéric Loulergue, and Thomas Pinsard.
Nested Atomic Sections with Thread Escape: An Operational
Semantics.
In Parallel and Distributed Computing, Applications and
Technologies (PDCAT), pages 2935, Taiwan, 2013. IEEE.
[ DOI ]
We consider a simple imperative language with
fork/join parallelism and lexically scoped nested
atomic sections from which threads can escape. In
this context, our contribution is a formal
operational semantics of this language that satifies
a specification on execution traces designed in a
companion paper.

[25]

Joeffrey Légaux, Zhenjiang Hu, Frédéric Loulergue, Kiminori
Matsuzaki, and Julien Tesson.
Programming with BSP Homomorphisms.
In EuroPar Parallel Processing, number 8097 in LNCS, pages
446457, Aachen, Germany, 2013. Springer.
[ DOI ]
Algorithmic skeletons in conjunction with list
homomorphisms play an important role in formal
development of parallel algorithms. We have designed
a notion of homomorphism dedicated to bulk
synchronous parallelism. In this paper we derive two
application using this theory: sparse matrix vector
multiplication and the all nearest smaller values
problem. We implement a support for BSP homomorphism
in the Orléans Skeleton Library and experiment
it with these two applications.

[26]

Joeffrey Légaux, Frédéric Loulergue, and Sylvain Jubertie.
Managing Arbitrary Distributions of Arrays in Orléans Skeleton
Library.
In International Conference on High Performance Computing and
Simulation (HPCS), pages 437444, Helsinki, Finland, 2013. IEEE.
[ DOI ]
Structured parallel models such as algorithmic
skeletons offer a global view of the parallel
program in contrast with the fragmented view of the
SPMD style. This makes program easier to write and
to read for users, and offer additional
opportunities for optimisation done by the
libraries, compilers and/or runtime
systems. Algorithmic skeletons are or can be seen as
patterns or higherorder functions implemented in
parallel, often manipulating distributed data
structures. Orléans Skeleton Library (OSL) is a
library of parallel algorithmic skeletons, written
in C++ on top of MPI, which uses metaprogramming
techniques for optimisation. Often such libraries
have no or limited support for arbitrary
distributions of the data structures. In this paper
we detail the new OSL skeletons used to manage
arbitrary distributions of distributed arrays. We
present a parallel regular sampling sort application
as an example of application that requires such
skeletons.

[27]

Joeffrey Légaux, Frédéric Loulergue, and Sylvain Jubertie.
OSL: an algorithmic skeleton library with exceptions.
In International Conference on Computational Science (ICCS),
pages 260269, Barcelona, Spain, 2013. Elsevier.
[ DOI ]
Exception handling is a traditional and natural
mechanism to manage errors and events that disrupt
the normal flow of program instructions. In most
concurrent or parallel systems, exception handling
is done locally or sequentially, and cannot
guarantee the global coherence of the system after
an exception is caught. Working with a
structured parallel model is an advantage in this
respect. Algorithmic skeletons, that are patterns of
parallel algorithms on distributed data structures,
offer such a structured model. However very few
algorithmic skeleton libraries provide a specific
parallel exception mechanism, and no C++based
library. In this paper we propose the design of an
exception mechanism for the C++ Orléans Skeleton
Library that ensures the global coherence of the
system after exceptions are caught. We explain our
design choices, experiment on the performance
penalty of its use, and we illustrate how to
purposefully use this mechanism to extract the
results in the course of some algorithms.

[28]

Frédéric Loulergue, Virginia Niculescu, and Simon Robillard.
Powerlists in Coq: Programming and Reasoning.
In First International Symposium on Computing and Networking
(CANDAR), pages 5765, Matsuyama, Japan, 2013. IEEE Computer Society.
[ DOI ]
For parallel programs, correctness by construction
is an essential feature since debugging is almost
impossible. Building correct programs by
construction is not a simple task, and usually the
methodologies used for this purpose are rather
theoretical and based on a penandpaper style. A
better approach could be based on tools and theories
that allow a user to develop an efficient parallel
application by easily implementing simple programs
satisfying conditions, ideally automatically
proved. Powerlists theory and its variants represent
a good theoretical base for such an approach, and
the Coq proof assistant is a tool that could be used
for automatic proofs. The goal of this paper is to
model the powerlist theory in Coq, and to use this
modelling to program and reason about parallel
programs in Coq. This represents the first step in
building a framework to ease the development of
correct and verifiable parallel programs.

[29]

Wadoud Bousdira, Frédéric Loulergue, and Julien Tesson.
A Verified Library of Algorithmic Skeletons on Evenly Distributed
Arrays.
In Algorithms and Architectures for Parallel Processing
(ICA3PP), number 7439 in LNCS, pages 218232, Fukuoka, Japan, 2012.
Springer.
[ DOI ]
To make parallel programming as widespread as
parallel architectures, more structured parallel
programming paradigms are necessary. One of the
possible approaches are Algorithmic skeletons that
are abstract parallel patterns. They can be seen as
higher order functions implemented in
parallel. Algorithmic skeletons offer a simple
interface to the programmer without all the details
of parallel implementations as they abstract the
communications and the synchronisations of parallel
activities. To write a parallel program, users have
to combine and compose the skeletons. Orléans
Skeleton Library (OSL) is an efficient
metaprogrammed C++ library of algorithmic skeletons
that manipulate distributed arrays. A prototype
implementation of OSL exists as a library written
with the function parallel language Bulk Synchronous
Parallel ML. In this paper we are interested in
verifying the correctness of a subset of this
prototype implementation. To do so, we give a
functional specification (i.e. without the parallel
details) of a subset of OSL and we prove the
correctness of the BSML implementation with respect
to this functional specification, using the Coq
proof assistant. To illustrate how the user could
use these skeletons, we prove the correctness of two
applications implemented with them: a heat diffusion
simulation and the maximum segment sum problem.

[30]

Noman Javed and Frédéric Loulergue.
Verification of a Heat Diffusion Simulation written with Orléans
Skeleton Library.
In 9th International Conference on Parallel Processing and
Applied Mathematics (PPAM 2011), number 7204 in LNCS, pages 91100, Torun,
Poland, 2012. Springer.
[ DOI ]
Orléans Skeleton Library (OSL) provides a set
of algorithmic skeletons as a C++ library on top of
MPI. The parallel programming approach of OSL is a
structure one, which eases the reasoning about the
performance and correctness of the programs. In this
paper we present the verification of a heat
diffusion simulation program, written in OSL, using
the Coq proof assistant. The specification used in
proving the correctness is a discretisation of the
heat equation.

[31]

Frédéric Loulergue, Frédéric Gava, Nikolai Kosmatov, and
Matthieu Lemerre.
Towards Verified Cloud Computing Environments.
In International Conference on High Performance Computing and
Simulation (HPCS), pages 9197, Madrid, Spain, 2012. IEEE.
[ DOI ]
As the usage of cloud becomes pervasive in our
lives, it is needed to ensure the reliability,
safety and security of cloud environments. In this
paper we study a usual software stack of a cloud
environment from the perspective of formal
verification. We argue that most of the layers could
be practically formally verified, even if the work
to verify all levels is huge.

[32]

Joeffrey Légaux, Sylvain Jubertie, and Frédéric Loulergue.
Experiments in Parallel Matrix Multiplication on MultiCore
Systems.
In Algorithms and Architectures for Parallel Processing
(ICA3PP), number 7439 in LNCS, pages 362376, Fukuoka, Japan, 2012.
Springer.
[ DOI ]

[33]

Noman Javed and Frédéric Loulergue.
Parallel Programming and Performance Predictability with Orléans
Skeleton Library.
In International Conference on High Performance Computing and
Simulation (HPCS), pages 257263, Istanbul, Turkey, 2011. IEEE.
[ DOI ]
Orléans Skeleton Library (OSL) is a library of
parallel algorithmic skeletons in C++ on top of
MPI. It provides a structured approach towards
parallel programming. Skeletons in OSL are based
over the bulk synchronous parallelism
model. Applications can be developed using different
combinations and compositions of the skeletons. This
paper illustrates the expressivity and performance
predictability of OSL with two applications: a two
dimensional heat diffusion simulation, and an exact
nbody simulation. Experiments using these
applications are performed on parallel machines.

[34]

Noman Javed and Frédéric Loulergue.
A Formal Programming Model of Orléans Skeleton Library.
In Victor Malyshkin, editor, 11th International Conference on
Parallel Computing Technologies (PaCT), LNCS 6873, pages 4052, Kazan,
Russia, 2011. Springer.
[ DOI ]
Orléans Skeleton Library (OSL) is a library of
parallel algorithmic skeletons in C++ on top of
MPI. It provides a structured approach towards
parallel programming. Skeletons in OSL are based on
the bulk synchronous parallelism model. In this
paper we present formal semantics of OSL: a
programming model and its properties proved with the
Coq assistant.

[35]

Julien Tesson and Frédéric Loulergue.
A Verified Bulk Synchronous Parallel ML Heat Diffusion Simulation.
In International Conference on Computational Science (ICCS),
pages 3645, Singapore, 2011. Elsevier.
[ DOI ]
Bulk Synchronous Parallel ML (BSML) is a structured
parallel functional programming language. It extends
a functional programming language of the ML family
with a polymorphic data structure and a very small
set of primitives. In this paper we describe a
framework for reasoning about BSML programs using
the Coq interactive theorem prover and for
extracting actual parallel programs from
proofs. This framework is illustrated through a
simulation application based on heat equation.

[36]

Louis Gesbert, Zhenjiang Hu, Frédéric Loulergue, Kiminori Matsuzaki,
and Julien Tesson.
Systematic Development of Correct Bulk Synchronous Parallel
Programs.
In Parallel and Distributed Computing, Applications and
Technologies (PDCAT), pages 334340. IEEE, 2010.
[ DOI ]
With the current generalisation of parallel
architectures arises the concern of applying formal
methods to parallelism. The complexity of parallel,
compared to sequential, programs makes them more
errorprone and difficult to verify. Bulk
Synchronous Parallelism (BSP) is a model of
computation which offers a high degree of
abstraction like PRAM models but yet a realistic
cost model based on a structured parallelism. We
propose a framework for refining a sequential
specification toward a functional BSP program, the
whole process being done with the help of the Coq
proof assistant. To do so we define BH, a new
homomorphic skeleton, which captures the essence of
BSP computation in an algorithmic level, and also
serves as a bridge in mapping from high level
specification to low level BSP parallel programs.

[37]

Julien Tesson, H. Hashimoto, Zhenjiang Hu, Frédéric Loulergue, and
Masato Takeichi.
Program Calculation in Coq.
In Algebraic Methodology And Software Technology (AMAST), LNCS
6486, pages 163179. Springer, 2010.
[ DOI ]
Program calculation, being a programming technique
that derives programs from specification by means of
formula manipulation, is a challenging activity. It
requires human insights and creativity, and needs
systems to help human to focus on clever parts of
the derivation by automating tedious ones and
verifying correctness of transformations. Different
from many existing systems, we show in this paper
that Coq, a popular theorem prover, provides a cheap
way to implement a powerful system to support
program calculation, which has not been recognized
so far. We design and implement a set of tactics for
the Coq proof assistant to help the user to derive
programs by program calculation and to write proofs
in calculational form. The use of these tactics is
demonstrated through program calculations in Coq
based on the theory of lists.

[38]

Noman Javed and Frédéric Loulergue.
OSL: Optimized Bulk Synchronous Parallel Skeletons on Distributed
Arrays.
In Y. Don, R. Gruber, and J. Joller, editors, 8th international
Conference on Advanced Parallel Processing Technologies (APPT'09), LNCS
5737, pages 436451. Springer, 2009.
[ DOI ]
The existing solutions to program parallel
architectures range from parallelizing compilers
to distributed concurrent programming. Intermediate
approaches propose a more structured parallelism:
Algorithmic skeletons are higherorder functions
that capture the patterns of parallel
algorithms. The user of the library has just to
compose some of the skeletons to write her parallel
application. When one is designing a parallel
program, the parallel performance is important. It
is thus very interesting for the programmer to rely
on a simple yet realistic parallel performance model
such as the Bulk Synchronous Parallel (BSP)
model. We present OSL, the Orléans Skeleton
Library: it is a library of BSP algorithmic
skeletons in C++. It offers dataparallel skeletons
on arrays as well as communication oriented
skeletons. The performances of OSL are demonstrated
with two applications: heat equation and FFT.

[39]

Louis Gesbert and Frédéric Loulergue.
Semantics of an Exception Mechanism for Bulk Synchronous Parallel
ML.
In Parallel and Distributed Computing, Applications and
Technologies (PDCAT), pages 201208. IEEE Computer Society, 2007.
[ DOI ]
Bulk Synchronous Parallel ML is a highlevel
language for programming parallel algorithms. Built
upon OCaml and using the BSP model, it provides a
safe setting for their implementation, avoiding
concurrency related problems (deadlocks,
indeterminism). Only a limited set of the features
of OCaml can be used in BSML to respect its
properties of safety: this paper describes a way to
add exception handling to this set by extending and
adapting OCaml's exceptions. After a precise
definition of the problems that arise and an
informal description of the solutions, an extension
of BSML is proposed. Formal semantics define the
behaviour in all possible cases, followed by a short
description of the implementation.

[40]

Louis Gesbert, Frédéric Gava, Frédéric Loulergue, and
Frédéric Dabrowski.
Bulk Synchronous Parallel ML with Exceptions.
In Peter Kacsuk, Thomas Fahringer, and Zsolt Nemeth, editors,
Distributed and Parallel Systems (DAPSYS 2006), pages 3342. Springer,
2006.
[ DOI ]
Bulk Synchronous Parallel ML is a highlevel
language for programming parallel
algorithms. Building upon OCaml and using the BSP
model, it provides a safe setting for their
implementation, avoiding concurrency related
problems (deadlocks, indeterminism). Only a limited
set of the features of OCaml can be used in BSML to
respect its safety: this paper describes a way to
add exception handling to this set by extending and
adapting OCaml's exceptions. The behaviour of these
new exceptions and the syntactic constructs to
handle them, together with their implementation, are
described in detail, and results over an example are
given.

[41]

Frédéric Loulergue, Radia Benheddi, Frédéric Gava, and Dimitri
LouisRegis.
Bulk Synchronous Parallel ML: Semantics and Implementation of the
Parallel Juxtaposition.
In International Computer Science Symposium in Russia (CSR
2006), volume 3967 of LNCS, pages 475486. Springer, 2006.
[ DOI ]
The Bulk Synchronous Parallel ML (BSML) library is
a library for Bulk Synchronous Parallel (BSP)
programming with the functional language Objective
Caml. It is based on an extension of the
lambdacalculus by parallel operations on a parallel
data structure named parallel vector. This paper
presents the execution model, given as a distributed
semantics, of the parallel juxtaposition, a
primitive of parallel composition for BSML. It also
presents an implementation, which follows the
semantics, and experiments using the parallel
juxtaposition.

[42]

Frédéric Loulergue, Frédéric Gava, and David Billiet.
Bulk Synchronous Parallel ML: Modular Implementation and Performance
Prediction.
In International Conference on Computational Science (ICCS),
volume 3515 of LNCS, pages 10461054. Springer, 2005.
[ DOI ]
The BSMLlib is a library for parallel programming
with the functional language Objective Caml. It is
based on an extension of the λcalculus by
parallel operations on a parallel data structure
named parallel vector. The execution time can be
estimated, deadlocks and indeterminism are
avoided. Programs are written as usual functional
programs (in Objective Caml) but using a small set
of additional functions. Provided functions are used
to access the parameters of the parallel machine and
to create and operate on parallel vectors. It
follows the execution and cost model of the Bulk
Synchronous Parallel model. The paper presents the
lastest implementation of this library and
experiments of performance prediction.

[43]

Frédéric Loulergue.
Optimizing Bulk Synchronous Parallel ML.
In Chao Lu and Roger Lee, editors, 6th ACIS International
Conference on Software Engineering, Artificial Intelligence, Networking, and
Parallel/Distributed Computing (SNPD 2005), pages 294299. IEEE Computer
Society, 2005.
Bulk Synchronous Parallel ML is a functional
parallel language based on the Bulk Synchronous
Parallelism model of computation. Deadlocks are
avoided and programs are deterministic. The
performance of programs can be accurately
predicted. This paper addresses the optimization of
BSML programs through the compilation of BSML
primitives to Objective Caml code with calls to a
low level library rather than their direct
implementation as a high level library.

[44]

Frédéric Gava and Frédéric Loulergue.
Semantics of a Functional Bulk Synchronous Parallel Language with
Imperative Features.
In Gaétan Joubert, W. Nagel, Frédéric Peters, and
W. Walter, editors, Parallel Computing: Software Technology, Algorithms,
Architectures and Applications, Proceedings of the 10th ParCo Conference,
pages 95102, Dresden, 2004. North Holland/Elsevier.
The Bulk Synchronous Parallel ML (BSML) is a
functional language for Bulk Synchronous Parallel
(BSP) programming, on top of the sequential
functional language Objective Caml. It is based on
an extension of the λcalculus by parallel
operations on a parallel data structure named
parallel vector, which is given by intention. The
Objective Caml language is a functional language but
it also offers imperative features. This paper
presents formal semantics of BSML with references,
assignment and dereferencing.

[45]

Frédéric Loulergue.
A Calculus of Functional BSP Programs with Explicit Substitution.
In Gaétan Joubert, W. Nagel, Frédéric Peters, and
W. Walter, editors, Parallel Computing: Software Technology, Algorithms,
Architectures and Applications, Proceedings of the 10th ParCo Conference,
pages 127134, Dresden, 2004. North Holland/Elsevier.
The BSlambdasigmaUparrowcalculus calculus is a
calculus of functional BSP programs with enumerated
parallel vectors and explicit substitution. This
confluent calculus is defined and proved
confluent. These results constitute the core of a
formal design for a Bulk Synchronous Parallel
dialect of ML (BSML) as well as a framework for
proving parallel abstract machine which can evaluate
BSML programs.

[46]

Frédéric Loulergue.
An Event Oriented Functional Parallel Language.
In Gaétan Joubert, W. Nagel, Frédéric Peters, and
W. Walter, editors, Parallel Computing: Software Technology, Algorithms,
Architectures and Applications, Proceedings of the 10th ParCo Conference,
pages 7986, Dresden, 2004. North Holland/Elsevier.
This paper presents the design of the core of a
parallel programming language called CDS*. It is
based on explicitlydistributed concrete data
structures and features compositional semantics,
higherorder functions and explicitly distributed
objects. The denotational semantics is outlined, the
(equivalent) operational semantics is presented and
a new realization of the latter is given as a
rewriting system. Simulated execution of examples
illustrates the language's flexibility and explicit
control of data layout on the parallel network.

[47]

Frédéric Loulergue.
Management of Communication Environments for Minimally Synchronous
Parallel ML.
In Z. Juhasz, P. Kacsuk, and D. Kranzimuller, editors,
Distributed and Parallel Systems (DAPSYS 2004), pages 185192. Springer,
2004.
Minimally Synchronous Parallel ML is a functional
parallel language whose execution time can then be
estimated and deadlocks and indeterminism are
avoided. Programs are written as usual ML programs
but using a small set of additional
functions. Provided functions are used to access the
parameters of the parallel machine and to create and
operate on a parallel data structure. It follows the
cost model of the Message Passing Machine model
(MPM). In the current implementation, the asynchrony
is limited by a parameter called the asynchrony
depth. When processes reach this depth a global
synchronization occurs. This is necessary to avoid
memory leak. In this paper we propose a mechanism to
avoid such synchronization barriers. This mechanism
relies on a more complex management of the
communication environments but with a small and
parametrized overhead.

[48]

Frédéric Loulergue.
Communication Primitives for Minimally Synchronous Parallel ML.
In International Conference on Computational Science (ICCS),
LNCS, pages 411414. Springer Verlag, 2004.
[ DOI ]
Minimally Synchronous Parallel ML is a functional
parallel language whose execution time can then be
estimated and deadlocks and indeterminism are
avoided. Programs are written as usual ML programs
but using a small set of additional
functions. Provided functions are used to access the
parameters of the parallel machine and to create and
operate on a parallel data structure. It follows the
cost model of the Message Passing Machine model
(MPM). This paper explore the writing of an
additional communication function using this small
set of primitives. It could also be considered as a
primitive rather than a function written using
primitives. So we developed a lowlevel
implementation of this function and compared it with
the highlevel implementation in terms of
efficiency.

[49]

Myrto Arapinis, Frédéric Loulergue, Frédéric Gava, and
Frédéric Dabrowski.
Semantics of Minimally Synchronous Parallel ML.
In W. Dosch and R. Y. Lee, editors, 4th International
Conference on Software Engineering, Artificial Intelligence, Networking, and
Parallel/Distributed Computing (SNPD'03), pages 260267. ACIS, 2003.
This paper presents a new functional parallel
language: Minimally Synchronous Parallel ML. It
shares with Bulk Synchronous Parallel ML its syntax
and highlevel semantics but it has a minimally
synchronous distributed semantics. It follows the
cost model of the Message Passing Machine model
(MPM).

[50]

Frédéric Dabrowski and Frédéric Loulergue.
Functional Bulk Synchronous Programming in C++.
In 21st IASTED International Multiconference, Applied
Informatics (AI 2003), Symposium on Parallel and Distributed Computing and
Networks, pages 462467. ACTA Press, february 2003.
This paper presents the BSFC++ library for
functional bulk synchronous parallel programming in
C++. It is based on an extension of the
λcalculus by parallel operations on a
parallel data structure named parallel vector, which
is given by intention. This guarantees the
determinism and the absence of deadlock. Broadcast
algorithms are implemented using the core library.

[51]

Frédéric Dabrowski, Frédéric Loulergue, and Frédéric
Gava.
Pattern Matching of Parallel Values in Bulk Synchronous Parallel
ML.
In W. Dosch and R. Y. Lee, editors, 4th International
Conference on Software Engineering, Artificial Intelligence, Networking, and
Parallel/Distributed Computing (SNPD'03), pages 301308. ACIS, 2003.
This paper presents an extension of the functional
parallel language BSML (Bulk Synchronous Parallel
ML) and of the BSlambdacalculus (a calculus of
functional bulk synchronous parallel programs) with
pattern matching of parallel values.

[52]

Frédéric Gava and Frédéric Loulergue.
A Parallel Virtual Machine for Bulk Synchronous Parallel ML.
In International Conference on Computational Science (ICCS),
number 2657 in LNCS, pages 155164. Springer Verlag, june 2003.
[ DOI ]
We have designed a functional dataparallel
language called BSML for programming
bulksynchronous parallel (BSP) algorithms in
socalled direct mode. In a directmode BSP
algorithm, the physical structure of processes is
made explicit. The execution time can then be
estimated and deadlocks and indeterminism are
avoided. The BSMLlib library has been implemented
for the Objective Caml language. But there is
currently no full implementation of such a language
and an abstract machine is needed to validate such
an implementation. Our approach is based on a
bytecode compilation to a parallel abstract machine
performing exchange of data and synchronous requests
derived from the ZAM, the efficient abstract machine
of the Objective Caml language.

[53]

Frédéric Gava and Frédéric Loulergue.
A Polymorphic Type System for Bulk Synchronous Parallel ML.
In V. Malyshkin, editor, Seventh International Conference on
Parallel Computing Technologies (PaCT 2003), number 2763 in LNCS, pages
215229. Springer Verlag, 2003.
[ DOI ]
The BSMLlib library is a library for Bulk
Synchronous Parallel (BSP) programming with the
functional language Objective Caml. It is based on
an extension of the λcalculus by parallel
operations on a data structure named parallel
vector, which is given by intention. In order to
have an execution that follows the BSP model, and to
have a simple cost model, nesting of parallel
vectors is not allowed. The novelty of this paper
is a type system which prevents such nesting. This
system is correct w.r.t. the dynamic semantics
which is also presented.

[54]

Frédéric Gava, Frédéric Loulergue, and Frédéric
Dabrowski.
A Parallel Categorical Abstract Machine for Bulk Synchronous
Parallel ML.
In W. Dosch and R. Y. Lee, editors, 4th International
Conference on Software Engineering, Artificial Intelligence, Networking, and
Parallel/Distributed Computing (SNPD'03), pages 293300. ACIS, 2003.
We have designed a functional dataparallel
language called BSML for programming
bulksynchronous parallel (BSP) algorithms in
socalled direct mode. In a directmode BSP
algorithm, the physical structure of processes is
made explicit. The execution time can then be
estimated and deadlocks and indeterminism are
avoided. The BSMLlib library has been implemented
for the Objective Caml language. But there is
currently no full implementation of such a language
and an abstract machine is needed to have a
certified implementation. Our approach is based on a
bytecode compilation to a parallel abstract machine
performing exchange of data and synchronous requests
derived from the abstract machine of the Caml
language.

[55]

Frédéric Loulergue.
Parallel Juxtaposition for Bulk Synchronous Parallel ML.
In H. Kosch, L. Boszorményi, and H. Hellwagner, editors,
EuroPar 2003, number 2790 in LNCS, pages 781788. Springer Verlag, 2003.
[ DOI ]
The BSMLlib library is a library for Bulk
Synchronous Parallel (BSP) programming with the
functional language Objective Caml. It is based on
an extension of the λcalculus by parallel
operations on a parallel data structure named
parallel vector. An attempt to add a parallel
composition to this approach led to a nonconfluent
calculus and to a restricted form of parallel
composition. This paper presents a new, simpler and
more general semantics for parallel composition,
renamed here parallel juxtaposition, as well as an
associated cost model derived from the BSP cost
model.

[56]

Frédéric Loulergue.
Parallel Superposition for Bulk Synchronous Parallel ML.
In International Conference on Computational Science (ICCS),
number 2659 in LNCS, pages 223232. Springer Verlag, 2003.
[ DOI ]
The BSMLlib library is a library for Bulk
Synchronous Parallel (BSP) programming with the
functional language Objective Caml. It is based on
an extension of the λcalculus by parallel
operations on a parallel data structure named
parallel vector, which is given by intention. Those
operations are flat and allow BSP programming
in direct mode but it is impossible to express
directly divideandconquer algorithms. This paper
presents a new construction for the BSMLlib
library which can express divideandconquer
algorithms. It is called parallel superposition
because it can be seen as the parallel composition
of two BSP threads which can each use all the
processors. An associated cost model derived from
the BSP cost model is also given.

[57]

Gaétan Hains and Frédéric Loulergue.
Functional Bulk Synchronous Parallel Programming using the BSMLlib
Library.
In S. Gorlatch and C. Lengauer, editors, Constructive Methods
for Parallel Programming, Advances in Computation: Theory and Practice,
pages 165178. Nova Science Publishers, august 2002.
BSMLlib is a functional dataparallel library for
programming bulksynchronous parallel (BSP)
algorithms in Objective CAML. This article
demonstrates the expressivity of BSMLlib operations
on elementary algorithms. Other operations, on a
type of parallel sets, illustrate how BSP exchange
phases are written as small, simple programs. A
longer example, inspired by parallel join algorithms
for relational databases, demonstrates the advantage
of using BSMLlib's explicit processes for dynamic
load balancing. Finally, suggestions are made for
extending the library and building a complete BSML
language with the same operations.

[58]

Frédéric Loulergue.
Implementation of a Functional Bulk Synchronous Parallel Programming
Library.
In 14th IASTED International Conference on Parallel and
Distributed Computing Systems, pages 452457. ACTA Press, 2002.
The BSMLlib is a library for Bulk Synchronous
Parallel (BSP) programming with the functional
language Objective Caml. It is based on an extension
of the lambdacalculus by parallel operations on a
parallel data structure named parallel vector, which
is given by intention. A first implementation of
this library was based on the BSPlib library, which
is not longer supported nor updated. Being the basis
of a framework for Grid computing, a new
implementation of the BSMLlib based on MPI has been
designed. Experimental results on a cluster of PCs
are presented.

[59]

Frédéric Loulergue.
Parallel Composition and Bulk Synchronous Parallel Functional
Programming.
In S. Gilmore, editor, Trends in Functional Programming, Volume
2, pages 7788. Intellect Books, 2001.
[ ePub ]
The work described here is part of our research
program to investigate new paradigms for declarative
parallel programming through special interpretations
of standard semantics. Our approach is a purely
functional approach to programming of BSP
algorithms. An attempt to add a parallel composition
to this approach led to a nonconfluent
calculus. This paper presents a solution for adding
the parallel composition to the weak callbyvalue
strategy of the BSlambda_p and to the BSMLlib
library.

[60]

Olivier Ballereau, Frédéric Loulergue, and Gaétan Hains.
Highlevel BSP Programming: BSML and BSλ.
In Gaétan Michaelson and Ph. Trinder, editors, Trends in
Functional Programming, pages 2938. Intellect Books, 2000.
[ ePub ]
A functional dataparallel language called BSML is
designed for programming bulksynchronous parallel
(BSP) algorithms in socalled direct mode. Its aim
is to combine the generality of languages like NESL
with the predictable performance of directmode BSP
algorithms. The BSML operations are motivated and
described. Experiments with a library implementation
of BSML show the possibility and limitations of
parallel performance prediction in this framework.

[61]

Frédéric Loulergue.
BSλ_{p}: Functional BSP Programs on Enumerated Vectors.
In J. Kazuki, editor, International Symposium on High
Performance Computing, number 1940 in LNCS, pages 355363. Springer,
October 2000.
[ DOI 
ePub ]
The BSlambda_p calculus is a calculus of functional
BSP programs on enumerated parallel vectors. This
confluent calculus is defined and a parallel cost
model is associated with a weak callbyvalue
strategy. These results constitute the core of a
formal design for a BSP dialect of ML.

[62]

Frédéric Loulergue and Gaétan Hains.
Parallel functional programming with explicit processes: Beyond
SPMD.
In C. Lengauer, M. Griebl, and S. Gorlatch, editors,
EuroPar'97 Parallel Processing, number 1300 in LNCS, pages 530537,
Passau, Germany, August 1997. Springer.
[ ePub ]
This paper presents the design for a purely
functional parallel language with higherorder
functions. Data layout is explicit in the language
and arbitrary synchronizations can be specified in
the nonstandard type system based on concrete data
structures. We present its denotational semantics,
operational semantics, full abstraction property and
program examples to illustrate its key features.
Article de tutoriel / Tutorial Paper

[1]

Victor Allombert, Mathias Bourgoin, and Frédéric Loulergue.
Parallel programming with OCaml: A tutorial.
In International Conference on High Performance Computing and
Simulation (HPCS), pages 311, Orléans, France, 2018. IEEE.
[ DOI ]
OCaml is a multiparadigm (functional, imperative,
objectoriented) high level sequential
language. Types are statically inferred by the
compiler and the type system is expressive and
strong. These features make OCaml a very productive
language for developing efficient and safe programs.
In this tutorial we present three frameworks for
using OCaml to program scalable parallel
architectures: BSML, MultiML and Spoc.

[2]

Allan Blanchard, Nikolai Kosmatov, and Frédéric Loulergue.
A lesson on verification of IoT software with FramaC.
In International Conference on High Performance Computing and
Simulation (HPCS), pages 2130, Orléans, France, 2018. IEEE.
[ DOI ]
This paper is a tutorial introduction to FramaC, a
framework for the analysis and verification of
sequential C programs, and in particular its EVA,
WP, and EACSL plugins. The examples are drawn from
Contiki, a lightweight operating system for the
Internet of Things.
Tutoriel / Tutorial

[1]

Allan Blanchard, Nikolai Kosmatov, and Frédéric Loulergue.
Towards reliable things: Formal verification of IoT software with
FramaC.
29th IEEE International Symposium on Software Reliability Engineering
(ISSRE), 2018.
Abstract Among distributed systems, connected
devices and services, also referred to as the
Internet of Things (IoT), have proliferated very
quickly in the past years. There are now billions of
interconnected devices, and this number is
growing. It is anticipated that by 2021, about 46
billions of devices will be in use. Some of these
devices are in service in safety and security
critical domains, and even in domains that are not
necessarily critical, privacy issues may arise with
devices collecting and transmitting a lot of
personal information. Formal methods have been used
successfully for years in highly critical domains,
now they can help to bring security into the IoT
field. In practice it is common to rely on a
combination of formal methods to achieve an
appropriate degree of guarantee: static analyses to
guarantee the absence of runtime errors, deductive
verification of functional correctness, dynamic
verification for parts that cannot be proved using
deductive verification. This tutorial is focused on
FramaC, which is a source code analysis platform
that aims at conducting verification of
industrialsize programs written in ISO C99 source
code. FramaC fully supports the combination of
formal methods approach, by providing to its users
with a collection of plugins that perform static
and dynamic analysis for safety and security
critical software. Moreover collaborative
verification across cooperating plugins is enabled
by their integration on top of a shared kernel, and
their compliance to a common specification language
ACSL. Recently FramaC has been applied to the
verification of software in the context of the
Internet of Things.

[2]

Allan Blanchard, Nikolai Kosmatov, and Frédéric Loulergue.
Secure your things: Secure development of IoT software with
FramaC.
In IEEE Cybersecurity Development Conference (SecDev), pages
126127. IEEE, 2018.
[ DOI ]
Among distributed systems, connected devices and
services, also referred to as the Internet of Things
(IoT), have proliferated very quickly in the past
years. There are now billions of interconnected
devices, and this number is growing. It is
anticipated that by 2021, about 46 billions of
devices will be in use. Some of these devices are
in service in security critical domains, and even in
domains that are not necessarily critical, privacy
issues may arise with devices collecting and
transmitting a lot of personal information. Moreover
insufficiently secured devices may be used for
example for massive distributed denial of service
attacks. This raises important security
challenges. Formal methods have been used
successfully for years in highly critical domains,
now they can help to bring security into the IoT
field. While the correctness of an implementation
with respect to a formal functional specification
provides the strongest form of guarantee, it can be
very costly to achieve. In practice it is therefore
more common to rely on a combination of formal
methods to achieve an appropriate degree of
guarantee: static analyses to guarantee the absence
of runtime errors, deductive verification of
functional correctness, dynamic verification for
parts that cannot be proved using deductive
verification. FramaC is a source code analysis
platform that aims at conducting verification of
industrialsize programs written in ISO C99 source
code. FramaC fully supports the combination of
formal methods approach, by providing to its users
with a collection of plugins that perform static
and dynamic analysis for safety and security
critical software. Moreover collaborative
verification across cooperating plugins is enabled
by their integration on top of a shared kernel, and
their compliance to a common specification language:
ACSL . Recently FramaC has been applied to the
verification of software in the context of the
Internet of Things, more specifically the
verification of modules of Contiki, an open source
operating system for the IoT.

[3]

Allan Blanchard, Nikolai Kosmatov, and Frédéric Loulergue.
Towards secure things, or how to verify IoT software with
FramaC.
Zooming Innovation in Consumer Electronics International Conference
(ZINC), 2018.
It is commonly said the "S" in Internet of Things
(IoT) stands for "Security": when it is not absent,
it is at least unnoticeable. However, security in
IoT becomes an important concern since in both the
industry and everyday life, it is gaining wider and
wider adoption. Thus, IoT is now targeting more
critical domains, and making domains that were not
critical, until now, collect and transmit a lot of
personal data. For years, this kind of challenges
have been successfully addressed in highly critical
domains using formal methods. Now, formal methods
can help to bring security in the IoT field. FramaC
is a source code analysis platform used to conduct
verification of industrialsize programs written in
ISO C99. It provides a collection of plugins that
perform static and dynamic analysis for safety and
security critical software. These plugins can
collaborate to the verification task thanks to their
integration on top of a shared kernel and their
compliance to ACSL, a specification language for C.
Participants will learn how to use different FramaC
analyzers and how to combine their results. During
the tutorial, several examples and usecases will
give them a clear practical vision of possible
usages of the underlying static and dynamic analyses
in their everyday work. The presented code fragments
are part of Contiki, a realworld lightweight
operating system for the IoT. Each part consists of
a presentation using slides and live demonstration,
and a session of exercises. The attendees will be
provided a virtual machine image for VirtualBox
containing all the tools ready to use, to work on
the exercises. We will also provide additional
exercises and we will be available during the
conference (and after) to help the attendees who may
want to go beyond the tutorial material.

[4]

Frédéric Loulergue.
Systematic Development of Programs for Scalable Computing Using Coq
(tutorial).
In International Conference on High Performance Computing and
Simulation (HPCS), Genoa, Italy, 2017. IEEE.
[ DOI ]
The SyDPaCC system is a set of libraries for the pr
oof assistant Coq that allows to write naive (i.e.
inef fi cient) functional programs then to transform
them into ef fi cient versions that could be
automatically parallelized within the framework
before being extracted from Coq to code in the
functional language OCaml plus calls to the parallel
functional programming library Bulk Synchronous
Parallel ML. These goals of the tutorial are to
provide an introduction to the development of
correctbyconstruction parallel programs, and to
able the attendees to develop functio nal parallel
programs using the SYDPACC system and the Coq proof
assistant.

[5]

Frédéric Loulergue.
Development of correctbyconstruction functional parallel programs
(tutorial).
In ACM Symposium on Applied Computing (SAC). ACM, 2016.
[ http ]
The SyDPaCC system is a set of libraries for the pr
oof assistant Coq that allows to write naive (i.e.
inef fi cient) functional programs then to transform
them into ef fi cient versions that could be
automatically parallelized within the framework
before being extracted from Coq to code in the
functional language OCaml plus calls to the parallel
functional programming library Bulk Synchronous
Parallel ML. These goals of the tutorial are to
provide an introduction to the development of
correctbyconstruction parallel programs, and to
able the attendees to develop functio nal parallel
programs using the SYDPACC system and the Coq proof
assistant.

[6]

Frédéric Loulergue and Julien Tesson.
Certified Parallel Program Calculation in Coq: A Tutorial.
In International Conference on High Performance Computing and
Simulation (HPCS). IEEE, 2014.
[ DOI ]
With the current generalization of parallel
architectures and increasing requirement of parallel
computation arises the concern of applying formal
methods, which allow specifications of parallel and
distributed programs to be precisely stated and the
conformance of an implementation to be verified
using mathematical techniques. However, the
complexity of parallel programs, compared to
sequential ones, makes them more errorprone and
difficult to verify. This calls for a strongly
structured form of parallelism which should not only
be equipped with an abstraction or model that
conceals much of the complexity of parallel
computation, but also provide a systematic way of
developing such parallelism from specifications for
practically nontrivial examples. Transformational
programming is a methodology that offers some scope
for making the construction of efficient programs
more mathematical. Program calculation is a kind of
program transformation based on the theory of
constructive algorithmics. An efficient program is
derived stepbystep through a sequence of
transformations that preserve the meaning and hence
the correctness. With suitable datastructures,
program calculation can be used for writing parallel
programs. However, once an efficient (and correct
with respect to the initial specification)
formulation of the program is obtained through
transformations, the program is then implemented
using a parallel library of algorithmic skeletons
most of the time written in C++ with calls to a
communication library such as MPI. There is no
formal correspondence between the efficient program
obtained by transformation and the C++ skeletal
program. Moreover the transformation itself is
usually a penandpaper process that could contain
errors. The SyDPaCC system is a set of
libraries for the Coq proof assistant that allows to
write naive functional programs then to transform
them into efficient versions that could be
automatically parallelised within the framework
before being extracted from Coq to real code in the
functional language OCaml plus calls to the parallel
functional programming library Bulk Synchronous
Parallel ML The tutorial is an introduction to the
Coq proof assistant and the SyDPaCC system
for the systematic development of correct and
verified parallel programs.
Exposé à un colloque international / Talk at an International Conference without Proceedings

[1]

Virginia Niculescu and Frédéric Loulergue.
Transforming powerlist based divide&conquer programs for an improved
execution model.
In High Level Parallel Programming and Applications (HLPP),
Orléans, France, 2018.
Divide&Conquer (DC) represents an important design
pattern for parallel programs. Powerlists are data
structures that could be successfully used for
defining parallel programs based on this paradigm.
These parallel recursive data structures and their
algebraic theories offer both a methodology to
design parallel algorithms and parallel programming
abstractions to ease the development of parallel
applications. The paper presents an improvement of
the parallel recursive programs defined based on
powerlists. The improvements is achieved by applying
transformation rules that introduce tuple functions
and prefix operators, for which a more efficient
execution model is defined. Together with the
execution model, a cost model is also defined in
order to allow a proper evaluation. The treated
examples emphasize the fact that the transformation
leads to important improvements of the programs, by
reducing their theoretical timecomplexities, and
the number of recursive calls, and also by allowing
splitting and combining of different data structures
to be packed together. Also, enhancing the function
that has to be computed to other useful functions
using a tuple, could improved the cost reduction
even more.

[2]

Thibaut Tachon, Gaétan Hains, Frédéric Loulergue, and Chong
Li.
Automated generation of BSP automata.
In High Level Parallel Programming and Applications (HLPP),
Münster, Germany, 2016.
BulkSynchronous Parallel (BSP) is a bridging model
between abstract execution and concrete parallel
architecture. In order to model BSP program
executions Hains adapted the finite automata theory
to the BSP paradigm by introducing BSP automata
theory. Benefit provided by BSP automata is
twofold: modeling BSP program control and
parallelizing finite state automata. The lack of
generation algorithm of BSP automata and the
illusory lack of application of this theory has been
preventing this theory from being used. We propose
in this paper an algorithm that generates a BSP
automaton recognizing a defined BSP language. In
order to demonstrate the usefulness of BSP automata
and help to design its use, two applications of the
BSP automata theory are provided. The parallel
recognition of an expression and debugging of a BSP
program.

[3]

Frédéric Loulergue.
On the Formal Verification of Computer Simulations.
In 5th Models and Simulations conference (MS5), The Finnish
Centre of Excellence in the Philosophy of the Social Sciences, Helsinki,
Finland, June 2012.
Atelier international / International Workshop

[1]

Dara Ly, Nikolai Kosmatov, Frédéric Loulergue, and Julien Signoles.
Soundness of a dataflow analysis for memory monitoring.
In Workshop on Languages and Tools for Ensuring CyberResilience
in Critical SoftwareIntensive Systems (HILT), Boston, MA, USA, 2018. ACM.
to appear.
An important concern addressed by runtime
verification tools for C code is related to
detecting memory errors. It requires to monitor some
properties of memory locations (e.g. their validity
and initialization) along the whole program
execution. Static analysis based optimizations have
been shown to significantly improve the performances
of such tools by reducing the monitoring of
irrelevant locations. However, soundness of the
verdict of the whole tool strongly depends on the
soundness of the underlying static analysis
technique. This paper tackles this issue for the
dataflow analysis used to optimize the EACSL
runtime assertion checking tool. We formally define
the core dataflow analysis used by EACSL and prove
its soundness.

[2]

Allan Blanchard, Frédéric Loulergue, and Nikolai Kosmatov.
From Concurrent Programs to Simulating Sequential Programs:
Correctness of a Transformation.
In Alexei Lisitsa, Andrei P. Nemytykh, and Maurizio Proietti,
editors, Proceedings Fifth International Workshop on Verification and
Program Transformation, Uppsala, Sweden, 29th April 2017, volume 253 of
Electronic Proceedings in Theoretical Computer Science, pages 109123. Open
Publishing Association, 2017.
[ DOI ]
FramaC is a software analysis framework that
provides a common infrastructure and a common
behavioral specification language to plugins that
implement various static and dynamic analyses of C
programs. Most plugins do not support
concurrency. We have proposed conc2seq, a FramaC
plugin based on program transformation, capable to
leverage the existing huge code base of plugins and
to handle concurrent C programs. In this paper we
formalize and sketch the proof of correctness of the
program transformation principle behind conc2seq,
and present an effort towards the full mechanization
of both the for malization and proofs with the
proof assistant Coq.

[3]

Allan Blanchard, Nikolai Kosmatov, and Frédéric Loulergue.
A CHRBased Solver for Weak Memory Behaviors.
In 7th Workshop on Constraint Solvers in Testing, Verification,
and Analysis (CSTVA), pages 1522. CEUR Workshop Proceedings, 2016.
[ .pdf ]
With the wide expansion of multiprocessor
architectures, the analysis and reasoning for
programs under weak memory models has become an
important concern. This work presents an original
constraint solver for detecting program behaviors
respecting a particular memory model. It is
implemented in Prolog using CHR (Constraint Handling
Rules). The CHR formalism provides a convenient
generic solution for specifying memory models, that
benefits from the existing optimized implementations
of CHR and can be easily extended to new models. We
briefly present the solver design, illustrate the
encoding of memory model constraints in CHR and
discuss the benefits and limitations of the
proposal.

[4]

Allan Blanchard, Nikolai Kosmatov, Matthieu Lemerre, and Frédéric
Loulergue.
A case study on formal verification of the Anaxagoros hypervisor
paging system with FramaC.
In International Workshop on Formal Methods for Industrial
Critical Systems (FMICS), LNCS, pages 1530, Oslo, Norway, June 2015.
Springer.
[ DOI ]
Cloud hypervisors are critical software whose formal
verification can increase our confidence in the
reliability and security of the cloud. This work
presents a case study on formal verification of the
virtual memory system of the cloud hypervisor
Anaxagoros, a microkernel designed for resource
isolation and protection. The code under
verification is specified and proven in the
FramaC software verification framework, mostly
using automatic theorem proving. The remaining
properties are interactively proven with the Coq
proof assistant. We describe in detail selected
aspects of the case study, including parallel
execution and counting references to pages, and
discuss some lessons learned, benefits and
limitations of our approach.

[5]

Mohamad Al Hajj Hassan, Mostafa Bamha, and Frédéric Loulergue.
An Efficient Skewinsensitive Algorithm for Join Processing on Grid
Architectures.
In 5th ACM SIGPLAN workshop on HighLevel Parallel Programming
and Applications, pages 1118. ACM, 2011.
[ DOI ]
Scientific experiments in many domains generate a
huge amount of data whose size is in the range of
hundreds of megabytes to petabytes. These data are
stored on geographically distributed and
heterogeneous resources. Researchers who need to
analyze and have a fast access to such data are also
located all over the globe. Queries executed by
these researchers may require the transfer of huge
amount of data over the wide area network in a
reasonable time. Due to these emerging needs, the
grid infrastructure which connects widely
geographically distributed and heterogeneous
computing and storage resources was born. In this
paper, we are interested in treating join queries on
the grid. We propose a new parallel algorithm
allowing to reduce communication and disk
Input/Output costs to minimum.

[6]

Frédéric Gava, Louis Gesbert, and Frédéric Loulergue.
Type System for a Safe Execution of Parallel Programs in BSML.
In 5th ACM SIGPLAN workshop on HighLevel Parallel Programming
and Applications, pages 2734. ACM, 2011.
[ DOI ]
BSML, or Bulk Synchronous Parallel ML, is a
highlevel language based on ML and dedicated to
parallel computation. In this paper, an extended
type system that guarantees the safety of parallel
programs is presented. It prevents nondeterminism
and deadlocks by ensuring that the invariants needed
to preserve the structured parallelism are
verified. Imperative extensions (references,
exceptions) are included, and the system is designed
for compatibility with modules.

[7]

Wadoud Bousdira, Frédéric Gava, Louis Gesbert, Frédéric
Loulergue, and Guillaume Petiot.
Functional Parallel Programming with Revised Bulk Synchronous
Parallel ML.
In Koji Nakano, editor, First International Conference on
Networking and Computing (ICNC 2010), 2nd International Workshop on Parallel
and Distributed Algorithms and Applications (PDAA), pages 191196. IEEE
Computer Society, 2010.
[ DOI ]
Bulk Synchronous Parallel ML or BSML is a
highlevel language for programming parallel
algorithms. Built upon the Objective Caml language,
it provides a safe setting for implementing Bulk
Synchronous Parallel (BSP) algorithms. It avoids
concurrency related problems: deadlocks and non
determinism. BSML is based on a very small core of
parallel primitives that extended functional
sequential programming to functional BSP programming
with a parallel data structure and operations to
manipulate it. However, in practice the primitives
for writing the parallel noncommunicating parts of
the program are not so easy to use. Thus we designed
a new syntax that makes programs easier to write and
read. Revised BSML is presented and its
expressiveness and performance are illustrated
through an application example.

[8]

Radia Benheddi and Frédéric Loulergue.
DivideandConquer Programming with Minimally Synchronous Parallel
ML.
In J. Weglarz, R. Wyrzykowski, and B. Szymanski, editors,
Seventh International Conference on Parallel Processing and Applied
Mathematics (PPAM 2007), Workshop on LanguageBased Parallel Programming
Models, number 4967 in LNCS, pages 10781085. Springer, 2008.
[ DOI ]
Minimally Synchronous Parallel ML (MSPML) is a
functional parallel programming language. It is
based on a small number of primitives on a parallel
data structure. MSPML programs are written like
usual sequential ML program and use this small set
of functions. MSPML is deterministic and deadlock
free. The execution time of the programs can be
estimated. Divideandconquer is a natural way of
expressing parallel algorithms. MSPML is a flat
language: it is not possible to split the parallel
machine in order to implement divideandconquer
parallel algorithms. This paper presents an
extension of MSPML to deal with this kind of
algorithms: a parallel composition primitive.

[9]

Julien Tesson and Frédéric Loulergue.
Formal Semantics for the DRMA Programming Style Subset of the BSPlib
Library.
In J. Weglarz, R. Wyrzykowski, and B. Szymanski, editors,
Seventh International Conference on Parallel Processing and Applied
Mathematics (PPAM 2007), Workshop on LanguageBased Parallel Programming
Models, number 4967 in LNCS, pages 11221129. Springer, 2008.
[ DOI ]
BSPlib is a programming library for C and Fortran
which supports bulk synchronous parallelism
(BSP). This paper is about a formal semantics for
the DRMA programming style of the BSPlib
library. The aim is to study the behavior of BSPlib
programs and to propose some syntactic
characterizations used to provide guarantees on
semantic properties. This work is the basis for
future tools dedicated to the validation of BSPlib
programs.

[10]

Frédéric Loulergue.
A Calculus of Functional BSP Programs with Projection.
In International Parallel & Distributed Processing Symposium,
8th Workshop on Advances in Parallel and Distributed Computational Models.
IEEE Computer Society Press, 2006.
[ DOI ]
Bulk Synchronous Parallel ML (BSML) is an extension
of the functional language Objective Caml to program
Bulk Synchronous Parallel (BSP) algorithms. It is
deterministic, deadlock free and performances are
good and predictable. Parallelism is expressed with
a set of 4 primitives on a parallel data structure
called parallel vector. These primitives are pure
functional ones: they have no sideeffect. It is
thus possible, and we did it, to prove the
correctness of BSML programs using a proof assistant
like Coq. The BSλcalculus is an extension
of the λcalculus which models the core
semantics of BSML. Nevertheless some principles of
BSML are not well captured by this calculus. This
paper presents a new calculus, with a projection
primitive, which provides a better model of the core
semantics of BSML.

[11]

Frédéric Gava and Frédéric Loulergue.
A Functional Language for Departmental Metacomputing.
In S. Gorlatch, editor, 4th Workshop on Constructive Methods for
Parallel Programming, pages 6380. Westfälische WilheimsUniversitët
Münster, 2004.
We have designed a functional dataparallel
language called BSML for programming
bulksynchronous parallel algorithms, a model of
computing. Deadlocks and indeterminism are avoided
and the execution time can be then estimated. For
large scale applications where parallel processing
is helpful, more than one parallel machine is
needed. One speaks about metacomputing. A major
problem in programming parallel application for such
platforms is their hierarchical network structures:
latency and bandwidth of global ones are orders of
magnitude worse than those of local networks. Here
we consider how an extention of the wellsuited BSP
model might be for these kinds of computing and how
extended our functional language can be for this.

[12]

Gaétan Hains and Frédéric Loulergue.
Functional Bulk Synchronous Parallel Programming using the BSMLlib
Library.
In S. Gorlatch, editor, Second International Workshop on
Constructive Methods for Parallel Programming (CMPP'2000),
Research Report MIP200007, June 2000.
BSMLlib is a functional dataparallel library for
programming bulksynchronous parallel (BSP)
algorithms in Objective CAML. The definition of
BSMLlib and performance prediction for elementary
operations have been discussed in Sfp2000. This
article demonstrates the expressivity of operations
on elementary algorithms. Other operations, on a
type of parallel sets, illustrate the advantage of
explicit BSP exchange phases: the possibility of
predicting and minimising communication and
synchronization costs. A more complex example,
inspired by parallel join algorithms for relational
databases, demonstrates the advantage of using
BSMLlib's explicit processes for dynamic load
balancing. Finally, suggestions are made for
extending the library and building a complete BSML
language with the same operations.

[13]

Frédéric Loulergue, Gaétan Hains, and Christian Foisy.
A Calculus of RecursiveParallel BSP Programs.
In S. Gorlatch, editor, First International Workshop on
Constructive Methods for Parallel Programming (CMPP'98), Research
Report MIP9805, pages 5970. University of Passau, May 1998.
An extension of the lambdacalculus called BSlambda
is introduced as a formal basis for functional
languages expressing bulk synchronous parallel
algorithms. A local confluence result is explained
and a callbyvalue evaluation strategy is
defined. The operational meaning of the calculus is
illustrated by two programs defining the binary fold
algorithm, one flat and one parallelrecursive.
Colloque national / National Conference

[1]

Allan Blanchard, Nikolai Kosmatov, and Frédéric Loulergue.
La logique contre les fantômes: comparaison de deux approches
pour la preuve d'un module de listes chaînées.
In 18e journées Approches Formelles dans l'Assistance au
Développement de Logiciels (AFADL), June 2019.

[2]

Wadoud Bousdira, Louis Gesbert, and Frédéric Loulergue.
Syntaxe et sémantique de Revised Bulk Synchronous Parallel ML.
In Sylvain Conchon and Assia Mahboubi, editors, Journées
Francophones des Langages Applicatifs (JFLA), Studia Informatica
Universalis, pages 117146. Hermann, 2011.
Bulk Synchronous Parallel ML (BSML) est une
extension du langage fonctionnel Objective Caml,
fondé sur un modèle structuré de
parallélisme, le modèle BSP. Ce modèle
assure au programmeur BSML la sûreté
d'exécution tout en lui laissant le strict
contrôle des processeurs. Le modèle de
prévision de performances de BSML est simple et
réaliste. Le parallélisme est exprimé en
utilisant un ensemble de primitives fonctionnelles
pures sur une structure de données parallèle
appelée vecteur parallèle. Cependant, les
programmes sont souvent difficiles à écrire
et leur mise au point peut être
fastidieuse. Nous proposons dans cet article une
nouvelle syntaxe et une sémantique associée
dans le but d'écrire des programmes plus courts
et plus lisibles. Nous formalisons la syntaxe et la
sémantique classiques ainsi que les nouvelles
syntaxe et sémantique, puis nous les
modélisons en Coq. Leur confluence est
établie.

[3]

H. Hashimoto, Zhenjiang Hu, Julien Tesson, Frédéric Loulergue, and
Masato Takeichi.
A Coq Library for Program Calculation.
In JSSST Conference on Software Science and Technology, Shimane
University, Shimane, Japan, 2009.
Program calculation, being a programming technique
that derives programs from specification by means of
formula manipulation, is a challenging activity. It
requires human insights and creativity, and needs
systems to help human to focus on clever parts of
the derivation by automating tedious ones and
verifying correctness of transformations. Different
from many existing systems, we show in this paper
that Coq, a popular theorem prover, provides a cheap
way to implement a powerful system to support
program calculation, which has not been recognized
so far. We design and implement a set of tactics for
the Coq proof assistant to help the user to derive
programs by program calculation and to write proofs
in calculational form. The use of these tactics is
demonstrated through program calculations in Coq
based on the theory of lists.

[4]

Radia Benheddi and Frédéric Loulergue.
Composition parallèle pour MSPML.
In P.E. Moreau and T. Hardin, editors, Journées
Francophones des Langages Applicatifs (JFLA), pages 101116. INRIA, 2006.
Cet article présente l'ajout, à un langage
fonctionnel parallèle appelé Minimally
Synchronous Parallel ML (MSPML), d'une primitive
pour la composition parallèle, appelée
juxtaposition. MSPML est un langage fonctionnel
parallèle basé sur un petit nombre de
primitives sur une structure de données
parallèle. Les programmes sont écrits comme
des programmes ML usuels en utilisant ce petit
ensemble de fonctions. MSPML est déterministe et
sans déblocages. Le temps d'exécution des
programmes peut étre estimé. Il a une
sémantique asynchrone, c'estàdire sans
barrière de synchronisation globale. Cette
propriété s'avère intéressante pour
des programmes non équilibrés à chaque
étape. Toutefois il ne permettait pas
d'écrire des algorithmes diviserpourrègner
parallèles qui sont courants dans la
littérature. La juxtaposition remédie à
cette limitation.

[5]

Frédéric Gava and Frédéric Loulergue.
Synthèse de types pour Bulk Synchronous Parallel ML.
In J.C. Filliatre, editor, Journées Francophones des
Langages Applicatifs (JFLA 2003), pages 153168, january 2003.
The BSMLlib library is a library for Bulk
Synchronous Parallel (BSP) programming with the
functional language Objective Caml. It is based on
an extension of the λcalculus by parallel
operations on a data structure named parallel
vector, which is given by intention. In order to
have an execution that follows the BSP model, and to
have a simple cost model, nesting of parallel
vectors is not allowed. The novelty of this paper
is a type system which prevents such nesting. An
algorithm for type inference is also presented.

[6]

A. Merlin, Gaétan Hains, and Frédéric Loulergue.
A SPMD Environment Machine for Functional BSP Programs.
In Proceedings of the Third Scottish Functional Programming
Workshop, august 2001.
A functional dataparallel language called BSML has
been designed for programming bulksynchronous
parallel (BSP) algorithms in socalled direct
mode. Its aim is to offer predictable and scalable
performance for BSP algorithms written as functional
programs. The current implementation of BSML is a
library and has not been validated w.r.t the
language's formal definition. As a library, it does
not allow static tools to support performance
management. As a first step in the construction of a
complete BSML language, we have defined a parallel
environment machine from our language's evaluation
semantics and begun testing the scalability and
predictability of its performance. This paper
presents the machine design and the results of our
experiments.

[7]

Frédéric Loulergue.
Parallel Composition and Bulk Synchronous Parallel Functional
Programming.
In Stephen Gilmore, editor, Proceedings of the second Scottish
Functional Programming Workshop, St Andrews, July 2000.
The work described here is part of our research
program to investigate new paradigms for declarative
parallel programming through special interpretations
of standard semantics. Our approach is a purely
functional approach to programming of BSP
algorithms. An attempt to add a parallel composition
to this approach led to a nonconfluent
calculus. This paper presents a solution for adding
the parallel composition to the weak callbyvalue
strategy of the BSlambda_p and to the BSMLlib
library.

[8]

Olivier Ballereau, Frédéric Loulergue, and Gaétan Hains.
Highlevel BSP Programming: BSML and BSλ.
In P Trinder and Greg Michaelson, editors, Proceedings of the
first Scottish Functional Programming Workshop, number RM/99/9 in Technical
Report, pages 4352, Edinburgh, august 1999. HeriotWatt University.
BSMLlib is a functional dataparallel library for
programming bulksynchronous parallel (BSP)
algorithms in Objective CAML. The definition of
BSMLlib and performance prediction for elementary
operations have been discussed in Sfp2000. This
article demonstrates the expressivity of operations
on elementary algorithms. Other operations, on a
type of parallel sets, illustrate the advantage of
explicit BSP exchange phases: the possibility of
predicting and minimising communication and
synchronization costs. A more complex example,
inspired by parallel join algorithms for relational
databases, demonstrates the advantage of using
BSMLlib's explicit processes for dynamic load
balancing. Finally, suggestions are made for
extending the library and building a complete BSML
language with the same operations.

[9]

Frédéric Loulergue.
Extension du BSλcalcul.
In P. Weis, editor, JFLA'99 : Journées Francophones des
Langages Applicatifs, pages 93112, MorzineAvoriaz, February 1999.
[ ePub ]
Après une présentation du calcul confluent
BSλ, extension du λcalcul par des
opérations parallèles permettant l'expression
d'algorithmes BSP, nous proposons trois nouvelles
opérations parallèles étendant le calcul. Le
calcul et cette nouvelle extension sont illustrés
par différentes algorithmes de diffusion donnés
dans une syntaxe à la ML.

[10]

Frédéric Loulergue.
BSML : Programmation BSP purement
fonctionnelle.
In D. Méry and GaétanR. Perrin, editors, Dixièmes
Rencontres Francophones du Parallélisme
(Renpar'10), pages 243246, Strasbourg, june 1998.
[ ePub ]
Nous d'efinissons une extension minimale du langage
fonctionnel Caml par des op'erations parall`eles
BSP, d'eriv'ees d'un mod`ele d'enotationel simple
avec processus explicites et statiques. Le langage
obtenu, BSML, pr'eserve la s'emantique
compositionnelle et permet des pr'evisions pr'ecises
de performances gr^ace au mod`ele de co^uts
BSP. Nous l'illustrons sur des algorithmes
parall`eles de base et montrons comment un
simulateur BSML permet,a`l'aide des param`etres BSP
des machines, de pr'evoir les performances
r'eelles.

[11]

Frédéric Loulergue and G Hains.
Programmation fonctionnelle parallèle en processus statiques: Une
approche dénotationnelle.
In M. Gengler and C. Queinnec, editors, Journées
Francophones des Langages Applicatifs JFLA'97. INRIA, 1997.
[ ePub ]
This paper presents the design for a purely
functional parallel language with higherorder
functions. It is based on specialisation of a
nonstandard model of lambda calculus (concrete data
structure): array data structures. We present its
denotational semantics, operational semantics, full
abstraction property and program examples to
illustrate its key features.
Rapport de recherche / Research Report

[1]

Allan Blanchard, Nikolai Kosmatov, and Frédéric Loulergue.
Concurrent Program Verification by Code Transformation:
Correctness.
Research Report RR201703, LIFO, Université d'Orléans, 2017.
[ .pdf ]
FramaC is a framework that provides a common
infrastructure and a common behavioral specification
language to plugins that implement various static
and dynamic analyses of C programs. Most plugin do
not support concurrency. We have proposed
conc2seq a plugin that handles concurrent C
programs, based on program transformation to
leverage the existing huge code base of plugins. In
this paper we formalize and sketch the correctness
of the program transformation principle behind
conc2seq, and present an effort towards the
full mechanization of both the formalization and
proofs with the proof assistant Coq

[2]

Arvid Jakobsson, Frédéric Dabrowski, Wadoud Bousdira,
Frédéric Loulergue, and Gaétan Hains.
Replicated Synchronization for Imperative BSP Programs.
Research Report RR201704, LIFO, Université d'Orléans, 2017.
[ .pdf ]

[3]

Frédéric Dabrowski, Frédéric Loulergue, and Thomas Pinsard.
Nested Atomic Sections with Thread Escape: A Formal Definition.
Technical Report RR201305, LIFO, Université d'Orléans,
2013.
[ .pdf ]
We consider a simple imperative language with
fork/join parallelism and lexically scoped nested
atomic sections from which threads can escape. In
this context, our contribution is the precise
definition of atomicity, wellsynchronisation and
the proof that the latter implies the strong form of
the former. A formalisation of our results in the
Coq proof assistant is also available.

[4]

Noman Javed and Frédéric Loulergue.
Parallel Programming with Orléans Skeleton Library.
Technical Report RR201105, LIFO, University of Orléans, March
2011.
Orleans Skeleton Library (OSL) is a library of
parallel algorithmic skeletons in C++ on top of
MPI. It provides a structured approach towards
parallel programming. Skeletons in OSL are based
over the bulk synchronous parallelism
model. Applications can be developed using different
combinations and compositions of the skeletons. This
paper illustrates the expressivity of OSL with two
applications: a two dimensional heat diffusion
simulation, and an exact nbody
simulation. Experiments using these applications are
performed on parallel machines.

[5]

Noman Javed and Frédéric Loulergue.
A Formal Programming Model of Orléans Skeleton Library.
Technical Report RR201106, LIFO, University of Orléans, April
2011.
Orléans Skeleton Library (OSL) is a library of
parallel algorithmic skeletons in C++ on top of
MPI. It provides a structured approach towards
parallel programming. Skeletons in OSL are based on
the bulk synchronous parallelism model. In this
paper we present formal semantics of OSL: a
programming model and its properties proved with the
Coq assistant.

[6]

Julien Tesson, H. Hashimoto, Zhenjiang Hu, Frédéric Loulergue, and
Masato Takeichi.
Program Calculation in Coq.
Technical Report RR200907, LIFO, University of Orléans, 2009.
Program calculation, being a programming technique
that derives programs from specification by means of
formula manipulation, is a challenging activity. It
requires human insights and creativity, and needs
systems to help human to focus on clever parts of
the derivation by automating tedious ones and
verifying correctness of transformations. Different
from many existing systems, we show in this paper
that Coq, a popular theorem prover, provides a cheap
way to implement a powerful system to support
program calculation, which has not been recognized
so far. We design and implement a set of tactics for
the Coq proof assistant to help the user to derive
programs by program calculation and to write proofs
in calculational form. The use of these tactics is
demonstrated through program calculations in Coq
based on the theory of lists.

[7]

Frédéric Loulergue, Zhenjiang Hu, and K. Kakehi.
A Tutorial Implementation of the Diffusion Algorithmic Skeleton with
the BSMLlib Library.
Technical Report METR200406, Department of Mathematical
Informatics, University of Tokyo, 2004.
Skeleton programming enables programmers to build
parallel programs easier by providing efficient
readymade parallel algorithms. The diffusion
skeleton was proposed (associated with a method for
program derivation) to abstract a good combination
of primitive skeletons, such as map, parallel
reduction and parallel prefix sum (scan). The
BSMLlib library whose design is based on formal
semantics is a library for the Objective Caml
language to support Bulk Synchronous Parallelism.
It offers a small set of primitives which permits to
write any deterministic BSP algorithm. In this paper
we study the implementation of the diffusion
parallel skeleton using the BSMLlib library in a
tutorial way.

[8]

Frédéric Loulergue.
Communication Primitives for Minimally Synchronous Parallel ML.
Technical Report 200402, University of Paris 12, LACL, 2004.
Minimally Synchronous Parallel ML is a functional
parallel language whose execution time can then be
estimated and deadlocks and indeterminism are
avoided. Programs are written as usual ML programs
but using a small set of additional
functions. Provided functions are used to access the
parameters of the parallel machine and to create and
operate on a parallel data structure. It follows the
cost model of the Message Passing Machine model
(MPM). This paper explore the writing of an
additional communication function using this small
set of primitives. It could also be considered as a
primitive rather than a function written using
primitives. So we developed a lowlevel
implementation of this function and compared it with
the highlevel implementation in terms of
efficiency.

[9]

Frédéric Gava and Frédéric Loulergue.
Verifying Functional Bulk Synchronous Parallel Programs Using the
Coq System.
Technical Report 187, University of Freiburg, D. Basin and B. Wolff
editor, TPHOLS'03, Emerging Trends, september 2003.
The Bulk Synchronous Parallel ML (BSML) is a
functional language for Bulk Synchronous Parallel
(BSP) programming. It is based on an extension of
the λcalculus by parallel operations on a
parallel data structure named parallel vector, which
is given by intention. We present the formal proofs
of correctness of BSML programs in the Coq proof
assistant. Such development demonstrates the
usefulness of higherorder logic in the process of
software certification and parallel
applications. They also show that proof of rather
complex parallel algorithms may be made with
inductive types by using the certified programs.

[10]

Frédéric Gava and Frédéric Loulergue.
A Parallel Virtual Machine for Bulk Synchronous Parallel ML.
Technical Report 200301, University of Paris ValdeMarne, LACL,
january 2003.
We have designed a functional dataparallel
language called BSML for programming
bulksynchronous parallel (BSP) algorithms in
socalled direct mode. In a directmode BSP
algorithm, the physical structure of processes is
made explicit. The execution time can then be
estimated and deadlocks and indeterminism are
avoided. The BSMLlib library has been implemented
for the Objective Caml language. But there is
currently no full implementation of such a language
and an abstract machine is needed to validate such
an implementation. Our approach is based on a
bytecode compilation to a parallel abstract machine
performing exchange of data and synchronous requests
derived from the ZAM, the efficient abstract machine
of the Objective Caml language.

[11]

Frédéric Gava and Frédéric Loulergue.
Verifying Functional Bulk Synchronous Parallel Programs Using the
Coq System.
Technical Report 200302, University of Paris 12, LACL, 2003.
The Bulk Synchronous Parallel ML (BSML) is a
functional language for Bulk Synchronous Parallel
(BSP) programming. It is based on an extension of
the λcalculus by parallel operations on a
parallel data structure named parallel vector, which
is given by intention. We present the formal proofs
of correctness of BSML programs in the Coq proof
assistant. Such development demonstrates the
usefulness of higherorder logic in the process of
software certification and parallel
applications. They also show that proof of rather
complex parallel algorithms may be made with
inductive types by using the certified programs.

[12]

Frédéric Dabrowski and Frédéric Loulergue.
Efficiency of Bulk Synchronous Parallel Programming using C++,
BSCF++ and BSMLlib.
Technical Report 200210, University of Paris ValdeMarne, LACL,
june 2002.
The BSMLlib library is a library for Bulk
Synchronous Parallel (BSP) programming with the
functional language Objective Caml. It is based on
an extension of the λcalculus by parallel
operations on a parallel data structure named
parallel vector, which is given by intention. The
BSFC++ library is a library for Functional Bulk
Synchronous Parallel programming in C++ which is
based on the FC++ library. We present those
libraries and give experimental results. For
comparaison, MPI/C++ versions of the same programs
were also ran on our cluster of PCs.

[13]

Frédéric Loulergue.
Implementation of the BSMLlib Library v0.2.
Technical Report 200211, University of Paris ValdeMarne, LACL,
july 2002.
The BSMLlib is a library for Bulk Synchronous
Parallel (BSP) programming with the functional
language Objective Caml. It is based on an extension
of the λcalcul by parallel operations on a
parallel data structure named parallel vector, which
is given by intention. A first implementation of
this library was based on the BSPlib library, which
is not longer supported nor updated. Being the basis
of a framework for Grid computing, a new
implementation of the BSMLlib based on MPI has been
designed. Experimental results on a cluster of PCs
are presented.

[14]

Frédéric Dabrowski and Frédéric Loulergue.
Functional Bulk Synchronous Parallel Programming in C++.
Technical Report 200213, University of Paris ValdeMarne, LACL,
august 2002.
This paper presents the BSFC++ library for
functional bulk synchronous parallel programming in
C++. It is based on an extension of the
λcalculus by parallel operations on a
parallel data structure named parallel vector, which
is given by intention. This guarantees the
determinism and the absence of deadlock. Broadcast
algorithms are implemented using the core library.

[15]

Frédéric Loulergue.
Parallel Juxtaposition for Bulk Synchronous Parallel ML.
Technical Report 200217, University of Paris ValdeMarne, LACL,
november 2002.
The BSMLlib is a library for Bulk Synchronous
Parallel (BSP) programming with the functional
language Objective Caml. It is based on an extension
of the λcalculus by parallel operations on
a parallel data structure named parallel vector,
which is given by intention. An attempt to add a
parallel composition to this approach led to a
nonconfluent calculus and to a restricted form of
parallel composition. This paper presents a new,
simpler and more general semantics for parallel
composition, renamed here parallel juxtaposition, as
well as an associated cost model derived from the
BSP cost model.

[16]

Frédéric Gava and Frédéric Loulergue.
A Bulk Synchronous Parallel Categorical Abstract Machine.
Technical Report 200220, University of Paris ValdeMarne, LACL,
december 2002.
We have designed a functional dataparallel
language called BSML for programming
bulksynchronous parallel (BSP) algorithms in
socalled direct mode. In a directmode BSP
algorithm, the physical structure of processes is
made explicit. The execution time can then be
estimated and deadlocks and indeterminism are
avoided. The BSMLlib library has been implemented
for the Objective Caml language. But there is
currently no full implementation of such a language
and an abstract machine is needed to validate such
an implementation. Our approach is based on a
natural semantics and a bytecode compilation to a
parallel abstract machine performing exchange of
data and synchronous requests derived from the
abstract machine of the Caml language.

[17]

Frédéric Loulergue.
Compositionality in Functional Bulk Synchronous Parallelism.
Technical Report 200221, University of Paris ValdeMarne, LACL,
december 2002.
The BSλ_{p}calculs an extension of the
λcalculus by bulk synchronous parallel
(BSP) operations on a parallel data structure named
parallel vector. This paper presents how functional
composition is preserved in this framework both from
the semantics point of view and from the cost model
point of view. Those operations are flat and
allow BSP programming in direct mode but it is
impossible to express directly divideandconquer
algorithms. This paper also present a new kind of
composition, a new construction for the
BSλ_{p}calculus which can express
divideandconquer algorithms. It is called parallel
superposition. An associated cost model derived from
the BSP cost model is also given.

[18]

Frédéric Loulergue.
Confluence of the BSλcalculus.
Technical Report RR9910, LIFO, Université d'Orléans, May
1999.
The BSlambdacalculus, a formal basis for functional
languages expressing bulk synchronous parallel
algorithms, is presented. It is then shown to be
confluent.

[19]

Frédéric Loulergue and Gaétan Hains.
An Introduction to BSλ.
Technical Report RR9809, LIFO, Université d'Orléans,
September 1998.
This paper is an introduction to BSlambda a calculus
of parallelrecursive BSP programs. The calculus is
presented, the confluence of its flat subset is
stated and examples are gieven in an MLlike
syntax.
Mémoire / Thesis

[1]

Frédéric Loulergue.
Functional Programming for Parallel and Meta Computing:
Semantics, Systems and Proofs.
Habilitation à diriger des recherches thesis, University Paris Val
de Marne, december 2004.

[2]

Frédéric Loulergue.
Programmation fonctionnelle d'ordinateurs parallèles et de
métaordinateurs : sémantiques, systèmes et preuves.
Mémoire d'habilitation à diriger des recherches, University
Paris Val de Marne, décembre 2004.

[3]

Frédéric Loulergue.
Conception de langages fonctionnels pour la programmation
massivement parallèle.
thèse de doctorat, Université d'Orléans, LIFO, 4 rue
Léonard de Vinci, BP 6759, F45067 Orléans Cedex 2, France, January
2000.
Certains problèmes nécessitent des performances
que seules les machines massivement parallèles
peuvent offrir. Leur programmation demeure
néanmoins difficile. Les travaux étudiant le
mélange de la programmation fonctionnelle et du
parallélisme se répartissent en deux
catégories : les extensions explicitement
parallèles des langages fonctionnels  mais les
langages obtenus sont soit indéterministes
soit brisent l'aspect fonctionnel pur  et les
implantations parallèles avec sémantique
fonctionnelle  mais les langages obtenus
n'expriment pas directement les algorithmes
parallèles et ne permettent pas la prévision des
temps d'exécution. L'approche des langages à
patrons, dans lesquels seulement un ensemble fixé
d'opérations sont exécutées en parallèle,
est intermédiaire. Leur sémantique
fonctionnelle est explicite mais leur sémantique
opérationnelle parallèle est
implicite. L'ensemble de patrons doit étre le
plus complet possible mais cet ensemble s'avère
dépendant du domaine d'application. Nous
approfondissons cette position intermédiaire avec
pour objectif de parvenir à des langages
universels dans lesquels le code source permet de
déterminer le coét. Cette dernière exigence
nécessite que soient explicites dans les
programmes les lieux du réseau de processeurs de
la machine. Une approche dénotationnelle nous a
amené à une étude des limites de
l'expressivité des langages fonctionnels
parallèles à processus explicites. Une approche
opérationnelle nous a amené à définir des
λcalculs BSP (le modèle BSP ajoute une
notion de processus explicites au parallélisme de
données) confluents et universels pour les
algorithmes BSP. Nous avons également analysé
les conditions précises d'implantation parallèle
de tels calculs, expérimenté le style de
programmation associé et constaté qu'ils sont
suffisamment expressifs et que la prévision des
temps d'exécution parallèle y est possible.
Abstract: Some problems require performance that
only massively parallel computers offer, but their
programming is still difficult. Works on functional
programming and parallelism can be divided in two
categories: explicit parallel extensions of
functional languages  where resulting languages
are either nondeterministic or non functional 
and parallel implementations with functional
semantics  where resulting languages don't express
parallel algorithms directly and don't allow the
prediction of execution times. Algorithmic skeletons
languages, in which only a finite set of operations
(the skeletons) are run in parallel, constitutes an
intermediate approach. Their functional semantics is
explicit but their parallel operational semantics is
implicit. The set of algorithmic skeletons has to be
as complete as possible but it is often dependent on
the domain of application. We explore this
intermediate position thoroughly in order to obtain
universal parallel languages where source code
determines execution cost. This last requirement
forces the use of explicit processes corresponding
to the parallel machine's processors. A denotational
approach leads us to study the expressiveness of
functional parallel languages with explicit
processes. An operational approach leads us to
define BSP lambdacalculi (the BSP model adds a
notion of explicit process to dataparallelism) that
are confluent and universal for BSP algoritms. We
also analyze the conditions for parallel
implementation of such calculi. We have experimented
the associated programming style and have observed
that they are sufficiently expressive and allow the
prediction of execution times.

[4]

Frédéric Loulergue.
Programmation parallèle fonctionnelle en processus statiques: une
approche dénotationnelle.
Mémoire de DEA d'informatique, LIFO, Université
d'Orléans, Septembre 1996.
Poster / Poster

[1]

Jolan Philippe and Frédéric Loulergue.
Towards automatically optimizing PySke programs (poster).
In International Conference on High Performance Computing and
Simulation (HPCS). IEEE, 2019.
to appear.

[2]

Jolan Philippe and Frédéric Loulergue.
Towards the generation of correct Java programs (poster).
In International Conference on High Performance Computing and
Simulation (HPCS), pages 10551056, Orléans, France, 2018. IEEE.
[ DOI ]

[3]

Salwa Souaf and Frédéric Loulergue.
Strong security guarantees: from Alloy to Coq (poster).
In International Conference on High Performance Computing and
Simulation (HPCS), pages 10571058, Orléans, France, 2018. IEEE.
[ DOI ]

[4]

Jolan Philippe, Frédéric Loulergue, and Wadoud Bousdira.
Formalization of a Big Graph API in Coq (Poster).
In International Conference on High Performance Computing and
Simulation (HPCS), pages 893894, Genoa, Italy, 2017. IEEE.
[ DOI ]

[5]

Christopher D. Whitney and Frédéric Loulergue.
Towards a Verified Parallel Implementation of Frequent Itemset
Mining (Poster).
In International Conference on High Performance Computing and
Simulation (HPCS), pages 889890, Genoa, Italy, 2017. IEEE.
[ DOI ]

[6]

Noman Javed and Frédéric Loulergue.
A Metaprogrammed Bulk Synchronous Parallel Algorithmic Skeleton
Library.
In International Conference for High Performance Computing,
Networking, Storage and Analysis (SC08), Poster, 2008.
Exposé / Talk

[1]

Frédéric Loulergue, Mostafa Bamha, and Mohamad Al Hajj Hassan.
Handling Dataskew Effects in Join Operations using MapReduce.
Journées nationales du GdR GPL, GdT LaMHA, juin 2014.
For over a decade, MapReduce has become a prominent
programming model to handle vast amounts of raw data
in large scale systems. This model ensures
scalability, reliability and availability aspects
with reasonable query processing time. However
these large scale systems still face some
challenges: data skew, task imbalance, high disk I/O
and redistribution costs can have disastrous effects
on performance. In this paper, we introduce
MRFAJoin algorithm: a new frequency adaptive
algorithm based on MapReduce programming model and a
randomised key redistribution approach for join
processing of largescale datasets. A cost analysis
of this algorithm shows that our approach is
insensitive to data skew and ensures perfect
balancing properties during all stages of join
computation. These performances have been confirmed
by a series of experimentations.

[2]

Frédéric Loulergue.
Systematic Development of Correct Programs for Parallel Computing.
Faculty of Mathematics and Informatics, BabesBolyai University of
ClujNapoca, Romania, February 2014.
With parallel machines being now the norm (from
smartphones to supercomputers) and no longer
restricted to a niche, concurrent and parallel
programming has also to become widespread. To do so,
it has to move to structured paradigms, in the same
way sequential programming did more than forty years
ago. Moreover the concern of applying formal methods
arises, which allow specifications of parallel and
distributed programs to be precisely stated and the
conformance of an implementation to be verified
using mathematical techniques. However, the
complexity of parallel programs, compared to
sequential ones, makes them more errorprone,
difficult to verify and almost impossible to debug.
Parallel programs correctness by construction is
thus an essential feature. To build correct programs
by constructions is not a simple task, and usually
the methodologies used for this purpose are rather
theoretical based on a penandpaper style. Such
theoretical methodologies are usually based on
parallel recursive structures. They are named
“parallel” because their recursive nature makes
them very suitable for divideandconquer and thus
parallelism. However when one writes an application
using parallel recursive structures, it is not yet
implemented in parallel. A realisation of parallel
recursive structures and their basic operations
should be implemented using more concrete (and
actually more “flat” than recursive) parallel
programming languages. Moreover, a better approach
could be based on software tools and formalised
theories that allow a user to develop an efficient
parallel application by implementing easily simple
programs satisfying conditions, ideally
automatically, proved. For this purpose proof
assistants make the reasoning checkable by a
machine, may provide some automation in reasoning,
and actual programs may be extracted from
developments within these proof assistants. This
talk will present an effort for providing a
framework based on the Coq proof assistant for the
systematic development of correct programs for
parallel computing.

[3]

Frédéric Loulergue.
PaPDAS: Parallel Program Development with Algorithmic Skeletons.
Journée MTV2/MFDL du GDR GPL, Grenoble, Janvier 2014.
Le projet PaPDAS a pour objectif de fournir un
environnement pour le développement
systématique de programmes parallèles se
basant sur l'algorithmique constructive (théorie
des listes) et permettant soit d'exécuter
très efficacement les programmes obtenus, soit
de les compiler avec un compilateur vérifié
gérant le parallélisme. Dans cet exposé
je présente la dérivation formelle et
l'extraction depuis Coq d'un programme parallèle
fonctionnel pour le problème All Nearest Smaller
Values (ANSV) et la comparaison avec une
implantation directe non vérifiée d'un autre
algorithme pour résoudre le même
problème.

[4]

Frédéric Loulergue, Simon Robillard, Julien Tesson, Joeffrey
Légaux, and Zhenjiang Hu.
Dérivation formelle et extraction d'un programme
dataparalléle pour le problème des valeurs inférieures les plus
proches.
AFADL, juin 2014.
The All Nearest Smaller Values (ANSV) problem is an
important problem for parallel programming as it can
be used to solve several problems and is one of the
phases of several other parallel algorithms. We
formally develop by construction a functional
parallel program for solving the ANSV problem using
the theory of Bulk Synchronous Parallel (BSP)
homomorphisms within the Coq proof assistant. The
performances of the Bulk Synchronous Parallel ML
program obtained from Coq is compared to a version
derived without software support (penandpaper) and
implemented using the Orléans Skeleton Library
of algorithmic skeletons, and to a (unproved
correct) direct implementation of the BSP algorithm
of He and Huang.

[5]

Frédéric Loulergue.
Towards a Verified GTA Library in Coq.
Deuxième Journée Informatique en Nuage à Orléans
(JINO2), January 2013.
The goal of this work is threefold. First we aim at
modelling the algebraic structures used in the GTA
framework and prove the lemma used to calculate
programs from their GTA from inside the Coq proof
assistant. Second we aim at providing some
automation inside Coq to ease the definition of
applications, with the formal verification that they
actually fullfill the requirements of the GTA
framework. Third we aim at providing the way to
extract automatically actual parallel functional
programs and MapReduce programs from these Coq
developments. This talk will present the current
status of this workinprogress. This is a joint
work with Kento Emoto, Julien Tesson, and Frederic
Dabrowski.

[6]

Frédéric Loulergue.
Towards a Framework for Systematically Developing Correct Parallel
and Cloud Programs.
NII  National Institute of Informatics, Tokyo, October 2013.
Our general goal is to ease the development of
correct and verifiable parallel programs with
predictable performances using theories and tools to
allow a user to develop an efficient application
byimplementing simple programs satisfying conditions
easily, or ideally automatically, proved. To attain
this goal, a framework should be based on (1)
highlevel algebraic theories of parallel programs,
(2) modelling of these theories inside interactive
theorem provers (also named proof assistants) such
as Coq, to be able to write and reason on programs,
(3) axiomatisation of lower level parallel
programming primitives and their use to implement
the highlevel primitives in order to extract actual
parallel code from the developments make inside
proof assistants, (4) verified compilers to ensure
that the compiled code actually preserves the
semantics (and yet the correctness) of the extracted
source code. This first lecture presents the big
picture, an overview of the other lectures and the
many open questions related to this goal.

[7]

Frédéric Loulergue.
Program Verification with the Coq Proof Assistant.
NII  National Institute of Informatics, Tokyo, November 2013.
[ .pdf ]
Coq is a proof assistant based on the CurryHoward
correspondence relating terms of a typed Lambda
calculus with proof trees of a logical system in
natural deduction form. The calculus behind Coq is
the calculus of (co)inductive constructions, an
extension of the initial Calculus of
Constructions. The main usages of Coq are: (1)
program verification, (2) modelling and proof of
properties programming languages semantics, (3)
formalised mathematical theories. This lecture is a
short introduction to Coq for program verification.

[8]

Frédéric Loulergue.
Writing and Reasoning about Parallel Programs  Part 1.
NII  National Institute of Informatics, Tokyo, November 2013.
Bulk Synchronous Parallel ML (BSML) is a structured
parallel functional programming language based on
the Bulk Synchronous Parallelism (BSP) which offers
a high degree of abstraction like PRAM models but
yet a realistic cost model based on a structured
parallelism. BSML extends a functional programming
language of the MLfamily with a polymorphic data
structure and a very small set of primitives. In
this lecture we describe a library for reasoning
about BSML programs using the Coq interactive
theorem prover.

[9]

Frédéric Loulergue.
Writing and Reasoning about Parallel Programs  Part 2.
NII  National Institute of Informatics, Tokyo, November 2013.
Bulk Synchronous Parallel ML (BSML) is a structured
parallel functional programming language based on
the Bulk Synchronous Parallelism (BSP) which offers
a high degree of abstraction like PRAM models but
yet a realistic cost model based on a structured
parallelism. BSML extends a functional programming
language of the MLfamily with a polymorphic data
structure and a very small set of primitives. In
this lecture we describe a framework for extracting
actual parallel programs from proofs of BSML
programs in Coq. This framework is illustrated
through several example applications.

[10]

Frédéric Loulergue.
Constructive Algorithms in Coq.
NII  National Institute of Informatics, Tokyo, November 2013.
Program calculation, being a programming technique
that derives programs from specification by means of
formula manipulation, is a challenging activity. It
requires human insights and creativity, and needs
systems to help human to focus on clever parts of
the derivation by automating tedious ones and
verifying correctness of transformations. Different
from many existing systems, we show in this lecture
that Coq provides a cheap way to implement a
powerful system to support program calculation. We
design and implement a set of tactics for the Coq
proof assistant to help the user to derive programs
by program calculation and to write proofs in
calculational form. The use of these tactics is
demonstrated through program calculations in Coq
mainly based on the theory of lists.

[11]

Frédéric Loulergue.
Bulk Synchronous Parallel Homomorphisms.
NII  National Institute of Informatics, Tokyo, November 2013.
Algorithmic skeletons in conjunction with list
homomorphisms play an important role in formal
development of parallel algorithms. We have designed
a notion of homomorphism dedicated to bulk
synchronous parallelism. In this lecture, the BH
theory is presented, as well as its formalisation in
Coq. An implementation of a BH skeleton is
implemented in BSML and verified in Coq. In practice
a BH skeleton for the C++ Orléans Skeleton
Library C++ could also be used. The framework is
illustrated through various applications including
sparse matrix vector multiplication and the all
nearest smaller values problem.

[12]

Frédéric Loulergue.
GenerateTestAggregate for Cloud Computing in Coq.
NII  National Institute of Informatics, Tokyo, November 2013.
Emoto, Fisher and Hu have integrated the
generateandtest programing(GTA) paradigm and
semirings for aggregation of results, to propose a
novel parallel programming framework for MapReduce,
and to demonstrate how the framework can be
efficiently implemented as a library to support
parallel programming on Hadoop. In this lecture, we
first present the modelling the algebraic structures
used in the GTA framework and prove the lemma used
to calculate programs with the Coq proof
assistant. Second we show how to provide some
automation inside Coq to ease the definition of
applications, with the formal verification that they
actually fulfill the requirements of the GTA
framework. Third we provide the way to extract
automatically actual parallel functional programs
and MapReduce programs from these Coq developments.

[13]

Frédéric Loulergue.
Towards a Verified GTA Library.
Second Workshop on Frameworks for the Development of Correct
(parallel) Programs (FraDeCoPP2), November 2012.
The goal of this work is threefold. First we aim at
modelling the algebraic structures used in the GTA
framework and prove the lemma used to calculate
programs from their GTA from inside the Coq proof
assistant. Second we aim at providing some
automation inside Coq to ease the definition of
applications, with the formal verification that they
actually fullfill the requirements of the GTA
framework. Third we aim at providing the way to
extract automatically actual parallel functional
programs and MapReduce programs from these Coq
developments. This talk will present the current
status of this workinprogress. This is a joint
work with Kento Emoto, Julien Tesson, and Frederic
Dabrowski.

[14]

Frédéric Loulergue.
Une bibliothèque vérifiée de squelettes algorithmiques
sur tableaux équitablement répartis.
Journés Nationales du GDR Génie de la Programmation et du
Logiciel, Juin 2012.
Les squelettes algorithmiques sont des motifs
abstraits qui apparaissent souvent dans la
programmation parallèle. Issus de la
programmation fonctionnelle, ce sont des fonctions
d'ordre supérieur implantées en
parallèle et fournies au programmeur. De
manière classique, ils offrent une interface
simple à l'utilisateur sans les détails du
parallélisme. Nous proposons dans cet article
une série de squelettes algorithmiques
implantés avec le langage fonctionnel
parallèle BSML (Bulk Synchrounous Parallel
ML). Ces squelettes permettent de manipuler une
structure de données répartie, des tableaux
équitablement répartis. La
spécification de ces squelettes est données
par un type de module de l'assistant de preuve
Coq. Une implantation de ce type de module, et donc
correcte par rapport à la spécification des
squelettes, est réalisée à l'aide de la
partie fonctionnelle de Coq augmentée d'une
axiomatisation des primitives de BSML. L'extraction
de cette implantation en Coq donne du code BSML
prouvé correct. Ces squelettes parallèles
peuvent alors être compilés par le
compilateur BSML et enfin exécutés sur une
grande variété de machines
parallèles. Nous prouvons également
correctes des applications en utilisant la
spécification des squelettes et extrayons leurs
implantations en BSML.

[15]

Frédéric Loulergue.
Vers des environnements infonuagiques vérifiés.
Première Journée Informatique en Nuage à Orléans
(JINO1), Juillet 2012.
As the usage of the cloud becomes pervasive in our
lives, it is needed to ensure the reliability,
safety and security of cloud environments. In this
paper we study a usual software stack of a cloud
environment from the perspective of formal
verification. This software stack ranges from
applications to the hypervisor. We argue that most
of the layers could be practically formally
verified, even if the work to verify all levels is
huge.

[16]

Frédéric Loulergue.
Modélisation en théorie des types. Application au
développement de programmes parallèles corrects.
Journée CaSciModOT, Orléans, Décembre 2011.

[17]

Frédéric Loulergue.
Functional Parallel Programming with Revised Bulk Synchronous
Parallel ML.
PDAA Workshop, Hiroshima, November 2010.
Bulk Synchronous Parallel ML or BSML is a highlevel
language for programming parallel algorithms. Built
upon the Objective Caml language, it provides a safe
setting for implementing Bulk Synchronous Parallel
(BSP) algorithms. It avoids concurrency related
problems: deadlocks and non determinism. BSML is
based on a very small core of parallel primitives
that extended functional sequential programming to
functional BSP programming with a parallel data
structure and operations to manipulate it. However,
in practice the primitives for writing the parallel
noncommunicating parts of the program are not so
easy to use. Thus we designed a new syntax that
makes programs easier to write and read. Revised
BSML is presented and its expressiveness and
performance are illustrated through an application
example.

[18]

Frédéric Loulergue.
Systematic Development of Correct Bulk Synchronous Parallel
Programs.
PDCAT Conference, Wuhan, China, Deember 2010.
With the current generalisation of parallel
architectures arises the concern of applying formal
methods to parallelism. The complexity of parallel,
compared to sequential, programs makes them more
errorprone and difficult to verify. Bulk
Synchronous Parallelism (BSP) is a model of
computation which offers a high degree of
abstraction like PRAM models but yet a realistic
cost model based on a structured parallelism. We
propose a framework for refining a sequential
specification toward a functional BSP program, the
whole process being done with the help of the Coq
proof assistant. To do so we define BH, a new
homomorphic skeleton, which captures the essence of
BSP computation in an algorithmic level, and also
serves as a bridge in mapping from high level
specification to low level BSP parallel programs.

[19]

Frédéric Loulergue.
Revised Bulk Synchronous Parallel ML.
TOPS Seminar, Tokyo, November 2010.
Bulk Synchronous Parallel ML or BSML is a highlevel
language for programming parallel algorithms. Built
upon the Objective Caml language, it provides a safe
setting for implementing Bulk Synchronous Parallel
(BSP) algorithms. It avoids concurrency related
problems: deadlocks and nondeterminism. BSML is
based on a very small core of parallel primitives
that extended functional sequential programming to
functional BSP programming with a parallel data
structure and operations to manipulate it. However,
in practice the primitives for writing the parallel
noncommunicating parts of the program are not so
easy to use. Thus we designed a new syntax that
makes programs easier to write and read: Revised
BSML. In this talk, Revised BSML is presented and
its expressiveness and performance are illustrated
through an application examples. We also designed a
formal semantics and proved its properties using the
Coq proof assistant.

[20]

Frédéric Loulergue.
Correct Functional Parallel Programs in a Safe Execution
Environment.
Talk at the National Institute of Informatics (Japan), June 2008.
The design of parallel programs and parallel
programming languages is a tradeoff. On one hand
the programs should be efficient. But the efficiency
should not come at the price of non portability and
unpredictability of performances. The portability of
code is needed to allow code reuse on a wide variety
of architectures and to allow the existence of
legacy code. The predictability of performances is
needed to guarantee that the efficiency will always
be achieved on any architecture. Another very
important characteristic of parallel programs is the
complexity of their semantics. Deadlocks and
indeterminism often hinder the practical use of
parallelism by a large number of users. To avoid
these undesirable properties, a tradeoff has to be
made between the expressiveness of the language and
its structure which could decrease the
expressiveness. Bulk Synchronous Parallelism (BSP)
is a model of computation which offers a high degree
of abstraction like PRAM models but yet a realistic
cost model based on a structured parallelism:
deadlocks are avoided and indeterminism is limited
to very specific cases in the BSPlib. BSP programs
are portable across many parallel architectures.
The Bulk Synchronous Parallel ML language (BSML) is
based on a confluent extension of the
λcalculus. Thus it is deadlock free and
deterministic. Being a highlevel language, programs
are easier to write, to reuse and to compose. It is
even possible to certify the correctness of BSML
programs with the help of the Coq proof assistant.
However the correctness of these programs is ensured
with respect to the programming model of BSML. BSML,
as dataparallel languages, has a programming model
that is a formal or informal semantics of the
primitive operations it provides as viewed by the
programmer. However the realization of the
primitives on parallel architecture, or execution
model, is quite different. Thus to guarantee the
correct execution of these programs in parallel, it
is necessary to verify the equivalence of the
programming model and the execution model. This
talk presents the Bulk Synchronous Parallel
language, several formal semantics of BSML from the
programming model to the semantics of an abstract
parallel machine, and proof of correctness of BSML
programs with the Coq proof assistant.

[21]

F Loulergue.
Functional Bulk Synchronous Parallelism in ML.
The University of Tokyo, July 2007.

[22]

Radia Benheddi and Frédéric Loulergue.
Minimally Synchronous Parallel ML with Parallel Composition.
In Seventh Symposium on Trends in Functional Programming (TFP
2006), pages 387402. Nottingham, UK, 2006.
Parallel divideandconquer parallel algorithms are
very common in the litterature on parallel
algorithms. In order to ease their implementation in
Minimally Synchronous Parallel ML (MSPML), we add a
new primitive to this asynchronous parallel
functional language. MSPML is based on a parallel
data structure called parallel vectors and
operations to manipulate them. This paper presents
semantics and an implementation of MSPML with
parallel composition.

[23]

Louis Gesbert and Frédéric Loulergue.
Semantics of Bulk Synchronous Parallel ML with Exceptions.
In Zoltén Horvéth, editor, Draft proceedings of the 18th
International Symposium on Implementation and Application of Functional
Languages (IFL'06), 2006.

[24]

Frédéric Loulergue.
The PROPAC Project  Certified Parallel Programming.
The University of Tokyo, March 2006.
The goal of this project is to produce a
programming environment in which certified
parallebgl programs can be written, proved and
safely executed. More technically, several subgoals
are studied to achieve the overall goal of the
project: proofs, using the Coq proof assistant, of
programs written with the functional parallel
languages we developped; certification of the
compilers for these languages, from source code to
bytecode for parallel virtual machines and the
certification of these virtual machines; proofs of
imperative bulk synchronous parallel programs first
written with a small dedicated language and then
with an extension of C or Java; automatic
performance prediction of the programs written with
these languages.

[25]

Frédéric Loulergue and Frédéric Gava.
Calcul parallèle: bicephale.univparis12.fr et Bulk Synchronous
Parallel ML.
Séminaire du LACL, February 2005.
Certains problèmes comme la simulation de
phénomènes physiques ou chimiques ou la gestion
de bases de données de grande taille nécessitent
des performances que seules les machines
parallèles peuvent offrir. En plus de ces domaines
d'application, on utilise les machines parallèles
en recherche opérationnelle, en extraction de
connaissances, en réalité virtuelle et en
vérification. Les machines parallèles servent
ainsi à un sousensemble important de toutes les
applications de l'informatique. Leur programmation
demeure néanmoins plus difficile que celle des
machines séquentielles et la conception de
langages adaptés est un sujet de recherche actif.
Cet exposé présentera :  d'une part la machine
parallèle du Laboratoire d'Algorithmique,
Complexité et Logique (LACL), appelée
"bicephale.univparis12.fr" qui est de type grappe
de PC et les bases de l'utilisation pratique de
celleci ;  d'autre part le langage de
programmation de hautniveau Bulk Synchronous
Parallel ML (BSML) développé au LACL et quelques
exemples implémentés avec BSML et qui ont fait
l'objet d'expérimentations sur la grappe de PC.

[26]

Frédéric Loulergue.
Programmation de métaordinateur: de MSPML à DMML.
Exposé au Groupe de travail "Programmation", Laboratoire Preuves,
Programmes et Systèmes, Paris, March 2005.
De nombreux problèmes comme la simulation de
phénomènes physiques ou la manipulation de bases
de données de grandes tailles nécessitent des
performances que seuls les ordinateurs massivement
parallèles, voire les métaordinateurs, peuvent
fournir.A partir d'un modèle de parallélisme
structuré possèdant un modèle de prévision
de performances (Bulk Synchronous Parallel ou BSP)
nous avons conéu un ensemble d'opérations
universelles qui permet la programmation de
n'importe quel algorithme de ce modèle. Une
approche opérationnelle nous a amené à
définir des extensions du lambdacalcul par des
primitives BSP. Une bibliothèque basée sur ces
calculs, la bibliothèque BSML a été
développée pour le langage fonctionnel Objective
Caml. La programmation BSML s'appuie sur une
structure de données parallèle polymorphe. Les
programmes sont des fonctions Objective Caml
usuelles mais manipulent cette structure de
données parallèle à l'aide d'un petit ensemble
d'opérations dédiées. BSML suit le modèle
BSP et est donc garanti sans interblocage. La
confluence des calculs garantit quant à elle le
déterminisme de BSML. Un nouvel axe de recherche,
présenté dans cet exposé, est d'avoir un
langage pour le metacomputing, c'estàdire pour
la programmation de grappes de machines parallèles
(qui sont ellesmémes souvent des grappes de PC)
que nous appelons métaordinateurs. Pour obtenir
ce langage, appelée Departmental Metacomputing ML,
nous avons combiné le langage BSML, pour la
programmation de chaque noeud parallèle, et notre
nouveau langage MSPML (Minimally Synchronous
Parallel ML) pour la coordination de
l'ensemble. MSPML diffère de BSML par l'absence de
barrières de synchronisation globale.

[27]

Frédéric Loulergue.
Functional programming of parallel and meta computing.
Talk at the Komet Workshop, November 2004.

[28]

Frédéric Loulergue.
Compositions parallèles et BSML.
Seminaire du LIFO, Orléans, march 2003.

[29]

Frédéric Loulergue.
Bulk Synchronous Parallel ML: Safety and Expressivity.
Talk at The Tokyo Programming Seminar, November 2003.

[30]

Frédéric Loulergue.
Distributed Evaluation of Functional BSP Programs.
Talk at the International Workshop on High Level Parallel
Programming, Orléans, march 2001.

[31]

Frédéric Loulergue.
Conception de langages fonctionnels pour la programation massivement
parallèle : de BSlambda à la BSMLlib.
Séminaire du LACL, Crétail, october 2000.

[32]

Frédéric Loulergue.
Conception de langages fonctionnels pour la programation massivement
parallèle : de BSlambda à la BSMLlib.
Séminaire du groupe de travail "Programmation", PPS, Paris, may
2000.

[33]

Frédéric Loulergue.
Conception de langages fonctionnels pour la programation massivement
parallèle.
Exposé aux Journées GDR Parallélisme Répartition et
Objets, Lille, december 1999.

[34]

Frédéric Loulergue.
Un langages fonctionnel pur d'ordre supérieur explicitement
parallèle.
Exposé aux Journées GDR, Rennes, 1997.