1.
Malita, Mihaela; Popescu, George Vladut; Stefan, Gheorghe M.
Heterogenous Computing for Markov Models in Big Data Proceedings Article
In: 2019 6TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND COMPUTATIONAL INTELLIGENCE (CSCI 2019), pp. 1500-1505, IEEE, 345 E 47TH ST, NEW YORK, NY 10017 USA, 2019, ISBN: 978-1-7281-5584-5, (6th Annual Conference on Computational Science and Computational Intelligence (CSCI), Las Vegas, NV, DEC 05-07, 2019).
Abstract | Links | BibTeX | Tags: Big Data; Markov Models; parallel architecture; accelerators; heterogenous computing
@inproceedings{WOS:000569996300272,
title = {Heterogenous Computing for Markov Models in Big Data},
author = {Mihaela Malita and George Vladut Popescu and Gheorghe M. Stefan},
doi = {10.1109/CSCI49370.2019.00279},
isbn = {978-1-7281-5584-5},
year = {2019},
date = {2019-01-01},
booktitle = {2019 6TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND
COMPUTATIONAL INTELLIGENCE (CSCI 2019)},
pages = {1500-1505},
publisher = {IEEE},
address = {345 E 47TH ST, NEW YORK, NY 10017 USA},
abstract = {Many Big Data problems, Markov Model related included, are solved using
heterogenous systems: host + parallel programmable accelerator. The
current solutions for the accelerator part - for example, GPU used as
GPGPU - provide limited accelerations due to some architectural
constraints. The paper introduces the use of a programmable parallel
accelerator able to perform efficient vector and matrix operations
avoiding the limitations of the current systems designed using
``of-theshelf'' solutions. Our main result is an architecture whose
actual performance is a much higher percentage from its peak performance
than those of the consecrated accelerators. The performance improvements
we offer come from the following two features: the addition of a
reduction network at the output of a linear array of cells and an
appropriate use of a serial register distributed along the same linear
array of cells. Thus, for a n-state Markov Model, instead of a solution
with the size in O(n2) and an acceleration in O(n2/logn), we offer an
accelerator with the size in O(n) and the acceleration in O(n).},
note = {6th Annual Conference on Computational Science and Computational
Intelligence (CSCI), Las Vegas, NV, DEC 05-07, 2019},
keywords = {Big Data; Markov Models; parallel architecture; accelerators; heterogenous computing},
pubstate = {published},
tppubtype = {inproceedings}
}
Many Big Data problems, Markov Model related included, are solved using
heterogenous systems: host + parallel programmable accelerator. The
current solutions for the accelerator part - for example, GPU used as
GPGPU - provide limited accelerations due to some architectural
constraints. The paper introduces the use of a programmable parallel
accelerator able to perform efficient vector and matrix operations
avoiding the limitations of the current systems designed using
``of-theshelf'' solutions. Our main result is an architecture whose
actual performance is a much higher percentage from its peak performance
than those of the consecrated accelerators. The performance improvements
we offer come from the following two features: the addition of a
reduction network at the output of a linear array of cells and an
appropriate use of a serial register distributed along the same linear
array of cells. Thus, for a n-state Markov Model, instead of a solution
with the size in O(n2) and an acceleration in O(n2/logn), we offer an
accelerator with the size in O(n) and the acceleration in O(n).
heterogenous systems: host + parallel programmable accelerator. The
current solutions for the accelerator part - for example, GPU used as
GPGPU - provide limited accelerations due to some architectural
constraints. The paper introduces the use of a programmable parallel
accelerator able to perform efficient vector and matrix operations
avoiding the limitations of the current systems designed using
``of-theshelf'' solutions. Our main result is an architecture whose
actual performance is a much higher percentage from its peak performance
than those of the consecrated accelerators. The performance improvements
we offer come from the following two features: the addition of a
reduction network at the output of a linear array of cells and an
appropriate use of a serial register distributed along the same linear
array of cells. Thus, for a n-state Markov Model, instead of a solution
with the size in O(n2) and an acceleration in O(n2/logn), we offer an
accelerator with the size in O(n) and the acceleration in O(n).