Seguir
Adrián Castelló
Título
Citado por
Citado por
Año
Argobots: A lightweight low-level threading and tasking framework
S Seo, A Amer, P Balaji, C Bordage, G Bosilca, A Brooks, P Carns, ...
IEEE Transactions on Parallel and Distributed Systems 29 (3), 512-526, 2017
1512017
SLURM support for remote GPU virtualization: Implementation and performance study
S Iserte, A Castelló, R Mayo, ES Quintana-Ortí, F Silla, J Duato, C Reaño, ...
2014 IEEE 26th International Symposium on Computer Architecture and High …, 2014
342014
High Performance and Portable Convolution Operators for Multicore Processors
P San Juan, A Castelló, MF Dolz, P Alonso-Jordá, ES Quintana-Ortí
SBAC-PAD 2020, 2020
25*2020
Improving the User Experience of the rCUDA Remote GPU Virtualization Framework
C Reano, F Silla, A Castelló, AJ Pena, R Mayo, ES Quintana-Ortí, J Duato
242014
PyDTNN: a user-friendly and extensible framework for distributed deep learning
S Barrachina, A Castelló, M Catalán, MF Dolz, JI Mestre
The Journal of Supercomputing 77, 9971-9987, 2021
192021
A Review of Lightweight Thread Approaches for High Performance Computing
A Castelló, AJ Peña, S Seo, R Mayo, P Balaji, ES Quintana-Ortí
2016 IEEE International Conference on Cluster Computing (CLUSTER 2016), 471-480, 2016
192016
Analysis of model parallelism for distributed neural networks
A Castelló, MF Dolz, ES Quintana-Ortí, J Duato
Proceedings of the 26th European MPI Users' Group Meeting, 1-10, 2019
162019
Theoretical Scalability Analysis of Distributed Deep Convolutional Neural Networks
A Castelló, MF Dolz, ES Quintana-Ortí, J Duato
2nd High Performance Machine Learning Workshop (HPML 2019), 534-541, 2019
142019
On the use of remote GPUs and low-power processors for the acceleration of scientific applications
A Castelló, J Duato, R Mayo, AJ Pena, ES Quintana-Ortí, V Roca, F Silla
The Fourth International Conference on Smart Grids, Green Communications and …, 2014
142014
GLTO: On the Adequacy of Lightweight Thread Approaches for OpenMP Implementations
A Castelló, S Seo, R Mayo, P Balaji, ES Quintana-Ortí, AJ Peña
International Conference on Parallel Processing (ICPP-2017), 60-69, 2017
132017
Enabling GPU Virtualization in Cloud Environments
S Iserte, FJ Clemente-Castelló, A Castelló, R Mayo, ES Quintana-Ortí
CLOSER 2016, 2016
132016
Reformulating the direct convolution for high-performance deep learning inference on ARM processors
S Barrachina, A Castelló, MF Dolz, TM Low, H Martínez, ES Quintana-Ortí, ...
Journal of Systems Architecture 135, 102806, 2023
122023
Anatomy of the BLIS family of algorithms for matrix multiplication
A Castelló, ES Quintana-Ortí, FD Igual
2022 30th Euromicro International Conference on Parallel, Distributed and …, 2022
92022
Accelerating distributed deep neural network training with pipelined MPI allreduce
A Castelló, ES Quintana-Ortí, J Duato
Cluster Computing 24 (4), 3797-3813, 2021
92021
A flexible research-oriented framework for distributed training of deep neural networks
S Barrachina, A Castelló, M Catalán, MF Dolz, JI Mestre
2021 IEEE International Parallel and Distributed Processing Symposium …, 2021
92021
GLT: A unified API for lightweight thread libraries
A Castelló, S Seo, R Mayo, P Balaji, ES Quintana-Ortí, AJ Peña
Euro-Par 2017: Parallel Processing: 23rd International Conference on …, 2017
82017
High performance and energy efficient inference for deep learning on multicore ARM processors using general optimization techniques and BLIS
A Castelló, S Barrachina, MF Dolz, ES Quintana-Ortí, P San Juan, ...
Journal of Systems Architecture 125, 102459, 2022
7*2022
Programming parallel dense matrix factorizations with look-ahead and OpenMP
S Catalán, A Castelló, FD Igual, R Rodríguez-Sánchez, ES Quintana-Ortí
Cluster Computing 23, 359-375, 2020
72020
On the adequacy of lightweight thread approaches for high-level parallel programming models
A Castelló, R Mayo, K Sala, V Beltran, P Balaji, AJ Peña
Future Generation Computer Systems 84, 22-31, 2018
72018
Exploiting task-parallelism on GPU clusters via OmpSs and rCUDA virtualization
A Castelló, R Mayo, J Planas, ES Quintana-Ortí
The 1st IEEE International Workshop on Reengineering for Parallelism in …, 2015
72015
El sistema no puede realizar la operación en estos momentos. Inténtalo de nuevo más tarde.
Artículos 1–20