Adrián Castelló

Citado por

	Total	Desde 2019
Citas	480	345
Índice h	12	9
Índice i10	12	7

120

201520162017201820192020202120222023202419 22 32 54 40 53 62 58 112 20

Acceso público

Ver todo

34 artículos

7 artículos

disponibles

no disponibles

Basado en requisitos de financiación

Coautores

Enrique S. Quintana-OrtíUniversitat Politècnica de València, SpainDirección de correo verificada de disca.upv.es
Manuel F. DolzUniversitat Jaume IDirección de correo verificada de icc.uji.es
Jose DuatoUniversitat Politècnica de ValènciaDirección de correo verificada de disca.upv.es
Antonio J. PeñaBarcelona Supercomputing Center (BSC)Dirección de correo verificada de bsc.es
Pavan BalajiArgonne National LaboratoryDirección de correo verificada de anl.gov
Sangmin SeoKlaytn FoundationDirección de correo verificada de klaytn.foundation
Pedro Alonso-JordáUniversitat Politècnica de ValènciaDirección de correo verificada de upv.es
Francisco D. IgualUniversidad Complutense de MadridDirección de correo verificada de ucm.es
Sergio IserteSenior Researcher @ BSCDirección de correo verificada de bsc.es
Sandra CatalánUniversitat Jaume IDirección de correo verificada de uji.es
Rafael Rodríguez-SánchezDep. Sistemas Informáticos, Universidad de Castilla-La ManchaDirección de correo verificada de uclm.es
Cristian RamírezUniversitat Politècnica de ValènciaDirección de correo verificada de posgrado.upv.es

Seguir

Adrián Castelló

Postdoc Fellow @ Universitat Politècnica de València (UPV)

Dirección de correo verificada de disca.upv.es - Página principal

Code Auto-generation Programming Models High Performance Computing Lightweight threading Deep Neural Networks


Título Ordenar por citas Ordenar por año Ordenar por título	Citado por Citado por	Año
Argobots: A lightweight low-level threading and tasking framework S Seo, A Amer, P Balaji, C Bordage, G Bosilca, A Brooks, P Carns, ... IEEE Transactions on Parallel and Distributed Systems 29 (3), 512-526, 2017	151	2017
SLURM support for remote GPU virtualization: Implementation and performance study S Iserte, A Castelló, R Mayo, ES Quintana-Ortí, F Silla, J Duato, C Reaño, ... 2014 IEEE 26th International Symposium on Computer Architecture and High …, 2014	34	2014
High Performance and Portable Convolution Operators for Multicore Processors P San Juan, A Castelló, MF Dolz, P Alonso-Jordá, ES Quintana-Ortí SBAC-PAD 2020, 2020	25*	2020
Improving the User Experience of the rCUDA Remote GPU Virtualization Framework C Reano, F Silla, A Castelló, AJ Pena, R Mayo, ES Quintana-Ortí, J Duato	24	2014
PyDTNN: a user-friendly and extensible framework for distributed deep learning S Barrachina, A Castelló, M Catalán, MF Dolz, JI Mestre The Journal of Supercomputing 77, 9971-9987, 2021	19	2021
A Review of Lightweight Thread Approaches for High Performance Computing A Castelló, AJ Peña, S Seo, R Mayo, P Balaji, ES Quintana-Ortí 2016 IEEE International Conference on Cluster Computing (CLUSTER 2016), 471-480, 2016	19	2016
Analysis of model parallelism for distributed neural networks A Castelló, MF Dolz, ES Quintana-Ortí, J Duato Proceedings of the 26th European MPI Users' Group Meeting, 1-10, 2019	16	2019
Theoretical Scalability Analysis of Distributed Deep Convolutional Neural Networks A Castelló, MF Dolz, ES Quintana-Ortí, J Duato 2nd High Performance Machine Learning Workshop (HPML 2019), 534-541, 2019	14	2019
On the use of remote GPUs and low-power processors for the acceleration of scientific applications A Castelló, J Duato, R Mayo, AJ Pena, ES Quintana-Ortí, V Roca, F Silla The Fourth International Conference on Smart Grids, Green Communications and …, 2014	14	2014
GLTO: On the Adequacy of Lightweight Thread Approaches for OpenMP Implementations A Castelló, S Seo, R Mayo, P Balaji, ES Quintana-Ortí, AJ Peña International Conference on Parallel Processing (ICPP-2017), 60-69, 2017	13	2017
Enabling GPU Virtualization in Cloud Environments S Iserte, FJ Clemente-Castelló, A Castelló, R Mayo, ES Quintana-Ortí CLOSER 2016, 2016	13	2016
Reformulating the direct convolution for high-performance deep learning inference on ARM processors S Barrachina, A Castelló, MF Dolz, TM Low, H Martínez, ES Quintana-Ortí, ... Journal of Systems Architecture 135, 102806, 2023	12	2023
Anatomy of the BLIS family of algorithms for matrix multiplication A Castelló, ES Quintana-Ortí, FD Igual 2022 30th Euromicro International Conference on Parallel, Distributed and …, 2022	9	2022
Accelerating distributed deep neural network training with pipelined MPI allreduce A Castelló, ES Quintana-Ortí, J Duato Cluster Computing 24 (4), 3797-3813, 2021	9	2021
A flexible research-oriented framework for distributed training of deep neural networks S Barrachina, A Castelló, M Catalán, MF Dolz, JI Mestre 2021 IEEE International Parallel and Distributed Processing Symposium …, 2021	9	2021
GLT: A unified API for lightweight thread libraries A Castelló, S Seo, R Mayo, P Balaji, ES Quintana-Ortí, AJ Peña Euro-Par 2017: Parallel Processing: 23rd International Conference on …, 2017	8	2017
High performance and energy efficient inference for deep learning on multicore ARM processors using general optimization techniques and BLIS A Castelló, S Barrachina, MF Dolz, ES Quintana-Ortí, P San Juan, ... Journal of Systems Architecture 125, 102459, 2022	7*	2022
Programming parallel dense matrix factorizations with look-ahead and OpenMP S Catalán, A Castelló, FD Igual, R Rodríguez-Sánchez, ES Quintana-Ortí Cluster Computing 23, 359-375, 2020	7	2020
On the adequacy of lightweight thread approaches for high-level parallel programming models A Castelló, R Mayo, K Sala, V Beltran, P Balaji, AJ Peña Future Generation Computer Systems 84, 22-31, 2018	7	2018
Exploiting task-parallelism on GPU clusters via OmpSs and rCUDA virtualization A Castelló, R Mayo, J Planas, ES Quintana-Ortí The 1st IEEE International Workshop on Reengineering for Parallelism in …, 2015	7	2015

El sistema no puede realizar la operación en estos momentos. Inténtalo de nuevo más tarde.

Artículos 1–20

Citas por año

Citas duplicadas

Citas combinadas

Añadir coautoresCoautores

Seguir

Citado por

Coautores