Kernel weaver: Automatically fusing database primitives for efficient gpu computation H Wu, G Diamos, S Cadambi, S Yalamanchili 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture, 107-118, 2012 | 129 | 2012 |
Thread tailor: dynamically weaving threads together for efficient, adaptive parallel applications J Lee, H Wu, M Ravichandran, N Clark Proceedings of the 37th annual international symposium on Computer …, 2010 | 125 | 2010 |
SIMD re-convergence at thread frontiers G Diamos, B Ashbaugh, S Maiyuran, A Kerr, H Wu, S Yalamanchili Proceedings of the 44th annual ieee/acm international symposium on …, 2011 | 111 | 2011 |
Red fox: An execution environment for relational query processing on gpus H Wu, G Diamos, T Sheard, M Aref, S Baxter, M Garland, S Yalamanchili Proceedings of Annual IEEE/ACM International Symposium on Code Generation …, 2014 | 102 | 2014 |
Optimizing data warehousing applications for GPUs using kernel fusion/fission H Wu, G Diamos, J Wang, S Cadambi, S Yalamanchili, S Chakradhar 2012 IEEE 26th International Parallel and Distributed Processing Symposium …, 2012 | 86 | 2012 |
Efficient relational algebra algorithms and data structures for GPU G Diamos, H Wu, A Lele, J Wang, S Yalamanchili CERCS, Georgia Institute of Technology, Tech. Rep. GIT-CERCS-12-01, 2012 | 45 | 2012 |
Characterization and transformation of unstructured control flow in bulk synchronous GPU applications H Wu, G Diamos, J Wang, S Li, S Yalamanchili International Journal of High Performance Computing Applications 26 (2), 170-185, 2012 | 44 | 2012 |
Relational algorithms for multi-bulk-synchronous processors G Diamos, H Wu, J Wang, A Lele, S Yalamanchili ACM SIGPLAN Notices 48 (8), 301-302, 2013 | 30 | 2013 |
Optimizing data warehousing applications for GPUs using dynamic stream scheduling and dispatch of fused and split kernels H Wu, S Cadambi, ST Chakradhar US Patent 8,990,827, 2015 | 26 | 2015 |
Multipredicate join algorithms for accelerating relational graph processing on GPUs H Wu, D Zinn, M Aref, S Yalamanchili International Workshop on Accelerating Data Management Systems Using Modern …, 2014 | 24 | 2014 |
Accelerating simulation of agent-based models on heterogeneous architectures J Wang, N Rubin, H Wu, S Yalamanchili Proceedings of the 6th Workshop on General Purpose Processor Using Graphics …, 2013 | 20 | 2013 |
Relational learning with GPUs: Accelerating rule coverage CA Martínez-Angeles, H Wu, I Dutra, VS Costa, J Buenabad-Chávez International Journal of Parallel Programming 44 (3), 663-685, 2016 | 13 | 2016 |
Satisfying data-intensive queries using GPU clusters J Young, H Wu, S Yalamanchili 2012 SC Companion: High Performance Computing, Networking Storage and …, 2012 | 10 | 2012 |
General-purpose join algorithms for large graph triangle listing on heterogeneous systems D Zinn, H Wu, J Wang, M Aref, S Yalamanchili Proceedings of the 9th Annual Workshop on General Purpose Processing Using …, 2016 | 8 | 2016 |
Cutlass V Thakkar, P Ramani, C Cecka, A Shivam, H Lu, E Yan, J Kosaian, ... github, 2023 | 7 | 2023 |
An efficient block motion estimation method on CELL BE X He, Y Zhang, X He, H Wu, Y Zou 2008 International Conference on Audio, Language and Image Processing, 1672-1676, 2008 | 4 | 2008 |
Acceleration and execution of relational queries using general purpose graphics processing unit (GPGPU) H Wu Georgia Institute of Technology, 2015 | 1 | 2015 |
Non-rectangular matrix computations and data pattern processing using tensor cores A Shivam, A Kerr, H Wu, M Gupta, N Shustrov, Q Yang, A Kaatz, AA Atluri US Patent App. 17/700,239, 2023 | | 2023 |
Relational Learning with GPUs: Accelerating Rule Coverage CAM Angeles, H Wu, I Dutra, VS Costa, JB Chavez | | 2016 |
Characterization and transformation of unstructured control flow in gpu applications H Wu, G Diamos, S Li, S Yalamanchili 1st international workshop on characterizing applications for heterogeneous …, 2011 | | 2011 |