|
Conferences in DBLP
A european perspective on supercomputing. [Citation Graph (, )][DBLP]
The roadrunner project and the importance of energy efficiency on the road to exascale computing. [Citation Graph (, )][DBLP]
Computing outside the box. [Citation Graph (, )][DBLP]
Implementation of a wide-angle lens distortion correction algorithm on the cell broadband engine. [Citation Graph (, )][DBLP]
High-performance regular expression scanning on the Cell/B.E. processor. [Citation Graph (, )][DBLP]
Computer generation of fast fourier transforms for the cell broadband engine. [Citation Graph (, )][DBLP]
DBDB: optimizing DMATransfer for the cell be architecture. [Citation Graph (, )][DBLP]
Zero-content augmented caches. [Citation Graph (, )][DBLP]
Dynamic cache clustering for chip multiprocessors. [Citation Graph (, )][DBLP]
Less reused filter: improving l2 cache performance via filtering less reused lines. [Citation Graph (, )][DBLP]
Divide-and-conquer: a bubble replacement for low level caches. [Citation Graph (, )][DBLP]
OhHelp: a scalable domain-decomposing dynamic load balancing for particle-in-cell simulations. [Citation Graph (, )][DBLP]
Pattern-based sparse matrix representation for memory-efficient SMVM kernels. [Citation Graph (, )][DBLP]
Dynamic topology aware load balancing algorithms for molecular dynamics applications. [Citation Graph (, )][DBLP]
Fast memory snapshot for concurrent programmingwithout synchronization. [Citation Graph (, )][DBLP]
QuakeTM: parallelizing a complex sequential application using transactional memory. [Citation Graph (, )][DBLP]
Refereeing conflicts in hardware transactional memory. [Citation Graph (, )][DBLP]
Parametric multi-level tiling of imperfectly nested loops. [Citation Graph (, )][DBLP]
Dynamic parallelization of single-threaded binary programs using speculative slicing. [Citation Graph (, )][DBLP]
Synchronization optimizations for efficient execution on multi-cores. [Citation Graph (, )][DBLP]
Chunking parallel loops in the presence of synchronization. [Citation Graph (, )][DBLP]
Efficient high performance collective communication for the cell blade. [Citation Graph (, )][DBLP]
Practice of parallelizing network applications on multi-core architectures. [Citation Graph (, )][DBLP]
Towards 100 gbit/s ethernet: multicore-based parallel communication protocol design. [Citation Graph (, )][DBLP]
Virtualization polling engine (VPE): using dedicated CPU cores to accelerate I/O virtualization. [Citation Graph (, )][DBLP]
Fast and scalable list ranking on the GPU. [Citation Graph (, )][DBLP]
Tuned and wildly asynchronous stencil kernels for hybrid CPU/GPU systems. [Citation Graph (, )][DBLP]
Performance modeling and automatic ghost zone optimization for iterative stencil loops on GPUs. [Citation Graph (, )][DBLP]
Creating artificial global history to improve branch prediction accuracy. [Citation Graph (, )][DBLP]
Exploring pattern-aware routing in generalized fat tree networks. [Citation Graph (, )][DBLP]
Understanding the interconnection network of SpiNNaker. [Citation Graph (, )][DBLP]
A graph based approach for MPI deadlock detection. [Citation Graph (, )][DBLP]
Maximizing MPI point-to-point communication performance on RDMA-enabled clusters with customized protocols. [Citation Graph (, )][DBLP]
MPI-aware compiler optimizations for improving communication-computation overlap. [Citation Graph (, )][DBLP]
Evaluating high performance communication: a power perspective. [Citation Graph (, )][DBLP]
FTL design exploration in reconfigurable high-performance SSD for server applications. [Citation Graph (, )][DBLP]
/scratch as a cache: rethinking HPC center scratch storage. [Citation Graph (, )][DBLP]
P-Code: a new RAID-6 code with optimal properties. [Citation Graph (, )][DBLP]
R-ADMAD: high reliability provision for large-scale de-duplication archival storage systems. [Citation Graph (, )][DBLP]
Single-particle 3d reconstruction from cryo-electron microscopy images on GPU. [Citation Graph (, )][DBLP]
How GPUs can outperform ASICs for fast LDPC decoding. [Citation Graph (, )][DBLP]
A translation system for enabling data mining applications on GPUs. [Citation Graph (, )][DBLP]
Combining thread level speculation helper threads and runahead execution. [Citation Graph (, )][DBLP]
Limited early value communication to improve performance of transactional memory. [Citation Graph (, )][DBLP]
EpiFast: a fast algorithm for large scale realistic epidemic simulations on distributed memory systems. [Citation Graph (, )][DBLP]
Using many-core hardware to correlate radio astronomy signals. [Citation Graph (, )][DBLP]
A parallel levenberg-marquardt algorithm. [Citation Graph (, )][DBLP]
Adagio: making DVS practical for complex HPC applications. [Citation Graph (, )][DBLP]
A comprehensive power-performance model for NoCs with multi-flit channel buffers. [Citation Graph (, )][DBLP]
Rate-based QoS techniques for cache/memory in CMP platforms. [Citation Graph (, )][DBLP]
MPI collective communications on the blue gene/p supercomputer: algorithms and optimizations. [Citation Graph (, )][DBLP]
TransMetric: architecture independent workload characterization for transactional memory benchmarks. [Citation Graph (, )][DBLP]
Cancellation of loads that return zero using zero-value caches. [Citation Graph (, )][DBLP]
Auto-vectorization through code generation for stream processing applications. [Citation Graph (, )][DBLP]
Subdomain communication to increase scalability in large-scale scientific applications. [Citation Graph (, )][DBLP]
Access map pattern matching for data cache prefetch. [Citation Graph (, )][DBLP]
Prediction-based power estimation and scheduling for CMPs. [Citation Graph (, )][DBLP]
Design of a novel SIMD architecture by fusing operations and registers. [Citation Graph (, )][DBLP]
Thrifty interconnection network for HPC systems. [Citation Graph (, )][DBLP]
Performance modeling for DFT algorithms in FFTW. [Citation Graph (, )][DBLP]
PARSEC: hardware profiling of emerging workloads for CMP design. [Citation Graph (, )][DBLP]
Approximate kernel matrix computation on GPUs forlarge scale learning applications. [Citation Graph (, )][DBLP]
Dynamic task set partitioning based on balancing memory requirements to reduce power consumption. [Citation Graph (, )][DBLP]
High-performance CUDA kernel execution on FPGAs. [Citation Graph (, )][DBLP]
Load balancing using work-stealing for pipeline parallelism in emerging applications. [Citation Graph (, )][DBLP]
Prefetch optimizations on large-scale applications via parameter value prediction. [Citation Graph (, )][DBLP]
Designing multi-socket systems using silicon photonics. [Citation Graph (, )][DBLP]
An infrastructure for scalable and portable parallel programs for computational chemistry. [Citation Graph (, )][DBLP]
|