Results 321 to 330 of about 2,725,619 (370)
Some of the next articles are maybe not open access.

Optimizing Hardware Accelerated General Matrix-Matrix Multiplication for CNNs on FPGAs

IEEE Transactions on Circuits and Systems - II - Express Briefs, 2020
Convolution is inarguably the most complex operation utilized in Convolutional Neural Networks (convnets). Owing to the billions of independent multiply-adds involved, convolution is being massively parallelized by the simultaneous utilization of many ...
Afzal Ahmad, M. A. Pasha
semanticscholar   +1 more source

Performance-Aware Model for Sparse Matrix-Matrix Multiplication on the Sunway TaihuLight Supercomputer

IEEE Transactions on Parallel and Distributed Systems, 2019
General sparse matrix-sparse matrix multiplication (SpGEMM) is one of the fundamental linear operations in a wide variety of scientific applications.
Yuedan Chen   +5 more
semanticscholar   +1 more source

Performance Evaluation of Accurate Matrix-Matrix Multiplication on GPU Using Sparse Matrix Multiplications

2020 Eighth International Symposium on Computing and Networking Workshops (CANDARW), 2020
Basic Linear Algebra Subprograms (BLAS) is a frequently used numerical library for linear algebra computations. However, it places little emphasis on computational accuracy, especially with respect to the accuracy assurance of the results. Consequently, a high-precision matrix–matrix multiplications algorithm that assures the precision by double ...
Fumiya Ishiguro   +3 more
openaire   +1 more source

Avoiding matrix multiplication

1991
The fastest known algorithms for many problems on graphs use matrix multiplication as a sub-routine. Some examples of problems solved using matrix multiplication are recognition of transitive graphs, computing the transitive closure of a directed acyclic graph, and finding the neighborhood containment matrix of a graph.
Tze-Heng Ma, Jeremy P. Spinrad
openaire   +1 more source

Fast matrix multiplication

Proceedings of the third annual ACM symposium on Theory of computing - STOC '71, 1971
This paper deals with three aspects of algebraic complexity. The first section is concerned with lower bounds on the number of operations required to compute several functions. Several theorems are presented and their proofs sketched. The second section deals with relationships among the complexities of several sets of functions.
openaire   +1 more source

Fast matrix multiplication

Journal of Computational Chemistry, 1987
AbstractSeveral implementations of matrix multiplication (MMUL) in Fortran and VAX assembly language are discussed. On a VAX‐11/780 computer, the most efficient MMUL is achieved through vector‐scalar‐multiply‐and‐add (VSMA) operations, rather than by means of dot products.
Carlos F. Bunge, Gerardo Cisneros
openaire   +1 more source

Three-Dimensional nand Flash for Vector–Matrix Multiplication

IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 2019
Three-Dimensional NAND flash technology is one of the most competitive integrated solutions for the high-volume massive data storage. So far, there are few investigations on how to use 3-D NAND flash for in-memory computing in the neural network ...
Panni Wang   +6 more
semanticscholar   +1 more source

SIMD Matrix Multiplication

1990
Suppose that two n×n matrices A [0:n-1, 0:n-1] and B[0:n-1, 0:n-1] are to be multiplied on an SIMD hypercube to get the product matrix C where $$ C[i,j] = \sum\limits_{k = 0}^{n - 1} {A[i,k]*B[k,j],0 \le i,j < n} $$
Sanjay Ranka, Sartaj Sahni
openaire   +1 more source

SPARSE MATRIX–VECTOR MULTIPLICATION

2004
Abstract This chapter introduces irregular algorithms and presents the example of parallel sparse matrix-vector multiplication (SpMV), which is the central operation in iterative linear system solvers. The irregular sparsity pattern of the matrix does not change during the multiplication, which may be repeated many times.
openaire   +1 more source

Parallel matrix multiplication

2018 41st International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), 2018
Utilizing all CPU cores available for numerical computations is a topic of considerable interest in HPC. This paper analyzes and compares four different parallel algorithms for matrix multiplication without block partitioning using OpenMP. The comparison of the algorithms is based on the achieved speed, memory bandwidth and efficient use of the cache ...
Nikola Tomikj, Marjan Gusev
openaire   +1 more source

Home - About - Disclaimer - Privacy