Spark Matrix Multiplication (Scala)

This project implements matrix multiplication using Apache Spark in Scala.
It demonstrates multiplication of both Coordinate Matrices (COO format) and Block Matrices, with two implementations each:

Using cogroup
Using join (commented as alternatives in the code)

📂 Project Structure

Spark-Matrix-Multiplication/
├── MatrixMultiplication.scala   # Scala code with functions + small demo
├── README.md                    # Project documentation
└── .gitignore                   # Ignore unnecessary files

⚡ Features

Matrix multiplication for small and large matrices
Support for coordinate and block representations
Example with 16,384 x 16,384 sparse matrices
Runs on Databricks, Spark Shell, or any Spark environment

🔧 Requirements

Apache Spark (2.4+ or 3.x)
Scala (2.11 or 2.12 depending on your Spark version)
Optional: Databricks Community Edition for notebooks

▶️ How to Run

1. Load the code

In Spark shell:

:load MatrixMultiplication.scala

Or in Databricks:
Copy the contents of MatrixMultiplication.scala into a notebook cell and run it.

2. Run Coordinate Matrix Multiplication

val resultCoo = COOMatrixMultiply(M_RDD_Small, N_RDD_Small)
resultCoo.collect.foreach(println)

Example Input Matrices

M = [ [1, 2],
      [3, 4] ]

N = [ [5, 6],
      [7, 8] ]

Expected Output

((0,0),19.0)
((0,1),22.0)
((1,0),43.0)
((1,1),50.0)

3. Run Block Matrix Multiplication

val resultBlock = BlockMatrixMultiply(M_RDD_Block, N_RDD_Block, blockSize)
resultBlock.collect.foreach(println)

You can adjust the blockSize parameter (e.g., 2, 4, 8).

4. Run Large Matrices (optional)

val R_RDD_Coo_large = COOMatrixMultiply(M_RDD_Coo_large, N_RDD_Coo_large)
println(R_RDD_Coo_large.count())

val R_RDD_Block_large = BlockMatrixMultiply(M_RDD_Block_large, N_RDD_Block_large, blockSize = 8)
println(R_RDD_Block_large.count())

⚠️ Running with 16384 x 16384 matrices requires significant cluster resources.

📘 Notes

Two implementations are provided: cogroup (used by default) and join (commented in code). You can switch by uncommenting.
The functions are generic: you can replace the sample matrices with your own RDDs.
For large-scale experiments, consider tuning partitioning and block size.

👨‍💻 Author

Created by Yashwin Bangalore Subramani

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.gitignore		.gitignore
README.md		README.md
yashwin_8361-Project.scala		yashwin_8361-Project.scala

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Spark Matrix Multiplication (Scala)

📂 Project Structure

⚡ Features

🔧 Requirements

▶️ How to Run

1. Load the code

2. Run Coordinate Matrix Multiplication

3. Run Block Matrix Multiplication

4. Run Large Matrices (optional)

📘 Notes

👨‍💻 Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Spark Matrix Multiplication (Scala)

📂 Project Structure

⚡ Features

🔧 Requirements

▶️ How to Run

1. Load the code

2. Run Coordinate Matrix Multiplication

3. Run Block Matrix Multiplication

4. Run Large Matrices (optional)

📘 Notes

👨‍💻 Author

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages