Apache Linkis builds a computation middleware layer to facilitate connection, governance and orchestration between the upper applications and the underlying data engines.
-
Updated
Jul 21, 2025 - Java
Apache Linkis builds a computation middleware layer to facilitate connection, governance and orchestration between the upper applications and the underlying data engines.
Hopsworks - Data-Intensive AI platform with a Feature Store
MapReduce, Spark, Java, and Scala for Data Algorithms Book
Java library for approximate nearest neighbors search using Hierarchical Navigable Small World graphs
This project demonstrates how to use Apache Airflow to submit jobs to Apache spark cluster in different programming laguages using Python, Scala and Java as an example.
Custom AEMO MMS Data Model CSV reader for Apache Spark
Assignment for UoM lesson "Big Data"
MapReduce Job Development, RDDs Programming, Medical Data Management, Sales Analysis, And Efficient Data Integration For Big Data Analysis. Spark: Big Data Processing, SQOOP Integration, And Spark Structured Streaming For Real-Time Data.
Implementation of Hadoop and Spark
B2C Online Education Website, Development Model of Separation of Frontend and Backend, MVC Design Pattern, Course Recommendation System
How to use spark testing base in spark java application. Feel free to make changes.
SUTD 2021 50.043 Database and Big Data Systems Code Dump
Examining the Relationship Between Tree Quality and Socioeconomic Status in New York City
Concepts and Applications of Big Data. Hadoop and Spark exercises
A CRISP-DM–based big data pipeline for predicting NYC ride-sharing trip fares: ingesting 2024 TLC data via Sqoop into HDFS/Hive, performing ETL and feature engineering with Spark & PySpark, training and tuning Linear Regression & Gradient Boosted Tree models, and outlining end-to-end deployment.
Add a description, image, and links to the pyspark topic page so that developers can more easily learn about it.
To associate your repository with the pyspark topic, visit your repo's landing page and select "manage topics."