Skip to content

Commit

Permalink
Update .NET 6 to .NET 8, .NET Framework 4.6.1 -> .NET Framework 4.8, …
Browse files Browse the repository at this point in the history
…dependencies.
  • Loading branch information
grazy27 committed Nov 28, 2024
1 parent e7eccdf commit 992daf4
Show file tree
Hide file tree
Showing 35 changed files with 144 additions and 115 deletions.
6 changes: 5 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -372,4 +372,8 @@ hs_err_pid*
.ionide/

# Mac dev
.DS_Store
.DS_Store

# Scala intermideate build files
**/.bloop/
**/.metals/
9 changes: 6 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@

.NET for Apache Spark is compliant with .NET Standard - a formal specification of .NET APIs that are common across .NET implementations. This means you can use .NET for Apache Spark anywhere you write .NET code allowing you to reuse all the knowledge, skills, code, and libraries you already have as a .NET developer.

.NET for Apache Spark runs on Windows, Linux, and macOS using .NET 6, or Windows using .NET Framework. It also runs on all major cloud providers including [Azure HDInsight Spark](deployment/README.md#azure-hdinsight-spark), [Amazon EMR Spark](deployment/README.md#amazon-emr-spark), [AWS](deployment/README.md#databricks) & [Azure](deployment/README.md#databricks) Databricks.
.NET for Apache Spark runs on Windows, Linux, and macOS using .NET 8, or Windows using .NET Framework. It also runs on all major cloud providers including [Azure HDInsight Spark](deployment/README.md#azure-hdinsight-spark), [Amazon EMR Spark](deployment/README.md#amazon-emr-spark), [AWS](deployment/README.md#databricks) & [Azure](deployment/README.md#databricks) Databricks.

**Note**: We currently have a Spark Project Improvement Proposal JIRA at [SPIP: .NET bindings for Apache Spark](https://issues.apache.org/jira/browse/SPARK-27006) to work with the community towards getting .NET support by default into Apache Spark. We highly encourage you to participate in the discussion.

Expand Down Expand Up @@ -40,7 +40,7 @@
<tbody align="center">
<tr>
<td>2.4*</td>
<td rowspan=4><a href="https://github.com/dotnet/spark/releases/tag/v2.1.1">v2.1.1</a></td>
<td rowspan=5><a href="https://github.com/dotnet/spark/releases/tag/v2.1.1">v2.1.1</a></td>
</tr>
<tr>
<td>3.0</td>
Expand All @@ -50,6 +50,9 @@
</tr>
<tr>
<td>3.2</td>
</tr>
<tr>
<td>3.5</td>
</tr>
</tbody>
</table>
Expand All @@ -61,7 +64,7 @@
.NET for Apache Spark releases are available [here](https://github.com/dotnet/spark/releases) and NuGet packages are available [here](https://www.nuget.org/packages/Microsoft.Spark).

## Get Started
These instructions will show you how to run a .NET for Apache Spark app using .NET 6.
These instructions will show you how to run a .NET for Apache Spark app using .NET 8.
- [Windows Instructions](docs/getting-started/windows-instructions.md)
- [Ubuntu Instructions](docs/getting-started/ubuntu-instructions.md)
- [MacOs Instructions](docs/getting-started/macos-instructions.md)
Expand Down
6 changes: 3 additions & 3 deletions azure-pipelines-e2e-tests-template.yml
Original file line number Diff line number Diff line change
Expand Up @@ -58,10 +58,10 @@ stages:
mvn -version
- task: UseDotNet@2
displayName: 'Use .NET 6 sdk'
displayName: 'Use .NET 8 sdk'
inputs:
packageType: sdk
version: 6.x
version: 8.x
installationPath: $(Agent.ToolsDirectory)/dotnet

- task: DownloadBuildArtifacts@0
Expand All @@ -71,7 +71,7 @@ stages:
downloadPath: $(Build.ArtifactStagingDirectory)

- pwsh: |
$framework = "net6.0"
$framework = "net8.0"
if ($env:AGENT_OS -eq 'Windows_NT') {
$runtimeIdentifier = "win-x64"
Expand Down
2 changes: 1 addition & 1 deletion benchmark/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ TPCH timing results is written to stdout in the following form: `TPCH_Result,<la
<true for sql tests, false for functional tests>
```
**Note**: Ensure that you build the worker and application with .NET 6 in order to run hardware acceleration queries.
**Note**: Ensure that you build the worker and application with .NET 8 in order to run hardware acceleration queries.
## Python
1. Upload [run_python_benchmark.sh](run_python_benchmark.sh) and all [python tpch benchmark](python/) files to the cluster.
Expand Down
6 changes: 3 additions & 3 deletions benchmark/csharp/Tpch/Tpch.csproj
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,8 @@

<PropertyGroup>
<OutputType>Exe</OutputType>
<TargetFrameworks>net461;net6.0</TargetFrameworks>
<TargetFrameworks Condition="'$(OS)' != 'Windows_NT'">net6.0</TargetFrameworks>
<TargetFrameworks>net48;net8.0</TargetFrameworks>
<TargetFrameworks Condition="'$(OS)' != 'Windows_NT'">net8.0</TargetFrameworks>
<RootNamespace>Tpch</RootNamespace>
<AssemblyName>Tpch</AssemblyName>
</PropertyGroup>
Expand All @@ -16,7 +16,7 @@
</ItemGroup>

<Choose>
<When Condition="'$(TargetFramework)' == 'net6.0'">
<When Condition="'$(TargetFramework)' == 'net8.0'">
<PropertyGroup>
<AllowUnsafeBlocks>true</AllowUnsafeBlocks>
</PropertyGroup>
Expand Down
6 changes: 3 additions & 3 deletions deployment/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ Microsoft.Spark.Worker is a backend component that lives on the individual worke
## Azure HDInsight Spark
[Azure HDInsight Spark](https://docs.microsoft.com/en-us/azure/hdinsight/spark/apache-spark-overview) is the Microsoft implementation of Apache Spark in the cloud that allows users to launch and configure Spark clusters in Azure. You can use HDInsight Spark clusters to process your data stored in Azure (e.g., [Azure Storage](https://azure.microsoft.com/en-us/services/storage/) and [Azure Data Lake Storage](https://docs.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-introduction)).

> **Note:** Azure HDInsight Spark is Linux-based. Therefore, if you are interested in deploying your app to Azure HDInsight Spark, make sure your app is .NET Standard compatible and that you use [.NET 6 compiler](https://dotnet.microsoft.com/download) to compile your app.
> **Note:** Azure HDInsight Spark is Linux-based. Therefore, if you are interested in deploying your app to Azure HDInsight Spark, make sure your app is .NET Standard compatible and that you use [.NET 8 compiler](https://dotnet.microsoft.com/download) to compile your app.
### Deploy Microsoft.Spark.Worker
*Note that this step is required only once*
Expand Down Expand Up @@ -115,7 +115,7 @@ EOF
## Amazon EMR Spark
[Amazon EMR](https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-what-is-emr.html) is a managed cluster platform that simplifies running big data frameworks on AWS.
> **Note:** AWS EMR Spark is Linux-based. Therefore, if you are interested in deploying your app to AWS EMR Spark, make sure your app is .NET Standard compatible and that you use [.NET 6 compiler](https://dotnet.microsoft.com/download) to compile your app.
> **Note:** AWS EMR Spark is Linux-based. Therefore, if you are interested in deploying your app to AWS EMR Spark, make sure your app is .NET Standard compatible and that you use [.NET 8 compiler](https://dotnet.microsoft.com/download) to compile your app.
### Deploy Microsoft.Spark.Worker
*Note that this step is only required at cluster creation*
Expand Down Expand Up @@ -160,7 +160,7 @@ foo@bar:~$ aws emr add-steps \
## Databricks
[Databricks](http://databricks.com) is a platform that provides cloud-based big data processing using Apache Spark.
> **Note:** [Azure](https://azure.microsoft.com/en-us/services/databricks/) and [AWS](https://databricks.com/aws) Databricks is Linux-based. Therefore, if you are interested in deploying your app to Databricks, make sure your app is .NET Standard compatible and that you use [.NET 6 compiler](https://dotnet.microsoft.com/download) to compile your app.
> **Note:** [Azure](https://azure.microsoft.com/en-us/services/databricks/) and [AWS](https://databricks.com/aws) Databricks is Linux-based. Therefore, if you are interested in deploying your app to Databricks, make sure your app is .NET Standard compatible and that you use [.NET 8 compiler](https://dotnet.microsoft.com/download) to compile your app.
Databricks allows you to submit Spark .NET apps to an existing active cluster or create a new cluster everytime you launch a job. This requires the **Microsoft.Spark.Worker** to be installed **first** before you submit a Spark .NET app.
Expand Down
31 changes: 16 additions & 15 deletions docs/building/ubuntu-instructions.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ Building Spark .NET on Ubuntu 18.04
- [Pre-requisites](#pre-requisites)
- [Building](#building)
- [Building Spark .NET Scala Extensions Layer](#building-spark-net-scala-extensions-layer)
- [Building .NET Sample Applications using .NET Core CLI](#building-net-sample-applications-using-net-core-cli)
- [Building .NET Sample Applications using .NET 8 CLI](#building-net-sample-applications-using-net-8-cli)
- [Run Samples](#run-samples)

# Open Issues:
Expand All @@ -16,7 +16,7 @@ Building Spark .NET on Ubuntu 18.04

If you already have all the pre-requisites, skip to the [build](ubuntu-instructions.md#building) steps below.

1. Download and install **[.NET 6 SDK](https://dotnet.microsoft.com/en-us/download/dotnet/6.0)** - installing the SDK will add the `dotnet` toolchain to your path.
1. Download and install **[.NET 8 SDK](https://dotnet.microsoft.com/en-us/download/dotnet/6.0)** - installing the SDK will add the `dotnet` toolchain to your path.
2. Install **[OpenJDK 8](https://openjdk.java.net/install/)**
- You can use the following command:
```bash
Expand Down Expand Up @@ -110,65 +110,66 @@ Let us now build the Spark .NET Scala extension layer. This is easy to do:
```
cd src/scala
mvn clean package
mvn clean package
```
You should see JARs created for the supported Spark versions:
* `microsoft-spark-2-3/target/microsoft-spark-2-3_2.11-<version>.jar`
* `microsoft-spark-2-4/target/microsoft-spark-2-4_2.11-<version>.jar`
* `microsoft-spark-3-0/target/microsoft-spark-3-0_2.12-<version>.jar`
* `microsoft-spark-3-0/target/microsoft-spark-3-5_2.12-<version>.jar`
## Building .NET Sample Applications using .NET 6 CLI
## Building .NET Sample Applications using .NET 8 CLI
1. Build the Worker
```bash
cd ~/dotnet.spark/src/csharp/Microsoft.Spark.Worker/
dotnet publish -f net6.0 -r linux-x64
dotnet publish -f net8.0 -r linux-x64
```
<details>
<summary>&#x1F4D9; Click to see sample console output</summary>
```bash
user@machine:/home/user/dotnet.spark/src/csharp/Microsoft.Spark.Worker$ dotnet publish -f net6.0 -r linux-x64
user@machine:/home/user/dotnet.spark/src/csharp/Microsoft.Spark.Worker$ dotnet publish -f net8.0 -r linux-x64
Microsoft (R) Build Engine version 16.0.462+g62fb89029d for .NET Core
Copyright (C) Microsoft Corporation. All rights reserved.
Restore completed in 36.03 ms for /home/user/dotnet.spark/src/csharp/Microsoft.Spark.Worker/Microsoft.Spark.Worker.csproj.
Restore completed in 35.94 ms for /home/user/dotnet.spark/src/csharp/Microsoft.Spark/Microsoft.Spark.csproj.
Microsoft.Spark -> /home/user/dotnet.spark/artifacts/bin/Microsoft.Spark/Debug/netstandard2.0/Microsoft.Spark.dll
Microsoft.Spark.Worker -> /home/user/dotnet.spark/artifacts/bin/Microsoft.Spark.Worker/Debug/net6.0/linux-x64/Microsoft.Spark.Worker.dll
Microsoft.Spark.Worker -> /home/user/dotnet.spark/artifacts/bin/Microsoft.Spark.Worker/Debug/net6.0/linux-x64/publish/
Microsoft.Spark.Worker -> /home/user/dotnet.spark/artifacts/bin/Microsoft.Spark.Worker/Debug/net8.0/linux-x64/Microsoft.Spark.Worker.dll
Microsoft.Spark.Worker -> /home/user/dotnet.spark/artifacts/bin/Microsoft.Spark.Worker/Debug/net8.0/linux-x64/publish/
```
</details>
2. Build the Samples
```bash
cd ~/dotnet.spark/examples/Microsoft.Spark.CSharp.Examples/
dotnet publish -f net6.0 -r linux-x64
dotnet publish -f net8.0 -r linux-x64
```
<details>
<summary>&#x1F4D9; Click to see sample console output</summary>
```bash
user@machine:/home/user/dotnet.spark/examples/Microsoft.Spark.CSharp.Examples$ dotnet publish -f net6.0 -r linux-x64
user@machine:/home/user/dotnet.spark/examples/Microsoft.Spark.CSharp.Examples$ dotnet publish -f net8.0 -r linux-x64
Microsoft (R) Build Engine version 16.0.462+g62fb89029d for .NET Core
Copyright (C) Microsoft Corporation. All rights reserved.
Restore completed in 37.11 ms for /home/user/dotnet.spark/src/csharp/Microsoft.Spark/Microsoft.Spark.csproj.
Restore completed in 281.63 ms for /home/user/dotnet.spark/examples/Microsoft.Spark.CSharp.Examples/Microsoft.Spark.CSharp.Examples.csproj.
Microsoft.Spark -> /home/user/dotnet.spark/artifacts/bin/Microsoft.Spark/Debug/netstandard2.0/Microsoft.Spark.dll
Microsoft.Spark.CSharp.Examples -> /home/user/dotnet.spark/artifacts/bin/Microsoft.Spark.CSharp.Examples/Debug/net6.0/linux-x64/Microsoft.Spark.CSharp.Examples.dll
Microsoft.Spark.CSharp.Examples -> /home/user/dotnet.spark/artifacts/bin/Microsoft.Spark.CSharp.Examples/Debug/net6.0/linux-x64/publish/
Microsoft.Spark.CSharp.Examples -> /home/user/dotnet.spark/artifacts/bin/Microsoft.Spark.CSharp.Examples/Debug/net8.0/linux-x64/Microsoft.Spark.CSharp.Examples.dll
Microsoft.Spark.CSharp.Examples -> /home/user/dotnet.spark/artifacts/bin/Microsoft.Spark.CSharp.Examples/Debug/net8.0/linux-x64/publish/
```
</details>
# Run Samples
Once you build the samples, you can use `spark-submit` to submit your .NET 6 apps. Make sure you have followed the [pre-requisites](#pre-requisites) section and installed Apache Spark.
Once you build the samples, you can use `spark-submit` to submit your .NET 8 apps. Make sure you have followed the [pre-requisites](#pre-requisites) section and installed Apache Spark.
1. Set the `DOTNET_WORKER_DIR` or `PATH` environment variable to include the path where the `Microsoft.Spark.Worker` binary has been generated (e.g., `~/dotnet.spark/artifacts/bin/Microsoft.Spark.Worker/Debug/net6.0/linux-x64/publish`)
2. Open a terminal and go to the directory where your app binary has been generated (e.g., `~/dotnet.spark/artifacts/bin/Microsoft.Spark.CSharp.Examples/Debug/net6.0/linux-x64/publish`)
1. Set the `DOTNET_WORKER_DIR` or `PATH` environment variable to include the path where the `Microsoft.Spark.Worker` binary has been generated (e.g., `~/dotnet.spark/artifacts/bin/Microsoft.Spark.Worker/Debug/net8.0/linux-x64/publish`)
2. Open a terminal and go to the directory where your app binary has been generated (e.g., `~/dotnet.spark/artifacts/bin/Microsoft.Spark.CSharp.Examples/Debug/net8.0/linux-x64/publish`)
3. Running your app follows the basic structure:
```bash
spark-submit \
Expand Down
Loading

0 comments on commit 992daf4

Please sign in to comment.