Skip to content

Commit

Permalink
docs(jina): add jina integration section (#129)
Browse files Browse the repository at this point in the history
* docs(jina): add jina integration section

* docs(jina): add jina integration section
  • Loading branch information
hanxiao authored Feb 21, 2022
1 parent 88ac69f commit 8fe03dc
Show file tree
Hide file tree
Showing 8 changed files with 197 additions and 9 deletions.
2 changes: 1 addition & 1 deletion docs/_static/docarray-dark.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/fundamentals/jina-support/docarray-img.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
190 changes: 190 additions & 0 deletions docs/fundamentals/jina-support/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,190 @@
(jina-support)=
# Jina

DocArray focuses on the local & monolith developer experience. [Jina](https://github.com/jina-ai/jina) scales DocArray to the Cloud. DocArray is also the default transit format in Jina, Executors talk to each other via serialized DocArray. The picture below shows their relations.

```{figure} position-jina-docarray.svg
:width: 80%

```

The next picture summarizes your development journey with DocArray and Jina. With a new project, first move horizontally left with DocArray, that often means improving quality and completing logics on a local environment. When you are ready, move vertically up with Jina, equipping your application with service endpoint, scalability and cloud-native features. Finally, you reach the point your service is ready for production.

```{figure} position-jina-docarray-2.svg
:width: 80%

```


## Package dependency

If you are a Jina 3 user, you don't need to install `docarray` independently, as it is included in `pip install jina`. You can use `jina -v` in the terminal to check if you are using `3.x`.

When starting a Jina project, you can write imports either as
```python
from docarray import DocumentArray, Document
from jina import Flow
```

Or as,

```python
from jina import Flow, DocumentArray, Document
```

They work exactly same. You will be using the same install of `docarray` in your system. This is because `jina` package exposes `DocumentArray` and `Document` from `docarray` package.

You can update DocArray package without updating Jina via `pip install -U docarray`. This often works unless otherwise specified in the release note of Jina.

## Local code as a service

Considering the example below, where we use DocArray to pre-process an image DocumentArray:

```python
from docarray import Document, DocumentArray

da = DocumentArray.from_files('**/*.png')

def preproc(d: Document):
return (d.load_uri_to_image_tensor() # load
.set_image_tensor_normalization() # normalize color
.set_image_tensor_channel_axis(-1, 0)) # switch color axis for the PyTorch model later

da.apply(preproc).plot_image_sprites(channel_axis=0)
```

The code can be run as-is. It will give you a plot like the following (depending on how many images you have):

```{figure} docarray-img.png
:width: 50%
```


When writing it with Jina, the code is slightly refactored into the Executor-style:

```python
from docarray import Document, DocumentArray

from jina import Executor, requests

class MyExecutor(Executor):

@staticmethod
def preproc(d: Document):
return (d.load_uri_to_image_tensor() # load
.set_image_tensor_normalization() # normalize color
.set_image_tensor_channel_axis(-1, 0)) # switch color axis for the PyTorch model later

@requests
def foo(self, docs: DocumentArray, **kwargs):
docs.apply(self.preproc)
```

To summarize, you need to do three changes:

- Import `Executor` and subclass it;
- Wrap you functions into class methods;
- Add `@request` decorator the logic functions.

Now you can feed data to it via:

```python
from jina import Flow, DocumentArray

f = Flow().add(uses=MyExecutor)

with f:
r = f.post('/', DocumentArray.from_files('**/*.png'), show_progress=True)
r.plot_image_sprites(channel_axis=0)
```

You get the same results as before with some extra output from the console:

```text
Flow@26202[I]:🎉 Flow is ready to use!
🔗 Protocol: GRPC
🏠 Local access: 0.0.0.0:57050
🔒 Private network: 192.168.0.102:57050
🌐 Public address: 84.172.88.250:57050
⠋ DONE ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╸ 0:00:05 100% ETA: 0 seconds 80 steps done in 5 seconds
```

## Three good reasons to use Jina

Okay, so I refactor the code from 10 lines to 24 lines, what's the deal? Here are three reasons to use Jina:

### A client-server architecture

One immediate consequence is now your logic works as a service. You can host it remotely on a server and start client to query it:

````{tab} Server
```python
from jina import Flow, DocumentArray
f = Flow(port=12345).add(uses=MyExecutor)
with f:
f.block()
```
````
````{tab} Client
```python
from jina import Client, DocumentArray
c = Client(port=12345)
c.post('/', DocumentArray.from_files('**/*.png'), show_progressbar=True)
```
````

You can also use `websockets`, `http`, GraphQL API to query it. More details can be found in [Jina Documentation](https://docs.jina.ai/).

### Scale it out

Scaling your server is as easy as adding `replicas`:

```python
from jina import Flow

f = Flow(port=12345).add(uses=MyExecutor, replicas=3)

with f:
f.block()
```

This will start three parallels can improve the overall throughput. [More details can be found here.](https://docs.jina.ai/fundamentals/flow/create-flow/#replicate-executors)

### Share and reuse it

One can share and reuse it via [Hub](https://hub.jina.ai). Save your Executor in a folder say `foo` and then:

```bash
jina hub push foo
```

This will upload your Executor logic to Jina Hub and allows you and other people to reuse it via Sandbox (as a hosted-microservice), Docker image or source. For example, after `jina hub push`, you will get:


```{figure} jinahub-push.png
:width: 60%
```


Say if you want to use it as a Sandbox, you can change your Flow to:

```python
from jina import Flow, DocumentArray

f = Flow().add(uses='jinahub+sandbox://mp0pe477')

with f:
f.post('/', DocumentArray.from_files('**/*.png'), show_progressbar=True)
```

In this case, the Executor is running remotely and managed by Jina Cloud. It does not use any of your local resources.

A single Executor can do very limited things. You can combine multiple Executors together in a Flow to accomplish a task, some of them are written by you; some of them are shared from the Hub; some may run remotely; some may run in local Docker. Little you have to worry about, all you need is to keep doing `.add()` Executor to your Flow.

## Summary


If you start something new, start with DocArray. If you want to scale it out and make it a public available cloud-service, then use Jina.
Binary file added docs/fundamentals/jina-support/jinahub-push.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions docs/fundamentals/jina-support/position-jina-docarray.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
8 changes: 2 additions & 6 deletions docs/get-started/what-is.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,8 +57,6 @@ In DocArray, there will also be a couple of feature release soon to allow big da
:width: 90%
```

The data in transit part of DocArray will become much clearer with Jina 3.0 release (expected in Feb. 2022).


## To AwkwardArray

Expand Down Expand Up @@ -113,6 +111,8 @@ In DocArray, the basic element one would work with is a Document, not `ndarray`.

## To Jina Users

DocArray focuses on the local & monolith developer experience. Jina scales DocArray to the Cloud. More details on their relations can be {ref}`found here<jina-support>`.

Jina 2.0-2.6 *kind of* have their own "DocArray", with very similar `Document` and `DocumentArray` interface. However, it is an old design and codebase. Since then, many redesigns and improvements have been made to boost its efficiency, usability and portability. DocArray is now an independent package that other frameworks such as future Jina 3.x and Finetuner will rely on.

The first good reason to use DocArray is its efficiency. Here is a side-by-side speed comparison of Jina 2.6 vs. DocArray on some common tasks.
Expand All @@ -135,7 +135,3 @@ Beside code refactoring and optimization, many features have been improved, incl
When first using DocArray, some Jina 2.x user may realize the static typing seems missing. This is due to a deliberate decision of DocArray: DocArray guarantees the types and constraints of the wire data, not the input data. In other words, only the functions that are listed under {ref}`docarray-serialization` chapter will trigger the data validation.

To learn DocArray, the recommendation here is to forget about everything in Jina 2.x, although some interfaces may look familiar. Read [the fundamental sections](../fundamentals/document/index.md) from beginning.

```{important}
The new Jina 3.0 (expected in Feb. 2022) will depend on the new DocArray. All Document & Document API from [Jina Docs](https://docs.jina.ai) will be removed. This documentation website of DocArray serves as the single source of truth.
```
4 changes: 2 additions & 2 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -73,13 +73,12 @@ This will install all requirements for reproducing tests on your local dev envir


```{important}
Jina 3.x[^1] users do not need to install `docarray` separately, as it is shipped with Jina. To check your Jina version, type `jina -vf` in the console.
Jina 3.x users do not need to install `docarray` separately, as it is shipped with Jina. To check your Jina version, type `jina -vf` in the console.
However, if the printed version is smaller than `0.1.0`, say `0.0.x`, then you are
not installing `docarray` correctly. You are probably still using an old `docarray` shipped with Jina 2.x.
```

[^1]: Jina 3.0rc will be released in Feb. 2022. Stay tune!


```{include} ../README.md
Expand Down Expand Up @@ -108,6 +107,7 @@ datatypes/index
:caption: Integrations
:hidden:
fundamentals/jina-support/index
fundamentals/notebook-support/index
fundamentals/fastapi-support/index
advanced/graphql-support/index
Expand Down

0 comments on commit 8fe03dc

Please sign in to comment.