diff --git a/learning/tour-of-beam/learning-content/common-transforms/aggregation/count/description.md b/learning/tour-of-beam/learning-content/common-transforms/aggregation/count/description.md index 43ab5503240c..60fd0cc9f216 100644 --- a/learning/tour-of-beam/learning-content/common-transforms/aggregation/count/description.md +++ b/learning/tour-of-beam/learning-content/common-transforms/aggregation/count/description.md @@ -238,11 +238,11 @@ PCollection> input = pipeline.apply( And replace `Count.globally` with `Count.perKey` it will output the count numbers by key. It is also necessary to replace the generic type: ``` -PCollection> output = applyTransform(input); +PCollection> output = applyTransform(input); ``` ``` -static PCollection> applyTransform(PCollection> input) { +static PCollection> applyTransform(PCollection> input) { return input.apply(Count.globally()); } ``` diff --git a/learning/tour-of-beam/learning-content/common-transforms/aggregation/max/description.md b/learning/tour-of-beam/learning-content/common-transforms/aggregation/max/description.md index 92a0ace73b4d..9b8cfbea8b11 100644 --- a/learning/tour-of-beam/learning-content/common-transforms/aggregation/max/description.md +++ b/learning/tour-of-beam/learning-content/common-transforms/aggregation/max/description.md @@ -42,11 +42,11 @@ func ApplyTransform(s beam.Scope, input beam.PCollection) beam.PCollection { ``` {{end}} {{if (eq .Sdk "java")}} -You can find the global maximum value from the `PCollection` by using `Max.doublesGlobally()` +You can find the global maximum value from the `PCollection` by using `Max.integersGlobally()` ``` PCollection input = pipeline.apply(Create.of(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)); -PCollection max = input.apply(Max.doublesGlobally()); +PCollection max = input.apply(Max.integersGlobally()); ``` Output diff --git a/learning/tour-of-beam/learning-content/common-transforms/aggregation/min/description.md b/learning/tour-of-beam/learning-content/common-transforms/aggregation/min/description.md index 1343b9d8c85f..138c8aef640e 100644 --- a/learning/tour-of-beam/learning-content/common-transforms/aggregation/min/description.md +++ b/learning/tour-of-beam/learning-content/common-transforms/aggregation/min/description.md @@ -41,11 +41,11 @@ func ApplyTransform(s beam.Scope, input beam.PCollection) beam.PCollection { ``` {{end}} {{if (eq .Sdk "java")}} -You can find the global minimum value from the `PCollection` by using `Min.doublesGlobally()` +You can find the global minimum value from the `PCollection` by using `Min.integersGlobally()` ``` PCollection input = pipeline.apply(Create.of(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)); -PCollection min = input.apply(Min.doublesGlobally()); +PCollection min = input.apply(Min.integersGlobally()); ``` Output @@ -165,7 +165,7 @@ PCollection> output = applyTransform(input); ``` static PCollection> applyTransform(PCollection> input) { - return input.apply(Sum.integersPerKey()); + return input.apply(Min.integersPerKey()); } ``` {{end}} diff --git a/learning/tour-of-beam/learning-content/common-transforms/filter/description.md b/learning/tour-of-beam/learning-content/common-transforms/filter/description.md index 7a5b9522926d..96f4b549625b 100644 --- a/learning/tour-of-beam/learning-content/common-transforms/filter/description.md +++ b/learning/tour-of-beam/learning-content/common-transforms/filter/description.md @@ -51,7 +51,7 @@ world ### Built-in filters -The Java SDK has several filter methods built-in, like `Filter.greaterThan` and `Filter.lessThen` With `Filter.greaterThan`, the input `PCollection` can be filtered so that only the elements whose values are greater than the specified amount remain. Similarly, you can use `Filter.lessThen` to filter out elements of the input `PCollection` whose values are greater than the specified amount. +The Java SDK has several filter methods built-in, like `Filter.greaterThan` and `Filter.lessThan` With `Filter.greaterThan`, the input `PCollection` can be filtered so that only the elements whose values are greater than the specified amount remain. Similarly, you can use `Filter.lessThan` to filter out elements of the input `PCollection` whose values are greater than the specified amount. Other built-in filters are: @@ -62,7 +62,7 @@ Other built-in filters are: * Filter.equal -## Example 2: Filtering with a built-in methods +## Example 2: Filtering with built-in methods ``` // List of integers @@ -404,11 +404,3 @@ func applyTransform(s beam.Scope, input beam.PCollection) beam.PCollection { }) } ``` - -### Playground exercise - -You can find the complete code of the above example using 'Filter' in the playground window, which you can run and experiment with. - -Filter transform can be used with both text and numerical collection. For example, let's try filtering the input collection that contains words so that only words that start with the letter 'a' are returned. - -You can also chain several filter transforms to form more complex filtering based on several simple filters or implement more complex filtering logic within a single filter transform. For example, try both approaches to filter the same list of words such that only ones that start with a letter 'a' (regardless of the case) and containing more than three symbols are returned. diff --git a/learning/tour-of-beam/learning-content/core-transforms/additional-outputs/description.md b/learning/tour-of-beam/learning-content/core-transforms/additional-outputs/description.md index d228e6537498..7c4922314521 100644 --- a/learning/tour-of-beam/learning-content/core-transforms/additional-outputs/description.md +++ b/learning/tour-of-beam/learning-content/core-transforms/additional-outputs/description.md @@ -104,7 +104,7 @@ func extractWordsFn(pn beam.PaneInfo, line string, emitWords func(string)) { ``` {{end}} {{if (eq .Sdk "java")}} -While `ParDo` always outputs the main output of `PCollection` (as a return value from apply), you can also force your `ParDo` to output any number of additional `PCollection` outputs. If you decide to have multiple outputs, your `ParDo` will return all the `PCollection` output (including the main output) combined. This will be useful when you are working with big data or a database that needs to be divided into different collections. You get a combined `PCollectionTuple`, you can use `TupleTag` to get a `PCollection`. +While `ParDo` always outputs the main output of `PCollection` (as a return value from apply), you can also force your `ParDo` to output any number of additional `PCollection` outputs. If you decide to have multiple outputs, your `ParDo` will return all the `PCollection` outputs (including the main output) combined. This will be useful when you are working with big data or a database that needs to be divided into different collections. You get a combined `PCollectionTuple`, you can use `TupleTag` to get a `PCollection`. A `PCollectionTuple` is an immutable tuple of heterogeneously typed `PCollection`, "with keys" `TupleTags`. A `PCollectionTuple` can be used as input or output for `PTransform` receiving or creating multiple `PCollection` inputs or outputs, which can be of different types, for example, `ParDo` with multiple outputs. @@ -202,7 +202,7 @@ tens = results[None] # the undeclared main output You can find the full code of this example in the playground window, which you can run and experiment with. -The `applyTransform()` accepts a list of integers at the output two `PCollection` one `PCollection` above 100 and second below 100. +The `applyTransform()` accepts a list of integers and outputs two `PCollections`: one `PCollection` above 100 and second below 100. You can also work with strings: {{if (eq .Sdk "go")}} diff --git a/learning/tour-of-beam/learning-content/core-transforms/branching/description.md b/learning/tour-of-beam/learning-content/core-transforms/branching/description.md index a01022de97ab..8adff770cdd5 100644 --- a/learning/tour-of-beam/learning-content/core-transforms/branching/description.md +++ b/learning/tour-of-beam/learning-content/core-transforms/branching/description.md @@ -86,7 +86,7 @@ starts_with_b = input | beam.Filter(lambda x: x.startswith('B')) You can find the full code of this example in the playground window, which you can run and experiment with. -Accepts a `PCollection` consisting of strings. Without modification, it returns a new "PCollection". In this case, one `PCollection` includes elements in uppercase. The other `PCollection' stores inverted elements. +Accepts a `PCollection` consisting of strings. Without modification, it returns a new `PCollection`. In this case, one `PCollection` includes elements in uppercase. The other `PCollection` stores inverted elements. You can use a different method of branching. Since `applyTransforms` performs 2 conversions, it takes a lot of time. It is possible to convert `PCollection` separately. {{if (eq .Sdk "go")}} diff --git a/learning/tour-of-beam/learning-content/core-transforms/combine/combine-per-key/description.md b/learning/tour-of-beam/learning-content/core-transforms/combine/combine-per-key/description.md index 1f44151337dd..af6f9e7aa197 100644 --- a/learning/tour-of-beam/learning-content/core-transforms/combine/combine-per-key/description.md +++ b/learning/tour-of-beam/learning-content/core-transforms/combine/combine-per-key/description.md @@ -14,9 +14,9 @@ limitations under the License. # CombinePerKey -CombinePerKey is a transform in Apache Beam that applies a `CombineFn` function to each key of a PCollection of key-value pairs. The `CombineFn` function can be used to aggregate, sum, or combine the values associated with each key in the input `PCollection`. +`CombinePerKey` is a transform in Apache Beam that applies a `CombineFn` function to each key of a `PCollection` of key-value pairs. The `CombineFn` function can be used to aggregate, sum, or combine the values associated with each key in the input `PCollection`. -The `CombinePerKey` transform takes in an instance of a `CombineFn` class and applies it to the input `PCollection`. The output of the transform is a new PCollection where each element is a key-value pair, where the key is the same as the input key, and the value is the result of applying the `CombineFn` function to all the values associated with that key in the input `PCollection`. +The `CombinePerKey` transform takes in an instance of a `CombineFn` class and applies it to the input `PCollection`. The output of the transform is a new `PCollection` where each element is a key-value pair, where the key is the same as the input key, and the value is the result of applying the `CombineFn` function to all the values associated with that key in the input `PCollection`. {{if (eq .Sdk "go")}} ``` @@ -81,7 +81,7 @@ func applyTransform(s beam.Scope, input beam.PCollection) beam.PCollection { {{if (eq .Sdk "java")}} ``` PCollection> input = pipeline - .apply("ParseCitiesToTimeKV", Create.of( + .apply(Create.of( KV.of("a", "apple"), KV.of("o", "orange"), KV.of("a", "avocado), @@ -93,7 +93,7 @@ static PCollection> applyTransform(PCollection { + static class SumStringBinaryCombineFn extends BinaryCombineFn { @Override public String apply(String left, String right) { diff --git a/learning/tour-of-beam/learning-content/core-transforms/combine/simple-function/description.md b/learning/tour-of-beam/learning-content/core-transforms/combine/simple-function/description.md index 8eda136e7931..1946265fa66c 100644 --- a/learning/tour-of-beam/learning-content/core-transforms/combine/simple-function/description.md +++ b/learning/tour-of-beam/learning-content/core-transforms/combine/simple-function/description.md @@ -14,7 +14,7 @@ limitations under the License. # Combine -`Combine` is a Beam transform for combining collections of elements or values in your data. Combine has variants that work on entire PCollections, and some that combine the values for each key in `PCollections` of **key/value** pairs. +`Combine` is a Beam transform for combining collections of elements or values in your data. Combine has variants that work on entire `PCollections`, and some that combine the values for each key in `PCollections` of **key/value** pairs. When you apply a `Combine` transform, you must provide the function that contains the logic for combining the elements or values. The combining function should be commutative and associative, as the function is not necessarily invoked exactly once on all values with a given key. Because the input data (including the value collection) may be distributed across multiple workers, the combining function might be called multiple times to perform partial combining on subsets of the value collection. The Beam SDK also provides some pre-built combine functions for common numeric combination operations such as sum, min, and max. diff --git a/learning/tour-of-beam/learning-content/core-transforms/composite/description.md b/learning/tour-of-beam/learning-content/core-transforms/composite/description.md index 774f5deae924..c61dcee8950b 100644 --- a/learning/tour-of-beam/learning-content/core-transforms/composite/description.md +++ b/learning/tour-of-beam/learning-content/core-transforms/composite/description.md @@ -195,7 +195,7 @@ func extractWords(s beam.Scope, input beam.PCollection) beam.PCollection { } ``` -You can use other transformations you can replace `Count` with `Filter` to output words starting with **p**: +You can use other transformations, i.e. you can replace `Count` with `Filter` to output words starting with **p**: ``` func applyTransform(s beam.Scope, input beam.PCollection) beam.PCollection { @@ -224,7 +224,7 @@ PCollection words = input })); ``` -You can use other transformations you can replace `Count` with `Filter` to output words starting with **p**: +You can use other transformations, i.e. you can replace `Count` with `Filter` to output words starting with **p**: ``` PCollection filtered = input @@ -252,7 +252,7 @@ PCollection filtered = input words = input | 'ExtractWords' >> beam.FlatMap(lambda line: [word for word in line.split() if word]) ``` -You can use other transformations you can replace `Count` with `Filter` to output words starting with **p**: +You can use other transformations, i.e. you can replace `Count` with `Filter` to output words starting with **p**: ``` filtered = (input | 'ExtractNonSpaceCharacters' >> beam.FlatMap(lambda line: [word for word in line.split() if word]) diff --git a/learning/tour-of-beam/learning-content/core-transforms/flatten/description.md b/learning/tour-of-beam/learning-content/core-transforms/flatten/description.md index 19618f02c9e3..db9cbc31dc65 100644 --- a/learning/tour-of-beam/learning-content/core-transforms/flatten/description.md +++ b/learning/tour-of-beam/learning-content/core-transforms/flatten/description.md @@ -56,7 +56,7 @@ By default, the coder for the output `PCollection` is the same as the coder for When using `Flatten` to merge `PCollection` objects that have a windowing strategy applied, all of the `PCollection` objects you want to merge must use a compatible windowing strategy and window sizing. For example, all the collections you’re merging must all use (hypothetically) identical 5-minute fixed windows or 4-minute sliding windows starting every 30 seconds. -If your pipeline attempts to use `Flatten` to merge `PCollection` objects with incompatible windows, Beam generates an IllegalStateException error when your pipeline is constructed. +If your pipeline attempts to use `Flatten` to merge `PCollection` objects with incompatible windows, Beam generates an `IllegalStateException` error when your pipeline is constructed. ### Playground exercise diff --git a/learning/tour-of-beam/learning-content/core-transforms/map/co-group-by-key/description.md b/learning/tour-of-beam/learning-content/core-transforms/map/co-group-by-key/description.md index a15321cd8ea3..739f43f96201 100644 --- a/learning/tour-of-beam/learning-content/core-transforms/map/co-group-by-key/description.md +++ b/learning/tour-of-beam/learning-content/core-transforms/map/co-group-by-key/description.md @@ -75,7 +75,7 @@ func formatCoGBKResults(key string, emailIter, phoneIter func(*string) bool) str {{if (eq .Sdk "java")}} You can use the `CoGroupByKey` transformation for a tuple of tables. `CoGroupByKey` groups results from all tables by similar keys in `CoGbkResults`, from which results for any particular table can be accessed using the `TupleTag` tag supplied with the source table. -For type safety, the Jav SDK requires you to pass each `PCollection` as part of a `KeyedPCollectionTuple`. You must declare a `TupleTag` for each input `PCollection` in the `KeyedPCollectionTuple` that you want to pass to `CoGroupByKey`. As output, `CoGroupByKey` returns a `PCollection>`, which groups values from all the input `PCollections` by their common keys. Each key (all of type K) will have a different `CoGbkResult`, which is a map from `TupleTag to Iterable`. You can access a specific collection in an `CoGbkResult` object by using the `TupleTag` that you supplied with the initial collection. +For type safety, the Java SDK requires you to pass each `PCollection` as part of a `KeyedPCollectionTuple`. You must declare a `TupleTag` for each input `PCollection` in the `KeyedPCollectionTuple` that you want to pass to `CoGroupByKey`. As output, `CoGroupByKey` returns a `PCollection>`, which groups values from all the input `PCollections` by their common keys. Each key (all of type K) will have a different `CoGbkResult`, which is a map from `TupleTag to Iterable`. You can access a specific collection in an `CoGbkResult` object by using the `TupleTag` that you supplied with the initial collection. ``` // Mock data diff --git a/learning/tour-of-beam/learning-content/core-transforms/map/group-by-key/description.md b/learning/tour-of-beam/learning-content/core-transforms/map/group-by-key/description.md index c6041905a2d8..d8c396bf3a75 100644 --- a/learning/tour-of-beam/learning-content/core-transforms/map/group-by-key/description.md +++ b/learning/tour-of-beam/learning-content/core-transforms/map/group-by-key/description.md @@ -11,7 +11,7 @@ limitations under the License. --> # GroupByKey -`GroupByKey` is a transform that is used to group elements in a `PCollection` by key. The input to `GroupByKey` is a `PCollection` of key-value pairs, where the keys are used to group the elements. The output of `GroupByKey` is a PCollection of key-value pairs, where the keys are the same as the input, and the values are lists of all the elements with that key. +`GroupByKey` is a transform that is used to group elements in a `PCollection` by key. The input to `GroupByKey` is a `PCollection` of key-value pairs, where the keys are used to group the elements. The output of `GroupByKey` is a `PCollection` of key-value pairs, where the keys are the same as the input, and the values are lists of all the elements with that key. Let’s examine the mechanics of `GroupByKey` with a simple example case, where our data set consists of words from a text file and the line number on which they appear. We want to group together all the line numbers (values) that share the same word (key), letting us see all the places in the text where a particular word appears. @@ -30,7 +30,7 @@ cat, 9 and, 6 ``` -`GroupByKey` gathers up all the values with the same key and outputs a new pair consisting of the unique key and a collection of all of the values that were associated with that key in the input collection. If we apply GroupByKey to our input collection above, the output collection would look like this: +`GroupByKey` gathers up all the values with the same key and outputs a new pair consisting of the unique key and a collection of all of the values that were associated with that key in the input collection. If we apply `GroupByKey` to our input collection above, the output collection would look like this: ``` cat, [1,5,9] @@ -61,11 +61,11 @@ PCollection> input = ...; // Apply GroupByKey to the PCollection input. // Save the result as the PCollection reduced. -PCollection>> reduced = mapped.apply(GroupByKey.create()); +PCollection>> reduced = input.apply(GroupByKey.create()); ``` {{end}} {{if (eq .Sdk "python")}} -While all SDKs have a GroupByKey transform, using GroupBy is generally more natural. The `GroupBy` transform can be parameterized by the name(s) of properties on which to group the elements of the PCollection, or a function taking the each element as input that maps to a key on which to do grouping. +While all SDKs have a `GroupByKey` transform, using `GroupBy` is generally more natural. The `GroupBy` transform can be parameterized by the name(s) of properties on which to group the elements of the `PCollection`, or a function taking the each element as input that maps to a key on which to do grouping. ``` input = ... @@ -77,11 +77,11 @@ grouped_words = input | beam.GroupByKey() If you are using unbounded `PCollections`, you must use either non-global windowing or an aggregation trigger in order to perform a `GroupByKey` or `CoGroupByKey`. This is because a bounded `GroupByKey` or `CoGroupByKey` must wait for all the data with a certain key to be collected, but with unbounded collections, the data is unlimited. Windowing and/or triggers allow grouping to operate on logical, finite bundles of data within the unbounded data streams. -If you do apply `GroupByKey` or `CoGroupByKey` to a group of unbounded `PCollections` without setting either a non-global windowing strategy, a trigger strategy, or both for each collection, Beam generates an IllegalStateException error at pipeline construction time. +If you do apply `GroupByKey` or `CoGroupByKey` to a group of unbounded `PCollections` without setting either a non-global windowing strategy, a trigger strategy, or both for each collection, Beam generates an `IllegalStateException` error at pipeline construction time. -When using `GroupByKey` or `CoGroupByKey` to group PCollections that have a windowing strategy applied, all of the `PCollections` you want to group must use the same windowing strategy and window sizing. For example, all the collections you are merging must use (hypothetically) identical 5-minute fixed windows, or 4-minute sliding windows starting every 30 seconds. +When using `GroupByKey` or `CoGroupByKey` to group `PCollections` that have a windowing strategy applied, all of the `PCollections` you want to group must use the same windowing strategy and window sizing. For example, all the collections you are merging must use (hypothetically) identical 5-minute fixed windows, or 4-minute sliding windows starting every 30 seconds. -If your pipeline attempts to use `GroupByKey` or `CoGroupByKey` to merge `PCollections` with incompatible windows, Beam generates an IllegalStateException error at pipeline construction time. +If your pipeline attempts to use `GroupByKey` or `CoGroupByKey` to merge `PCollections` with incompatible windows, Beam generates an `IllegalStateException` error at pipeline construction time. ### Playground exercise @@ -118,7 +118,7 @@ func applyTransform(s beam.Scope, input beam.PCollection) beam.PCollection { {{if (eq .Sdk "java")}} ``` PCollection> input = pipeline - .apply("ParseCitiesToTimeKV", Create.of( + .apply(Create.of( KV.of("banana", 2), KV.of("apple", 4), KV.of("lemon", 3), diff --git a/learning/tour-of-beam/learning-content/core-transforms/map/map-elements/description.md b/learning/tour-of-beam/learning-content/core-transforms/map/map-elements/description.md index 27a07e4849fb..51c38a44b771 100644 --- a/learning/tour-of-beam/learning-content/core-transforms/map/map-elements/description.md +++ b/learning/tour-of-beam/learning-content/core-transforms/map/map-elements/description.md @@ -28,7 +28,7 @@ PCollection wordLengths = input.apply( })); ``` -If your `ParDo` performs a one-to-one mapping of input elements to output elements–that is, for each input element, it applies a function that produces exactly one output element, you can use the higher-level `MapElements` transform.MapElements can accept an anonymous Java 8 lambda function for additional brevity. +If your `ParDo` performs a one-to-one mapping of input elements to output elements–that is, for each input element, it applies a function that produces exactly one output element, you can use the higher-level `MapElements` transform. `MapElements` can accept an anonymous Java 8 lambda function for additional brevity. Here’s the previous example using `MapElements` : diff --git a/learning/tour-of-beam/learning-content/core-transforms/map/pardo-one-to-one/description.md b/learning/tour-of-beam/learning-content/core-transforms/map/pardo-one-to-one/description.md index e7bdbc6565ed..0f3dd6f580a5 100644 --- a/learning/tour-of-beam/learning-content/core-transforms/map/pardo-one-to-one/description.md +++ b/learning/tour-of-beam/learning-content/core-transforms/map/pardo-one-to-one/description.md @@ -264,9 +264,9 @@ wordLengths := beam.ParDo(s, func(word string) int { {{if (eq .Sdk "java")}} ### Accessing additional parameters in your DoFn -In addition to the element and the OutputReceiver, Beam will populate other parameters to your DoFn’s @ProcessElement method. Any combination of these parameters can be added to your process method in any order. +In addition to the element and the `OutputReceiver`, Beam will populate other parameters to your DoFn’s `@ProcessElement` method. Any combination of these parameters can be added to your process method in any order. -**Timestamp**: To access the timestamp of an input element, add a parameter annotated with @Timestamp of type Instant. For example: +**Timestamp**: To access the timestamp of an input element, add a parameter annotated with `@Timestamp` of type `Instant`. For example: ``` .of(new DoFn() { @@ -274,7 +274,7 @@ In addition to the element and the OutputReceiver, Beam will populate other para }}) ``` -**Window**: To access the window an input element falls into, add a parameter of the type of the window used for the input `PCollection`. If the parameter is a window type (a subclass of BoundedWindow) that does not match the input `PCollection`, then an error will be raised. If an element falls in multiple windows (for example, this will happen when using `SlidingWindows`), then the `@ProcessElement` method will be invoked multiple time for the element, once for each window. For example, when fixed windows are being used, the window is of type `IntervalWindow`. +**Window**: To access the window an input element falls into, add a parameter of the type of the window used for the input `PCollection`. If the parameter is a window type (a subclass of `BoundedWindow`) that does not match the input `PCollection`, then an error will be raised. If an element falls in multiple windows (for example, this will happen when using `SlidingWindows`), then the `@ProcessElement` method will be invoked multiple time for the element, once for each window. For example, when fixed windows are being used, the window is of type `IntervalWindow`. ``` .of(new DoFn() { @@ -298,7 +298,7 @@ In addition to the element and the OutputReceiver, Beam will populate other para }}) ``` -`@OnTimer` methods can also access many of these parameters. Timestamp, Window, key, `PipelineOptions`, `OutputReceiver`, and `MultiOutputReceiver` parameters can all be accessed in an @OnTimer method. In addition, an `@OnTimer` method can take a parameter of type `TimeDomain` which tells whether the timer is based on event time or processing time. Timers are explained in more detail in the Timely (and Stateful) Processing with Apache Beam blog post. +`@OnTimer` methods can also access many of these parameters. Timestamp, Window, key, `PipelineOptions`, `OutputReceiver`, and `MultiOutputReceiver` parameters can all be accessed in an `@OnTimer` method. In addition, an `@OnTimer` method can take a parameter of type `TimeDomain` which tells whether the timer is based on event time or processing time. Timers are explained in more detail in the [Timely (and Stateful) Processing with Apache Beam blog post](https://beam.apache.org/blog/timely-processing/). {{end}} @@ -401,7 +401,7 @@ class StatefulDoFn(beam.DoFn): You can find the full code of this example in the playground window, which you can run and experiment with. -You can work with any type of object.For example String: +You can work with any type of object. For example `String`: {{if (eq .Sdk "go")}} ``` diff --git a/learning/tour-of-beam/learning-content/core-transforms/partition/description.md b/learning/tour-of-beam/learning-content/core-transforms/partition/description.md index 59f6ad5f5c33..e951a38c1750 100644 --- a/learning/tour-of-beam/learning-content/core-transforms/partition/description.md +++ b/learning/tour-of-beam/learning-content/core-transforms/partition/description.md @@ -70,7 +70,7 @@ fortieth_percentile = by_decile[4] You can find the full code of this example in the playground window, which you can run and experiment with. -The `applyTransforms` returns a slice of the PCollection, you can access it by index. In this case, we have two `PCollections`, one consists of numbers that are less than 100, the second is more than 100. +The `applyTransforms` returns a slice of the `PCollection`, you can access it by index. In this case, we have two `PCollections`, one consists of numbers that are less than 100, the second is more than 100. You can also divide other types into parts, for example: "strings" and others. diff --git a/learning/tour-of-beam/learning-content/core-transforms/side-inputs/description.md b/learning/tour-of-beam/learning-content/core-transforms/side-inputs/description.md index a64ccde52950..ae5bdffc827d 100644 --- a/learning/tour-of-beam/learning-content/core-transforms/side-inputs/description.md +++ b/learning/tour-of-beam/learning-content/core-transforms/side-inputs/description.md @@ -11,7 +11,7 @@ limitations under the License. --> # Side inputs -In addition to the main input `PCollection`, you can provide additional inputs to a `ParDo` transform in the form of side inputs. A side input is an additional input that your `DoFn` can access each time it processes an element in the input PCollection. When you specify a side input, you create a view of some other data that can be read from within the `ParDo` transform’s `DoFn` while processing each element. +In addition to the main input `PCollection`, you can provide additional inputs to a `ParDo` transform in the form of side inputs. A side input is an additional input that your `DoFn` can access each time it processes an element in the input `PCollection`. When you specify a side input, you create a view of some other data that can be read from within the `ParDo` transform’s `DoFn` while processing each element. Side inputs are useful if your `ParDo` needs to inject additional data when processing each element in the input `PCollection`, but the additional data needs to be determined at runtime (and not hard-coded). Such values might be determined by the input data, or depend on a different branch of your pipeline. {{if (eq .Sdk "go")}} @@ -172,7 +172,7 @@ If the side input has multiple trigger firings, Beam uses the value from the lat You can find the full code of this example in the playground window, which you can run and experiment with. -At the entrance we have a map whose key is the city of the country value. And we also have a `Person` structure with his name and city. We can compare cities and embed countries in `Person`. +At the entrance we have a map whose key is the city of the country value. And we also have a `Person` structure with their name and city. We can compare cities and embed countries in `Person`. You can also use it as a variable for mathematical calculations. diff --git a/learning/tour-of-beam/learning-content/introduction/introduction-concepts/runner-concepts/description.md b/learning/tour-of-beam/learning-content/introduction/introduction-concepts/runner-concepts/description.md index 71abe616f1ad..d7c4cb4137dc 100644 --- a/learning/tour-of-beam/learning-content/introduction/introduction-concepts/runner-concepts/description.md +++ b/learning/tour-of-beam/learning-content/introduction/introduction-concepts/runner-concepts/description.md @@ -53,7 +53,7 @@ When using Java, you must specify your dependency on the Direct Runner in your p #### Set runner -In java, you need to set runner to `args` when you start the program. +In Java, you need to set runner to `args` when you start the program. ``` --runner=DirectRunner