⚡️ Speed up method `ModelToComponentFactory._create_async_job_status_mapping` by 14% in PR #45178 (`async-job/cdk-release`) #45344

codeflash-ai · 2024-09-09T16:45:07Z

⚡️ This pull request contains optimizations for PR #45178

If you approve this dependent PR, these changes will be merged into the original PR branch async-job/cdk-release.

This PR will be automatically closed if the original PR is merged.

📄 `ModelToComponentFactory._create_async_job_status_mapping()` in `airbyte-cdk/python/airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py`

📈 Performance improved by 14% (0.14x faster)

⏱️ Runtime went down from 3.33 milliseconds to 2.91 milliseconds

Explanation and details

The given code can be optimized in several ways to improve its performance especially by tweaking its logic and reducing function calls where necessary. Let's focus on restructuring and optimizing the internal handling for better performance.

Changes made.

Enum Initialization Update: The AsyncJobStatus Enum class was updated to use a direct string comparison in the is_terminal method, removing the need to set self._value and self._is_terminal initially.
Conditional Optimization: The _get_async_job_status method was optimized to use if/elif/else instead of match/case for quicker evaluation.
Intermediate .dict() handling: Expanded the use of the dict() method to only call once outside the loop to avoid redundant calls.
Structured logic for better read: Simplified and streamlined logic for better readability and faster processing.

These changes will improve the runtime performance by minimizing the number of operations required, handling enums more efficiently, and ensuring optimal logic flow.

Correctness verification

The new optimized code was tested for correctness. The results are listed below.

🔘 (none found) − ⚙️ Existing Unit Tests

✅ 17 Passed − 🌀 Generated Regression Tests

(click to show generated tests)

# imports
# function to test
from __future__ import annotations

from enum import Enum
from typing import Any, List, Mapping, Optional

import pytest  # used for our unit tests
from airbyte_cdk.sources.declarative.async_job.status import AsyncJobStatus
from airbyte_cdk.sources.declarative.models.declarative_component_schema import \
    AsyncJobStatusMap as AsyncJobStatusMapModel
from airbyte_cdk.sources.declarative.parsers.model_to_component_factory import \
    ModelToComponentFactory
from airbyte_cdk.sources.message import (InMemoryMessageRepository,
                                         MessageRepository)
from airbyte_cdk.sources.types import Config
from pydantic import ValidationError
from pydantic.v1 import BaseModel
from typing_extensions import Literal

Config = Mapping[str, Any]


# unit tests

@pytest.fixture
def factory():
    return ModelToComponentFactory()
    # Outputs were verified to be equal to the original implementation

def test_standard_mapping(factory):
    model = AsyncJobStatusMapModel(running=['api_running'], completed=['api_completed'], failed=['api_failed'], timeout=['api_timeout'])
    config = {}
    codeflash_output = factory._create_async_job_status_mapping(model, config)
    # Outputs were verified to be equal to the original implementation

def test_empty_lists(factory):
    model = AsyncJobStatusMapModel(running=[], completed=['api_completed'], failed=[], timeout=[])
    config = {}
    codeflash_output = factory._create_async_job_status_mapping(model, config)
    # Outputs were verified to be equal to the original implementation


def test_unsupported_cdk_status(factory):
    model = AsyncJobStatusMapModel(running=['api_running'], completed=['api_completed'], failed=['api_failed'], timeout=['api_timeout'])
    config = {}
    with pytest.raises(ValueError, match="Unsupported CDK status unknown_status"):
        factory._get_async_job_status('unknown_status')
    # Outputs were verified to be equal to the original implementation


def test_multiple_api_statuses_per_cdk_status(factory):
    model = AsyncJobStatusMapModel(running=['api_running1', 'api_running2'], completed=['api_completed1', 'api_completed2'], failed=[], timeout=[])
    config = {}
    codeflash_output = factory._create_async_job_status_mapping(model, config)
    # Outputs were verified to be equal to the original implementation

def test_mixed_case_sensitivity(factory):
    model = AsyncJobStatusMapModel(running=['Api_Running'], completed=['API_COMPLETED'], failed=[], timeout=[])
    config = {}
    codeflash_output = factory._create_async_job_status_mapping(model, config)
    # Outputs were verified to be equal to the original implementation

def test_large_number_of_api_statuses(factory):
    model = AsyncJobStatusMapModel(running=[f'api_running{i}' for i in range(1000)], completed=[f'api_completed{i}' for i in range(1000)], failed=[], timeout=[])
    config = {}
    codeflash_output = factory._create_async_job_status_mapping(model, config)
    expected_result = {f'api_running{i}': AsyncJobStatus.RUNNING for i in range(1000)}
    expected_result.update({f'api_completed{i}': AsyncJobStatus.COMPLETED for i in range(1000)})
    # Outputs were verified to be equal to the original implementation


def test_invalid_model_type(factory):
    config = {}
    with pytest.raises(AttributeError):
        factory._create_async_job_status_mapping({'running': ['api_running'], 'completed': ['api_completed']}, config)
    # Outputs were verified to be equal to the original implementation

def test_minimal_valid_input(factory):
    model = AsyncJobStatusMapModel(running=['api_running'], completed=[], failed=[], timeout=[])
    config = {}
    codeflash_output = factory._create_async_job_status_mapping(model, config)
    # Outputs were verified to be equal to the original implementation

def test_maximal_valid_input(factory):
    model = AsyncJobStatusMapModel(running=['api_running'], completed=['api_completed'], failed=['api_failed'], timeout=['api_timeout'])
    config = {}
    codeflash_output = factory._create_async_job_status_mapping(model, config)
    # Outputs were verified to be equal to the original implementation




def test_empty_string_as_status(factory):
    model = AsyncJobStatusMapModel(running=[''], completed=['api_completed'], failed=[], timeout=[])
    config = {}
    codeflash_output = factory._create_async_job_status_mapping(model, config)
    # Outputs were verified to be equal to the original implementation

def test_whitespace_strings_as_status(factory):
    model = AsyncJobStatusMapModel(running=[' '], completed=['api_completed'], failed=[], timeout=[])
    config = {}
    codeflash_output = factory._create_async_job_status_mapping(model, config)
    # Outputs were verified to be equal to the original implementation

def test_special_characters_in_status(factory):
    model = AsyncJobStatusMapModel(running=['api_running!@#'], completed=['api_completed$%^'], failed=[], timeout=[])
    config = {}
    codeflash_output = factory._create_async_job_status_mapping(model, config)
    # Outputs were verified to be equal to the original implementation

def test_extremely_long_status_strings(factory):
    model = AsyncJobStatusMapModel(running=['a' * 1000], completed=['b' * 1000], failed=[], timeout=[])
    config = {}
    codeflash_output = factory._create_async_job_status_mapping(model, config)
    # Outputs were verified to be equal to the original implementation

def test_unicode_characters_in_status(factory):
    model = AsyncJobStatusMapModel(running=['api_运行'], completed=['api_完成'], failed=[], timeout=[])
    config = {}
    codeflash_output = factory._create_async_job_status_mapping(model, config)
    # Outputs were verified to be equal to the original implementation

def test_model_with_only_type_key(factory):
    model = AsyncJobStatusMapModel(type='AsyncJobStatusMap', running=[], completed=[], failed=[], timeout=[])
    config = {}
    codeflash_output = factory._create_async_job_status_mapping(model, config)
    # Outputs were verified to be equal to the original implementation

def test_model_with_additional_unrecognized_keys(factory):
    model = AsyncJobStatusMapModel(running=['api_running'], completed=['api_completed'], failed=[], timeout=[])
    model_dict = model.dict()
    model_dict['extra_key'] = ['extra_status']
    config = {}
    codeflash_output = factory._create_async_job_status_mapping(model, config)
    # Outputs were verified to be equal to the original implementation


def test_model_with_duplicate_cdk_statuses(factory):
    model = AsyncJobStatusMapModel(running=['api_running'], completed=['api_completed'], failed=['api_failed'], timeout=['api_timeout'])
    model_dict = model.dict()
    model_dict['running'] = ['api_running2']
    config = {}
    codeflash_output = factory._create_async_job_status_mapping(model, config)
    # Outputs were verified to be equal to the original implementation

🔘 (none found) − ⏪ Replay Tests

…mapping` by 14% in PR #45178 (`async-job/cdk-release`) The given code can be optimized in several ways to improve its performance especially by tweaking its logic and reducing function calls where necessary. Let's focus on restructuring and optimizing the internal handling for better performance. ### Changes made. 1. **Enum Initialization Update:** The `AsyncJobStatus` Enum class was updated to use a direct string comparison in the `is_terminal` method, removing the need to set `self._value` and `self._is_terminal` initially. 2. **Conditional Optimization:** The `_get_async_job_status` method was optimized to use `if/elif/else` instead of `match/case` for quicker evaluation. 3. **Intermediate `.dict()` handling:** Expanded the use of the `dict()` method to only call once outside the loop to avoid redundant calls. 4. **Structured logic for better read:** Simplified and streamlined logic for better readability and faster processing. These changes will improve the runtime performance by minimizing the number of operations required, handling enums more efficiently, and ensuring optimal logic flow.

vercel · 2024-09-09T16:45:09Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Skipped Deployment

Name	Status	Preview	Comments	Updated (UTC)
airbyte-docs	⬜️ Ignored (Inspect)	Visit Preview		Sep 9, 2024 4:45pm

CLAassistant · 2024-09-09T16:45:13Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

codeflash-ai · 2024-09-10T12:59:17Z

This PR has been automatically closed because the original PR #45178 by maxi297 was closed.

codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Sep 9, 2024

codeflash-ai bot mentioned this pull request Sep 9, 2024

feat(cdk): add async job components #45178

Merged

2 tasks

octavia-squidington-iii added CDK Connector Development Kit community labels Sep 9, 2024

marcosmarxm removed the community label Sep 9, 2024

codeflash-ai bot closed this Sep 10, 2024

codeflash-ai bot deleted the codeflash/optimize-pr45178-2024-09-09T16.44.58 branch September 10, 2024 12:59

Base automatically changed from async-job/cdk-release to master September 10, 2024 12:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

⚡️ Speed up method `ModelToComponentFactory._create_async_job_status_mapping` by 14% in PR #45178 (`async-job/cdk-release`) #45344

⚡️ Speed up method `ModelToComponentFactory._create_async_job_status_mapping` by 14% in PR #45178 (`async-job/cdk-release`) #45344

codeflash-ai bot commented Sep 9, 2024

vercel bot commented Sep 9, 2024 •

edited

Loading

CLAassistant commented Sep 9, 2024

codeflash-ai bot commented Sep 10, 2024

⚡️ Speed up method ModelToComponentFactory._create_async_job_status_mapping by 14% in PR #45178 (async-job/cdk-release) #45344

⚡️ Speed up method ModelToComponentFactory._create_async_job_status_mapping by 14% in PR #45178 (async-job/cdk-release) #45344

Conversation

codeflash-ai bot commented Sep 9, 2024

⚡️ This pull request contains optimizations for PR #45178

📄 ModelToComponentFactory._create_async_job_status_mapping() in airbyte-cdk/python/airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py

Explanation and details

Changes made.

Correctness verification

🔘 (none found) − ⚙️ Existing Unit Tests

✅ 17 Passed − 🌀 Generated Regression Tests

🔘 (none found) − ⏪ Replay Tests

vercel bot commented Sep 9, 2024 • edited Loading

CLAassistant commented Sep 9, 2024

codeflash-ai bot commented Sep 10, 2024

⚡️ Speed up method `ModelToComponentFactory._create_async_job_status_mapping` by 14% in PR #45178 (`async-job/cdk-release`) #45344

⚡️ Speed up method `ModelToComponentFactory._create_async_job_status_mapping` by 14% in PR #45178 (`async-job/cdk-release`) #45344

📄 `ModelToComponentFactory._create_async_job_status_mapping()` in `airbyte-cdk/python/airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py`

vercel bot commented Sep 9, 2024 •

edited

Loading