Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Escape semicolons #148

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Conversation

izhan
Copy link
Contributor

@izhan izhan commented Jul 22, 2015

I am currently running Stream-Framework with production data, and recently ran into an issue with how stream-framework serialize Aggregated Activities.

In aggregated_activity_serializer.py, we have the line check_reserved(serialized, [';', ';;']) that checks if the serialized values contain semicolons. My use case is creating notifications for when a user (actor) replies (verb) with a comment (object) to another comment (target). Due to the architecture of the current system, I currently include the body of the comment as extra_context to the activity. The comment body may contain semicolons as it is user-input, and a SerializationException is thrown upon serialization.

This PR simply escapes semicolons before deserialization, allowing for semicolons in activities. I've added semicolons in the extra_context of a test activity to verify that it is correctly escaping the semicolons.

@izhan izhan force-pushed the escape_semicolons branch from 7b71a22 to 15b9e88 Compare July 22, 2015 01:56
@joealcorn
Copy link

We've just hit this problem as well, is there any reason this can't be merged in?

@tbarbugli
Copy link
Collaborator

this PR will break existing applications or make it impossible to upgrade without purging/updating all data in Redis.

@izhan
Copy link
Contributor Author

izhan commented Sep 16, 2015

indeed, it's an unlikely edge case, but this will break for any existing activities that serializes into a string that ends with a backslash.

@EvgeneOskin
Copy link

EvgeneOskin commented Apr 28, 2017

Is there any updates? We've hit this problem as well. It stops us from using python3 with this awesome project.

The reason in default pickle version. By default python3 use version 3. But it's incompatible with existed data (at the moment we use python2 😄 ).

In python3

>>> import pickle 
>>> pickle.dumps({'user_id': 12313, 'meta_id':19003}).decode('latin1') 
'\x80\x03}q\x00(X\x07\x00\x00\x00user_idq\x01M\x190X\x07\x00\x00\x00meta_idq\x02M;Ju.'
>>> pickle.dumps({'user_id': 12313, 'meta_id':19003}, 0).decode('latin1') 
'(dp0\nVuser_id\np1\nL12313L\nsVmeta_id\np2\nL19003L\ns.'

In python2

>>> import pickle 
>>> pickle.dumps({'user_id': 12313, 'meta_id':19003})
"(dp0\nS'user_id'\np1\nI12313\nsS'meta_id'\np2\nI19003\ns."

@tbarbugli
Copy link
Collaborator

@EvgeneOskin how is this PR related to pickle output differences between Python 2 and Python 3?

@EvgeneOskin
Copy link

EvgeneOskin commented Apr 28, 2017

As far as I understand this pull request fix a fail when serialized string contains semicolon.

Also, I've solved my issue with custom ActivitySerializer and AggregatedSerializer.

class Base64ActivitySerializer(ActivitySerializer):

    def dumps(self, activity):
        serialized = super(Base64ActivitySerializer, self).dumps(activity)
        return base64.b64encode(serialized.encode('latin1')).decode('latin1')

    def loads(self, serialized):
        data = base64.b64decode(serialized.encode('latin1')).decode('latin1')
        return super(Base64ActivitySerializer, self).loads(data)


class FollowingSerializer(NotificationSerializer):

    activity_serializer_class = Base64ActivitySerializer

To me it would be nice to use json to serialize extra_context.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants