Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CellPhoneDB v5 (update and is_ppi flag) #269

Open
dbdimitrov opened this issue Nov 28, 2023 · 14 comments
Open

CellPhoneDB v5 (update and is_ppi flag) #269

dbdimitrov opened this issue Nov 28, 2023 · 14 comments
Assignees
Labels
bug Problem in the code update Major changes

Comments

@dbdimitrov
Copy link
Collaborator

Hey Denes,

Recently, CellPhoneDB got bumped to v5, and the data is stored here:
https://github.com/ventolab/cellphonedb-data/tree/master

Seems to have changed format from:

'cellphonedb': {

Let me know if I can help with this.
Daniel

@dbdimitrov dbdimitrov added bug Problem in the code enhancement Adds a new feature/information update Major changes and removed bug Problem in the code enhancement Adds a new feature/information labels Nov 28, 2023
@dbdimitrov
Copy link
Collaborator Author

@deeenes also please use the is_ppi flag, I found a lot of erroneous interactions between enzymes and receptors. (no metabolite)

@dbdimitrov
Copy link
Collaborator Author

Maybe how I process it here would help:

saezlab/liana-py#60

@npalacioescat
Copy link
Member

Hi Daniel,

As far as I could find, pypath i already using the CellPhoneDB git as a source for the data (see here and then here), so I think it is already using the v5 version of the data.

What I found out now when checking this, is that although the retrieval of interactions works fine:

> from pypath.inputs import cellphonedb
> list(cellphonedb.cellphonedb_interactions())[-1]
CellphonedbInteraction(id_a='P16070', id_b='O43914', sources='CellPhoneDB', references='', interaction_type='unknown-unknown', type_a='unknown', type_b='unknown')

When you try to retrieve the ligand-receptor interactions it returns a tuple of empty sets:

> cellphonedb.cellphonedb_ligands_receptors()
(set(), set())

This seems to be an issue in how the complex annotations were being imported, and therefore the ligand/receptor attributes were being all labeled as False, I think I fixed it in #279

Regarding the use of is_ppi flag, seems a bit more complex to implement (and I wouldn't want to break anything), so maybe we can discuss in person and I could try to take a look into it, or we can wait for @deeenes to come back 😅

Since this should resolve your initial question, I'll close the issue and we can discuss the is_ppi thing later :)

Best

@dbdimitrov
Copy link
Collaborator Author

@Nic-Nic thanks Nico. Though I would say the is_ppi is crucial since there are now a lot of enzyme-enzyme interactions imported ad ligand-receptors 😅

@dbdimitrov dbdimitrov reopened this Feb 15, 2024
@dbdimitrov dbdimitrov changed the title CellPhoneDB v5 CellPhoneDB v5 (update and is_ppi flag) Feb 15, 2024
@dbdimitrov
Copy link
Collaborator Author

I renamed the issue and reopened since the two comments are tied. The flag was introduced along with the update of the database. 🙂

@dbdimitrov
Copy link
Collaborator Author

PS. Also, there is no need to implement the flag, it's simply about setting it to False, when whe resource is obtained. We don't want to include those, and I can think of limited use of having them even if we do.

npalacioescat pushed a commit that referenced this issue Feb 15, 2024
Added the `is_ppi` flag from the interaction database of CellPhoneDB
Fixes #269
@npalacioescat
Copy link
Member

Added the flag to the import method of the interactions database from CellPhoneDB (see #281). The decision on whether to filter out the False ones or not, is more for @deeenes to take 😅
Since the flag is now there (once the PR is merged), you can easily then apply the filter in your code if you deem it necessary :)

@dbdimitrov
Copy link
Collaborator Author

Hey Nico, thanks a lot.

I think it should definitely be False to default, or at least the clients should have it as false if possible - though that might be more work.

In short, they assume that the last production enzyme of a metabolite in one cell type, and a receptor/enzyme of another translate to the metabolite-receptor interaction. I think it's very specific to be pull by default as ligand-receptor interactions by the clients :)

deeenes added a commit that referenced this issue Feb 16, 2024
…ropagating it to the web service (part of addressing #269)
@dbdimitrov dbdimitrov reopened this Feb 20, 2024
@dbdimitrov
Copy link
Collaborator Author

Hey @deeenes @Nic-Nic,

It seems to me that the solution we discussed yesterday for liana, i.e. access the databases via the client, will not work if we don't filter the non-ppis here.

These non-ppis are either way incorporated into MetalinksDB, so for our usecases we don't need them.

So, I'm re-opening the issue. Let me know if you want me to add the line that the dataframe.

Daniel

@deeenes
Copy link
Member

deeenes commented Feb 20, 2024

Hey @dbdimitrov, you're right, having the attribute itself doesn't result in the removal of those interactions. We need two little things:

  1. This is one of the few tasks that belongs to the scope of integration (between OmniPath & LIANA), so there should be one line either in LIANA or in omnipath Python that makes sure is_ppi=True is removed;
  2. In the OmniPath network dataset definitions, the is_ppi interactions should go into a separate dataset, definitely not to the ligand-receptor one (this makes the prev. point redundant, but better to be safe, it doesn't cost anything)

We'll soon take care of these

@dbdimitrov
Copy link
Collaborator Author

Ping @deeenes, it will become time sensitive very soon 😄

@dbdimitrov
Copy link
Collaborator Author

@deeenes 👀

@and-rewsmith
Copy link

@deeenes I am curious, if one is to use cellphonedb via Liana, what is the effect of this is_ppi not functioning correctly? Will we see erroneous interactions between enzymes and receptors?

It is unclear to me because it seems like an integration between omnipath and liana, and this issue may have been addressed in the liana repo since this discussion.

@dbdimitrov
Copy link
Collaborator Author

Hi @and-rewsmith,

The flag still hasn't been implemented... If you wish to use liana with CellPhoneDBv5, you can format CellPhoneDB with:

import pandas as pd
import numpy as np

import requests
import io

# read csv from link
# https://github.com/ventolab/cellphonedb-data/blob/master/data/interaction_input.csv
resource = requests.get('https://raw.githubusercontent.com/ventolab/cellphonedb-data/master/data/interaction_input.csv').content
resource = io.StringIO(resource.decode('utf-8'))
resource = pd.read_csv(resource, sep=',')
# keep only PPIs
resource = resource[resource['is_ppi']][['interactors']]
# replace + with _
resource['interactors'] = resource['interactors'].apply(lambda x: x.replace('+', '_'))
# if interactors contains two '-' replace the first one with '&
resource['interactors'] = resource['interactors'].apply(lambda x: x.replace('-', '&', 1) if x.count('-') == 2 else x)
# split by - and expand
resource = resource['interactors'].str.split('-', expand=True)
# replace & with - in the first column
resource[0] = resource[0].apply(lambda x: x.replace('&', '-'))
resource.columns = ['ligand', 'receptor']

and then feed it to the resource parameter of any liana method.

Hope this helps!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Problem in the code update Major changes
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants