-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for functional localization #240
base: main
Are you sure you want to change the base?
Conversation
for stimuli_idx in range(3, 14): | ||
data["sent"] += " " + data[f"stim{stimuli_idx}"].apply(str.lower) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what does this do? add comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
from brainscore_language import load_dataset | ||
|
||
BRAINIO_CACHE = os.environ.get("BRAINIO", f"{Path.home()}/.brainio") | ||
os.environ["TOKENIZERS_PARALLELISM"] = "False" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comment why this is necessary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
|
||
class Fed10_langlocDataset(Dataset): | ||
def __init__(self): | ||
self.num_samples = 240 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where is this being used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
line #103 in the extract_representations
function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok I'm not actually sure what this does -- looks like it's just used to zero-fill layer_name
(??)
Could this not also be derived from self.sentences
?
final_layer_representations = {
"sentences": {layer_name: np.zeros((langloc_dataset.num_samples, hidden_dim)) for layer_name in layer_names},
"non-words": {layer_name: np.zeros((langloc_dataset.num_samples, hidden_dim)) for layer_name in layer_names}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
replaced langloc_dataset.num_samples
with len(langloc_dataset.sentences)
Users can now perform functional localization as described in Brain-Like Language Processing via a Shallow Untrained Multihead Attention Network
Changes:
data/fedorenko2010_localization
and can be loaded via thedata_registry
model_helpers/localize
that computes language mask according to the paper mentioned above.brainio
HuggingfaceSubject
class was adapted to extract activations from multiple layers at once and make use of the localization script if theuse_localizer
flag is set to True. This extracts only the language selective units from all the activations.Usage:
examples/score_localization