You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In Circuits, several abstract structures found in CV models were summarized. The Branches Specialization tendency of the CV neural networks, as well as the Weight Banding property of NNs last layers, have not been directly studied in LLMs, though the findings of several papers (1, 2) could be related.
The task is to perform a study of the abstract structures representedness in CV and NLP models, by applying the same inspection techniques to both groups of models. This would allow answering the following questions. Are the abstract structures identified for CV also found in NLP models? How do these structures differ among NLP models?
The text was updated successfully, but these errors were encountered:
oserikov
changed the title
Imlement tests for abstract structures such as in Curcuits thread
1. Imlement tests for abstract structures such as in Curcuits thread
Feb 26, 2022
oserikov
changed the title
1. Imlement tests for abstract structures such as in Curcuits thread
Imlement tests for abstract structures such as in Curcuits thread
Feb 26, 2022
duration: scalable, can be both 175 and 350 hours
mentor: @oserikov , Ryan Teehan
difficulty: medium
requirements:
useful links:
Idea Description:
In Circuits, several abstract structures found in CV models were summarized. The Branches Specialization tendency of the CV neural networks, as well as the Weight Banding property of NNs last layers, have not been directly studied in LLMs, though the findings of several papers (1, 2) could be related.
The task is to perform a study of the abstract structures representedness in CV and NLP models, by applying the same inspection techniques to both groups of models. This would allow answering the following questions. Are the abstract structures identified for CV also found in NLP models? How do these structures differ among NLP models?
Coding Challenge
Reproduce the Branch Specialization core insight: demonstrate the presence of the branch specialization in the CV model; Reproduce the Individual Neurons analysis on BERT model.
The text was updated successfully, but these errors were encountered: