Directions forward

@yamins81
There are 2 main goals

1) A screening set that is representative of the difficulty in the 1000-way categorization task, for creating a challenge submission

2) A screening set that is representative of the difficulty that humans are good at in all of imagenet, for getting better neural fits

re 1)
We should use random L3 models (5 sets of features, one from each random model) and find a set of images that is hard to separate on average for the model class. This would mean extracting #N1 images from each synset, then getting margins for all 2-ways for each image. Then, we could just take the mean of the set of negative margins for each image as a score, and take the #N2 lowest scoring images.

re 2)
We should find the largest negative margins as above, but then for each of these margins, test it in humans. This means that we will have ranked list of tuples ranked by margin (most negative first):
(image, distractor_synset, margin) 

And we will search through this set of image tuples using psychophysics to find the first (going down the ordered list) #N2 tuples that have a performance above some threshold.

Here are some training curve results for MCC2 classification
The results for linearsvc are still being calculated (takes about 210 minutes to generate one of these curves.)

![screen shot 2013-09-30 at 5 22 44 pm](https://f.cloud.github.com/assets/2701347/1241139/cdc9188e-2a17-11e3-99a5-c8acd6783cb1.png)

Immediate points of action:
1) Deciding how many images per synset to extract (#N1), then extracting them.
2) Deciding the size of the screening set (#N2)
# N1 seems to be around 400 given the training curve (saturation around 300-350, need 50-100 test examples)

If you agree with this decision for #N1, then I will create a new dataset called PixelHardSynsets which you should then extract

``` python
import imagenet
dataset = imagenet.dldataset.PixelHardSynsets
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Directions forward #2

N1 seems to be around 400 given the training curve (saturation around 300-350, need 50-100 test examples)

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Directions forward #2

Description

N1 seems to be around 400 given the training curve (saturation around 300-350, need 50-100 test examples)

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions