Skip to content
This repository was archived by the owner on May 9, 2023. It is now read-only.

Search term suggestions #882

Merged
merged 17 commits into from
Jul 27, 2020
Merged

Search term suggestions #882

merged 17 commits into from
Jul 27, 2020

Conversation

danielnaab
Copy link
Contributor

@danielnaab danielnaab commented Jul 6, 2020

This branch implements a term suggestion drop-down, using a hardcoded list of search terms from search.gov and fuse.js. See #881

There are now three suggestions files:

  • suggestions-raw.json - The curated source list. Only make edits to this file
  • suggestions-counts.json - Includes search.gov results counts for each keyword suggestion in suggestions-raw.json
  • suggestions.json - Final list of filtered suggestions; currently, keywords with less that 2 search results are omitted.

Please confirm the following steps are completed:

  • Choose the appropriate target branch:
    • Content
      • preview (approved content) <- Content branch
      • master (production) <- preview
  • Assign an appropriate reviewer:
    • Content admin or Project lead for merge to preview (approved content)
    • Engineer for merge to master (production)

😎 PREVIEW URL

@maomeara63
Copy link
Contributor

Thoughts:

  1. Typeahead terms should, possibly, be in alphabetical order (or in order it would retrieve if you kept typing).
  2. Bolding should be consistent (if I type "vaccine," it bolds in suggestions 2 and 3, but not one).
  3. We shouldn't suggest misspelled words (e.g., vacine).
  4. We shouldn't suggest anything that leads to No Results (e.g., contact tracing). Routed query results ("We found this instead") is okay, I think.

An observation: I will say that when you search for something like "Church" the first three options have no results. I don't feel like that is getting any closer to a good solution than what we have now.

@debjudy
Copy link
Contributor

debjudy commented Jul 7, 2020

@maomeara63 Dan and I were discussing building out a curated list instead of extract from search.gov search queries, which is what is populating now with misspellings, etc. We can discuss how we might want to tackle that.

@danielnaab
Copy link
Contributor Author

fyi the current sort order is based on a calculation of string distance - basically terms that share the largest number of characters with the query are first.

An alternate way to do it is only show terms that start with the characters in the input box (Google does this), but we would need many terms to seed the suggestions with.

@debjudy debjudy linked an issue Jul 7, 2020 that may be closed by this pull request
10 tasks
@debjudy
Copy link
Contributor

debjudy commented Jul 7, 2020

@danielnaa Meghan, Kim and I discussed. We like the idea of the "google" approach of suggesting words. Attached is a cleaned list of words found in search.gov queries. It excludes helper words such as prepositions, articles, pronouns, etc. It is just terms, though, not phrases, so not sure if that is helpful or not.

searchgovWords.txt

@danielnaab
Copy link
Contributor Author

I moved the search terms into a standalone JSON file - https://github.com/18F/cv_faq/blob/search-term-suggestions/suggestions.json

Would everyone be comfortable making edits to this file? It's currently alphabetized, but the order doesn't matter. There's also test coverage that verifies that it's a valid JSON-formatted file, so there should be no worries about making syntax errors.

Cleaned up spelling, removed duplicates and junk entries.
@debjudy
Copy link
Contributor

debjudy commented Jul 8, 2020

@danielnaab Meghan and I just reviewed the json. Making edits to it looks very manageable. We also looked at how the type ahead is working using the json which also looks good! Made multiple updates to the file this afternoon.

@debjudy
Copy link
Contributor

debjudy commented Jul 9, 2020

Questions for Dan

  • Why does "anxiety" show up with "can i get"?
  • Is it worth having 3 letter words in the json?
  • Bolding only appears to happen when there is a full word match. Can we get it to bold as finds matching characters.
  • can we include apostrophes int the json?
  • HTML is appearing in the search box and in no search results.

…the suggestion template. (This works around a bug in accessible-autocomplete)
…sage of the suggestion template. (This works around a bug in accessible-autocomplete)"

This reverts commit 21d1368.
…arch results.

NOTE: This was initially tried in response to "anxiety" matching the term "can i get".
@danielnaab danielnaab force-pushed the search-term-suggestions branch from 87bfb98 to 80020ee Compare July 9, 2020 21:32
@danielnaab
Copy link
Contributor Author

@debjudy I removed the input pre-filling, added in the exact search term into the suggestions drop down, and tweaked a search parameter that addresses the "anxiety" match. I think the latter change should be safe, but may want to test a bit to confirm other terms still have good suggestions.

@debjudy
Copy link
Contributor

debjudy commented Jul 10, 2020

@danielnaab Thanks!

@kimschulke @maomeara63 @jonadecker would you please do some testing on the search suggestions and let us know what you think.

@danielnaab danielnaab marked this pull request as ready for review July 21, 2020 19:56
@debjudy
Copy link
Contributor

debjudy commented Jul 23, 2020

@danielnaab From testing this morning

  • searched for "air conditioning" which appeared in suggestion list. clicked on phrase in list. Search results returned stated showing 1 of 3 but no results displayed.

  • None of the terms for routed queries are in the suggestion list, e.g. 1200, dental, asymptomatic. Seems odd that these aren't there.

  • When entering characters in search box, the suggestions are unexpected. For example,
    -- Start with "s" and get suggestions of "ask a question", "assistance", 'assisted living"
    -- Start with "st" and first suggestion is "asthma". Next 4 in the list all start with "st"

@danielnaab
Copy link
Contributor Author

@danielnaab From testing this morning

  • searched for "air conditioning" which appeared in suggestion list. clicked on phrase in list. Search results returned stated showing 1 of 3 but no results displayed.
  • None of the terms for routed queries are in the suggestion list, e.g. 1200, dental, asymptomatic. Seems odd that these aren't there.
  • When entering characters in search box, the suggestions are unexpected. For example,
    -- Start with "s" and get suggestions of "ask a question", "assistance", 'assisted living"
    -- Start with "st" and first suggestion is "asthma". Next 4 in the list all start with "st"

I pushed something up to add in routed queries. WRT the search.gov count bug - there's a workaround I could look into adding in to get such ones removed from the suggestion list. WRT the Fuse.js "st" search behavior - I think we'll have to live with that - the library doesn't provide a way to prioritize "starts with" queries without compromising its fuzzy search behavior.

danielnaab and others added 4 commits July 23, 2020 12:01
added 80 new terms based upon last 3 weeks of terms searched which yield results.
Update raw list based upon searches from last 3 weeks.
@jonadecker
Copy link
Contributor

Still/again(?) seeing the "air conditioning" query weirdness Deb described above @danielnaab

Copy link
Contributor

@jonadecker jonadecker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@jonadecker jonadecker merged commit f1e8ef7 into preview Jul 27, 2020
@jonadecker jonadecker deleted the search-term-suggestions branch July 27, 2020 15:09
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Experiment with terms in type ahead
4 participants