Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 9 additions & 8 deletions BLOG_GUIDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,8 +37,6 @@ categories:
meta_keywords: semantic search engine, neural search engine, keyword and natural language search, search relevance, benchmarking tests
meta_description: Learn how to create a semantic search engine in OpenSearch, including architecture and model options, benchmarking tests, and effects of different combination strategies and normalization protocols.
excerpt: In an earlier blog post, we described different ways of building a semantic search engine in OpenSearch. In this post, we'll dive further into the science behind it. We'll discuss the benefits of combining keyword-based search with neural search, the architecture and model options, and benchmarking tests and results. First, we'll provide an overview of our proposed solutions and a summary of the main results. Next, we'll outline the steps for creating a solution and fine-tuning it for your own document corpus. Finally, we'll discuss the effects of different combination strategies and normalization protocols on search relevance.
has_math: true
has_science_table: true
---
```

Expand All @@ -54,8 +52,10 @@ Variable | Description
`meta_keywords` | Meta keywords are provided by the marketing team once you put up a PR with the blog post.
`meta_description` | Meta keywords are provided by the marketing team once you put up a PR with the blog post.
`excerpt` | (Optional) A blog excerpt you want to appear on the [blog front page](https://opensearch.org/blog). If you don't provide this variable, the excerpt will contain the first paragraph of the blog. If you do provide your own excerpt, make sure it does not contain any special Markdown formatting because this formatting will be ignored and displayed as is. For example, if you surround a word with tic marks, the tic marks will be displayed rather than formatting the variable in code font.
`has_math` | (Optional) If your blog post contains mathematical formulas, set this variable to `true` so you can use the [MathJax](https://www.mathjax.org/) syntax to render the formulas.
`has_science_table` | (Optional) By default, tables do not render grid lines. To add grid lines to your table, set this variable to `true`.
~~has_math~~ | ~~(Optional) If your blog post contains mathematical formulas, set this variable to `true` so you can use the [MathJax](https://www.mathjax.org/) syntax to render the formulas.~~
~~has_science_table~~ | ~~(Optional) By default, tables do not render grid lines. To add grid lines to your table, set this variable to `true`.~~

*Note:* `has_math` and `has_science_table` are no longer used.

## Title

Expand All @@ -81,11 +81,12 @@ If you want to specify the image width or another style, use HTML syntax:
<img src="/assets/media/blog-images/2023-02-13-semantic-search-solutions/vectors.jpg" alt="Similar vectors" width="700"/>
```

To center a Markdown image, specify the `img-centered` class for the image:
~~To center a Markdown image, specify the `img-centered` class for the image:~~

```
![Similar vectors](/assets/media/blog-images/2023-02-13-semantic-search-solutions/vectors.jpg){:class="img-centered"}
```
~~```![Similar vectors](/assets/media/blog-images/2023-02-13-semantic-search-solutions/vectors.jpg){:class="img-centered"}```~~


^ The markdown convention that Jekyll uses for adding CSS classes causes issues with our publishing process. Please refrain from using it. Use the raw HTML methodology below.

To center an HTML image, include `class="centered"` in the image tag:

Expand Down
76 changes: 7 additions & 69 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,38 +1,24 @@
<img src="https://opensearch.org/assets/img/opensearch-logo-themed.svg" height="64px">

- [OpenSearch.org website](#opensearchorg-website)
- [Getting help](#getting-help)
- [Contributing](#contributing)
- [Adding to the Partners page](#adding-to-the-partners-page)
- [Writing a blog](#writing-a-blog)
- [Building the website](#building-the-website)
- [Testing](#testing)
- [Link checker](#link-checker)
- [Code of Conduct](#code-of-conduct)
- [Security](#security)
- [License](#license)
- [Credits](#credits)
- [Copyright](#copyright)

# OpenSearch.org website

This repo contains the source for the [opensearch.org](https://opensearch.org/) website.
Opensearch.org is now hosted by the Linux Foundation. This repo contains the source for the [opensearch.org](https://opensearch.org/) website as it existed as a statically generated Jekyll site. It is now a content repository instead of a code repository.

## Getting help
PRs are still used for intake and editorial.

If you find a bug, or have a feature request, please don't hesitate to open an issue in this repository.
## Getting help

If you need help and are unsure where to open an issue, try [forums](https://forum.opensearch.org/).
If you find a bug, or have a feature request, please don't hesitate to open an issue in this repository. If you need help and are unsure where to open an issue, try [forums](https://forum.opensearch.org/) or our [slack instance](https://www.opensearch.org/slack.html).

## Contributing

We welcome contributions! Please see our [CONTRIBUTING](CONTRIBUTING.md) page to learn more about how to contribute to the website.

_Note:_ As of July 20, 2021, contributions are welcome on the `main` branch; the `prod` branch is now protected and holds the finalized version of the site. The `staging` branch has been removed and is no longer being used.

### Adding to the Partners page
### Becoming a Solutions Provider

If you are a partner, you are welcome to add your logo/link to our partners page. Please copy and edit the [sample file](_partners/_sample.md), and submit a pull request.
If you would like to become a Solutions Provider and be listed on our [solutions providers](https://opensearch.org/solutions-providers/), [submit your application](https://github.com/opensearch-project/project-website/issues/new?template=solutions_provider_update.md) through Github.

### Writing a blog

Expand All @@ -52,57 +38,9 @@ For writing guidelines, see the [OpenSearch Project Style Guidelines](https://gi

If you need help, contact [pajuric](https://github.com/pajuric).

### Building the website

This site uses [Jekyll](https://jekyllrb.com/). You can build the site and make it available on a local server via `docker-compose up -d`, or by installing all the dependencies on your local environment as follows (tested to work with Ruby 2.7.2).

1. Install [Ruby](https://www.ruby-lang.org/en/) and [Bundler](https://bundler.io/), then run `bundle install`.
2. Build and start Jekyll with `bundle exec jekyll serve`.
3. Browse the site at [`http://127.0.0.1:4000/`](http://127.0.0.1:4000/).

Alternatively, build the site with `bundle exec jekyll build`. The HTML output is generated into `/_site`. For the full configuration options when running Jekyll, see [this page](https://jekyllrb.com/docs/configuration/options/).

A full site build takes around 20 seconds. If you want to shave off some time, you can build the development version which lacks the sitemap.xml (which is very time consuming to build). The development version takes about 3 seconds to build, so it's great for fast iteration but not exactly what will be built in deployment (it's very close).

```
BUNDLE_GEMFILE=Gemfile-dev bundle exec jekyll serve --config ./_config-dev.yml
```

#### Content Modifier

In order to automatically mitigate some common security risks, the generated pages are scanned and modified, during build, by the [`ContentModifier` plugin](_plugins/content-modifier.rb). Due to its impact on build times, the plugin does not run when developing locally using `jekyll serve`. This behavior can be changed to force the execution of plugin while serving by adding the ENV flag `JEKYLL_ALLOW_CONTENT_MODIFIER`. E.g.
```sh
JEKYLL_ALLOW_CONTENT_MODIFIER= bundle exec jekyll serve
```

### Search bar invisible pages

To prevent a document from appearing in search results, you can add `omit_from_search: true` to its front matter.


### Testing

#### Link checker

We use a link checker plugin to ensure that we don't have any broken links on the website. It does not run by default since it can slow down the build, especially when running `bundle exec jekyll serve`. To run the link checker, add the ENV flag `JEKYLL_LINK_CHECKER` or `JEKYLL_FATAL_LINK_CHECKER` with any one of the valid values `internal`,`forced`,`all` or `retry`. Each option tests a larger range of links. E.g.

```sh
JEKYLL_FATAL_LINK_CHECKER=all bundle exec jekyll build
```

**`JEKYLL_LINK_CHECKER` vs `JEKYLL_FATAL_LINK_CHECKER`**

They both accept the same values with the only difference being that `JEKYLL_FATAL_LINK_CHECKER` fails the build if a broken link is found

**Env values**
1. **internal**: validates only the internal links
2. **forced**: validates internal links and links that are technically internal but instead link to an external page. e.g. `/docs`
3. **all**: validates all links. however this option does not retry retry-able link or follow redirection links. e.g. HTTP:429 (too many attempts, retry after), HTTP:301 (Permanent redirect)
4. **retry**: validates all the links but also retries links with retry-able HTTP header

## Code of Conduct

This project has adopted the [Amazon Open Source Code of Conduct](CODE_OF_CONDUCT.md). For more information see the [Code of Conduct FAQ](https://aws.github.io/code-of-conduct-faq), or contact [[email protected]](mailto:[email protected]) with any additional questions or comments.
This project has adopted the [OpenSearch Code of Conduct](CODE_OF_CONDUCT.md). See [NOTICE](https://github.com/opensearch-project/.github/blob/main/NOTICE.txt) for details.

## Security

Expand Down