diff --git a/docs/api/configuration.md b/docs/api/configuration.md index 6ce2a167f..5059e4679 100644 --- a/docs/api/configuration.md +++ b/docs/api/configuration.md @@ -19,7 +19,7 @@ embeddings: content: true ``` -Two top level settings are available to control where indexes are saved and if an index is a read-only index. +Three top level settings are available to control where indexes are saved and if an index is a read-only index. ### path ```yaml @@ -35,6 +35,9 @@ path: boolean Determines if the input embeddings index is writable (true) or read-only (false). This allows serving a read-only index. +### cloud +[Cloud storage settings](../../embeddings/configuration#cloud) can be set under a `cloud` top level configuration group. + ## Pipeline Pipelines are loaded as top level configuration parameters. Pipeline names are automatically detected in the YAML configuration and created upon startup. All [pipelines](../../pipeline) are supported. diff --git a/docs/cloud.md b/docs/cloud.md index 1fda2af11..750bb99ea 100644 --- a/docs/cloud.md +++ b/docs/cloud.md @@ -39,7 +39,7 @@ docker build -t txtai --build-arg GPU=1 . docker build -t txtai --build-arg COMPONENTS= . ``` -## Cache models in container images +## Container image model caching As mentioned previously, model caching is recommended to reduce container start times. The following commands demonstrate this. In all cases, it is assumed a config.yml file is present in the local directory with the desired configuration set. @@ -100,3 +100,64 @@ docker build -t txtai-workflow . # GPU build docker build -t txtai-workflow --build-arg BASE_IMAGE=neuml/txtai-gpu . ``` + +## Serverless Compute + +One of the most powerful features of txtai is building YAML-configured applications with the "build once, run anywhere" approach. API instances and workflows can run locally, on a server, on a cluster or serverless. + +Serverless instances of txtai are supported with frameworks such as [AWS SAM](https://github.com/aws/serverless-application-model) and [Serverless](https://github.com/serverless/serverless). + +The following steps shows a basic example of how to spin up a serverless API instance with AWS SAM. + +- Create config.yml and template.yml + +```yaml +# config.yml +writable: true + +embeddings: + path: sentence-transformers/nli-mpnet-base-v2 + content: true +``` + +```yaml +# template.yml +Resources: + txtai: + Type: AWS::Serverless::Function + Properties: + PackageType: Image + MemorySize: 3000 + Timeout: 20 + Events: + Api: + Type: Api + Properties: + Path: "/{proxy+}" + Method: ANY + Metadata: + Dockerfile: Dockerfile + DockerContext: ./ + DockerTag: api +``` + +- Install [AWS SAM](https://pypi.org/project/aws-sam-cli/) + +- Run following + +```bash +# Get Dockerfile and application +wget https://raw.githubusercontent.com/neuml/txtai/master/docker/aws/api.py +wget https://raw.githubusercontent.com/neuml/txtai/master/docker/aws/Dockerfile + +# Build the docker image +sam build + +# Start API gateway and Lambda instance locally +sam local start-api -p 8000 --warm-containers LAZY + +# Verify instance running (should return 0) +curl http://localhost:8080/count +``` + +If successful, a local API instance is now running in a "serverless" fashion. This configuration can be deployed to AWS using SAM. [See this link for more information.](https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/sam-cli-command-reference-sam-deploy.html) diff --git a/docs/embeddings/configuration.md b/docs/embeddings/configuration.md index 623e6b33e..6bdf49c02 100644 --- a/docs/embeddings/configuration.md +++ b/docs/embeddings/configuration.md @@ -1,32 +1,35 @@ # Configuration -## method +## Embeddings +This following describes available embeddings configuration. These parameters are set via the [Embeddings constructor](../methods#txtai.embeddings.base.Embeddings.__init__). + +### method ```yaml method: transformers|sentence-transformers|words|external ``` Sentence embeddings method to use. Options listed below. -### transformers +#### transformers Builds sentence embeddings using a transformers model. While this can be any transformers model, it works best with [models trained](https://huggingface.co/models?pipeline_tag=sentence-similarity) to build sentence embeddings. -### sentence-transformers +#### sentence-transformers Same as transformers but loads models with the sentence-transformers library. -### words +#### words Builds sentence embeddings using a word embeddings model. -### external +#### external Sentence embeddings are loaded via an external model or API. Requires setting the `transform` parameter to a function that translates data into vectors. The method is inferred using the _path_, if not provided. sentence-transformers and words require the [similarity](../../install/#similarity) extras package to be installed. -## path +### path ```yaml path: string ``` @@ -34,7 +37,7 @@ path: string Sets the path for a vectors model. When using a transformers/sentence-transformers model, this can be any model on the [Hugging Face Model Hub](https://huggingface.co/models) or a local file path. Otherwise, it must be a local file path to a word embeddings model. -## backend +### backend ```yaml backend: faiss|hnsw|annoy ``` @@ -44,7 +47,7 @@ Approximate Nearest Neighbor (ANN) index backend for storing generated sentence Backend-specific settings are set with a corresponding configuration object having the same name as the backend (i.e. annoy, faiss, or hnsw). None of these are required and are set to defaults if omitted. -### faiss +#### faiss ```yaml faiss: components: Comma separated list of components - defaults to "Flat" for small @@ -61,7 +64,7 @@ See the following Faiss documentation links for more information. - [Index Factory](https://github.com/facebookresearch/faiss/wiki/The-index-factory) - [Search Tuning](https://github.com/facebookresearch/faiss/wiki/Faster-search) -### hnsw +#### hnsw ```yaml hnsw: efconstruction: ef_construction param for init_index (int) - defaults to 200 @@ -72,7 +75,7 @@ hnsw: See [Hnswlib documentation](https://github.com/nmslib/hnswlib/blob/master/ALGO_PARAMS.md) for more information on these parameters. -### annoy +#### annoy ```yaml annoy: ntrees: number of trees (int) - defaults to 10 @@ -81,14 +84,14 @@ annoy: See [Annoy documentation](https://github.com/spotify/annoy#full-python-api) for more information on these parameters. Note that annoy indexes can not be modified after creation, upserts/deletes and other modifications are not supported. -## content +### content ```yaml content: string|boolean ``` Enables content storage. When true, the default content storage engine will be used. Otherwise, the string must specify the supported content storage engine to use. -## quantize +### quantize ```yaml quantize: boolean ``` @@ -96,9 +99,9 @@ quantize: boolean Enables quanitization of generated sentence embeddings. If the index backend supports it, sentence embeddings will be stored with 8-bit precision vs 32-bit. Only Faiss currently supports quantization. -## Additional configuration for Transformers models +### Additional configuration for Transformers models -### tokenize +#### tokenize ```yaml tokenize: boolean ``` @@ -106,29 +109,89 @@ tokenize: boolean Enables string tokenization (defaults to false). This method applies tokenization rules that only work with English language text and may increase the quality of English language sentence embeddings in some situations. -## Additional configuration for Word embedding models +### Additional configuration for Word embedding models Word embeddings provide a good tradeoff of performance to functionality for a similarity search system. With that being said, Transformers models are making great progress in scaling performance down to smaller models and are the preferred vector backend in txtai for most cases. Word embeddings models require the [similarity](../../install/#similarity) extras package to be installed. -### storevectors +#### storevectors ```yaml storevectors: boolean ``` Enables copying of a vectors model set in path into the embeddings models output directory on save. This option enables a fully encapsulated index with no external file dependencies. -### scoring +#### scoring ```yaml scoring: bm25|tfidf|sif ``` A scoring model builds weighted averages of word vectors for a given sentence. Supports BM25, TF-IDF and SIF (smooth inverse frequency) methods. If a scoring method is not provided, mean sentence embeddings are built. -### pca +#### pca ```yaml pca: int ``` Removes _n_ principal components from generated sentence embeddings. When enabled, a TruncatedSVD model is built to help with dimensionality reduction. After pooling of vectors creates a single sentence embedding, this method is applied. + +## Cloud + +This section describes parameters used to sync compressed indexes with cloud storage. These parameters are only enabled if an embeddings index is stored as compressed. They are set via the [embeddings.load](../methods/#txtai.embeddings.base.Embeddings.load) and [embeddings.save](../methods/#txtai.embeddings.base.Embeddings.save) methods. + +### provider +```yaml +provider: string +``` + +The cloud storage provider, see [full list of providers here](https://libcloud.readthedocs.io/en/stable/storage/supported_providers.html). + +### container +```yaml +container: string +``` + +Container/bucket/directory name. + +### key +```yaml +key: string +``` + +Provider-specific access key. Can also be set via ACCESS_KEY environment variable. Ensure the configuration file is secured if added to the file. + +### secret +```yaml +secret: string +``` + +Provider-specific access secret. Can also be set via ACCESS_SECRET environment variable. Ensure the configuration file is secured if added to the file. + +### host +```yaml +host: string +``` + +Optional server host name. Set when using a local cloud storage server. + +### port +```yaml +port: int +``` + +Optional server port. Set when using a local cloud storage server. + +### token +```yaml +token: string +``` + +Optional temporary session token + +### region +```yaml +region: string +``` + +Optional parameter to specify the storage region, provider-specific. diff --git a/docs/install.md b/docs/install.md index 10a930a05..5633abfa3 100644 --- a/docs/install.md +++ b/docs/install.md @@ -52,6 +52,14 @@ Serve txtai via a web API. pip install txtai[api] ``` +### Cloud + +Interface with cloud compute. + +``` +pip install txtai[cloud] +``` + ### Database Additional content storage options diff --git a/src/python/txtai/embeddings/base.py b/src/python/txtai/embeddings/base.py index 7fd2ace15..59043ac92 100644 --- a/src/python/txtai/embeddings/base.py +++ b/src/python/txtai/embeddings/base.py @@ -340,13 +340,13 @@ def batchsimilarity(self, queries, data): # Add index and sort desc based on score return [sorted(enumerate(score), key=lambda x: x[1], reverse=True) for score in scores] - def exists(self, path, archive=None): + def exists(self, path, cloud=None): """ Checks if an index exists at path. Args: path: input path - archive: archive configuration + cloud: cloud storage configuration Returns: True if index exists, False otherwise @@ -355,23 +355,23 @@ def exists(self, path, archive=None): # Check if this is an archive file and exists path, apath = self.checkarchive(path) if apath: - return self.archive.exists(apath, archive) + return self.archive.exists(apath, cloud) return os.path.exists(f"{path}/config") and os.path.exists(f"{path}/embeddings") - def load(self, path, archive=None): + def load(self, path, cloud=None): """ Loads an existing index from path. Args: path: input path - archive: archive configuration + cloud: cloud storage configuration """ # Check if this is an archive file and extract path, apath = self.checkarchive(path) if apath: - self.archive.load(apath, archive) + self.archive.load(apath, cloud) # Index configuration with open(f"{path}/config", "rb") as handle: @@ -403,13 +403,13 @@ def load(self, path, archive=None): if self.database: self.database.load(f"{path}/documents") - def save(self, path, archive=None): + def save(self, path, cloud=None): """ Saves an index. Args: path: output path - archive: archive configuration + cloud: cloud storage configuration """ if self.config: @@ -446,7 +446,7 @@ def save(self, path, archive=None): # If this is an archive, save it if apath: - self.archive.save(apath, archive) + self.archive.save(apath, cloud) def close(self): """ diff --git a/src/python/txtai/embeddings/cloud.py b/src/python/txtai/embeddings/cloud.py index a7f8aade0..bd6efadcf 100644 --- a/src/python/txtai/embeddings/cloud.py +++ b/src/python/txtai/embeddings/cloud.py @@ -37,7 +37,12 @@ def __init__(self, config): # Get client connection self.client = driver( - config.get("key", os.environ.get("access_key")), config.get("secret", os.environ.get("access_secret")), region=config.get("region") + config.get("key", os.environ.get("ACCESS_KEY")), + config.get("secret", os.environ.get("ACCESS_SECRET")), + host=config.get("host"), + port=config.get("port"), + token=config.get("token"), + region=config.get("region"), ) def exists(self, path):