Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Throw Errors from AWS SDK or Expose Custom Errors in TypeScript API #9

Open
jfw225 opened this issue Aug 27, 2024 · 4 comments
Open

Comments

@jfw225
Copy link

jfw225 commented Aug 27, 2024

Firstly, I want to preface this by saying that I'm shocked (but also at the same time not shocked) that AWS hasn't released an official DAX client with V3 support by now, but you guys are awesome for putting this library together.

I run a production code base and have been trying to start using DAX for some of our APIs. In our code base, I have built an abstraction layer to swap between the standard AWS SDK DB client and your DAX client. For the most part, your DAX client has worked well as a drop-in replacement.

The only part that prevents your client from being a complete drop-in replacement (at least for our code base) is that when your client encounters a service error, you throw a custom DaxServiceError instance. There are a lot of places in our code base where we use the instanceof operator to handle certain errors from @aws-sdk/client-dynamodb (e.g. ConditionalCheckFailedException) which obviously wouldn't work with your custom error.

After digging into your source code, I do see that you have already done the hard part in terms of parsing the response to figure out the correct error. Ideally, it would be great if you could reconstruct the SDK errors from @aws-sdk/client-dynamodb. Otherwise, I would be grateful if you guys could add a quick typescript declaration for your custom DaxServiceError and expose the fields that are necessary to reconstruct the AWS SDK errors.

@daveharig
Copy link
Contributor

daveharig commented Aug 27, 2024

I had a similar issue, and agree with harmonizing the error handling. As a workaround, I ultimately built a system where it fails over to DDB client if DAX client fails. Then I log the DAX errors and deal with it in the morning. This covers every manor of potential DAX failure including db service, cluster availability, VPC network nightmares, capacity etc. This DAX client is a baby step for me to get my feet wet and offload some high use access patterns for cost and UX speed.

    let output: GetItemCommandOutput;
    if (useDax) {
      try {
        msDebugLog(
          helper,
          "DAX client selected, sending request to DAX cluster"
        );
        output = await new Promise((resolve, reject) => {
          // If daxClient is unable to get the response in 6 seconds, throw error and fallback to dynamoDbClient
          const timeout = setTimeout(() => {
            reject(new Error("DAX request timed out after 6 seconds"));
          }, 6000);
          // If this daxClient gets executed within 6 seconds, then we return the result and do not do a fallback
          daxClient
            .send(new GetItemCommand(getItemParams))
            .then((result) => {
              clearTimeout(timeout);
              resolve(result);
            })
            .catch((error) => {
              clearTimeout(timeout);
              reject(
                new Error(
                  "DAX client send error: " + JSON.stringify(error, null, 2)
                )
              );
            });
        });
      } catch (error) {
        msErrorLog(
          helper,
          "DAX client failed, falling over to DynamoDB client",
          error
        );
        output = await dynamoDbClient.send(new GetItemCommand(getItemParams));
      }
    } else {
      output = await dynamoDbClient.send(new GetItemCommand(getItemParams));
    }
    return output.Item as T | undefined;
  } 

@jfw225
Copy link
Author

jfw225 commented Aug 27, 2024

Thanks for the quick reply. I'm really glad I was not the only one experiencing VPC network nightmares. I'm working on a simple parser to reconstruct the SDK errors on my end anyway. I'll share it here once I have it working in case it's of any use.

@jfw225
Copy link
Author

jfw225 commented Aug 28, 2024

Okay here is my rough error parser. In case you want to harmonize the error parsing with the DB SDK, this may serve has a helpful starting point.

Some Notes

  • To make the VPC nightmares less prevalent, I created a 'dax-proxy' lambda function to handle all of the DAX interactions, and then invoke this lambda function in my other APIs. Thus, when handling errors (from AWS services), I needed them to be serializable (which is why there are the custom types for that).
  • At this point, the only DB SDK error that I actually retrieve extra info from is TransactionCanceledException because it gives insight into why it was canceled. To parse this extra info from the thrown error, I implemented a regex. This is definitely a little hacky, but I've attached the test suite I wrote to make sure it works.
  • Also, my DAX proxy is very robustly tested to make sure that we don't encounter a regression. I didn't include that full test suite.
  • Sometimes when DynamoDB client throws, it will just throw an object of type DynamoDBServiceException (not a child class). I still need to handle this case. the I mention this, but do not implement it below.
  • Everything below assumes that you are using the @aws-sdk/client-dynamodb client, not the lib-db client. Thus, that library may have slightly different errors (probably different classes as well).

DaxServiceError -> Serializable AWS SDK Error

(@yaws/dynamodb/src/core/dynamodb-utils/db.error-utils.ts)

import { CancellationReason } from "@aws-sdk/client-dynamodb";
import {
	SerializableAwsDbError,
	doesDbErrorNeedData,
	getAwsDbErrorNames,
} from "@yaws/dynamodb";
import { assertUnreachable } from "yoomi-shared-js";

/**
 * The DAX client that we use
 * @see https://github.com/kwojcicki/amazon-dax-client-v3
 * throws custom errors. However, sometimes these errors are caught by AWS
 * middleware and converted to a new error object (which loses the fields from
 * the original error). This happens in:
 * @see node_modules/@aws-sdk/client-dynamodb/node_modules/@smithy/middleware-retry/dist-cjs/index.js
 *
 * Thus, this function attempts to parse the error object from the message
 * string. This function is certainly hacky, but we rigorously test it.
 */
export function parseDaxServiceError(
	e: Error,
): SerializableAwsDbError | undefined {
	// The error message is in the format: `DaxServiceError: <aws-sdk-error-name>: ${message}`

	const names = getAwsDbErrorNames();
	type Name = (typeof names)[number];

	const constructOutput = (name: Name): SerializableAwsDbError => {
		if (doesDbErrorNeedData(name)) {
			switch (name) {
				// handle special cases individually
				case "TransactionCanceledException":
					return {
						type: "AwsDbError",
						name,
						$metadata: {},
						stack: e.stack,
						message: e.message,
						data: {
							CancellationReasons:
								_transactCanceledExtractReasonsFromErrorMsg(
									e.message,
								),
						},
					} satisfies SerializableAwsDbError;

				default:
					return assertUnreachable(name);
			}
		}

		return {
			type: "AwsDbError",
			name,
			$metadata: {},
			stack: e.stack,
			message: e.message,
			data: undefined,
		} satisfies SerializableAwsDbError;
	};

	for (const name of names) {
		const re = new RegExp(`DaxServiceError: ${name}`);

		if (re.test(e.message)) {
			return constructOutput(name);
		}
	}

	// if there was no match, then we return `undefined` to indicate that we
	// could not parse the error
	return undefined;
}

/**
 * We only export this for unit testing purposes.
 */
export function _transactCanceledExtractReasonsFromErrorMsg(
	msg: string,
): CancellationReason[] {
	const regex = /for specific reasons \[([^\]]*)\]/g;

	// get the first match
	const match = regex.exec(msg)?.[1];

	if (!match) {
		console.warn(`No match found for ${msg}`);

		return [];
	}

	return match
		.split(",")
		.map((e) => e.trim())
		.filter((e) => e.length > 0)
		.map((e) => ({ Code: e }));
}

AWS SDK Error Encoder/Decoder (stringify/parse)

import type * as DDB_IMPL from "@aws-sdk/client-dynamodb";
import {
	BackupInUseException,
	BackupNotFoundException,
	ConditionalCheckFailedException,
	ContinuousBackupsUnavailableException,
	DuplicateItemException,
	DynamoDBServiceException,
	ExportConflictException,
	ExportNotFoundException,
	GlobalTableAlreadyExistsException,
	GlobalTableNotFoundException,
	IdempotentParameterMismatchException,
	ImportConflictException,
	ImportNotFoundException,
	IndexNotFoundException,
	InternalServerError,
	InvalidEndpointException,
	InvalidExportTimeException,
	InvalidRestoreTimeException,
	ItemCollectionSizeLimitExceededException,
	LimitExceededException,
	PointInTimeRecoveryUnavailableException,
	PolicyNotFoundException,
	ProvisionedThroughputExceededException,
	ReplicaAlreadyExistsException,
	ReplicaNotFoundException,
	RequestLimitExceeded,
	ResourceInUseException,
	ResourceNotFoundException,
	TableAlreadyExistsException,
	TableInUseException,
	TableNotFoundException,
	TransactionCanceledException,
	TransactionConflictException,
	TransactionInProgressException,
} from "@aws-sdk/client-dynamodb";
import {
	Constructor,
	ExcludeStrict,
	ExtractStrict,
	assertUnreachable,
	isObjectGeneral,
	lazyFactoryFn,
	safeObjectKeys,
} from "yoomi-shared-js";

type DDB = typeof DDB_IMPL;

type AllAwsDbErrors = {
	[K in keyof DDB]: DDB[K] extends Constructor<unknown>
		? InstanceType<DDB[K]> extends DynamoDBServiceException
			? // we exclude the DynamoDBServiceException itself
				DynamoDBServiceException extends InstanceType<DDB[K]>
				? never
				: InstanceType<DDB[K]>
			: never
		: never;
}[keyof DDB];

type AwsDbErrorNames = AllAwsDbErrors["name"];

/**
 * Type mapping between some of the error names and the additional information
 * that they contain.
 */
interface AdditionalErrorInfo {
	TransactionCanceledException: {
		CancellationReasons: DDB_IMPL.CancellationReason[] | undefined;
	};
}

// the error names with additional information (strict exclude enforces that
// the keys of `AdditionalErrorInfo` are actually error names)
type WithData = ExtractStrict<AwsDbErrorNames, keyof AdditionalErrorInfo>;
type WithoutData = ExcludeStrict<AwsDbErrorNames, WithData>;

export interface SerializableAwsDbErrorBase<N extends AwsDbErrorNames, Data> {
	type: "AwsDbError";
	name: N;
	$metadata: DynamoDBServiceException["$metadata"];
	stack: string | undefined;
	message: string;

	/**
	 * We intentionally set this to `undefined` as well as `Data` in case that
	 * it is missing for some reason. We want this parser to be as robust as
	 * possible.
	 */
	data: Data | undefined;
}

// we intentionally distribute this union
type SerializableWithData<N extends WithData = WithData> = N extends N
	? SerializableAwsDbErrorBase<N, AdditionalErrorInfo[N]>
	: never;

// we intentionally do not distribute this union
type SerializableWithoutData<N extends WithoutData = WithoutData> =
	SerializableAwsDbErrorBase<N, undefined>;

export type SerializableAwsDbError =
	| SerializableWithData
	| SerializableWithoutData;

// TODO: ensure that we allow for `DynamoDBServiceException` objects that
// are not in the named error list. however, make sure to prioritize the named
// list.
export function isAwsDbError(e: unknown): e is AllAwsDbErrors {
	return e instanceof DynamoDBServiceException && e.name in ERROR_MAP;
}

export function isSerializableAwsDbError(
	e: unknown,
): e is SerializableAwsDbError {
	return isObjectGeneral<Pick<SerializableAwsDbError, "type">>(e, {
		type: (v, r): v is typeof r => v === ("AwsDbError" satisfies typeof r),
	});
}

export function toSerializableAwsDbError(
	e: AllAwsDbErrors,
): SerializableAwsDbError {
	const base: SerializableAwsDbErrorBase<AwsDbErrorNames, undefined> = {
		type: "AwsDbError",
		name: e.name,
		$metadata: e.$metadata,
		stack: e.stack,
		message: e.message,
		data: undefined,
	};

	if (doesDbErrorNeedData(e.name)) {
		switch (e.name) {
			case "TransactionCanceledException":
				return {
					...base,
					name: e.name,
					data: {
						CancellationReasons: e.CancellationReasons,
					},
				} satisfies SerializableAwsDbError;

			/* istanbul ignore next */
			default:
				return assertUnreachable(e.name);
		}
	}

	return {
		...base,
		name: e.name,
		data: undefined,
	} satisfies SerializableAwsDbError;
}

export function reconstructAwsDbError(
	s: SerializableAwsDbError,
): AllAwsDbErrors {
	const constructError = <N extends AwsDbErrorNames>(
		name: N,
	): InstanceType<DDB[N]> => {
		const C = ERROR_MAP[name];

		const output = new C({
			$metadata: s.$metadata,
			message: s.message,
		});

		output.stack = s.stack;

		return output satisfies AllAwsDbErrors as InstanceType<DDB[N]>;
	};

	if (doesDbErrorNeedData(s.name)) {
		switch (s.name) {
			case "TransactionCanceledException":
				return (() => {
					const e = constructError("TransactionCanceledException");

					e.CancellationReasons = s.data?.CancellationReasons;

					return e;
				})();

			/* istanbul ignore next */
			default:
				return assertUnreachable(s.name);
		}
	}

	return constructError(s.name);
}

export const getAwsDbErrorNames: () => AwsDbErrorNames[] = lazyFactoryFn(() => {
	const keys = safeObjectKeys(ERROR_MAP);

	return () => keys;
});

const getAwsDbErrorNamesWithData: () => WithData[] = lazyFactoryFn(() => {
	const keys = safeObjectKeys<AdditionalErrorInfo>({
		TransactionCanceledException: null,
	});

	return () => keys;
});

export function doesDbErrorNeedData(name: AwsDbErrorNames): name is WithData {
	return getAwsDbErrorNamesWithData().includes(name as WithData);
}

// mapping from error name to error constructor
const ERROR_MAP: {
	[E in AllAwsDbErrors as E["name"]]: DDB[E["name"]];
} = {
	ConditionalCheckFailedException,
	BackupInUseException,
	BackupNotFoundException,
	InternalServerError,
	RequestLimitExceeded,
	InvalidEndpointException,
	ProvisionedThroughputExceededException,
	ResourceNotFoundException,
	ItemCollectionSizeLimitExceededException,
	ContinuousBackupsUnavailableException,
	LimitExceededException,
	TableInUseException,
	TableNotFoundException,
	GlobalTableAlreadyExistsException,
	ResourceInUseException,
	TransactionConflictException,
	PolicyNotFoundException,
	ExportNotFoundException,
	GlobalTableNotFoundException,
	ImportNotFoundException,
	DuplicateItemException,
	IdempotentParameterMismatchException,
	TransactionInProgressException,
	ExportConflictException,
	InvalidExportTimeException,
	PointInTimeRecoveryUnavailableException,
	ImportConflictException,
	TableAlreadyExistsException,
	InvalidRestoreTimeException,
	ReplicaAlreadyExistsException,
	ReplicaNotFoundException,
	IndexNotFoundException,
	TransactionCanceledException,
};

Some Utilities from Custom Library (yoomi-shared-js)

This is certainly far less important, but I thought I would include it in case you wanted a complete understanding (and context) of the implementations above.

/**
 * The type representing the constructor of some class `T`.
 */
// eslint-disable-next-line @typescript-eslint/no-explicit-any
export type Constructor<T> = abstract new (...args: any[]) => T;


/**
 * Same as
 * @see Exclude
 * but requires types to be present in `T`.
 */
export type ExcludeStrict<T, U extends T> = Exclude<T, U>;

/**
 * Same as
 * @see Extract
 * but requires types to be present in `T`.
 */
export type ExtractStrict<T, U extends T> = Extract<T, U>;

/**
 * This function throws an error whenever it is called. It is useful for
 * situations where you want a switch statement to be exhaustive. Put this in
 * the default case of the switch statement and call it with the value of the
 * switch statement.
 */
export function assertUnreachable(
	x: never,
	opts?: { getError?: () => Error; msg?: string },
): never {
	if (opts?.getError) {
		throw opts.getError();
	}

	throw new Error(
		opts?.msg ??
			`Assert unreachable. Didn't expect to get here. x: ${JSON.stringify(
				x,
			)}`,
	);
}

/**
 * Lazily executes the given factory function and returns its result.
 *
 * Note: if you assign a field to the returned function, it will not be usable
 * until after the function is called. We could improve this by implementing the
 * returned function wrapper as a proxy that calls the factory function when any
 * fields are accessed.
 */
export function lazyFactoryFn<
	Fn extends (...args: any[]) => any,
	FacArgs extends any[] = any[],
>(factoryFn: (...args: FacArgs) => Fn, ...factoryArgs: FacArgs): Fn {
	let fn: Fn | undefined;

	return function (...args) {
		if (fn === undefined) {
			fn = factoryFn(...factoryArgs);
		}

		return fn(...args);
	} as Fn;
}


export function safeObjectKeys<
	const K extends string,
	const T extends { [KK in K]: unknown },
>(obj: T): (keyof T)[];

export function safeObjectKeys<T extends { [K in keyof T]: unknown }>(
	obj: keyof T extends string ? { [K in keyof T]: unknown } : never,
): (keyof T)[];

/**
 * Same as
 * @see Object.keys
 * but with a type-safe return type.
 */
export function safeObjectKeys<
	K extends string,
	T extends { [KK in K]: unknown },
>(obj: T): (keyof T)[] {
	return Object.keys(obj) as K[];
}

/**
 * A general predicate function that returns `true` if the given object contains
 * all of the keys and runtime types of the given `expected` object.
 */
export function isObjectGeneral<O extends { [K in keyof O]: O[K] }>(
	obj: unknown,
	expected: ObjTypeToObjValidator<O>,
	opts?: {
		getError?: ValidatorGetErrorFn<O>;
	},
): obj is O {
    throw new Error(`Implementation not included`).
}

Note that I didn't include the implementation or types for isObjectGeneral because they are pretty complex.

Test Suite for DaxServiceError -> Serializable AWS SDK Error

import { _transactCanceledExtractReasonsFromErrorMsg } from "../dax.error-parser";

const extractReasons = (str: string) => {
	return _transactCanceledExtractReasonsFromErrorMsg(str).map((r) => r.Code);
};

describe("Extract reasons from error strings", () => {
	test("should extract multiple reasons", () => {
		const input =
			" for specific reasons [ConditionalCheckFailed, ConditionalCheckFailed, ConditionalCheckFailed]";
		expect(extractReasons(input)).toEqual([
			"ConditionalCheckFailed",
			"ConditionalCheckFailed",
			"ConditionalCheckFailed",
		]);
	});

	test("should extract a single reason", () => {
		const input = " for specific reasons [ConditionalCheckFailed]";
		expect(extractReasons(input)).toEqual(["ConditionalCheckFailed"]);
	});

	test("should handle empty brackets", () => {
		const input = " for specific reasons []";
		expect(extractReasons(input)).toEqual([]);
	});

	test("should handle trailing commas", () => {
		const input =
			" for specific reasons [ConditionalCheckFailed, ConditionalCheckFailed, ]";
		expect(extractReasons(input)).toEqual([
			"ConditionalCheckFailed",
			"ConditionalCheckFailed",
		]);
	});

	test("should handle strings without reasons correctly", () => {
		const input = "Transaction failed without specific reasons";
		expect(extractReasons(input)).toEqual([]);
	});

	test("should extract reasons mixed with DynamoDB error messages", () => {
		const input = `Transaction failed for specific reasons [TransactionConflict, NetworkFailure], another error for specific reasons [ProvisionedThroughputExceeded]`;
		expect(extractReasons(input)).toEqual([
			"TransactionConflict",
			"NetworkFailure",
		]);
	});

	test(`with multiple occurrences of the pattern`, () => {
		const input = `
(DaxServiceError: TransactionCanceledException: Transaction cancelled, please refer cancellation reasons for specific reasons [ConditionalCheckFailed, ConditionalCheckFailed, ConditionalCheckFailed]
          at <stack-trace-omitted>) {
        time: 1724800994829,
        code: 'TransactionCanceledException',
        retryable: true,
        requestId: 'L1K0IJ0B2QSIECUBRHIM4H9IVNVV4KQNSO5AEMVJF66Q9ASUAAJG',
        statusCode: 400,
        _tubeInvalid: false,
        waitForRecoveryBeforeRetrying: false,
        _message: 'Transaction cancelled, please refer cancellation reasons for specific reasons [ConditionalCheckFailed, ConditionalCheckFailed, ConditionalCheckFailed]',
        codeSeq: [ 4, 37, 38, 39, 58 ],
        cancellationReasons: [
          {
            Item: null,
            Code: 'ConditionalCheckFailed',
            Message: 'The conditional request failed'
          },
          {
            Item: null,
            Code: 'ConditionalCheckFailed',
            Message: 'The conditional request failed'
          },
          {
            Item: null,
            Code: 'ConditionalCheckFailed',
            Message: 'The conditional request failed'
          }
        ]
      })
        `;

		expect(extractReasons(input)).toEqual([
			"ConditionalCheckFailed",
			"ConditionalCheckFailed",
			"ConditionalCheckFailed",
		]);
	});
});

@jfw225
Copy link
Author

jfw225 commented Aug 28, 2024

@daveharig I do also have a few questions as I am wrapping up my dax-proxy implementation. For all of the below, this is in a single-cluster setup with 1 active node.

Transact Writes

I can't seem to get TransactWrite commands to work correctly or reliably. When doing a TransactWrite with a put request, it seems that the write is "eventually consistent" (i.e. if I wait long enough, subsequent reads return expected values). However, when I perform a TransactWrite with a delete request, it does not seem to actually delete the items (maybe it does eventually, but it is definitely not immediate like the standard DB interface is).

My question: Does your client currently support TransactWrites using a TransactWriteCommand? If not, no big deal. I'll just prevent my DAX-enabled tables from using TransactWrites for now.

Queries

This is my DAX client setup:

export function createDaxClient(p: {
	isTestEnv: boolean;
}): DbGeneralInterface<DaxSupportedTable> {
	const dbBase = {
		awsClient: new AWS_DynamoDbClient({
			region: DYNAMO_DB_REGION,
			endpoint: getDaxClusterEndpoint(p),
		}),
	};

	const daxClient = new AmazonDaxClient({
		client: dbBase.awsClient,
		config: {
			connectTimeout: 5000,
		},
	});

	const db = new DynamoDbStandardInterface({
		type: "instance",
		client: daxClient,

		// applying the xray middleware seems to be causing an error (i think
		// that the dax client is not correctly instrumenting the command
		// middleware when it makes the request to the dax cluster)
		noGlobalMiddleware: true,
		isTableSupported: isDaxSupportedTable,
	});

	return db;
}

where

  • AWS_DynamoDbClient is the client from @aws-sdk/client-dynamodb (not the lib-db);
  • AmazonDaxClient is your client;
  • DynamoDbStandardInterface is our custom class to standardize the DynamoDB interface (parametrized by a client).

Under the hood, DynamoDbStandardInterface simply calls the send method of AWS_DynamoDbClient with the appropriate command from @aws-sdk/client-dynamodb. More every command other than TransactWrite and QueryCommand works as expected. However, when I try to use QueryCommand, I seem to always get an empty response (for an input that works fine with the standard client).

I noticed that you have paginateScan and paginateQuery methods in your client.

My question: Am I using the QueryCommand incorrectly with your client? Do I need to use the paginateQuery method for all of my queries? If so, I see the startingToken input is of type string | undefined when typically the start key is an object the the AWS attribute-value format. How would I reconcile this? Any examples would be greatly appreciated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants