Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SHOW CHANGE STREAMS statement #137

Open
apstndb opened this issue Jun 9, 2022 · 13 comments
Open

SHOW CHANGE STREAMS statement #137

apstndb opened this issue Jun 9, 2022 · 13 comments

Comments

@apstndb
Copy link
Collaborator

apstndb commented Jun 9, 2022

refs #135 (comment)

  • Show watched tables and columns
  • Show FOR ALL or empty streams also
  • Show options(currently only retention_period)

Proposed layout

+----------------------------------------------------------------+----------------------+
| CHANGE_STREAM                                                  | OPTIONS              |
+----------------------------------------------------------------+----------------------+
| Empty                                                          |                      |
| EverythingStream FOR ALL                                       |                      |
| NamesAndAlbums FOR Singers(FirstName, LastName), Albums, Songs | retention_period=36h |
| SingerAlbumStream FOR Singers, Albums                          |                      |
+----------------------------------------------------------------+----------------------+

Semantics

SELECT CHANGE_STREAM_NAME || IF(`ALL`, " FOR ALL", IFNULL(" FOR " || (
  SELECT STRING_AGG(TABLE_NAME || IF(ALL_COLUMNS, "", FORMAT("(%s)", (
    SELECT STRING_AGG(COLUMN_NAME, ", ")
    FROM INFORMATION_SCHEMA.CHANGE_STREAM_COLUMNS CSC
    WHERE (CST.CHANGE_STREAM_NAME, CST.TABLE_NAME) = (CSC.CHANGE_STREAM_NAME, CSC.TABLE_NAME)
  ))), ", ")
  FROM INFORMATION_SCHEMA.CHANGE_STREAM_TABLES CST
  WHERE CS.CHANGE_STREAM_NAME = CST.CHANGE_STREAM_NAME
), "")) AS CHANGE_STREAM,
IFNULL((
  SELECT STRING_AGG(FORMAT("%s=%s", OPTION_NAME, OPTION_VALUE), ", ")
  FROM INFORMATION_SCHEMA.CHANGE_STREAM_OPTIONS CSO
  WHERE CS.CHANGE_STREAM_NAME = CSO.CHANGE_STREAM_NAME
), "") AS OPTIONS
FROM INFORMATION_SCHEMA.CHANGE_STREAMS CS

References

@yfuruyama
Copy link
Collaborator

Thanks for the suggested layout.

Another layout representation would be just using the response of the GetDatabaseDdlResponse. With this way we don't need to change our layout even if a new notation/syntax is added to the Change Stream in the future.

@apstndb
Copy link
Collaborator Author

apstndb commented Jun 9, 2022

Equivalent DDLs

$ gcloud spanner databases ddl describe --project=$PROJECT_ID --instance=$INSTANCE_ID $DATABASE_ID | pcregrep -M 'CREATE CHANGE STREAM [^;]*;'
CREATE CHANGE STREAM Empty;
CREATE CHANGE STREAM EverythingStream
  FOR ALL;
CREATE CHANGE STREAM NamesAndAlbums
  FOR Singers(FirstName, LastName), Albums, Songs OPTIONS (
  retention_period = '36h'
);
CREATE CHANGE STREAM SingerAlbumStream
  FOR Singers, Albums;

@yfuruyama
Copy link
Collaborator

Thanks! That's the one. What do you think?

@apstndb
Copy link
Collaborator Author

apstndb commented Jun 9, 2022

spanner-cli already have the MySQL-styleSHOW CREATE TABLE <table_name> pseudo-statement.
What about generic SHOW CREATE <object_type> <object_name> to show the DDL?

@apstndb
Copy link
Collaborator Author

apstndb commented Jun 9, 2022

The generic SHOW CREATE can also support indexes and views.

spanner> SHOW CREATE INDEX SongsBySongName\G
*************************** 1. row ***************************
Name: SongsBySongName
 DDL: CREATE INDEX SongsBySongName ON Songs(SongName)
1 rows in set (0.47 sec)

spanner> SHOW CREATE CHANGE STREAM EverythingStream\G
*************************** 1. row ***************************
Name: EverythingStream
 DDL: CREATE CHANGE STREAM EverythingStream
  FOR ALL
1 rows in set (0.37 sec)

spanner> SHOW CREATE VIEW SingerNames\G
*************************** 1. row ***************************
Name: SingerNames
 DDL: CREATE VIEW SingerNames SQL SECURITY INVOKER AS SELECT
  Singers.SingerId AS SingerId,
  Singers.FirstName || ' ' || Singers.LastName AS Name
FROM Singers
1 rows in set (0.32 sec)

(We should consider about UNIQUE and NULL_FILTERED)

@apstndb
Copy link
Collaborator Author

apstndb commented Jun 9, 2022

@yfuruyama
Copy link
Collaborator

That's a good idea! +1 👍

@apstndb
Copy link
Collaborator Author

apstndb commented Jun 10, 2022

Implemented SHOW statements are fallen into three categories.

  1. List all target-typed schema objects.
  • SHOW TABLES
  • SHOW DATABASES(It is higher level objects than DDLs of database.)
  1. Show details of the target-typed schema objects filtered by parent name.
  • SHOW INDEX FROM <table>
  • SHOW COLUMNS FROM <table>
  1. Show the DDL for the specified object.
  • SHOW CREATE TABLE

3 can be generalized as discussed but I think 1(SHOW CHANGE STREAMS) is also needed.

I think it can be generalized as the sugar for SELECT <known name column or *> FROM INFORMATION_SCHEMA.<snake_cased_object_type>.
(We should consider about _CATALOG and _SCHEMA columns like TABLE_SCHEMA of the INFORMATION_SCHEMA.TABLES)

@yfuruyama
Copy link
Collaborator

Thanks for summarizing that. I totally agree with the summarized categories.

The user story might be running SHOW CHANGE STREAMS at first to list the change streams, then running SHOW CREATE CHANGE STREAM to get the details of the change stream:

spanner> SHOW CHANGE STREAMS;
+-------------------+
| Change_Streams    |
+-------------------+
| EverythingStream  |
| NamesAndAlbums    |
| SingerAlbumStream |
+-------------------+

spanner> SHOW CREATE CHANGE STREAM EverythingStream;
+------------------+-----------------------------------------------+
| Change_Stream    | Create_Change_Stream                          |
+------------------+-----------------------------------------------+
| EverythingStream | CREATE CHANGE STREAM EverythingStream FOR ALL |
+------------------+-----------------------------------------------+

I think it can be generalized as the sugar for SELECT <known name column or *> FROM INFORMATION_SCHEMA.<snake_cased_object_type>.

I'm not sure if we can assemble the proper CREATE xxx statement in a general way from INFORMATION_SCHEMA database. Perhaps it would be easier to scrape from the DDL as comment: #137 (comment)

@apstndb
Copy link
Collaborator Author

apstndb commented Jun 10, 2022

I'm not sure if we can assemble the proper CREATE xxx statement in a general way from INFORMATION_SCHEMA database. Perhaps it would be easier to scrape from the DDL as comment: #137 (comment)

I agree it is impossible to assemble DDLs from INFORMATION_SCHEMA in a general way and it is possible using projects.instances.databases.getDdl.
We should use projects.instances.databases.getDdl for the SHOW CREATE statement.

As for SHOW <schema_types>, we can simply query INFORMATION_SCHEMA even if the scheme object type is unknown.

spanner> SHOW CHANGE STREAMS;
Warning: `SHOW CHANGE STREAMS` has not yet natively supported.
Fallback to `SELECT * FROM INFORMATION_SCHEMA.CHANGE_STREAMS` (Subject to change in future versions)
+-----------------------+----------------------+--------------------+-------+
| CHANGE_STREAM_CATALOG | CHANGE_STREAM_SCHEMA | CHANGE_STREAM_NAME | ALL   |
+-----------------------+----------------------+--------------------+-------+
|                       |                      | Empty              | false |
|                       |                      | EverythingStream   | true  |
|                       |                      | NamesAndAlbums     | false |
|                       |                      | SingerAlbumStream  | false |
+-----------------------+----------------------+--------------------+-------+

In thought experimental, most of SHOW <object_type_plural> can be implemented by the same shape of query.

SELECT <prefix>_NAME FROM INFORMATION_SCHEMA.<object_type_plural_snake> WHERE (<parent_prefix>_CATALOG, <parent_prefix>_SCHEMA) = ("", "")
SELECT TABLE_NAME FROM INFORMATION_SCHEMA.TABLES WHERE (TABLE_CATALOG, TABLE_SCHEMA) = ("", "")
SELECT CONSTRAINT_NAME FROM INFORMATION_SCHEMA.REFERENTIAL_CONSTRAINTS WHERE (CONSTRAINT_CATALOG, CONSTRAINT_SCHEMA) = ("", "")
SELECT CONSTRAINT_NAME FROM INFORMATION_SCHEMA.TABLE_CONSTRAINTS WHERE (TABLE_CATALOG, TABLE_SCHEMA) = ("", "")
SELECT CONSTRAINT_NAME FROM INFORMATION_SCHEMA.CHECK_CONSTRAINTS WHERE (CONSTRAINT_CATALOG, CONSTRAINT_SCHEMA) = ("", "")
SELECT TABLE_NAME FROM INFORMATION_SCHEMA.VIEWS WHERE (TABLE_CATALOG, TABLE_SCHEMA) = ("", "")
SELECT CHANGE_STREAM_NAME FROM INFORMATION_SCHEMA.CHANGE_STREAMS WHERE (CHANGE_STREAM_CATALOG, CHANGE_STREAM_SCHEMA) = ("", "")

Note: INFORMATION_SCHEMA.INDEXES will not be usable without filtering out PRIMARY KEY. _CATALOG _SCHEMA is usually empty.

@yfuruyama
Copy link
Collaborator

As for SHOW <schema_types>, we can simply query INFORMATION_SCHEMA even if the scheme object type is unknown.

Sounds good.

In thought experimental, most of SHOW <object_type_plural> can be implemented by the same shape of query.
SELECT _NAME FROM INFORMATION_SCHEMA.<object_type_plural_snake> WHERE (<parent_prefix>_CATALOG, <parent_prefix>_SCHEMA) = ("", "")

How can we get the parent_prefix from the given object_type in a general way? For example: SELECT TABLE_NAME FROM INFORMATION_SCHEMA.VIEWS WHERE (TABLE_CATALOG, TABLE_SCHEMA) = ("", "").

@apstndb
Copy link
Collaborator Author

apstndb commented Jun 13, 2022

How can we get the parent_prefix from the given object_type in a general way? For example: SELECT TABLE_NAME FROM INFORMATION_SCHEMA.VIEWS WHERE (TABLE_CATALOG, TABLE_SCHEMA) = ("", "").

I don't think it can possible to get the parent_prefix in a general way. It is only for thinking the consistent behavior of SHOW < object_type> statements.

@yfuruyama
Copy link
Collaborator

Yeah perhaps we have to implement each SHOW <object> statement separately so that we can use an appropriate table/column from INFORMATION_SCHEMA.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants