Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Aryn connectors for reading and writing docsets #1147

Merged
merged 6 commits into from
Feb 4, 2025
Merged

Conversation

austintlee
Copy link
Contributor

No description provided.

@austintlee austintlee requested a review from HenryL27 January 31, 2025 23:09
def __init__(self, docs: list[dict[str, Any]]):
self.docs = docs

def to_docs(self, query_params: "BaseDBReader.QueryParams") -> list[Document]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some day I feel like we should turn these lists that we're passing around into iterators/generators (all the way to the ray ds construction) but not right now and that applies to all the readers

Comment on lines 68 to 72
def create_target_idempotent(self, target_params: "BaseDBWriter.TargetParams"):
pass

def get_existing_target_params(self, target_params: "BaseDBWriter.TargetParams") -> "BaseDBWriter.TargetParams":
pass
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it a huge lift to add docset-creation functionality to this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there are multiple writer tasks, will each one attempt to create a docset? Since name is not unique, creating a docset from multiple writer tasks will result in multiple docsets (with the same name) being created.

Comment on lines 652 to 653
api_key = ArynConfig.get_aryn_api_key()
aryn_url = ArynConfig.get_aryn_url()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be able to pass these in as params here too I think

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

Comment on lines 638 to 639
@context_params
def aryn(self, docset_id: str, **kwargs) -> DocSet:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What params come from the context?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I need this. I'll remove it.

Comment on lines 824 to 825
api_key = ArynConfig.get_aryn_api_key()
aryn_url = ArynConfig.get_aryn_url()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these guys too

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

Comment on lines 826 to 831
if docset_id is None and create_new_docset and name is not None:
headers = {
"Authorization": f"Bearer {api_key}"
}
res = requests.post(url=f"{aryn_url}/docsets", data={"name": name}, headers=headers)
docset_id = res.json()["docset_id"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess docstore allows multiple docsets with the same name?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, only IDs are unique.

@austintlee austintlee enabled auto-merge (squash) February 4, 2025 05:54
@austintlee austintlee merged commit 24dcec6 into main Feb 4, 2025
10 of 15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants