Skip to content

Commit

Permalink
Merge pull request #16 from coudertlab/interpenetratedtopologyresult
Browse files Browse the repository at this point in the history
Add InterpenetratedTopologyResult
  • Loading branch information
Liozou authored Jul 10, 2023
2 parents 57932eb + 0f67e5c commit d4f2e84
Show file tree
Hide file tree
Showing 18 changed files with 318 additions and 144 deletions.
1 change: 1 addition & 0 deletions .github/workflows/CI.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ jobs:
matrix:
version:
- '1.6'
- '1.9'
- 'nightly'
os:
- ubuntu-latest
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/documentation.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ jobs:
- uses: actions/checkout@v2
- uses: julia-actions/setup-julia@latest
with:
version: '1.6'
version: '1.9'
- name: Install dependencies
run: julia --project=docs/ -e 'using Pkg; Pkg.develop(PackageSpec(path=pwd())); Pkg.instantiate()'
- name: Build and deploy
Expand Down
6 changes: 3 additions & 3 deletions Project.toml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
name = "CrystalNets"
uuid = "7952bbbe-a946-4118-bea0-081a0932faa9"
authors = ["Lionel Zoubritzky [email protected]"]
version = "0.3.6"
version = "0.4.0"

[deps]
ArgParse = "c7e460c6-2fb9-53a9-8c5b-16f535851c63"
Expand All @@ -27,8 +27,8 @@ ArgParse = "1.1"
Chemfiles = "0.10"
Graphs = "1.3"
PeriodicGraphEmbeddings = "0.2.2"
PeriodicGraphEquilibriumPlacement = "0.1, 0.2"
PeriodicGraphs = "0.8.1, 0.9"
PeriodicGraphEquilibriumPlacement = "0.2"
PeriodicGraphs = "0.10"
Pkg = "1.5"
ProgressMeter = "1.7"
PrecompileTools = "1"
Expand Down
33 changes: 23 additions & 10 deletions docs/src/faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -96,18 +96,14 @@ Most often, the difference will come from either:

## How can I do a database topology analysis with CrystalNets.jl?

The built-in way to do this consists in using the [`determine_topology_dataset`](@ref) function, or [`guess_topology_dataset`](@ref) in some cases.
These functions expect the path of a directory containing CIF files within (possibly in subdirectories).
The built-in way to do this consists in using the [`determine_topology_dataset`](@ref) function.
This function expects the path of a directory containing CIF files within (possibly in subdirectories).

## How can I directly access the genome of my structure instead of its name?

The result of [`determine_topology`](@ref) is either a [`TopologicalGenome`](@ref) or a
`Vector{Tuple{Vector{Int},TopologyResult}}`, depending on whether the input
contains multiple interpenetrating subnets or not. In the second case, extract the relevant
[`TopologyResult`](@ref).
The result `x` of [`determine_topology`](@ref) is an [`InterpenetratedTopologyResult`](@ref). Its `length` gives the number of interpenetrated substructures. Each of its values, for instance `x[1]`, is a tuple `(topo, n)` meaning that the substructure is an `n`-fold catenated net of topology `topo`. `topo` itself is a [`TopologyResult`](@ref), which stores the result of a topology computation for possibly several clusterings. The [`TopologicalGenome`](@ref) associated to a given clustering can be extracted by indexing the [`TopologyResult`](@ref), for instance `t = topo[Clustering.SingleNodes]` (or simply `t = topo[:SingleNodes]`).

A [`TopologyResult`](@ref) can store the result for different clustering options, so the
topological genome should be chosen by extracting the relevant result. For example:
For example:

```jldoctest im19faq
julia> path_to_im19 = joinpath(dirname(dirname(pathof(CrystalNets))), "test", "cif", "IM-19.cif");
Expand All @@ -117,17 +113,32 @@ AllNodes: rna
SingleNodes: bpq
julia> typeof(result)
InterpenetratedTopologyResult
julia> length(result)
1
julia> topo, n = only(result);
julia> n # catenation multiplicity
1
julia> topo
AllNodes: rna
SingleNodes: bpq
julia> typeof(topo)
TopologyResult
julia> genome_allnodes = result[Clustering.AllNodes]
julia> genome_allnodes = topo[Clustering.AllNodes]
rna
julia> typeof(genome_allnodes)
TopologicalGenome
```

In case where all clusterings lead to the same genome, it can simply be accessed
by calling `first(result)`.
by calling `first(topo)`.

Having obtained a [`TopologicalGenome`](@ref), the topological genome itself can accessed
by converting it to a `PeriodicGraph`:
Expand All @@ -137,6 +148,8 @@ julia> genome = PeriodicGraph(genome_allnodes)
PeriodicGraph3D(6, PeriodicEdge3D[(1, 2, (0,0,0)), (1, 3, (0,0,0)), (1, 4, (0,0,0)), (1, 4, (0,0,1)), (1, 5, (0,0,0)), (1, 6, (0,0,0)), (2, 4, (0,0,1)), (2, 6, (-1,0,0)), (3, 4, (0,0,1)), (3, 5, (0,-1,0)), (4, 5, (0,0,0)), (4, 6, (0,0,0))])
```

In case of error during topology identification, the returned `genome` is a `PeriodicGraph{0}`.

The string representation of the genome is simply `string(genome)`:

``` im19faq
Expand Down
24 changes: 10 additions & 14 deletions docs/src/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,25 +66,19 @@ julia> determine_topology("/path/to/unstable/net.cif")
unstable 1 1 1 1 1 2 0 2 2 1
```

In both known and unknown cases, the result is a [`TopologyResult`](@ref).
In both known and unknown cases, the result is an [`InterpenetratedTopologyResult`](@ref).

#### Interpenetrating substructures

If the file contains multiple interpenetrating substructures, the result is a
`Vector{Tuple{Vector{Int}, TopologyResult}}`, where each entry is a tuple
`(vmap, result)` with:
If the file contains multiple interpenetrating substructures, each substructure and its catenation multiplicity can be extracted from the [`InterpenetratedTopologyResult`](@ref).

- `vmap`: the list of vertices of the initial graph that were kept for this substructure.
The initial graph is the one exported in .vtf as `input`. See also
[`parse_chemfile`](@ref) and [`CrystalNets.Crystal`](@ref) for manipulations on the initial graph.
- `result`: the [`TopologyResult`](@ref) for this substructure.
For example:

```julia
julia> determine_topology("/path/to/intertwinned/structures.cif")
2-element Vector{Tuple{Vector{Int64}, TopologyResult}}:
([2, 3, 4, 6], pcu)
([1, 5, 7, 8], srs)
julia> x = determine_topology("/path/to/intertwinned/structures.cif")
2 interpenetrated substructures:
Subnet 1 pcu
Subnet 2 srs
```

#### Using options
Expand Down Expand Up @@ -134,5 +128,7 @@ dia
Run `CrystalNets --help` for the list of options available to the executable.

!!! tip
In terms of performance, the compiled executable is the best option if you only want to identify a few structures from time to time. For intensive workloads with many structures to identify, it is best to use `CrystalNets.jl` as a Julia module through the
[`determine_topology_dataset`](@ref) and [`guess_topology_dataset`](@ref) functions. The module is also the best option to perform more advanced analyses on the net in Julia, or to use the [`Options`](@ref) unavailable to the executable.
In terms of performance, the compiled executable is the best option if you only want to identify a few structures from time to time. Using [the website](https://progs.coudert.name/topology) is recommended as well for this use-case, unless the nets you study are too big.

For intensive workloads with many structures to identify, it is best to use `CrystalNets.jl` as a Julia module through the
[`determine_topology_dataset`](@ref) function. The module is also the best option to perform more advanced analyses on the net in Julia, or to use the [`Options`](@ref) unavailable to the executable.
14 changes: 14 additions & 0 deletions docs/src/lib/internals.md
Original file line number Diff line number Diff line change
Expand Up @@ -88,9 +88,23 @@ CrystalNets.expand_collisions
CrystalNets.collision_nodes
```

## Archives

```@docs
CrystalNets.make_archive
```

## Utils

```@docs
CrystalNets.@toggleassert
CrystalNets.check_dimensionality
```

## Other

```@docs
CrystalNets.guess_topology
CrystalNets.guess_topology_dataset
CrystalNets.recognize_topology
```
7 changes: 2 additions & 5 deletions docs/src/lib/public.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,18 +7,16 @@ CrystalNet
UnderlyingNets
TopologicalGenome
TopologyResult
InterpenetratedTopologyResult
```

## Main functions

```@docs
determine_topology
determine_topology_dataset
guess_topology
guess_topology_dataset
parse_chemfile
topological_genome
recognize_topology
```

## Options
Expand All @@ -28,7 +26,7 @@ CrystalNets.Options
StructureType
Bonding
Clustering
ClusterKinds
CrystalNets.ClusterKinds
```

## Other utilities
Expand All @@ -51,6 +49,5 @@ empty_default_archive!
change_current_archive!
refresh_current_archive!
add_to_current_archive!
make_archive
CrystalNets.export_arc
```
12 changes: 5 additions & 7 deletions docs/src/python.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ The same warnings are printed at the beginning, followed by the same exports. Th

## Usage

Let's now consider a programmatic use-case where the goal is to identify the topology of a complex MOF structure according the [`SingleNodes`](@ref Clustering) and [`AllNodes`](@ref Clustering) clusterings. The main structure may contain interpenetrating substructures and for each substructure.
Let's now consider a programmatic use-case where the goal is to identify the topology of a complex MOF structure according the [`SingleNodes`](@ref Clustering) and [`AllNodes`](@ref Clustering) clusterings. The main structure may contain interpenetrating substructures.

The function is expected to error if the topologies are different between the two clusterings. Otherwise, it returns a list of pairs whose first element is the dimensionality of the subnet and the second element is the name of the corresponding topology. If there is no known name, the topological genome is used instead.

Expand All @@ -77,12 +77,10 @@ def identify_topology(cif):
options = jl.CrystalNets.Options(structure=jl.StructureType.MOF)
# Since the structure is specified as a MOF, the default clusterings are AllNodes and SingleNodes
result = jl.determine_topology(cif, options) # Main call
if jl.isa(result, jl.Vector): # indicates interpenetrating substructures
# for each x in result:
# * x[0] is the list of nodes belonging to the substructure
# * x[1] is the topology of the substructure
return [check_unique_topology(x[1]) for x in result]
return [check_unique_topology(result)]
# for each x in result:
# * x[0] is the topology of the substructure.
# * x[1] is the catenation multiplicity of this subnet.
return [check_unique_topology(x[0]) for x in result]

def check_unique_topology(result):
singlenodes = result[jl.Clustering.SingleNodes] # topology for SingleNodes
Expand Down
2 changes: 1 addition & 1 deletion docs/src/visualization.md
Original file line number Diff line number Diff line change
Expand Up @@ -183,4 +183,4 @@ Other available export options are disabled by default:
- `export_clusters` for the clusters. The only difference with `export_subnets` is that the
graph induced by the clusters is not trimmed yet.
- `export_net` for the net before separation into connected components. This is equivalent
to catenating the result of `export_subnets` into a single file.
to concatenating the result of `export_subnets` into a single file.
7 changes: 2 additions & 5 deletions src/CrystalNets.jl
Original file line number Diff line number Diff line change
Expand Up @@ -25,17 +25,14 @@ export CrystalNet,
UnderlyingNets,
TopologicalGenome,
TopologyResult,
InterpenetratedTopologyResult,
determine_topology,
determine_topology_dataset,
guess_topology,
guess_topology_dataset,
parse_chemfile,
topological_genome,
recognize_topology,
StructureType,
Bonding,
Clustering,
ClusterKinds
Clustering

using LinearAlgebra: det, dot, norm, rank, cross
import LinearAlgebra
Expand Down
26 changes: 15 additions & 11 deletions src/archive.jl
Original file line number Diff line number Diff line change
Expand Up @@ -47,23 +47,25 @@ dia
It is also possible to directly access the topological genome as a `PeriodicGraph`
by parsing the name as a [`TopologicalGenome`](@ref):
```jldoctest
julia> parse(TopologicalGenome, "pcu").genome
julia> PeriodicGraph(parse(TopologicalGenome, "pcu"))
PeriodicGraph3D(1, PeriodicEdge3D[(1, 1, (0,0,1)), (1, 1, (0,1,0)), (1, 1, (1,0,0))])
julia> string(parse(TopologicalGenome, "nbo").genome) == REVERSE_CRYSTALNETS_ARCHIVE["nbo"]
julia> string(PeriodicGraph(parse(TopologicalGenome, "nbo"))) == REVERSE_CRYSTALNETS_ARCHIVE["nbo"]
true
```
"""
const REVERSE_CRYSTALNETS_ARCHIVE = Dict{String,String}(id => (startswith(key, "unstable") ? key[10:end] : key) for (key, id) in CRYSTALNETS_ARCHIVE)

export REVERSE_CRYSTALNETS_ARCHIVE

export clean_default_archive!,
set_default_archive!,
empty_default_archive!,
change_current_archive!,
refresh_current_archive!,
add_to_current_archive!,
make_archive,
REVERSE_CRYSTALNETS_ARCHIVE
add_to_current_archive!

# export make_archive

function _reset_archive!()
global CRYSTALNETS_ARCHIVE
Expand Down Expand Up @@ -317,7 +319,7 @@ function add_to_current_archive!(id::AbstractString, genome::AbstractString)
end

"""
make_archive(path, destination=nothing)
make_archive(path, destination=nothing, verbose=false)
Make an archive from the files located in the directory given by `path` and export
it to `destination`, if specified. Each file of the directory should correspond
Expand All @@ -334,22 +336,24 @@ function make_archive(path, destination, verbose=false)
verbose && print("Handling "*name*"... ")
flag = false
flagerror = Ref{Any}(Tuple{Vector{Int},String}[])
genomes::Vector{Tuple{Vector{Int},String}} = try
results::InterpenetratedTopologyResult = try
x = topological_genome(UnderlyingNets(parse_chemfile(path*f)))
verbose && println(name*" done.")
x
catch e
flag = true
flagerror[] = e
Tuple{Vector{Int},String}[]
InterpenetratedTopologyResult()
end
for (i, (vmap, genome)) in enumerate(genomes)
for (i, (topology, nfold)) in enumerate(results)
genome = string(topology)
if startswith(genome, "unstable") || genome == "non-periodic"
flag = true
push!(flagerror[]::Vector{Vector{Int}}, (vmap, genome))
push!(flagerror[]::Vector{Tuple{Vector{Int},String}}, (vmap, genome))
continue
end
arc[genome] = length(genomes) == 1 ? name : (name * '_' * string(i))
verbose && nfold != 1 && println(nfold, "-fold catenated net found for ", name)
arc[genome] = length(results) == 1 ? name : (name * '_' * string(i))
end
if flag
e = flagerror[]
Expand Down
30 changes: 18 additions & 12 deletions src/executable.jl
Original file line number Diff line number Diff line change
Expand Up @@ -236,6 +236,19 @@ function split_clusterings(s)
end
end

"""
main(ARGS)
Function called when using the module as an executable.
Return code can be:
* 0: no error
* 1: the chemical bond system has no periodicity
* 2: invalid input
* 3: parsing error
* 4: internal CrystalNets.jl error
* 5: unhandled CrystalNets.jl error, please report
"""
function main(args)
try
_parsed_args = parse_commandline(args)
Expand Down Expand Up @@ -424,7 +437,7 @@ function main(args)
end
end

unets = try
unets::UnderlyingNets = try
if iskey
g = try
PeriodicGraph(input_file)
Expand Down Expand Up @@ -452,7 +465,7 @@ function main(args)
return invalid_input_error("""The input cannot be analyzed because of the following error:""",
e, catch_backtrace())
end
genomes::Vector{Tuple{Vector{Int},TopologyResult}} = try
genomes::InterpenetratedTopologyResult = try
topological_genome(unets)
catch e
return internal_error("""Internal error encountered while computing the topological genome:""",
Expand Down Expand Up @@ -488,20 +501,13 @@ function main(args)
end
=#

if length(genomes) == 1
id = genomes[1][2]
println(id)
all(x -> isnothing(x.name), values(id)) && return 1
return 0
end

if length(genomes) == 0
println(TopologyResult(""))
println(genomes)
return 1
end

println(genomes)
return 1
return 0
catch e
return unhandled_error("CrystalNets encountered an unhandled exception:",
e, catch_backtrace())
Expand Down
2 changes: 1 addition & 1 deletion src/options.jl
Original file line number Diff line number Diff line change
Expand Up @@ -116,7 +116,7 @@ function Base.parse(::Type{_Clustering}, s::AbstractString)
elseif s == "PEM"
return Clustering.PEM
end
throw(ArgumentError(lazy"No clustering from string $x"))
throw(ArgumentError(lazy"No clustering from string \"$s\""))
end

"""
Expand Down
Loading

2 comments on commit d4f2e84

@Liozou
Copy link
Collaborator Author

@Liozou Liozou commented on d4f2e84 Jul 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JuliaRegistrator
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Registration pull request created: JuliaRegistries/General/87170

After the above pull request is merged, it is recommended that a tag is created on this repository for the registered package version.

This will be done automatically if the Julia TagBot GitHub Action is installed, or can be done manually through the github interface, or via:

git tag -a v0.4.0 -m "<description of version>" d4f2e848952d21d7c12de3163dab60807088b067
git push origin v0.4.0

Please sign in to comment.