| marp | theme | class | paginate | backgroundColor | color | header | footer |
|---|---|---|---|---|---|---|---|
true |
default |
lead |
true |
Scientific AI and the Future of OME-Zarr |
[](https://fideus.io) Matt McCormick, PhD | fideus labs | EMBL BIA 2025 |
Matt McCormick, PhD fideus labs
EMBL Advanced Methods in Bioimage Analysis September 17, 2025
๐ HTML slides | ๐ PDF slides | ๐ GitHub repository
50 minutes + 10 minutes Q&A
-
Extended introduction to ngff-zarr (15 min)
- Converting bioimages to OME-Zarr
-
Introduction to MCP Servers (15 min)
- Add the ngff-zarr MCP server to agentic AI tools
-
The ngff-zarr MCP Server in Action (15 min)
- AI-powered conversions and batch processing
-
fideus labs introduction (5 min)
Next-Generation Scientific Imaging
What is OME-Zarr?
- Cloud-native bioimaging file format from the Open Microscopy Environment (OME)
- Built on Zarr - chunked, compressed array storage
- Multiscale pyramidal data structure
- Interoperable across platforms and tools
- FAIR data principles: Findable, Accessible, Interoperable, Reusable
Traditional Problems:
- ๐ญ Vendor-specific proprietary formats
- ๐ฆ Monolithic files difficult to stream
- โ๏ธ Limited cloud compatibility
- ๐ข Poor scalability for large datasets
OME-Zarr Solutions:
- ๐ Open specification
- ๐งฉ Chunked data access
- ๐ Cloud-optimized storage
- โก Parallel processing friendly
- ngff-zarr is an lean and kind open-source toolkit for working with OME-Zarr, the next-generation file format for scientific imaging.
- Provides command-line, Python, TypeScript, and AI interfaces for converting, validating, optimizing, and analyzing bioimaging data.
- Developed by the OME-Zarr and ITK communities for interoperability and performance.
- Supports a wide range of scientific image formats and workflows.
- ๐ Convert your scientific images (NRRD, TIFF, HDF5, and more) to OME-Zarr for scalable, cloud-ready storage.
- โ Validate OME-Zarr datasets to ensure compliance and interoperability.
- ๐ ๏ธ Optimize chunking and compression for efficient access and storage.
- ๐ค Integrate with AI and analysis tools via the Model Context Protocol (MCP).
- ๐ Automate batch processing and reproducible workflows for large-scale projects.
Install Visual Studio Code
Download VS Code:
- ๐ Web: Visit code.visualstudio.com
- ๐ง Linux:
sudo snap install code --classicor download .deb/.rpm - ๐ macOS: Download from website or
brew install --cask visual-studio-code - ๐ช Windows: Download installer or
winget install Microsoft.VisualStudioCode
Pixi is a fast, modern, and reproducible package and environment manager built on the conda ecosystem. It provides:
- ๐ Easy, reproducible environments for any language
- ๐ ๏ธ Task runner for project automation
- ๐ Isolation and cross-platform support (Linux, macOS, Windows)
- ๐ฆ Simple dependency management with a single file (
pixi.tomlorpyproject.toml)
On Linux/macOS:
wget -qO- https://pixi.sh/install.sh | shOn Windows (PowerShell):
powershell -ExecutionPolicy ByPass -c "irm -useb https://pixi.sh/install.ps1 | iex"After installation, add ~/.pixi/bin (Linux/macOS) or %USERPROFILE%\.pixi\bin (Windows) to your PATH if not done automatically.
Pixi lets you define and run project tasks in your pixi.toml or pyproject.toml.
To run a task (e.g., start):
pixi run startYou can define custom tasks (like test, lint, etc.) and run them the same way:
pixi run test
pixi run lintPixi ensures all dependencies and the environment are set up before running your task.
Enter an interactive shell with your project environment activated:
pixi shellWhat happens:
- ๐ง Environment activated - all dependencies available
- ๐ฏ Direct command execution - no need for
pixi runprefix - ๐ช Easy exit - just type
exitwhen done
pixi run convert- ๐ Automatic multiscale generation - without aliasing artifacts
- ๐งฉ Intelligent chunking - optimized for access patterns
- ๐ Metadata preservation - spatial information maintained
- ๐๏ธ Compression applied - reduced file size
- โ๏ธ Cloud-ready format - object-store optimized, can be served via HTTP
pixi run convert-ome-zarr-0.5# Count the number of files created
find carp.ome.zarr -type f | wc -lpixi run convert-sharding# Count the number of files created
find carp.ome.zarr -type f | wc -l- ๐ชฃ Sharding enabled - multiple chunks stored in single files
- ๐ฆ Optimized storage - fewer small files, better filesystem performance
What is Sharding? Sharding combines multiple small chunks into larger "shard" files, dramatically reducing the number files needed to store data while maintaining random access capabilities.
Connecting AI to Your Data
What is Model Context?
- ๐ Information the AI model can "see" and reason about
- ๐งฎ Limited capacity - typically measured in tokens (words/symbols)
- โฑ๏ธ Temporary memory - context is conversation-specific
- ๐ฏ Scope of knowledge for making informed decisions
Why Context Matters:
- ๐ Better understanding - more relevant, accurate responses
- ๐๏ธ Tool selection - AI chooses appropriate tools for the task
- ๐ Data integration - combines multiple information sources
- ๐ Workflow automation - maintains state across complex operations
The Challenge: How do we give AI access to your scientific data and tools?
Universal standard for connecting AI assistants to external data and tools
Key Components:
- ๐ค MCP Client - integrated in AI applications
- ๐ฅ๏ธ MCP Server - exposes specific capabilities
- ๐ Transport Layer - JSON-RPC 2.0 communication
- ๐ง Standardized Interface - tools, resources, prompts
AI Application (Qodo, Claude, etc.)
โ๏ธ JSON-RPC 2.0
MCP Client
โ๏ธ STDIO/HTTP
MCP Server (ngff-zarr)
โ๏ธ
Scientific Data & Tools
Benefits:
- Single protocol for all integrations
- Bidirectional communication
- Context-aware AI interactions
Before MCP:
- ๐ง Custom integrations for each tool
- ๐ซ Limited AI access to scientific data
- โ Manual, error-prone workflows
With MCP:
- ๐ฌ Natural language interface to scientific tools
- ๐ค Automated data processing pipelines
- ๐ง AI-driven optimization and analysis
- ๐ Reproducible computational workflows
Install uv, if not already installed
pixi global install uvuvx, which comes with uv, will be used to install the ngff-zarr-mcp command-line tool and its dependencies, and run the MCP server.
{
"mcpServers": {
"ngffZarr": {
"command": "uvx",
"args": ["ngff-zarr-mcp"]
}
}
}AI-Powered Scientific Image Processing
Core Functions:
- ๐ Convert scientific formats to OME-Zarr
- ๐ Inspect and validate OME-Zarr stores
- ๐ ๏ธ Optimize compression and chunking
- ๐ Generate processing scripts
- ๐ฆ Batch operation planning
AI Integration:
- ๐ฌ Natural language commands
- ๐ฏ Intelligent parameter selection
- ๐ค Automated workflow generation
Put the Qodo Anteater to work!
In Qodo chat:
Convert the vs_male.nrrd file to OME-Zarr format and
find the optimal compression codec for this type of data.
โจ Watch the AI agent:
- ๐ Analyze the input file
- ๐ฏ Select appropriate parameters
- โ๏ธ Execute the conversion
- ๐ Report optimization results
Ask the AI to explore:
Examine the contents of carp.ome.zarr and tell me
about its structure, dimensions, and metadata
โจ The AI agent will:
- ๐ Inspect multiscale levels
- ๐ Report spatial metadata
- ๐งฉ Analyze chunk structure
- โจ Suggest next steps
Scale up with AI automation:
I have a folder of 50 similar NRRD files.
Generate a Python script to batch convert them all
to OME-Zarr with the same optimal settings
โจ The AI agent creates:
- ๐ Complete Python script
โ ๏ธ Error handling- ๐ Progress reporting
- ๐ฏ Optimized parameters from previous analysis
Today's Demo Shows:
- ๐ฌ Conversational scientific computing
- ๐ค Automated optimization
- ๐ Reproducible workflows
- โจ Accessible advanced techniques
Tomorrow's Possibilities:
- ๐งฌ Multi-modal analysis pipelines
- ๐ง Intelligent experiment design
- ๐ก๏ธ Automated quality control
- ๐ Cross-platform integration
Fostering trust ๐ค and advancing understanding ๐ง from scientific and biomedical images ๐ฌ๏ธ
About fideus labs
Specialties:
- Biomedical Imaging - ITK core development
- Scientific Visualization - advanced rendering
- Open science - pioneering decentralized science
- AI Integration - intelligent workflows
Open Source Leadership:
- ITK (Insight Toolkit) core team
- OME-Zarr ecosystem contributor
- Curate ngff-zarr development
Research Partnership:
- Government laboratories
- Academic institutions
- Industry leaders
- Open source communities
fideus labs services:
- Custom imaging solutions
- Scientific software development
- Training and consultation
Connect
We are hiring! Send us your CV and GitHub profile.
โ OME-Zarr - Future of scientific imaging formats
โ MCP Servers - Bridge AI and scientific tools
โ Natural Language - New interface for scientific computing
โ Accessible Research - Cloud-native, collaborative science
What we covered:
- OME-Zarr fundamentals and conversion
- MCP architecture and benefits
- AI-powered scientific workflows
Let's discuss:
- Your specific use cases
- Integration challenges
- Future possibilities
- Next steps for implementation





