Skip to content

Scientific AI and the Future of OME-Zarr: Building Intelligent Bioimage Analysis Workflows

License

fideus-labs/scientific-ai-omezarr-tutorial

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

17 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

marp theme class paginate backgroundColor color header footer
true
default
lead
true
Scientific AI and the Future of OME-Zarr
<style> img[alt~="center"] { display: block; margin: 0 auto; } .small-text { font-size: 0.6em; } </style>

Scientific AI and the Future of OME-Zarr

Building Intelligent Bioimage Analysis Workflows

Matt McCormick, PhD fideus labs

EMBL Advanced Methods in Bioimage Analysis September 17, 2025

๐ŸŒ HTML slides | ๐Ÿ“„ PDF slides | ๐Ÿ“‚ GitHub repository

๐Ÿ“œ License: Content CC-BY-4.0 | Code MIT


Today's Journey

50 minutes + 10 minutes Q&A

  1. Extended introduction to ngff-zarr (15 min)

    • Converting bioimages to OME-Zarr
  2. Introduction to MCP Servers (15 min)

    • Add the ngff-zarr MCP server to agentic AI tools
  3. The ngff-zarr MCP Server in Action (15 min)

    • AI-powered conversions and batch processing
  4. fideus labs introduction (5 min)


Part 1: Introduction to ngff-zarr

Next-Generation Scientific Imaging


What is OME-Zarr?

  • Cloud-native bioimaging file format from the Open Microscopy Environment (OME)
  • Built on Zarr - chunked, compressed array storage
  • Multiscale pyramidal data structure
  • Interoperable across platforms and tools
  • FAIR data principles: Findable, Accessible, Interoperable, Reusable

Why OME-Zarr Matters

Traditional Problems:

  • ๐Ÿญ Vendor-specific proprietary formats
  • ๐Ÿ“ฆ Monolithic files difficult to stream
  • โ˜๏ธ Limited cloud compatibility
  • ๐Ÿข Poor scalability for large datasets

OME-Zarr Solutions:

  • ๐Ÿ“– Open specification
  • ๐Ÿงฉ Chunked data access
  • ๐ŸŒ Cloud-optimized storage
  • โšก Parallel processing friendly

What is ngff-zarr?

  • ngff-zarr is an lean and kind open-source toolkit for working with OME-Zarr, the next-generation file format for scientific imaging.
  • Provides command-line, Python, TypeScript, and AI interfaces for converting, validating, optimizing, and analyzing bioimaging data.
  • Developed by the OME-Zarr and ITK communities for interoperability and performance.
  • Supports a wide range of scientific image formats and workflows.

h:150 center ngff-zarr logo


What can ngff-zarr do for you?

  • ๐Ÿ”„ Convert your scientific images (NRRD, TIFF, HDF5, and more) to OME-Zarr for scalable, cloud-ready storage.
  • โœ… Validate OME-Zarr datasets to ensure compliance and interoperability.
  • ๐Ÿ› ๏ธ Optimize chunking and compression for efficient access and storage.
  • ๐Ÿค– Integrate with AI and analysis tools via the Model Context Protocol (MCP).
  • ๐Ÿš€ Automate batch processing and reproducible workflows for large-scale projects.

๐Ÿ› ๏ธ Hands-On: Converting bioimages to OME-Zarr


๐Ÿ’ป Prerequisites: VS Code Installation

Download VS Code:

  • ๐ŸŒ Web: Visit code.visualstudio.com
  • ๐Ÿง Linux: sudo snap install code --classic or download .deb/.rpm
  • ๐ŸŽ macOS: Download from website or brew install --cask visual-studio-code
  • ๐ŸชŸ Windows: Download installer or winget install Microsoft.VisualStudioCode

h:200 Pixi logo

๐Ÿ“ฆ Prerequisites: Pixi reproducible software environment


What is Pixi?

Pixi is a fast, modern, and reproducible package and environment manager built on the conda ecosystem. It provides:

  • ๐Ÿš€ Easy, reproducible environments for any language
  • ๐Ÿ› ๏ธ Task runner for project automation
  • ๐Ÿ”’ Isolation and cross-platform support (Linux, macOS, Windows)
  • ๐Ÿ“ฆ Simple dependency management with a single file (pixi.toml or pyproject.toml)

โฌ‡๏ธ How to install Pixi

On Linux/macOS:

wget -qO- https://pixi.sh/install.sh | sh

On Windows (PowerShell):

powershell -ExecutionPolicy ByPass -c "irm -useb https://pixi.sh/install.ps1 | iex"

After installation, add ~/.pixi/bin (Linux/macOS) or %USERPROFILE%\.pixi\bin (Windows) to your PATH if not done automatically.


๐Ÿš€ How to run Pixi tasks

Pixi lets you define and run project tasks in your pixi.toml or pyproject.toml.

To run a task (e.g., start):

pixi run start

You can define custom tasks (like test, lint, etc.) and run them the same way:

pixi run test
pixi run lint

Pixi ensures all dependencies and the environment are set up before running your task.


๐Ÿš Interactive shell with pixi shell

Enter an interactive shell with your project environment activated:

pixi shell

What happens:

  • ๐Ÿ”ง Environment activated - all dependencies available
  • ๐ŸŽฏ Direct command execution - no need for pixi run prefix
  • ๐Ÿšช Easy exit - just type exit when done

๐Ÿ‘ฉโ€๐Ÿ’ป๏ธ Exercise 1: Convert the sample NRRD image to OME-Zarr

pixi run convert

What Just Happened?

  • ๐Ÿ” Automatic multiscale generation - without aliasing artifacts
  • ๐Ÿงฉ Intelligent chunking - optimized for access patterns
  • ๐Ÿ“Š Metadata preservation - spatial information maintained
  • ๐Ÿ—œ๏ธ Compression applied - reduced file size
  • โ˜๏ธ Cloud-ready format - object-store optimized, can be served via HTTP

๐Ÿ‘ฉโ€๐Ÿ’ป๏ธ Exercise 2: Convert the sample NRRD image to OME-Zarr version 0.5

pixi run convert-ome-zarr-0.5
# Count the number of files created
find carp.ome.zarr -type f | wc -l

๐Ÿ‘ฉโ€๐Ÿ’ป๏ธ Exercise 3: Convert the sample NRRD image to OME-Zarr with sharding

pixi run convert-sharding
# Count the number of files created
find carp.ome.zarr -type f | wc -l

What Just Happened? โœจ New in OME-Zarr 0.5

  • ๐Ÿชฃ Sharding enabled - multiple chunks stored in single files
  • ๐Ÿ“ฆ Optimized storage - fewer small files, better filesystem performance

What is Sharding? Sharding combines multiple small chunks into larger "shard" files, dramatically reducing the number files needed to store data while maintaining random access capabilities.


Part 2: Introduction to MCP Servers

Connecting AI to Your Data


๐Ÿง  Understanding Large Language Model (LLM) Context

What is Model Context?

  • ๐Ÿ“ Information the AI model can "see" and reason about
  • ๐Ÿงฎ Limited capacity - typically measured in tokens (words/symbols)
  • โฑ๏ธ Temporary memory - context is conversation-specific
  • ๐ŸŽฏ Scope of knowledge for making informed decisions

๐Ÿง  Understanding Large Language Model (LLM) Context

Why Context Matters:

  • ๐Ÿ” Better understanding - more relevant, accurate responses
  • ๐ŸŽ›๏ธ Tool selection - AI chooses appropriate tools for the task
  • ๐Ÿ”— Data integration - combines multiple information sources
  • ๐Ÿš€ Workflow automation - maintains state across complex operations

The Challenge: How do we give AI access to your scientific data and tools?


What is the Model Context Protocol (MCP)?

Universal standard for connecting AI assistants to external data and tools

Key Components:

  • ๐Ÿค– MCP Client - integrated in AI applications
  • ๐Ÿ–ฅ๏ธ MCP Server - exposes specific capabilities
  • ๐Ÿ”— Transport Layer - JSON-RPC 2.0 communication
  • ๐Ÿ”ง Standardized Interface - tools, resources, prompts

MCP Architecture

AI Application (Qodo, Claude, etc.)
    โ†•๏ธ JSON-RPC 2.0
MCP Client
    โ†•๏ธ STDIO/HTTP
MCP Server (ngff-zarr)
    โ†•๏ธ
Scientific Data & Tools

Benefits:

  • Single protocol for all integrations
  • Bidirectional communication
  • Context-aware AI interactions

Why MCP for Scientific Computing?

Before MCP:

  • ๐Ÿ”ง Custom integrations for each tool
  • ๐Ÿšซ Limited AI access to scientific data
  • โœ‹ Manual, error-prone workflows

With MCP:

  • ๐Ÿ’ฌ Natural language interface to scientific tools
  • ๐Ÿค– Automated data processing pipelines
  • ๐Ÿง  AI-driven optimization and analysis
  • ๐Ÿ”„ Reproducible computational workflows

๐Ÿ› ๏ธ Hands-On: Configure Qodo with the ngff-zarr MCP


Install uv, if not already installed

pixi global install uv

uvx, which comes with uv, will be used to install the ngff-zarr-mcp command-line tool and its dependencies, and run the MCP server.


Install Qodo Extension in VS Code

h:480 center Qodo extension


Add Qodo MCP Tools

h:480 center Qodo Add MCP Tools


Add new MCP

h:480 center Qodo Add new MCP


Add the ngff-zarr MCP server config

{
  "mcpServers": {
    "ngffZarr": {
      "command": "uvx",
      "args": ["ngff-zarr-mcp"]
    }
  }
}

h:300 center Qodo ngff-zarr MCP server


Watch the ngff-zarr MCP server start

h:480 center Qodo ngff-zarr MCP server start


Part 3: The ngff-zarr MCP Server

AI-Powered Scientific Image Processing


ngff-zarr MCP Server Capabilities

Core Functions:

  • ๐Ÿ”„ Convert scientific formats to OME-Zarr
  • ๐Ÿ” Inspect and validate OME-Zarr stores
  • ๐Ÿ› ๏ธ Optimize compression and chunking
  • ๐Ÿ“ Generate processing scripts
  • ๐Ÿ“ฆ Batch operation planning

AI Integration:

  • ๐Ÿ’ฌ Natural language commands
  • ๐ŸŽฏ Intelligent parameter selection
  • ๐Ÿค– Automated workflow generation

๐Ÿ› ๏ธ Hands-On: AI-Powered Conversion


๐Ÿ’ฌ Convert a bioimage with AI assistance

Put the Qodo Anteater to work!

In Qodo chat:

Convert the vs_male.nrrd file to OME-Zarr format and
find the optimal compression codec for this type of data.

โœจ Watch the AI agent:

  1. ๐Ÿ” Analyze the input file
  2. ๐ŸŽฏ Select appropriate parameters
  3. โš™๏ธ Execute the conversion
  4. ๐Ÿ“Š Report optimization results

๐Ÿ’ฌ Examine OME-Zarr contents

Ask the AI to explore:

Examine the contents of carp.ome.zarr and tell me
about its structure, dimensions, and metadata

โœจ The AI agent will:

  • ๐Ÿ” Inspect multiscale levels
  • ๐Ÿ“ Report spatial metadata
  • ๐Ÿงฉ Analyze chunk structure
  • โœจ Suggest next steps

๐Ÿ’ฌ Generate batch script

Scale up with AI automation:

I have a folder of 50 similar NRRD files.
Generate a Python script to batch convert them all
to OME-Zarr with the same optimal settings

โœจ The AI agent creates:

  • ๐Ÿ Complete Python script
  • โš ๏ธ Error handling
  • ๐Ÿ“ˆ Progress reporting
  • ๐ŸŽฏ Optimized parameters from previous analysis

The Future of Scientific AI

Today's Demo Shows:

  • ๐Ÿ’ฌ Conversational scientific computing
  • ๐Ÿค– Automated optimization
  • ๐Ÿ”„ Reproducible workflows
  • โœจ Accessible advanced techniques

Tomorrow's Possibilities:

  • ๐Ÿงฌ Multi-modal analysis pipelines
  • ๐Ÿง  Intelligent experiment design
  • ๐Ÿ›ก๏ธ Automated quality control
  • ๐ŸŒ Cross-platform integration

fideus labs

Fostering trust ๐Ÿค and advancing understanding ๐Ÿง  from scientific and biomedical images ๐Ÿ”ฌ๏ธ


Specialties:

  • Biomedical Imaging - ITK core development
  • Scientific Visualization - advanced rendering
  • Open science - pioneering decentralized science
  • AI Integration - intelligent workflows

Open Source Leadership:

  • ITK (Insight Toolkit) core team
  • OME-Zarr ecosystem contributor
  • Curate ngff-zarr development

Our Approach

Research Partnership:

  • Government laboratories
  • Academic institutions
  • Industry leaders
  • Open source communities

Connect With Us

fideus labs services:

  • Custom imaging solutions
  • Scientific software development
  • Training and consultation

Connect

We are hiring! Send us your CV and GitHub profile.


Key Takeaways

โœ… OME-Zarr - Future of scientific imaging formats

โœ… MCP Servers - Bridge AI and scientific tools

โœ… Natural Language - New interface for scientific computing

โœ… Accessible Research - Cloud-native, collaborative science


Questions & Discussion

What we covered:

  • OME-Zarr fundamentals and conversion
  • MCP architecture and benefits
  • AI-powered scientific workflows

Let's discuss:

  • Your specific use cases
  • Integration challenges
  • Future possibilities
  • Next steps for implementation

About

Scientific AI and the Future of OME-Zarr: Building Intelligent Bioimage Analysis Workflows

Topics

Resources

License

Stars

Watchers

Forks