Merge branch 'main' into jp-3861_pathloss_style

spacetelescope · Feb 20, 2025 · 9b44bc2 · 9b44bc2
2 parents 56faa1d + 30f1606
commit 9b44bc2
Show file tree

Hide file tree

Showing 2 changed files with 130 additions and 60 deletions.
diff --git a/docs/jwst/skymatch/description.rst b/docs/jwst/skymatch/description.rst
@@ -6,31 +6,33 @@ Description
 
 Overview
 --------
-The ``skymatch`` step can be used to compute sky values in a collection of input
-images that contain both sky and source signal. The sky values can be computed
-for each image separately or in a way that matches the sky levels amongst the
-collection of images so as to minimize their differences. This operation is
-typically applied before doing cosmic-ray rejection and combining multiple
-images into a mosaic. When running the ``skymatch`` step in a matching mode,
+The ``skymatch`` step can be used to compute sky values in a collection of
+input images that contain both sky and source signal. The sky values can be
+computed for each image separately or in a way that matches the sky levels
+amongst the collection of images so as to minimize their differences.
+This operation is typically applied before doing cosmic-ray rejection and
+combining multiple images into a mosaic.
+When running the ``skymatch`` step in a matching mode,
 it compares *total* signal levels in *the overlap regions* of a set of input
-images and computes the signal offsets for each image that will
-minimize -- in a least squares sense -- the residuals across the entire set.
-This comparison is performed directly on the input images without resampling
-them onto a common grid. The overlap regions are computed directly on the sky
-(celestial sphere) for each pair of input images. Matching based on total signal
-level is especially useful for images that are dominated by large, diffuse
-sources, where it is difficult -- if not impossible -- to find and measure
-true sky.
+images and computes the signal offsets either for each image *or a set/group of
+images* (see Image Groups section below) that will
+minimize -- in a least squares sense -- the residuals across
+the entire set. This comparison is performed directly on the input images
+without resampling them onto a common grid. The overlap regions are computed
+directly on the sky (celestial sphere) for each pair of input images.
+Matching based on total signal level is especially useful for images that
+are dominated by large, diffuse sources, where it is
+difficult -- if not impossible -- to find and measure true sky.
 
 Note that the meaning of "sky background" depends on the chosen sky computation
 method. When the matching method is used, for example, the reported "sky" value
-is only the offset in levels between images and does not necessarily include the
-true total sky level.
+is only the offset in levels between images and does not necessarily include
+the true total sky level.
 
 .. note::
-   Throughout this document the term "sky" is used in a generic sense, referring
-   to any kind of non-source background signal, which may include actual sky,
-   as well as instrumental (e.g. thermal) background, etc.
+   Throughout this document the term "sky" is used in a generic sense,
+   referring to any kind of non-source background signal, which may include
+   actual sky, as well as instrumental (e.g. thermal) background, etc.
 
 The step records information in three keywords that are included in the output
 files:
@@ -45,16 +47,74 @@ BKGSUB
   a boolean indicating whether or not the sky was subtracted from the
   output images. Note that by default the step argument "subtract" is set to
   ``False``, which means that the sky will *NOT* be subtracted
-  (see the :ref:`skymatch step arguments <skymatch_arguments>` for more details).
+  (see the :ref:`skymatch step arguments <skymatch_arguments>` for more
+  details).
 
 Both the "BKGSUB" and "BKGLEVEL" keyword values are important information for
 downstream tasks, such as
 :ref:`outlier detection <outlier_detection_step>` and
 :ref:`resampling <resample_step>`.
-Outlier detection will use the BKGLEVEL values to internally equalize the images,
-which is necessary to prevent false detections due to overall differences in
-signal levels between images, and the resample step will subtract the BKGLEVEL
-values from each input image when combining them into a mosaic.
+Outlier detection will use the BKGLEVEL values to internally equalize the
+images, which is necessary to prevent false detections due to overall
+differences in signal levels between images, and the resample step will
+subtract the BKGLEVEL values from each input image when combining them into
+a mosaic.
+
+Sky background
+--------------
+For a detailed discussion of JWST background components, please see
+`Rigby et al. "How Dark the Sky: The JWST Backgrounds", 2023
+<https://doi.org/10.48550/arXiv.2211.09890>`_ and
+`"JWST Background Model" section in the JWST User Documentation
+<https://jwst-docs.stsci.edu/jwst-general-support/jwst-background-model>`_
+Here we just note that some components (e.g., in-field zodiacal light)
+result in reproducible background structures in all detectors when they are
+exposed simultaneously, while other components (e.g. stray light, thermal
+emission) can produce varying background from one exposure to the next
+exposure. The type of background structure that dominates a particular dataset
+affects the optimal way to group images in the skymatch step.
+
+Image Groups
+------------
+When computing and matching sky background on a set of input images, a *single
+sky level* (or offset, depending on selected ``skymethod``) can be computed
+either for each input image or for groups of two or more input images.
+
+When background is dominated by zodiacal light, images taken at the same time
+(e.g., NIRCam images from all short-wave detectors) can be sky matched
+together; that is, a single background
+level can be computed and applied to all these images because we can assume
+that for the next exposure we will get a similar background structure, albeit
+with an offset level (common to all images in an exposure). Using grouped
+images with a common background level offers several advantages:
+more data are used to compute the single sky level, and the
+background level of images that do not overlap individually
+with any other images in other exposures can still be adjusted (as long as they
+belong to a group). This is the default operating mode for the ``skymatch``
+step.
+
+Identification of images that belong to the same "exposure" and therefore
+can be grouped together is based on several attributes described in
+`jwst.datamodels.ModelContainer`. This grouping is performed automatically
+in the ``skymatch`` step using the
+`jwst.datamodels.ModelContainer.models_grouped` property or
+:py:meth:`jwst.datamodels.ModelLibrary.group_indices`.
+
+However, when background across different detectors in a single "exposure"
+(or "group") is dominated by unpredictable background components, we no longer
+can use a single background level for all images in a group. In this case,
+it may be desirable to match image backgrounds independently. This can be
+achieved either by setting the ``image_model.meta.group_id`` attribute to a
+unique string or integer value for each image, or by adding the ``group_id``
+attribute to the ``members`` of the input ASN table - see
+`~jwst.datamodels.ModelContainer` for more details.
+
+.. note::
+    Group ID (``group_id``) is used by both ``tweakreg`` and ``skymatch`` steps
+    and so modifying it for one step will affect the results in another step.
+    If it is desirable to apply different grouping strategies to the
+    ``tweakreg`` and ``skymatch`` steps, one may need to run each step
+    individually and provide a different ASN as input to each step.
 
 Assumptions
 -----------
@@ -70,9 +130,10 @@ value computations.
 The first method, called "local", essentially is an enhanced version of the
 original sky subtraction method used in older versions of
 `astrodrizzle <https://drizzlepac.readthedocs.io/en/latest/astrodrizzle.html>`_.
-This method simply computes the mean/median/mode/etc. value of the sky separately
-in each input image. This method was upgraded to be able to use DQ flags
-to remove bad pixels from being used in the computations of sky statistics.
+This method simply computes the mean/median/mode/etc. value of the sky
+separately in each input image. This method was upgraded to be able to use DQ
+flags to remove bad pixels from being used in the computations of sky
+statistics.
 
 In addition to the classic "local" method, two other methods have been
 introduced: "global" and "match", as well as a combination of the
@@ -86,7 +147,7 @@ two -- "global+match".
 
 #. The "match" algorithm computes only a correction value for each image, such
    that, when applied to each image, the mismatch between *all* pairs of images
-   is minimized, in the least-squares sense. For each pair of images, the sky 
+   is minimized, in the least-squares sense. For each pair of images, the sky
    mismatch is computed *only* in the regions in which the two images overlap
    on the sky.
 
@@ -95,9 +156,9 @@ two -- "global+match".
    only pair-wise intersection of adjacent images without having
    a common intersection region (on the sky) in all images.
 
-   Note that if the argument "match_down=True", matching will be done to the image
-   with the lowest sky value, and if "match_down=False" it will be done to the
-   image with the highest value
+   Note that if the argument "match_down=True", matching will be done to the
+   image with the lowest sky value, and if "match_down=False" it will be done
+   to the image with the highest value
    (see :ref:`skymatch step arguments <skymatch_arguments>` for full details).
 
 #. The "global+match" algorithm combines the "global" and "match" methods.
@@ -123,19 +184,20 @@ The ``skymatch`` step can also accept user-supplied sky values for each image.
 This is useful when sky values have been determined based on a custom workflow
 outside the pipeline. To use this feature, the user must provide a list of sky
 values matching the number of images (``skylist`` parameter) and set the
-``skymethod`` parameter to "user". The ``skylist`` must be a two-column 
+``skymethod`` parameter to "user". The ``skylist`` must be a two-column
 whitespace-delimited file with the first column containing the image filenames
 and the second column containing the sky values. There must be exactly one line
 per image in the input list.
 
 Examples
 --------
-To get a better idea of the behavior of these different methods, the tables below
-show the results for two hypothetical sets of images. The first example is for a
-set of 6 images that form a 2x3 mosaic, with every image having overlap with its
-immediate neighbors. The first column of the table gives the actual (fake) sky
-signal that was imposed in each image, and the subsequent columns show the
-results computed by each method (i.e. the values of the resulting BKGLEVEL keywords).
+To get a better idea of the behavior of these different methods, the tables
+below show the results for two hypothetical sets of images. The first example
+is for a set of 6 images that form a 2x3 mosaic, with every image having
+overlap with its immediate neighbors. The first column of the table gives the
+actual (fake) sky signal that was imposed in each image, and the subsequent
+columns show the results computed by each method (i.e. the values of the
+resulting BKGLEVEL keywords).
 All results are for the case where the step argument ``match_down = True``,
 which means matching is done to the image with the lowest sky value.
 Note that these examples are for the highly simplistic case where each example
@@ -230,16 +292,17 @@ accomplishes this by comparing the sky values in the
 overlap regions of each image pair. The quality of sky matching will
 obviously depend on how well these sky values can be estimated.
 True background may not be present at all in some images, in which case
-the computed "sky" may be the surface brightness of a large galaxy, nebula, etc.
+the computed "sky" may be the surface brightness of a large galaxy, nebula,
+etc.
 
 Here is a brief list of possible limitations and factors that can affect
 the outcome of the matching (sky subtraction in general) algorithm:
 
 #. Because sky computation is performed on *flat-fielded* but
    *not distortion corrected* images, it is important to keep in mind
    that flat-fielding is performed to obtain correct surface brightnesses.
-   Because the surface brightness of a pixel containing a point-like source will
-   change inversely with a change to the pixel area, it is advisable to
+   Because the surface brightness of a pixel containing a point-like source
+   will change inversely with a change to the pixel area, it is advisable to
    mask point-like sources through user-supplied mask files. Values
    different from zero in user-supplied masks indicate good data pixels.
    Alternatively, one can use the ``upper`` parameter to exclude the use of

diff --git a/jwst/resample/tests/test_resample_step.py b/jwst/resample/tests/test_resample_step.py
@@ -854,12 +854,13 @@ def test_resample_undefined_variance(nircam_rate, shape):
 
 @pytest.mark.parametrize('ratio', [0.7, 1.2])
 @pytest.mark.parametrize('rotation', [0, 15, 135])
-@pytest.mark.parametrize('crpix', [(600, 550), (601, 551)])
-@pytest.mark.parametrize('crval', [(22.04, 11.98), (22.041, 11.981)])
-@pytest.mark.parametrize('shape', [(1205, 1100)])
+@pytest.mark.parametrize('crpix', [(256, 488), (700, 124)])
+@pytest.mark.parametrize('crval', [(22.04019, 11.98262), (22.0404, 11.983)])
+@pytest.mark.parametrize('shape', [(1020, 1010)])
 def test_custom_wcs_resample_imaging(nircam_rate, ratio, rotation, crpix, crval, shape):
     im = AssignWcsStep.call(nircam_rate, sip_approx=False)
     im.data += 5
+
     result = ResampleStep.call(
         im,
         output_shape=shape,
@@ -907,15 +908,17 @@ def test_custom_refwcs_resample_imaging(nircam_rate, output_shape2, match,
     rng = np.random.default_rng(seed=77)
     im.data[:, :] = rng.random(im.data.shape)
 
-    crpix = (600, 550)
-    crval = (22.04, 11.98)
+    if output_shape2 is None:
+        crpix = None
+    else:
+        crpix = (output_shape2[-1] // 2, output_shape2[-2] // 2)
+    crval = tuple(np.mean(im.meta.wcs.footprint(), axis=0))
     rotation = 15
     ratio = 0.7
 
-    # first pass - create a reference output WCS:
     result = ResampleStep.call(
         im,
-        output_shape=(1205, 1100),
+        output_shape=output_shape2,
         crpix=crpix,
         crval=crval,
         rotation=rotation,
@@ -924,11 +927,15 @@ def test_custom_refwcs_resample_imaging(nircam_rate, output_shape2, match,
 
     # make sure results are nontrivial
     data1 = result.data
+    weight1 = result.wht
+
     assert not np.all(np.isnan(data1))
 
+    if crpix is not None:
+        assert np.allclose(result.meta.wcs(*crpix), crval, rtol=1e-12, atol=0)
+
     refwcs = str(tmp_path / "resample_refwcs.asdf")
-    result.meta.wcs.bounding_box = [(-0.5, 1204.5), (-0.5, 1099.5)]
-    asdf.AsdfFile({"wcs": result.meta.wcs}).write_to(refwcs)
+    asdf.AsdfFile({"wcs": result.meta.wcs, "array_shape": data1.shape}).write_to(refwcs)
 
     result = ResampleStep.call(
         im,
@@ -937,6 +944,7 @@ def test_custom_refwcs_resample_imaging(nircam_rate, output_shape2, match,
     )
 
     data2 = result.data
+    weight2 = result.wht
     assert not np.all(np.isnan(data2))
 
     if output_shape2 is not None:
@@ -945,19 +953,18 @@ def test_custom_refwcs_resample_imaging(nircam_rate, output_shape2, match,
     if match:
         # test output image shape
         assert data1.shape == data2.shape
-        assert np.allclose(data1, data2, equal_nan=True)
+        assert np.allclose(data1, data2, equal_nan=True, rtol=1.0e-7, atol=1e-7)
 
     # make sure pixel values are similar, accounting for scale factor
     # (assuming inputs are in surface brightness units)
-    iscale = np.sqrt(
-        im.meta.photometry.pixelarea_steradians
-        / compute_mean_pixel_area(im.meta.wcs, shape=im.data.shape)
-    )
+    iscale2 = (im.meta.photometry.pixelarea_steradians
+               / compute_mean_pixel_area(im.meta.wcs, shape=im.data.shape))
     input_mean = np.nanmean(im.data)
-    output_mean_1 = np.nanmean(data1)
-    output_mean_2 = np.nanmean(data2)
-    assert np.isclose(input_mean * iscale**2, output_mean_1, atol=1e-4)
-    assert np.isclose(input_mean * iscale**2, output_mean_2, atol=1e-4)
+    total_weight = np.sum(weight1)
+    output_mean_1 = np.nansum(data1 * weight1) / total_weight
+    output_mean_2 = np.nansum(data2 * weight2) / total_weight
+    assert np.isclose(input_mean * iscale2, output_mean_1)
+    assert np.isclose(input_mean * iscale2, output_mean_2)
 
     im.close()
     result.close()
@@ -971,7 +978,7 @@ def test_custom_refwcs_pixel_shape_imaging(nircam_rate, tmp_path):
     im.data[:, :] = rng.random(im.data.shape)
 
     crpix = (600, 550)
-    crval = (22.04, 11.98)
+    crval = (22.04019, 11.98262)
     rotation = 15
     ratio = 0.7