Skip to content

Timeseries pipeline issues #201

@samaloney

Description

@samaloney
  • General
    • Not integrated with CLI or config
      • Can't set JSOC email via config files or CLI
      • Uses a different root folder var to rest of code
  • Performance
    • Storing lots of data twice + move overhead
      • Don't copy files symlink to final directory
    • Not using as many cores as possible and passing large objects between processes
      • Now use semaphore to limit DRMS call but use many cores for processing
      • Don't pass the entire fits file to processes only the WCS
    • Shouldn't bother reprojecting the entire image at the start huge waste only re-project the ARs
    • Could write animation frame to memory use mpl animate (possible memory issues)
  • These errors come up repeatedly on long runs
    • QUALITY column seems to be missing occasionally
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! 785/2393 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
2026-02-17 07:23:27 - root - INFO: 2014-04-24_12038_Beta-Gamma_Dai_Xb0_Mb0_Cb0_Xa0_Ma0_Ca0
2026-02-17 07:25:16 - root - ERROR: 'QUALITY'
Traceback (most recent call last):
  File "/net/maedoc.ap.dias.ie/maedoc/home_cr/smaloney/projects/arccnet/arccnet/data_generation/timeseries/drms_pipeline.py", line 77, in <module>
    aia_maps, hmi_maps = drms_pipeline(
                         ^^^^^^^^^^^^^^
  File "/net/maedoc.ap.dias.ie/maedoc/home_cr/smaloney/projects/arccnet/arccnet/data_generation/timeseries/sdo_processing.py", line 479, in drms_pipeline
    aia_query, aia_export = aia_query_export(hmi_query, aia_keys, wavelengths)
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/net/maedoc.ap.dias.ie/maedoc/home_cr/smaloney/projects/arccnet/arccnet/data_generation/timeseries/sdo_processing.py", line 588, in aia_query_export
    euv_value = [aia_rec_find(qstr, keys, 3, 12) for qstr in qstrs_euv]
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/net/maedoc.ap.dias.ie/maedoc/home_cr/smaloney/projects/arccnet/arccnet/data_generation/timeseries/sdo_processing.py", line 588, in <listcomp>
    euv_value = [aia_rec_find(qstr, keys, 3, 12) for qstr in qstrs_euv]
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/net/maedoc.ap.dias.ie/maedoc/home_cr/smaloney/projects/arccnet/arccnet/data_generation/timeseries/sdo_processing.py", line 674, in aia_rec_find
    while qry["QUALITY"].values[0] != 0 and retry < retries:
          ~~~^^^^^^^^^^^
  File "/net/maedoc.ap.dias.ie/maedoc/home_cr/smaloney/.virtualenvs/arccnet/lib/python3.11/site-packages/pandas/core/frame.py", line 4107, in __getitem__
    indexer = self.columns.get_loc(key)
              ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/net/maedoc.ap.dias.ie/maedoc/home_cr/smaloney/.virtualenvs/arccnet/lib/python3.11/site-packages/pandas/core/indexes/range.py", line 417, in get_loc
    raise KeyError(key)
KeyError: 'QUALITY'
  • Length mismatch happen on this line (can't find traceback right now)

    instr = query["INSTRUME"][0][0:3]
    path_prefix = []
    export.urls.drop_duplicates(ignore_index=True, inplace=True)
    query.drop_duplicates(ignore_index=True, inplace=True)
    for time, wvl in zip(query["T_REC"], query["WAVELNTH"]):
    time = sunpy.time.parse_time(time).to_value("ymdhms")
    year, month, day = time["year"], time["month"], time["day"]
    newdir = f"{path}/01_raw/{year}/{month}/{day}/SDO/{instr}/"
    Path(newdir).mkdir(parents=True, exist_ok=True)
    path_prefix.append(f"{newdir}{int(wvl)}.")
    existing_files = [glob.glob(f"{dirs}*.fits") for dirs in np.unique(path_prefix)]
    existing_files = list(itertools.chain.from_iterable(existing_files))
    matching_files = [comp_list(file, existing_files) for file in export.urls["filename"]]
    missing_files = [not value for value in matching_files]
    export.urls["filename"] = path_prefix + export.urls["filename"]

  • New SSL timeout error

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! 24/2305 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
2026-02-17 16:22:50 - root - INFO: 2011-03-02_11164_Beta-Gamma-Delta_Ekc_Xb0_Mb0_Cb0_Xa0_Ma0_Ca1
2026-02-17 16:27:31 - root - ERROR: The read operation timed out
Traceback (most recent call last):
  File "/net/maedoc.ap.dias.ie/maedoc/home_cr/smaloney/projects/arccnet/arccnet/data_generation/timeseries/drms_pipeline.py", line 77, in <module>
    aia_maps, hmi_maps = drms_pipeline(
                         ^^^^^^^^^^^^^^
  File "/net/maedoc.ap.dias.ie/maedoc/home_cr/smaloney/projects/arccnet/arccnet/data_generation/timeseries/sdo_processing.py", line 487, in drms_pipeline
    cnt_dls, cnt_exs = l1_file_save(ic_export, ic_query, path)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/net/maedoc.ap.dias.ie/maedoc/home_cr/smaloney/projects/arccnet/arccnet/data_generation/timeseries/sdo_processing.py", line 727, in l1_file_save
    export.download(directory="", index=export.urls[missing_files].index)
  File "/net/maedoc.ap.dias.ie/maedoc/home_cr/smaloney/.virtualenvs/arccnet/lib/python3.11/site-packages/drms/client.py", line 567, in download
    shutil.copyfileobj(response, out_file)
  File "/home/smaloney/.pyenv/versions/3.11.13/lib/python3.11/shutil.py", line 197, in copyfileobj
    buf = fsrc_read(length)
          ^^^^^^^^^^^^^^^^^
  File "/home/smaloney/.pyenv/versions/3.11.13/lib/python3.11/http/client.py", line 473, in read
    s = self.fp.read(amt)
        ^^^^^^^^^^^^^^^^^
  File "/home/smaloney/.pyenv/versions/3.11.13/lib/python3.11/socket.py", line 718, in readinto
    return self._sock.recv_into(b)
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/smaloney/.pyenv/versions/3.11.13/lib/python3.11/ssl.py", line 1314, in recv_into
    return self.read(nbytes, buffer)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/smaloney/.pyenv/versions/3.11.13/lib/python3.11/ssl.py", line 1166, in read
    return self._sslobj.read(len, buffer)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TimeoutError: The read operation timed out
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! 25/2305 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Traceback (most recent call last):
  File "/net/maedoc.ap.dias.ie/maedoc/home_cr/smaloney/.virtualenvs/arccnet/lib/python3.11/site-packages/aiapy/util/net.py", line 39, in _get_data_from_jsoc
    jsoc_result = drms.Client().query(query, key=key, seg=seg)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/net/maedoc.ap.dias.ie/maedoc/home_cr/smaloney/.virtualenvs/arccnet/lib/python3.11/site-packages/drms/client.py", line 1007, in query
    self._raise_query_error(lres)
  File "/net/maedoc.ap.dias.ie/maedoc/home_cr/smaloney/.virtualenvs/arccnet/lib/python3.11/site-packages/drms/client.py", line 638, in _raise_query_error
    raise DrmsQueryError(msg)
drms.exceptions.DrmsQueryError: Command '/home/jsoc/releases/jsoc-v10.0.5/bin/linux_x86_64/show_series -qz JSOC_DBHOST=hmidb2:5432 JSOC_DBNAME=jsoc JSOC_DBUSER=apache' returned non-zero status code 1 [status=5]

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/net/maedoc.ap.dias.ie/maedoc/home_cr/smaloney/projects/arccnet/arccnet/data_generation/timeseries/drms_pipeline.py", line 68, in <module>
    pointing_table = calibrate.util.get_pointing_table(source="jsoc", time_range=[start - 6 * u.hour, end])
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/net/maedoc.ap.dias.ie/maedoc/home_cr/smaloney/.virtualenvs/arccnet/lib/python3.11/site-packages/aiapy/calibrate/util.py", line 253, in get_pointing_table
    _get_data_from_jsoc(query=f"aia.master_pointing3h[{start.isot}Z-{end.isot}Z]", key="**ALL**")
  File "/net/maedoc.ap.dias.ie/maedoc/home_cr/smaloney/.virtualenvs/arccnet/lib/python3.11/site-packages/aiapy/util/net.py", line 42, in _get_data_from_jsoc
    raise OSError(msg) from e
OSError: Unable to query the JSOC.
 Error message: Command '/home/jsoc/releases/jsoc-v10.0.5/bin
  • Also getting these warning which maybe ok but feel they shouldn't be happening
    WARNING: SunpyUserWarning: Using 'time' assumes an Earth-based observer. [sunpy.physics.differential_rotation] 2026-02-17 11:27:19 - sunpy - WARNING: SunpyUserWarning: Using 'time' assumes an Earth-based observer.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions