Skip to content

Executing blocks with malformed results section erases org file contents #486

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
pati-ni opened this issue Jul 28, 2023 · 26 comments
Open

Comments

@pati-ni
Copy link

pati-ni commented Jul 28, 2023

In some scenarios (does not happen often) the executing a src block erases the appended commend of the org-mode file. The result is that everything at least until the next src block gets erased.

My guess is that in the #+RESULTS: section the :RESULTS: are not properly wrapped with the :END: block. Thus, re-executing the src block erases everything up until the next :END: block. If lucky and executed the next block it will stop in :END: of that block. If not, most likely it will erase everything. Usually this can be reverted, if I realize described behavior with an undo.

But this can lead to some nasty situations. Is there a way to introduce a fail-safe for this?

@ed9w2in6
Copy link

ed9w2in6 commented Sep 27, 2023

I face a similar symptom regularly but only on commands that fails and triggers error / warnings.
In which org-mode will complain, here are an excerpt of my *Warnings* buffer

 ■  Warning (org-element-cache): org-element--cache: Unregistered buffer modifications detected (312536 != 312301). Resetting.
If this warning appears regularly, please report the warning text to Org mode mailing list (M-x org-submit-bug-report).
The buffer is: scratch.org
 Current command: nil
 Backtrace:
nil

So a good starting point on any investigation would be:

  • how emacs-jupyter put error message ([goto-error]?) into org buffer
    • via jupyter-handle-error
  • how org-element--cache-sync expects buffer are edited.

Sorry that I do not have time right now to look further into this but I shall investigate more on it later since it affects me a lot.
I used to just ignore it but lately I found that huge sections of my org file got deleted away without me noticing.
Undo also won't work but luckily I had made a backup few weeks ago.

@nnicandro
Copy link
Collaborator

Could you give an example where Emacs-Jupyter causes the :END: line to be missing from the results of a source block? There shouldn't be any such cases happening, if there are let me know.

It would be great if you could provide a minimum working example Org file with example source blocks that can reproduce your problem. I am aware of the scenario that you mention, but evaluating source blocks with Emacs-Jupyter should always create valid Org documents.

@akirakyle
Copy link
Contributor

I have also been seeing these issues but haven't had a chance to debug further and create a minimum working example. I suspect the org-element--cache issues may be related to to the way jupyter handles ansi colors as I still see inconsistent behavior for error outputs that have escape characters to color them.

@ed9w2in6
Copy link

@nnicandro
For me, I am not sure if :END: line is missing all the time when erasure of contents happen.
I am sure that:

I can reproduce this quite reliably in my current setup so I'll try look into it maybe at the weekends for a minimal setup.

@NightMachinery
Copy link

I have also seen this happen, e.g.,

* before
#+begin_src jupyter-python :kernel py_base :session emacs_py_1 :async yes :exports both
8
#+end_src

#+RESULTS:
#+begin_src jupyter-python :kernel py_base :session emacs_py_1 :async yes :exports both
9
#+end_src
* after
#+begin_src jupyter-python :kernel py_base :session emacs_py_1 :async yes :exports both
8
#+end_src

#+RESULTS:
: 8

Here the problem is that I have deleted the blank line after #+RESULTS, but I think emacs-jupyter should be robust to such user misbehaviors. Can't it just check for the next #+begin_src and always stop there? After all, all results are prepended with : , so there should never be a line in the results that starts with #+begin_src. We can even just stop at the first #+.

@ed9w2in6
Copy link

ed9w2in6 commented Oct 20, 2023 via email

@NightMachinery
Copy link

@ed9w2in6 You’re right, the issue I raised is distinct from yours, but perhaps the solution I proposed would work for your case, as well?

@nnicandro
Copy link
Collaborator

@NightMachinery In Org, src blocks can also be the results of execution of some other source block, e.g.

#+begin_src shell :wrap "src jupyter-python"
echo 9
#+end_src

Running the above yields

#+begin_src shell :wrap "src jupyter-python"
echo 9
#+end_src

#+RESULTS:
#+begin_src jupyter-python
9
#+end_src

Org is the one that is removing the source block with the #+RESULTS keyword attached to it, not Emacs-Jupyter, when a new result is inserted see org-babel-insert-result. There is really no way to work around this unless we mess with Org's internals.

@NightMachinery
Copy link

@nnicandro I think this problem is severe enough that it warrants forking org-babel-insert-result. I am pretty sure I have lost code to this behavior.

This org feature of inserting a source block as a result of another thing is not that useful (I would personally use eval instead in such a situation, which will work even if I migrate to a script).

We can of course check :wrap and do delete begin_src if such a :wrap is present. This new behavior can even be sent upstream, no?

@mankoff
Copy link

mankoff commented Oct 23, 2023

Just to confirm that I can reliably reproduce this. Evaluate buffer with error. An example is:

#+BEGIN_SRC jupyter-python :exports both
import numpy as np
import xarray as xr
import pandas as pd

# foo = xr.Dataset({'foo': xr.DataArray(data=[100,200], dims=['dim'], coords={'dim':['a','b']})})
# bar = xr.Dataset({'bar': xr.DataArray(data=[200,300], dims=['dim'], coords={'dim':['b','c']})})
# baz = xr.Dataset({'baz': xr.DataArray(data=[200,100], dims=['dim'], coords={'dim':['b','a']})})
# print((foo['foo']-bar['bar']).values)
# print((foo['foo'] - baz['baz']).values)

times = pd.date_range(start='2000-01-01',freq='1D',periods=3)
dims = np.array(['aa','ab']).astype(np.object)

foo = xr.Dataset({'foo': xr.DataArray(data=[[1,2],[3,4],[5,6]], dims=['time','dim'], coords={'time':times, 'dim':dims})})
bar = xr.Dataset({'bar': xr.DataArray(data=[[2,1],[4,3],[6,5]], dims=['time','dim'], coords={'time':times, 'dim':dims[::-1]})})
# print((foo['foo']-bar['bar']).values)

ds = xr.Dataset()
ds['time'] = (('time'), times)
ds['dim'] = (('dim'), dims)
ds['baz'] = (('time','dim'), foo['foo']-bar['bar'])
print(ds)
#+END_SRC

Which errors with:

#+RESULTS:
:RESULTS:
: /tmp/ipykernel_521850/3839449454.py:12: FutureWarning: In the future `np.object` will be defined as the corresponding NumPy scalar.
:   dims = np.array(['aa','ab']).astype(np.object)
# [goto error]
#+begin_example
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[13], line 12
      5 # foo = xr.Dataset({'foo': xr.DataArray(data=[100,200], dims=['dim'], coords={'dim':['a','b']})})
      6 # bar = xr.Dataset({'bar': xr.DataArray(data=[200,300], dims=['dim'], coords={'dim':['b','c']})})
      7 # baz = xr.Dataset({'baz': xr.DataArray(data=[200,100], dims=['dim'], coords={'dim':['b','a']})})
      8 # print((foo['foo']-bar['bar']).values)
      9 # print((foo['foo'] - baz['baz']).values)
     11 times = pd.date_range(start='2000-01-01',freq='1D',periods=3)
---> 12 dims = np.array(['aa','ab']).astype(np.object)
     14 foo = xr.Dataset({'foo': xr.DataArray(data=[[1,2],[3,4],[5,6]], dims=['time','dim'], coords={'time':times, 'dim':dims})})
     15 bar = xr.Dataset({'bar': xr.DataArray(data=[[2,1],[4,3],[6,5]], dims=['time','dim'], coords={'time':times, 'dim':dims[::-1]})})

File ~/local/mambaforge/envs/ds/lib/python3.10/site-packages/numpy/__init__.py:305, in __getattr__(attr)
    300     warnings.warn(
    301         f"In the future `np.{attr}` will be defined as the "
    302         "corresponding NumPy scalar.", FutureWarning, stacklevel=2)
    304 if attr in __former_attrs__:
--> 305     raise AttributeError(__former_attrs__[attr])
    307 # Importing Tester requires importing all of UnitTest which is not a
    308 # cheap import Since it is mainly used in test suits, we lazy import it
    309 # here to save on the order of 10 ms of import time for most users
    310 #
    311 # The previous way Tester was imported also had a side effect of adding
    312 # the full `numpy.testing` namespace
    313 if attr == 'testing':

AttributeError: module 'numpy' has no attribute 'object'.
`np.object` was a deprecated alias for the builtin `object`. To avoid this error in existing code, use `object` by itself. Doing this will not modify any behavior and is safe. 
The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
    https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
#+end_example
:END:

Note that there is both an #+end_example and :END:.

However, if I then type C-/ which is (undo), it deletes the next ~100 lines of the buffer. Sections, babel blocks with BEGIN_SRC, etc. (a comment above said to the next source block but I am seeing this bug consume multiple source blocks).

I can reliably recreate this bug with my (large) config. I cannot create an MWE.

@ed9w2in6
Copy link

ed9w2in6 commented Oct 24, 2023

@mankoff The symptoms that you just described are exactly the same as mine.
Do you see the org-element--cache warning messages?

@nnicandro Do you prefer me and @mankoff to open another issue? At least the symptoms are different, if it turns out to be the same cause we can just close one of them.

@mankoff
Copy link

mankoff commented Oct 24, 2023

@ed9w2in6 I'm running latest org from git and updated yesterday. I cannot recreate this bug today. The git log has a lot of entries on org-element and the cache. Can you check if you still have this bug on the latest commit? I'm at

  • 098f08159 - (HEAD -> main, origin/main, origin/HEAD) org-open-at-point: Preserve point unless opening link moves the point (2023-10-23)

@mankoff
Copy link

mankoff commented Oct 26, 2023

@ed9w2in6 I'm running latest org from git and updated yesterday. I cannot recreate this bug today. The git log has a lot of entries on org-element and the cache. Can you check if you still have this bug on the latest commit?

Nevermind, Still here :(.

@mankoff
Copy link

mankoff commented Oct 26, 2023

I posted about this issue on the Org list in case it is related to org-cache.

Their suggestion:

This is most likely a problem with emacs-jupyter. It does something that bypasses `after-change-functions', which is not allowed in Org mode.

@mankoff
Copy link

mankoff commented Oct 26, 2023

@mankoff The symptoms that you just described are exactly the same as mine.
Do you see the org-element--cache warning messages?

Yes, I get the same cache warning. I see

Toggle...
Warning (org-element-cache): org-element--cache: Unregistered buffer modifications detected (57640 != 55418). Resetting.
If this warning appears regularly, please report the warning text to Org mode mailing list (M-x org-submit-bug-report).
The buffer is: misc.org
 Current command: nil
 Backtrace:
"  backtrace-to-string(nil)
  org-element--cache-sync(# 37660)
  org-element-at-point()
  (progn (org-element-at-point))
  (unwind-protect (progn (org-element-at-point)) (set-match-data save-match-data-internal 'evaporate))
  (let ((save-match-data-internal (match-data))) (unwind-protect (progn (org-element-at-point)) (set-match-data save-match-data-internal 'evaporate)))
  (let ((element (let ((save-match-data-internal (match-data))) (unwind-protect (progn (org-element-at-point)) (set-match-data save-match-data-internal 'evaporate))))) (and (eq (org-element-type element) 'src-block) (>= (line-beginning-position) (let* ((parray (and t (let* ... ...)))) (if parray (let* ((val ...)) (if (eq val ...) 'nil (let ... val))) (let* ((val ...)) (cond (... ...) (... ...) (t ...)))))) (<= (line-end-position) (save-excursion (save-restriction (widen) (goto-char (let* (...) (if parray ... ...))) (skip-chars-backward \" \\11\\n\") (line-end-position)))) (org-element--property :language element nil nil)))
  org-eldoc-get-src-lang()
  (let ((lang (org-eldoc-get-src-lang))) (cond ((string= lang \"org\") nil) ((or (string= lang \"emacs-lisp\") (string= lang \"elisp\")) (cond ((and (boundp 'eldoc-documentation-functions) (fboundp 'elisp-eldoc-var-docstring) (fboundp 'elisp-eldoc-funcall)) (let ((eldoc-documentation-functions ...)) (eldoc-print-current-symbol-info))) ((fboundp 'elisp-eldoc-documentation-function) (elisp-eldoc-documentation-function)) (t (let (eldoc-documentation-function) (eldoc-print-current-symbol-info))))) ((or (string= lang \"c\") (string= lang \"C\")) (if (require 'c-eldoc nil t) (progn (c-eldoc-print-current-symbol-info)))) ((string= lang \"css\") (if (require 'css-eldoc nil t) (progn (css-eldoc-function)))) ((string= lang \"php\") (if (require 'php-eldoc nil t) (progn (php-eldoc-function)))) ((or (string= lang \"go\") (string= lang \"golang\")) (if (require 'go-eldoc nil t) (progn (go-eldoc--documentation-function)))) (t (let ((doc-fun (org-eldoc-get-mode-local-documentation-function lang)) (callback (car args))) (if (functionp doc-fun) (progn (if (functionp callback) (funcall doc-fun callback) (funcall doc-fun))))))))
  (or (org-eldoc-get-breadcrumb) (org-eldoc-get-src-header) (let ((lang (org-eldoc-get-src-lang))) (cond ((string= lang \"org\") nil) ((or (string= lang \"emacs-lisp\") (string= lang \"elisp\")) (cond ((and (boundp ...) (fboundp ...) (fboundp ...)) (let (...) (eldoc-print-current-symbol-info))) ((fboundp 'elisp-eldoc-documentation-function) (elisp-eldoc-documentation-function)) (t (let (eldoc-documentation-function) (eldoc-print-current-symbol-info))))) ((or (string= lang \"c\") (string= lang \"C\")) (if (require 'c-eldoc nil t) (progn (c-eldoc-print-current-symbol-info)))) ((string= lang \"css\") (if (require 'css-eldoc nil t) (progn (css-eldoc-function)))) ((string= lang \"php\") (if (require 'php-eldoc nil t) (progn (php-eldoc-function)))) ((or (string= lang \"go\") (string= lang \"golang\")) (if (require 'go-eldoc nil t) (progn (go-eldoc--documentation-function)))) (t (let ((doc-fun (org-eldoc-get-mode-local-documentation-function lang)) (callback (car args))) (if (functionp doc-fun) (progn (if ... ... ...))))))))
  org-eldoc-documentation-function(#f(compiled-function (string &rest plist) #))
  run-hook-with-args-until-success(org-eldoc-documentation-function #f(compiled-function (string &rest plist) #))
  eldoc-documentation-default()
  eldoc--invoke-strategy(nil)
  eldoc-print-current-symbol-info()
  #f(compiled-function () #)()
  apply(#f(compiled-function () #) nil)
  timer-event-handler([t 0 0 500000 nil #f(compiled-function () #) nil idle 0 nil])
" Disable showing Disable logging

@yantar92
Copy link

This org feature of inserting a source block as a result of another thing is not that useful

It is useful in some scenarios. But can be disabled, if you wish to (:results none). If you find something missing, feel free to write a feature request.

@mankoff
Copy link

mankoff commented Oct 26, 2023

Hi @yantar92 - thanks for following up here. It does appear that we may be discussing two different problems in this GitHub issue. The error in the latter (more recent above here) comments and that I reported on the Org mailing list is not about code generating a source block. It happens with 'normal' #+RESULTS:.

See comment here: #486 (comment)

@yantar92
Copy link

yantar92 commented Oct 26, 2023

However, if I then type C-/ which is (undo), it deletes the next ~100 lines of the buffer. Sections, babel blocks with BEGIN_SRC, etc. (a comment above said to the next source block but I am seeing this bug consume multiple source blocks).

I can reliably recreate this bug with my (large) config. I cannot create an MWE.

I am unable to reproduce with my config.

@mankoff
Copy link

mankoff commented Oct 26, 2023

I was unable to reproduce with my config yesterday either :). But I was today. Same config! :(.

@yantar92
Copy link

I am seeing the following in ob-jupyter:

        ;; KLUDGE: Remove the file result-parameter so that
        ;; `org-babel-insert-result' doesn't attempt to handle it while
        ;; async results are pending.  Do the same in the synchronous
        ;; case, but not if link or graphics are also result-parameters,
        ;; only in Org >= 9.2, since those in combination with file mean
        ;; to interpret the result as a file link, a useful meaning that
        ;; doesn't interfere with Jupyter style result insertion.

Do note that async evaluation API is in place in the latest Org mode. There is no need to write custom async code that might indeed be prone to various errors. Check out org-babel-comint-async-register (if it is not sufficient, consider writing a feature request).

@ed9w2in6
Copy link

A package with similar (same?) issue: nobiot/org-transclusion#105
Their fix: nobiot/org-transclusion@eb3ff3c

I believe our issue here can be similar to them, in which the culprit is inhibit-modification-hook.

usage in jupyter-org-client:

;; Don't add these changes to the undo list, gives a slight speed up.
(let ((buffer-undo-list t)
(inhibit-modification-hooks t)
next begin1 end1)
(while (/= begin end)
(setq next (next-single-property-change begin 'jupyter-ansi nil end))

usage in jupyter-repl (may not need to change but maybe change to for consistency?):

jupyter/jupyter-repl.el

Lines 748 to 755 in 3a31920

;;
;; For reference see https://debbugs.gnu.org/cgi/bugreport.cgi?bug=32278
(let ((inhibit-modification-hooks t)
(beg (point-min))
(end (point-max))
(new-len (with-current-buffer new-code
(- (point-max) (point-min)))))
(run-hook-with-args

A quick search in org-mode mailing list reveals a few similar issues: https://list.orgmode.org/?q=inhibit-modification-hooks

I am not sure about a right fix, maybe just doing a (org-element-cache-reset)?

Reading the long mailing lists for a bit, insert-file-contents apparently also need to do a cache reset dance.

@yantar92
Copy link

I am not sure about a right fix, maybe just doing a (org-element-cache-reset)?

Why do you need to set inhibit-modification-hooks to start with? Running org-element-cache-reset will cause performance degradation in the whole Org file.

@fakeGenuis
Copy link

Similar problem I met. I am using doom with latest emacs-jupter. My problem is that when code running cost long time and met an error, then stdout during running got erased. A minimal example like this

#+begin_src jupyter-python :session test :async yes
from time import sleep

n_all = 10
for i in range(n_all):
    sleep(0.05)
    print(i)
1/0
#+end_src

the results block looks

#+RESULTS:
:RESULTS:
#+begin_example
ter-python :session 
#+end_example
# [goto error]
: ---------------------------------------------------------------------------
: ZeroDivisionError                         Traceback (most recent call last)
: Cell In[46], line 7
:       5     sleep(0.05)
:       6     print(i)
: ----> 7 1/0
: 
: ZeroDivisionError: division by zero
:END:

Note that without 1/0 in the last line, the results are 0-9 wrapped inside example block. Now they are replaced with some codes anywhere from my org file (ter-python :session). Either comment sleep(0.05) or change header args to :async no fix it.

If I change n_all to 9, the stdout would not be warpped inside example block and it works fine. Any idea on which function/variable triggered this wrapper?

@akirakyle
Copy link
Contributor

For anyone here experiencing the org-element-cache warning issue, you can try out the fix in #515 and see if that helps. I'm not sure about the other issue with erasing org file contents. I've seen this happen before but never reproducibly and so I've always attributed to accidentally deleting :END: or something else in the org buffer while long running async code works such that the org document it tries to insert its results in is itself malformed.

@pati-ni
Copy link
Author

pati-ni commented Nov 28, 2023

One thing I have just noticed is about the output which if it is several lines it is formatted in an example block. If a #+begin_example block is not properly ended with an end_example , it will ignore the :END: sequence and will delete everything until it comes across the next #+end_example occurrence. So it may be that in blocks that are erroring this sequences of formatting is not occurring properly.

@pati-ni
Copy link
Author

pati-ni commented Mar 31, 2025

To follow up on this, I think that the results from the Python interpreter are not sanitized properly. Does anybody else still have experience where the execution of one block that generates warnings/error messages erases the contents of the notebook?

EDIT: Important to mention that I have the async facility and I edit the buffer while waiting for the block to finish.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants