RSH 16: Debugging
Debugging: we all do it, we are never taught it. How we approach it and some tools we use
Video
Collaborative notes taken during the session
Ice-breaker question
What was the most difficult bug you have chased/solved? What was the longest
debugging session/period for you?
- couple of days for one line of code
- couple of hours (around 10h) and learning that it was not straightforward to put simple text into an HDF5 file
- couple of days, most difficult was to find the bug in a part where i was abslutely sure it was right
- couple of days, gave up for a few months (on that new module), solved it in a few hours next time I looked at it. The issue with locating the bug was that debugging/print statements led me to believe that the error was in the wrong part of the code, due to asynchronous execution. In the end, the error was a simple index error.
- over a week. A bug that was very hard to reproduce and trigger. Software had poor test coverage.
-
Is this related? How to keep an overview about the code that gets more and more complex?
- definitely related. the more modular and "well structured" code, the easier it may be to locate/corner the bug.
-
Python logging module: https://docs.python.org/3/library/logging.html
-
Git bisect exercise: https://github.com/coderefinery/git-bisect-exercise/
-
Error message:
Code:
import ipyparallel as ipp
rc = ipp.Client()
rc.ids
executor = rc.become_dask(ncores=4)
executor
Traceback (most recent call last):
File "/opt/conda/lib/python3.7/site-packages/ipyparallel/controller/hub.py", line 559, in dispatch_query
handler(idents, msg)
File "/opt/conda/lib/python3.7/site-packages/ipyparallel/controller/hub.py", line 1467, in become_dask
self.distributed_scheduler = scheduler = Scheduler(**kwargs)
File "/opt/conda/lib/python3.7/site-packages/distributed/scheduler.py", line 885, in __init__
self.bandwidth = parse_bytes(dask.config.get("distributed.scheduler.bandwidth"))
File "/opt/conda/lib/python3.7/site-packages/dask/utils.py", line 1164, in parse_bytes
s = s.replace(" ", "")
AttributeError: 'int' object has no attribute 'replace'
- Do you not start at the bottom and read up?
- this is what I most often do in these cases
- oops :-) it seems Python and Fortran/C unwind errors in different orders
- Would say depends a little on the kind of program you are debugging. When using frameworks sometimes the answer is in the middle of the traceback.
- I usually
- Check changelog as well. Might be that a fix is mentioned.
- great point!
- changelogs, issues, pull requests
"Sleep on it" :smile: :+1:
-
Ask somebody else to have a look?
-
Also just trying to explain to the little rubber duck what is happening and what should actually happen.
- Often I have found the solution by explaining the problem to somebody. So this is real :-)
-
The Zen of Python: https://www.python.org/dev/peps/pep-0020/
-
set -e
-
set -euo pipefail
-
set -x
-
-q
-v
-vv
, -vvv
"debuggers do not remove bugs, they run your code in slow motion" [citation needed]
- TIL: https://en.wikipedia.org/wiki/Heisenbug
The example that we use gdb on (later also Valgrind): https://github.com/ResearchSoftwareHour/demo-debugging :+1:
-
fortran example: fortran/bugs/
- -g -fpe-trap=zero,invalid,overflow,underflow
-
Would it be easier to do this in an IDE (?), e.g. like Spyder, where one could see the variable and their values?
- yes I think so, it's a lot easier to follow the position and see the variables
-
Is there a way to print all variables and their values currently in memory? +1
- yes (I don't remember off-hand how, though)
- try
help all
in gdb
- how about in the python (was it pdb?)
import pdb ; pdb.set_trace()
-
I remember in Matlab I could put a command into the code and it would exactly stop there and give me access to the values save in this current moment.
- yes - a "breakpoint". in python this would be
breakpoint()
after importing pdb
- in visual studio code you can also click next to a line and it sets a breakpoint at that line (never used though, just did it by accident and wondered what was wrong :D)
- Thank you!
- See also debug adapter protocol and a list of supported languages
-
For Rust developers, I was happy to find out about the macro dbg!()
-
Have you heard about sourcery?
- I haven't, please tell us more ...
- https://sourcery.ai/ ?
- Yes, I heard about it in a podcast. I thought it checks you code and makes suggestions, maybe another sort of debugging?
-
awesome! I missed the variable inpector from Rstudio when starting with python. Now will look into if there is same in VSCode, do you know? -> seems there is under the run debug view (play sign with bug)
-
In Matlab the very similar variable inspector could handle the changing variables when one goes into the functions.
-
anyone with debugging advice on Haskell? Would like to hear/read any stories.
- I would also like to know :-)
-
https://valgrind.org/
-
@bast any chance of showing how to approach a segfault? +2
-
even describing what it is. It's still a bit vague to me.
- https://www.geeksforgeeks.org/core-dump-segmentation-fault-c-cpp/
- Thanks!
-
Python takes care of memory leaks?
- Not really. You can still have them but they have a different form. For instance Python has a Garbage Collector that ensures unused memory is released at specific times when no additional references exist to this memory. You can still have a memory leak if you do something like:
def func(mylist=[]):
mylist.append("hello")
print(mylist)
func()
func()
...
-
any advice on how to detect memory leaks (that are small when testing but may increase later when running with real data) in python?
- Test the code in loops, repeatedly calling it.
- Plot your memory usage over time
- yes, memory profiling. ideally at the end it should be back to zero
- is there specific tools for memory profiling?
-
Any suggestion on detecting memory leaks in CUDA? I know about cuda-memcheck
(run with cuda-memcheck [memcheck_options] app_name [app_options]
). Are there other/better alternatives?
- Looking around, I found Cudagrind, it hasn't been updated for many years though. Would be nice to see what else can be used.
-
Is pair programming sitting down together and work it out together? If yes, it was super fun! A scientist and a software engineer. I guess we learned a lot from each other.
- Can be but usually refers to "collaborating" in solving problems or implementing solutions (my understanding)
- My understanding is original, two people and one set of code, one editor.
- there is also "mob programming": one person types, many others (Twitch :P) are in the same room and navigate, and people give the keyboard to others after some time
-
RSH live from Nordic RSE: https://nordic-rse.org/events/2020-online-get-together/
Thank you! :) Learned a lot and will start debugging now
Thanks for another great episode of RSHour! :heart: +1 :+1:
Thank you :-)
Notes we used when planning this session
Introduction
- This is a really challenging topic to talk about, since there are
- So many ways to go wrong
- So many ways to fix it
- No one right answer, it's more of an art than a science
- https://en.wikipedia.org/wiki/List_of_software_bugs
Types of bugs
- syntax errors
- runtime errors
- Hard errors that are clearly reported as errors
- results are wrong (like the kelvin example)
- Heisenbug: trying to study the bug changes it, e.g. adding print statements changes timings. Memory locations change. etc
- Compiled with or without optimizations. Stepped through with debugger to eliminate race conditions.
- you add a print statement and bug goes away: memory bugs
- schrödinbug: it never worked, but you never noticed it.
- Local (clearly identify to one line) vs systematic (a property of the whole system)
Approach to debugging
- Reality not as fancy as you might thing
- Take a break
- Reducing size of the problem
- Make it run faster, but still produce the bug (smaller input data, less iterations, etc)
- Removing degrees of freedom: Disable optional features until you get straight to it
- i thought efficient debugging is like efficient tree-search of possibilities "inside our head": eliminating as big branches as possible as early as possible.
- Finding the point of the problem
- bisection
- git bisect: when you have good version control, when was it introduced?
- Turn off optional features and see if the problem still occurs
- "Deactivating code"/skipping code to locate memory problems: making the result wrong but making the code not crash
- git grep
- it can be useful to make the code produce "not scientifically meaningful" results for the sake of debugging
- Now, you roughly know the point. What now?
- Reading error messages
- How to approach stack traces and finding the problem
- How to pick the interesting part out of error message: read from both the top and the bottom
- internet-searching solutions with the right error message
- Turn it into a unit test or example script, that is self-contained
- Can you make the example portable to someone else to try (git, conda, containers, etc?)
- This is basically a prerequisite to asking someone for help
- Asking for help
- When to ask for help (after you have narrowed it down)
- What to include when asking (not "it doesn't work")
- what to do if it crashes/fails in somebody else's code (library or package)
Preparing code for debugging
- various points from Zen of Python, for example "Errors should never pass silently / unless explicitly silenced", but there is more relevant in it too
- print debugging
- writing good error messages, catching errors the right way
- don't trap all errors and ignore them
- logging and verbosity
- shell: set -x
- -v, -q
- stdout vs stderr for printing stuff
- logging module show example (rkdarst)
- Everything defined in levels: debug, info, warning, error, critical
- https://github.com/NordicHPC/envkernel/blob/master/envkernel.py
- assertions, programming for safety
- shell script strict modes
- set -e ; set -u ; set -o pipefail . often done as set -euo pipefail
- debug compile flags
- "debuggers do not remove bugs, they run your code in slow motion"
- gdb/pdb type interface (AF), let it run until error happens
- demo-debugging/fortran/bugs
- explicit error to make it start
- print statements
- Useful as starting point
- Maybe use logging module instead?
- jupyter: %%debug (RD)
- import IPython ; IPython.embed() or or from code import interact; interact()
- Valgrind and memory bugs (Radovan)
- C/C++ example with 3 memory bugs (use after free, memory leak, out of bounds access)
- debugging through IDE; remote debugging
- variable inspector (RD)
Bonus if we have time